Data Privacy
Learn about the risks of reidentification and some historical examples of this disaster.
Data is very revealing when it’s collected on a mass scale, as it is today. Many will be familiar with data breaches or leaks where personally identifiable information (PII) is hacked, leaked, or stolen from databases and either kept for a ransom or published on the internet. The gravity of situations like this is why global governments charge such a high fee for violation of data privacy laws. In Europe, the General Data Protection Regulation (GDPR) is of the utmost importance for any company handling potentially sensitive data. Clearly, data privacy is important, but why should we be concerned with it?
Motivation
There are many risks to holding private data. Let’s discuss some of these risks and the consequences that may result from them.
Reidentification
In a
If we know an individual watches lots of “rags to riches” movies and rates them highly, we can make the assumption that they not only enjoy this genre, but are also lower- to middle-class and dream of their own rags to riches story. We can then launch predatory advertising campaigns or otherwise use this information for our personal gain. In this manner, sexual orientation, political leanings, etc., can be garnered from just combining the two datasets. This is only possible because we know the full identity of the individual. Let’s recreate a toy example with a fake Netflix and IMDb dataset.
Get hands-on with 1400+ tech skills courses.