Using big data sets in clever ways, analysts can infer and predict behavior of individuals or groups who themselves may not necessarily be in the dataset used for the inference. This practice is common in online advertising, as well as in sectors as public governance and insurance. Is this an issue for privacy law and data protection regulation, though? This issue was raised by Kate Crawford and Jason Schultz in the 2013 paper “Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms” and again during a panel discussion at the Amsterdam Privacy Conference.
As the widely discussed Target pregnancy case has made obvious, data collected during innocuous transactions (e.g. shopping) can reveal particularly sensitive information about individuals. Not only does it reveal significant life changing events, but it can also reveal some more permanent fact, such as having a higher or low IQ if you happen to like curly fries on social media platform, as shown by Cambridge University researchers in their paper “Private traits and attributes are predictable from digital records of human behaviour.” While not all inferences will reveal sensitive facts, it does appear to be an easy way to learn about issues that are not inherently obvious from the data the individual has chosen to provide.
This is fertile ground for discrimination. Not only are the underlying algorithms designed by people with their particular objectives, incentives, and views of the world (see Barocas and Selbst’s paper on Big Data’s Disparate Impact), but by applying machine learning techniques takes out any moral agent in the process of inferring facts about others. A particularly serious loss of autonomy appears when the created data is used to make decisions about people on unrelated matters than the purpose for which the data was initially provided. Consider – in a doom scenario that is not based on science fiction – that your recent financial transactions fit into a pattern of people who have a particular likelihood of getting a divorce one year later. Financial service industries could be alarmed when you apply for a new product.
Existing power balances are upset when institutions know more about you then you do yourself. The question remains how to deal with this upset informational power relation. It may be difficult to apply European data protection law, since the created data about individuals or groups are not necessarily a result their personal data being processed. A call for more transparency of data practices may be futile and a false reassurance, because few will take the time to understand modern information practices. The paper by Crawford and Schultz proposes that individuals affected by big data analyses should have similar redresses as found in privacy laws in the Anglo-Saxon world.
The question remains though whether predictive analyses are a privacy invasion in our current understanding of the concept. Given the Court of Justice of the EU’s recent active stance on privacy issues, it’s probably less hard to predict how the judges would answer this question. It seems, however, that privacy scholars are currently more looking towards ethical justifications for answers, rather than seeking legal solutions.
Bendert ZevenbergenAcademic Liaison at Princeton University