31 August, 2015
Recently I had a discussion with a fellow researcher who was doing a consultancy gig to quantify privacy risks in a big organisation’s databases, and to trigger employees’ responses when a certain threshold was met. While going through her steps to understand how she was approaching such a decision support system, I was reminded by the typical sentences that open each self-respecting paper on privacy, for example: “Privacy is a value so complex, so entangled in competing and contradictory dimensions, so engorged with various and distinct meanings, that I sometimes despair whether it can be usefully addressed at all.” (Robert C. Post, Three Concepts of Privacy, 2001). Other variations include Judith Jarvis Thomson’s opening in her paper The Right To Privacy (pdf) “Perhaps the most striking thing about the right to privacy is that nobody seems to have any very clear idea what it is.” – or Daniel Solove’s first sentence in his paper A Taxonomy of Privacy (pdf): “Privacy is a concept in disarray.”
It seems like a tall order to quantify the risks of a concept that is still vague to the brightest minds working on the topic. Risks resulting from personal or private information disclosures would also need to be calculated taking into account future technological developments, which – as ethicists Hannah Maslen argues – is important to not be overtaken by technology, but also mainly speculative and hardly practical. For example, a recent paper from Italy titled “Security, privacy and trust in Internet of Things: The road ahead” analyses a survey about the necessary precautions and interrelation of privacy and security in a world where billions of devices are interconnected and collecting data. To understand what to engineer, the different technical solutions to preserving privacy should be underpinned by more qualitative surveys, such as the recent Swedish study “Online privacy concerns: A broad approach to understanding the concerns of different groups for different uses.” This paper reiterates the finding that people with similar characteristics, such as their political orientation or inherent trust in people, have similar privacy concerns, but that these concerns can vary across defined groups in a population. While the qualitative survey does not give a more solid definition to the concept of privacy, it can guide technical approaches to preserving appropriate flows of information.
Perhaps the most interesting academic works in quantifying privacy risks can be found at the data publishing or dissemination phase. For example, the paper “Privacy-Preserving Big Data Publishing” from the University of Calgary and IBM Research explains how tried and tested models of data aggregation (k-anonymity and l-diversity) can be used in big data contexts. The advantage of such approaches to data publishing and dissemination is the ability to control the balance between privacy and data utility, which can be adjusted given the context and subsequent risks analysis of the data. However, the best way to control – and possibly quantify – privacy, is to consider it from the outset, for example by applying privacy by design strategies as outlined by Jaap-Henk Hoepman a few years ago. Leaving privacy risk assessments or quantification until data is collected and databases are created, leaves much to speculation of (possibly) poorly understood contextual risks.
Academic Liaison at Princeton University