Bendert Zevenbergen: Prejudiced Algorithms

The Guardian recently asked its readers in a thought-provoking piece “is an algorithm any less racist than a human?” Some affirmative evidence was already published in 2013 by Latanya Sweeney, Professor of Government and Technology in Residence at Harvard University, when she discovered that typing in her African-American sounding name prompted advertising services to display links to services that allow users to investigate whether this person had been previously arrested. This would not happen when more Caucasian names were submitted, like Jill or Kristen. Similar controversial topics were discussed at a recent weeklong seminar in the beautiful converted German monastery in Dagstuhl, where more than 30 computer scientists (and three non-techies, including myself) exchanged ideas.

For European lawyers, the concept of discrimination is typically operationalised through statutory lists of prohibited or sensitive data types that warrant special attention or treatment when processed. In their multi-disciplinary literature review, computer scientists Andrea Romei and Salvatore Ruggieri from the University of Pisa show the depth of scholarship about the concept of discrimination. The authors note that data mining algorithms can be used to discover discriminatory practices in historical databases (with many potential flaws and limitations to be taken into account!). However, they also warn that “[m]ining algorithms may then assign to such discriminatory practices the status of general rules, which are subsequently used for automatic decision making in socially sensitive tasks.”

The challenge to design algorithms to be more in line with legal and policy objectives was taken up by a large group of academics, and presented by Solon Barocas at Dagstuhl, in the paper “Accountable Algorithms.” The paper contains a vast exploration of legal standards (such as ‘procedural regularity’ or ‘due process’) and computational techniques. The authors note that “accountability mechanisms and legal standards that govern decision processes have not kept pace with technology” and that “[t]he tools currently available to policymakers, legislators, and courts were developed primarily to oversee human decision makers.” One of the assumptions of this paper is that total transparency of algorithms is quite useless, as a strong technical proficiency is required to understand how the code is constructed, or how it constructs itself. Therefore, the paper offers alternative technological approaches such as Cryptographic Commitments, Zero-Knowledge Proofs, Fair Random Choices.

The time to engage in the development on non-racist algorithms is now! Several communities are currently forming that are developing approaches to understand the social effects of algorithms and machine learning. The Data Transparency Lab at Columbia University in New York is hosting several workshops in November, such as Fairness, Accountability, and Transparency.

In Machine Learning and the Workshop on Data and Algorithmic Transparency. Finally, the Big Data journal is currently collecting papers for a “Special Issue on Social and Technical Trade-Offs.” I hope to see some readers at these workshops!