Searching on Google images for “person” returns largely white results. That may not sound too alarming and it is unlikely that there is anything malicious behind it, but there are increasing concerns that this is a consequence of the algorithms behind search results being trained on sets of data that lack diversity.
In 2016 an organisation called the Algorithmic Justice League (AJL) was launched by Joy Buolamwini, a Massachusetts Institute of Technology postgraduate student, to attempt to combat the biases in written code.
Buolamwini found that facial recognition software could not detect her face, but worked fine for lighter-skinned colleagues. She then found a way to make it detect her: she wore a white mask, which was enough for the software to recognise her.
This was a problem that Buolamwini had encountered before, when, as an undergraduate studying computer science, she worked on social robots.
Buolamwini said: “one of my tasks was to get a robot to play peekaboo… the problem is peekaboo doesn’t really work if I can’t see you, and the robot couldn’t see me. “I borrowed my roommate’s face for the assignment and figured someone else will solve this problem.”
Facial recognition software is developed using machine learning, by showing the program a data set that teaches it what a face is and what is not, and how to detect other faces. If the data set does not include a diverse range of faces, then the software will not learn to recognise anyone not included in that data set.
Buolamwini wants to see more “full spectrum data sets” which would improve the accuracy of facial recognition software when used by people who were omitted from the data sets used in the generic version of the software.
She warned that the consequences of not doing this could be very serious: “Algorithmic bias can lead to discriminatory practices. US police departments are starting to use facial recognition software as part of their crime fighting arsenal. A report showed that almost one in two Americans have their faces in facial recognition networks. Police can use these despite the algorithms not having been checked for accuracy. The software is often unreliable, and mislabelling a wanted criminal is no laughing matter.” To help to solve this problem Buolamwini said that we need “more inclusive coding” and “diverse teams working on the code who can check each other’s blind spots.”
The most recent diversity reports from big tech companies show that there is a long way to go on that front. Google said that 19 percent of its tech staff are women and only one percent are black. That compares to 17.5 percent female and 2.7 percent black or African American at Microsoft, and 17 and one percent respectively amongst Facebook’s tech team.
Buolamwini launched the AJL so that anyone who wanted to help could “report biases in algorithms or become a tester.”
Issues surrounding the lack of diversity within coding have been around for a long time. It was only in July 2015 that the Unicode Consortium (the organisation that makes it possible for the scripts of hundreds of languages to be used in coding, or as most people know them: the people who decide upon new emojis) finally decided to add an emoji that represented non-white people. They did this using five modifier characters that could be applied depending on which skin tone the user wanted to use.