Are Robots Racist? Interrogating biases in AI
In 2016, Microsoft's artificial intelligence (AI) Twitter chatbot "Tay" caused controversy when it started using racist and sexist slurs and calling for genocide, forcing Microsoft to take the bot offline within 16 hours of its launch.
[examples of Tay's tweets]
The pertinent question, what was the root cause of this outburst? Whilst Tay’s Twitter outlash conjured fears of a dystopian future of killer robots, the bot, launched as a pilot to learn and mimic natural language, was merely reproducing the awful speech being tweeted at it. Paradoxically then, we can be somewhat glad that the bot’s unsupervised machine learning vulnerabilities were deliberately exploited by white supremacist users to reproduce their racist propaganda and not the creation of a “racist robot”.
However, this was not the only example of AI exhibiting such discriminatory behaviour. In 2015, Google’s Photos App, an image recognition service, categorised a picture of 2 black users as “gorillas”. Similarly, results of an algorithm-judged beauty contest sparked controversy because its 44 chosen winners (out of 6,000 contestants from 100 different countries) were overwhelmingly white.
We can hope that none of these algorithms were designed with racist and exclusionary intent, so what is going on here?
To break it down (rather simplistically), machine learning models reflect the quality and diversity of the data they are fed. In Google’s case, it could be assumed that the algorithms were not trained and tested on a diverse enough data set before its release, i.e. predominantly white faces were used to train the model to identify a “man” and/or a ”woman”. Whilst diverse data sets free of bias can sometimes be difficult to come by (e.g. homogeneous results if you google image search: “man”), I believe these examples are indicative of a wider diversity problem within the tech industry.
If teams of developers and product owners designing and building these technologies are predominantly of a specific race or gender, they may fail to consider other users and their differences. For example, if testing is conducted on individuals with similar looks, needs and preferences as the designer, the problems won’t become evident until it’s released to the wider public. Encouraging diversity in the creation and testing of these products becomes even more important.
This lack of diversity and consideration may explain why none of the Microsoft team working on Tay thought to include content filters for racially or sexually explicit language, let alone anticipated deliberate abuse of the bot launched on a platform with a well-known trolling community (following incidents such as #gamergate and increasing complaints of death and rape threats on the platform).
Where we’ve tried to side-step into a post-racial colour-blind future, these technologies highlight and amplify the biases still embedded within our societies. Whilst product and service design has always been riddled with biases (sometimes deliberately), the implications of these inaccuracies are becoming increasingly severe with the growing adoption of autonomous technologies, such as self -driving vehicles and within predictive policing.
Worryingly, for example, Propublica reports that the AI software used in the US to predict probability of recidivism was found to be twice as likely to falsely flag black defendants as high-risk of reoffending and white defendants were almost as likely to be falsely flagged as low-risk of reoffending. Not only is this a questionable use of AI, these inaccuracies have grave consequences for defendants of colour, affecting a person’s sentence length, and further criminalises black communities under the guise of objective decision making.
The notion that AI eliminates or is immune to human error and is value-free needs to be thoroughly interrogated, especially as many algorithms rely on codifying historical data, which can be flawed due to a long history of discriminatory policies and reflects historical biases and disparities. Robust systems should therefore be developed to counter these biases, taking historical facts and different viewpoints into account and adjusting accordingly.