Super intelligent computers have long graced the pages of comic books — whether it is Ultron or Braniac — these machines are quick to take over the world. While super intelligent machines remain represented in the cultural zeitgeist, the algorithms in the real world may present a more insidious threat.
If I was simply judging the progress of machine learning (ML) and artificial intelligence (AI) through news headlines, I might begin to think a superintelligent AI could go online soon. A quick search on the New York Times website brings up a few choice headlines:
“Is Coffee Good for Us? Maybe Machine Learning Can Help Figure It Out.”
“Can A.I. All but End Car Crashes? The Potential Is There.”
“A.I. Is Mastering Language. Should We Trust What It Says?”
They evoke the idea that ML and AI algorithms have some sort of agency. AI is helping us make health decisions, prevent accidents and master language. These algorithms have been integrated within many aspects of our daily lives — targeted online advertisements, predictive policing, and even autocomplete on Google Docs.
There are also plenty of big names behind these initiatives, from OpenAI which was co-founded by Elon Musk to Google, Microsoft and Meta’s own research teams. It is no surprise that the space has garnered a $93.5 billion market cap according to Grand View Research. But after cutting through the hype and marketing, many researchers and scientists are pointing toward the shortcomings of many of these applications.
Some studies, even those published in prestigious journals, rely on antiquated and racist ideas to “determine” trustworthiness or detect emotions. Large language models trained on data imported from the internet are also racist and sexist. Other times, models meant to detect sepsis in hospitals failed to catch the deadly condition again and again. Experts believe that we need to do a better job of regulating ML and AI applications to ensure that they don’t cause harm.
ML and AI algorithms don’t have agency
Unlike Ultron, an entity with control over its own actions, algorithms in the real world cannot act on their own. Experts disagree with the wording in a bevy of New York Times headlines suggesting algorithms can do research, stop car crashes, and master language.
At their core, ML and AI algorithms sound tame and mundane in comparison to fictional robot villains. They are programs that take certain data as inputs and then make some calculations to classify the input based on features in the data. Then, these models can learn from their past mistakes to become more accurate.
On March 17th 2022 Emily Tucker, the executive director of the center on privacy & technology at Georgetown Law wrote that “the Privacy Center will stop using the terms ‘artificial intelligence,’ ‘AI,’ and ‘machine learning’ in our work to expose and mitigate the harms of digital technologies in the lives of individuals and communities.”
Tucker wrote that this is due to the way ML and AI algorithms are given agency and often obfuscate what the underlying application does.
“The terms ‘artificial intelligence,’ ‘AI’ and ‘machine learning,’ placehold everywhere for the scrupulous descriptions that would make the technologies they refer to transparent for the average person,” Tucker wrote.
Instead of explaining that an algorithm looks at the websites you visit and suggest products based on those patterns, it might be marketed as something that serves you personalized ads. She also argues that ML and AI doesn’t “imitate” human thinking because we barely understand human thinking and cognitive processes.
These algorithms are built and designed by humans and all the input is curated, selected and created by humans.
The problem with benchmarks
One of the goals of developing these algorithms is to create a generalizable AI. That means you could use the same algorithm to complete many different tasks. Scientists at University of California, Berkeley, University of Washington, and Google wrote about the limitations of popular AI benchmarking datasets.
Even if these algorithms manage to classify words and images based on the datasets, it only represents a finite set of categories. The authors, led by Deborah Raji, made an apt comparison to a Sesame Street picture book titled “Grover and the Everything in the Whole Wide World Museum” where the titular character realizes that a museum does not contain everything in the world.
An excerpt from their paper reads: “We argue that benchmarks presented as measurements of progress towards general ability within vague tasks such as ‘visual understanding’ or ‘language understanding’ are as ineffective as the finite museum is at representing ‘everything in the whole wide world,’ and for similar reasons—being inherently specific, finite and contextual.”
Benchmarks can often be misleading, perpetuating the idea that an algorithm could be more effective at these tasks than humans. Some datasets used to train these algorithms are also problematic because the data used for benchmarking is biased.
The problem with racism and sexism
Sure, the sentient machines of science fiction often attempt to take over the world and subjugate humans. But at least these depictions of ML and AI aren’t overtly racist and sexist — a stark contrast to real-life algorithms.
Microsoft’s Chatbot
Take the case of Microsoft’s AI Twitter chatbot named Tay which went live in 2016. After a day, the algorithm went from sending playful, friendly messages but quickly pivoted to racism. The developers behind the chatbot then shut it down.
The chatbot itself wasn’t racist, but it received lots of exposure to horrible Tweets, leading the bot to mimic them. “As it learns, some of its responses are inappropriate and indicative of the types of interactions some people are having with it,” Microsoft wrote in a statement to Business Insider. “We’re making some adjustments to Tay.”
Open AI’s GPT-3
Open AI’s impressive natural language learning model can generate lengthy texts including news articles, stories and poetry. It also has a huge problem: Open AI’s own research from May 2020 shows that GPT-3 has a lower opinion of Black people and a tendency to generate sexist comments.
Despite these glaring issues, Open AI has gone ahead with licensing out and commercializing this algorithm. It can be applied in many settings, including rewriting passages of text in simpler or more complex language for school children. The racist and sexist aspects of the algorithm may seep into and even perpetuate harmful stereotypes through its commercial applications.
Predictive policing
Despite the prevalence of policing algorithms in the U.S., they may not be all that equitable. Tools like COMPAS are fed input about age, gender, marital status, history of substance abuse, and criminal record, in order to predict how likely a person is to be rearrested if they are released from prison.
However, the data feeding these algorithms is already incredibly biased. If you are Black, you are twice as likely to be arrested than a white person or five times as likely to be stopped without just cause. Taking this data into account, the algorithm would be more likely to predict a Black person would be rearrested than a white person, preventing them from release.
These tools don’t predict whether a person will commit a crime, but rather if people within a similar demographic are likely to be rearrested.
Conclusions
ML/AI algorithms are hyped up in the news cycles, promising to cut through complex data and provide easy solutions. In reality however, these algorithms have recreated and provided legitimacy for existing biases. There are problems with the way they’re developed, the benchmarks that they’re trained on and many of its applications.
Models that string together paragraphs and sentences perpetuate racist and sexist stereotypes. Applications in predictive policing are troubling as these widespread algorithms automate biased predictions. Since a Black person is more likely to be stopped without just cause, they would be more likely to be rearrested according to the algorithm.
ML and AI algorithms are often given agency, obfuscating the role of companies and developers in the harms that they may perpetuate. This might make the current use of these algorithms immediately more dangerous than super-intelligent AI.
Simon Spichak is a science communicator and journalist with an MSc in neuroscience. His work has been featured in outlets that include Massive Science, Being Patient, Futurism, and Time magazine. You can follow his work at his website, http://simonspichak.com, or on Twitter @SpichakSimon.