AI's machine learning process has shown biases due to faulty training data. New research from MIT suggests these biases can be overcome
AI's machine learning process has shown biases due to faulty training data. New research from MIT suggests these biases can be overcome (PC Pixabay.com)

Can AI’s Machine Learning Process Actually be Unbiased?

While helpful and effective at problem-solving, artificial intelligence has also occasionally been the instigator behind several concerning incidents. In one example, Amazon revealed that their AI recruiter tool, which scanned resumes for possible candidates, was biased against female applicants. Another incident occurred at the University of California Berkeley, when their AI system, designed to allocate care to over 200 million patients in the U.S., gave black patients a significantly lower level of care.

These biases often have their origins with those responsible for the input data that the AI system uses in its machine learning process. As a result, many are looking into how the combination of machine learning and bias happens. Now, researchers from MIT believe they may have found a possible solution for the issue of machine learning and prejudice, as recently published in Nature Machine Intelligence.

Background: Machine Learning and Bias

Machine learning is a part of artificial intelligence that uses algorithms to solve problems and detect patterns. It uses sample or training data as the basis for these predictions. Because humans provide the training data for the machine learning process, the data can become biased. This bias is usually not intentional; for example, a set of training data may have faces of mainly white males, skewing the algorithms and resulting in bias, especially against black females. There are many reasons that a data set can become biased. The training data may be incredibly small or incomplete, leading to a higher probability of bias. For those using machine learning algorithms, it’s essential to study the training data to avoid any potential biases thoroughly. Some users may have to revise their training data or start over depending on how big their data set is. 

Analysis: Overcoming Biases

Researchers at MIT were interested to see if a machine learning system could overcome biased data. To test their query, the researchers developed a series of data sets containing different objects in various poses. Some data sets had more pose diversity than others, varying the bias of the data set. The researchers then input the data set into a neural network, a machine learning system, to see how it performed. From their results, the team found that if the data set had more diversity, there was a lower chance of bias in the machine learning process.

The team then tested the neural network system to run two data sets simultaneously versus doing them individually. They found that the probability of bias rose significantly when the neural network ran two problems simultaneously, rather than separately. Research scientist at MIT, Xavier Boix, stated: “The results were really striking. In fact, the first time we did this experiment, we thought it was a bug. It took us several weeks to realize it was a real result because it was so unexpected.”

Outcome: Diversity is Key

After publishing their findings, the researchers hope others will use their data to develop more diverse training data, reducing any potential bias. Because machine learning and bias can often make headlines, many big companies are eager to find practical solutions. And the simplest one may be to have a more diverse set of training data.

Kenna Castleberry is a staff writer at the Debrief and the Science Communicator at JILA (a partnership between the University of Colorado Boulder and NIST). She focuses on deep tech, the metaverse, and quantum technology. You can find more of her work at her website: https://kennacastleberry.com/