Noted Artificial Intelligence (AI) tool ChatGPT was shown to be more creative than 99% of college students who took the same well-regarded creativity test, researchers report.
Previous tests using earlier versions of the commonly known AI were not nearly as creative as their human counterparts. However, these new results using GPT-4 show that AI may be closing the creativity gap faster than anticipated.
Fears of AI Replacing Human Workers Generally Spared Creative Thinkers
Since the launch of AI tools like ChatGPT showed the world how advanced some of these systems have become, fears have increased that AI may replace human workers in a number of fields. Some reports indicate that is already happening in places like advertising and code writing, but even the direst predictions of AI job displacement surmised that areas of human creativity are the likeliest to be safe, at least for the foreseeable future.
According to the researchers behind this latest finding, that was even the case a year ago, when an earlier version of ChatGPT was unable to compete with many humans in creative thinking.
“We learned of previous research on GPT-3 that was done a year ago,” said Dr. Erik Guzik, an assistant clinical professor at the University of Montana’s College of Business and the study’s director. “At that time, ChatGPT did not score as well as humans on tasks that involved original thinking.”
Still, Guzik and his colleagues Christian Gilde of UM Western and Christian Byrge of Vilnius University say that after spending a year playing around with ChatGPT, they were fascinated by the possibility that the newest version had increased its creativity enough to give human minds a run for their money.
“We had all been exploring with ChatGPT, and we noticed it had been doing some interesting things that we didn’t expect,” Guzik said. “Some of the responses were novel and surprising. That’s when we decided to put it to the test to see how creative it really is.”
Standardized Test Measuring Creative Thought in Humans Yields Astonishing Results
To see if the newest AI tool was indeed becoming more creative, Guzik and his research team decided to put it to the test. Literally.
First, they used a tool known as the Torrance Test for Creative Thinking (TTCT). Employed for decades to measure human creativity, the TTCT prompts the test subject with a series of questions designed to determine how creative they are.
“Let’s say it’s a basketball,” Guzik explains of the type of prompt the TTCT might offer a test subject. “Think of as many uses of a basketball as you can.”
Obvious responses to this query may include shooting it into a basketball hoop or playing a game of catch with your friends. But creative thinking humans may offer responses a computer might not.
“If you force yourself to think of new uses, maybe you cut it up and use it as a planter,” Guzik explains. “Or with a brick, you can build things, or it can be used as a paperweight. But maybe you grind it up and reform it into something completely new.”
Again, the researcher notes that until recently, AI tools didn’t fare as well as humans in giving unconventional, creative options to these queries.
After the TTCT test asked ChatGPT a series of eight questions designed to evoke creative solutions, Guzik performed the same test on 24 of his human entrepreneurship students. The researchers also gathered 2,700 responses from students from a 2016 nationwide test to dramatically increase their comparison data set.
Next, Guzik and his colleagues sent the responses of to the Scholastic Testing Service (STS) for scoring. Notably, the folks at the STS did not know that AI was involved in the examination and treated all subjects as if they were humans. Also significant, the researchers point out that the TTCT is protected, proprietary material, so ChatGPT could not cheat by looking online for creative answers to its questions.
After the folks at the STS finished their scoring of the replies submitted by Guzik and his colleagues, the results were even better than expected.
“The AI application was in the top percentile for fluency – the ability to generate a large volume of ideas – and for originality – the ability to come up with new ideas,” the researchers report.
“For ChatGPT and GPT-4, we showed for the first time that it performs in the top 1% for originality,” Guzik said. “That was new.”
Notably, they showed that the AI “only” scored in the 97th percentile for flexibility, which the test defines as the ability to generate different types and categories of ideas. Still, Guzik and his team say this was a staggering number.
AI Says Humans May Need Better Tools and Understanding to Measure Creativity
One interesting result of their research came about when Guzik asked ChatGPT what it would mean if it scored well on the TTCT.
“ChatGPT told us we may not fully understand human creativity, which I believe is correct,” Guzik said. “It also suggested we may need more sophisticated assessment tools that can differentiate between human and AI-generated ideas.”
Of course, the research only looked at the basics of creativity, as opposed to asking ChatGPT to write a symphony or a multi-volume fantasy fiction series. As a result, those sorts of endeavors may still be out of reach for even the most sophisticated AI tools. Still, the results the team did glean from their analysis were generally unexpected and may show that AI systems are learning to thin like humans faster than anyone had expected.
“I think we know the future is going to include AI in some fashion,” Guzik said. “We have to be careful about how it’s used and consider needed rules and regulations. But businesses already are using it for many creative tasks. In terms of entrepreneurship and regional innovation, this is a game changer.”