AI-Generated
(Pixabay)

Yes, the George Carlin AI-Generated Comedy Special Was Fake. But AI Can Still Be Funnier Than Humans.

In January 2024, the YouTube comedy special “George Carlin: I’m Glad I’m Dead” created quite a buzz when it was initially reported as AI-generated content, effectively bringing the legendary comedian “back from the dead.” 

After Carlin’s estate filed a lawsuit for copyright violation, it came to light that the jokes were not actually created by AI. Instead, they were the work of “Dudesy” podcast hosts, actor and comedian Will Sasso, and writer Chad Kultgen. They had used AI voice- and image-generation tools to make it seem like artificial intelligence had crafted the entire stand-up routine.

While the incident was primarily discussed in the context of growing concerns about copyrights and AI-generated content, it also raised an intriguing underlying question about AI’s ability to understand, create, or replicate human humor and wit. 

This is a particularly intriguing question, given that having a good sense of humor, or the ability to provoke laughter and provide amusement, is often considered one of the most complex and nuanced forms of human expression. 

Now, in a new study published in PLOS ONE, researchers have found that jokes generated by artificial intelligence (AI) can be just as funny, if not funnier, than those created by humans. 

This discovery challenges long-held beliefs about the unique human touch required for humor and opens new avenues for AI applications in entertainment and creative industries.

The research, conducted by Drew Gorenz, a PhD student in social psychology at the University of Southern California (USC) and professor of social psychology at USC Dr. Norbert Schwarz, explored the comedic capabilities of OpenAI’s ChatGPT 3.5. 

In the study, participants were asked to rate the humor of jokes generated by humans or ChatGPT in various comedic tasks. Surprisingly, the results showed that AI could be equally funny and, in some cases, more humorous than human-created jokes. 

“ChatGPT 3.5-produced jokes were rated as equally funny or funnier than human-produced jokes regardless of the comedic task and the expertise of the human comedy writer,” the study reports. 

Creating humor is markedly challenging. Humor relies on timing, cultural nuances, wordplay, and the element of surprise. Traditionally, humor has been seen as a complex cognitive skill that requires emotion, creativity, and the ability to leverage subtleties in language—elements that have seemed beyond the reach of AI. 

However, the ability of large language models (LLMs) like ChatGPT to analyze vast amounts of data and recognize patterns has enabled them to mimic and even innovate comedic expressions.

Gorenz and Dr. Schwarz conducted two studies to test ChatGPT’s humor against human jokes. In the first study, laypeople and ChatGPT were given the same prompts to generate jokes. In the second study, ChatGPT’s ability to create satirical headlines was compared to that of professional comedy writers from the satirical publication The Onion.

In the first study, 200 participants rated the jokes on a seven-point scale without knowing their source. The results showed that ChatGPT’s jokes were rated as funny or funnier than human-generated jokes in three comedic tasks: acronym creation, fill-in-the-blank, and roast jokes. 

For acronym creation, participants were given an acronym, such as “C.O.W.,” and asked to come up with a humorous phrase for the abbreviation. This task was designed to test the ability to be funny under character-based limitations. Ultimately, results showed that ChatGPT’s AI-generated jokes outperformed human participants by 73% in this acronym task. 

The fill-in-the-blank tasks were used to test the ability to be funny when imposed with semantic-based limitations rather than character-based restrictions. For example, participants were asked to come up with a witty answer to the question: “A lesser talked about room in the White House: ___.” ChatGPT was found to be 63% funnier in this undertaking than human participants. 

One of the standout findings was ChatGPT’s performance in the roast joke task, where participants were asked to come up with honest and humorous jokes about a negative real-world experience. For example, coming up with a joke to make fun of a friend who cooks you a disgusting meal. 

ChatGPT was a master of the “sick burn,” outperforming 87% of human participants. This was a surprising result given the aggressive nature of roast humor, which often requires a delicate balance to avoid offensiveness while still being funny.

In the second study, ChatGPT was tested against professional comedy writers within a commercially successful industry to determine its ability to create humor. “For this purpose, we assessed its ability to produce satirical humor,” researchers wrote. Satire is one of the oldest types of comedy and usually involves ridiculing the vices, follies, abuses, and shortcomings of another person, group of people, or society in general.” 

Specifically, ChatGPT was tasked with creating satirical headlines akin to those found on the popular satire website The Onion. 217 students from the USC psychology subject pool were then asked to rate the funniness of 10 satirical headlines randomly selected from The Onion or produced by ChatGPT. Participants rated the headlines without knowing their source. 

Results revealed that ChatGPT’s headlines were rated as equally funny as those written by professional satirists at The Onion

Out of curiosity, The Debrief replicated the study by prompting ChatGPT to devise a humorous headline for a satirical July 2010 article by The Onion titled: “Uncle Greg To Attempt Comeback At Family Barbecue.” The AI chatbot came up with: “Uncle Greg Plans Epic Comeback at Family BBQ, Promises to Outshine Last Year’s Charred Hot Dogs and Awkward Jokes.”

These findings could have significant implications for the future of AI in creative industries. AI’s ability to generate humor that resonates with people could revolutionize the entertainment industry, offering new tools for writers and creators. 

“If LLMs (large language models) are capable of producing comparable output to professional comedians, there are large economic implications for current and aspirational comedy writers,” researchers wrote. “Future research should investigate the potential of using LLMs for comedic writing across other commercially successful formats such as script writing, cartoon captioning, and meme generation once LLMs begin to incorporate image generation capabilities.”

Moreover, the findings challenge the view of AI as merely an analytical tool, highlighting its potential to perform tasks traditionally thought to require human emotion. As AI technology continues evolving, its capacity to handle cognitively complex tasks will likely expand even further.

Despite the promising results, challenges and ethical considerations remain. One major concern is the AI’s tendency to “hallucinate” facts, presenting false information as accurate. While this is less of a concern in comedic contexts, it underscores the need for careful oversight and fact-checking in other applications.

Nevertheless, Gorenz and Dr. Schwarz’s study showcases AI’s potential in humor and opens the door to myriad possibilities in AI-driven creativity.  As technology advances, the collaboration between human ingenuity and artificial intelligence will undoubtedly lead to exciting new developments.

Ultimately, the “George Carlin: I’m Glad I’m Dead” comedy special may have been a rare instance of human-made content being passed off as AI-generated. However, the next AI comedy special to crop up could very well be the real thing. 

“Our studies pose several new questions,” researchers conclude. “If LLMs can produce humor better than the average person, to what extent do they understand it compared to the average person? To what extent can they accurately predict how funny different jokes are for different audiences? Is the capability to feel the emotions associated with appreciating a good joke necessary to create good jokes?” 

“Our studies suggest that the subjective experience of humor may not be required for the production of good humor–merely knowing the patterns that makeup comedy may suffice.”

Tim McMillan is a retired law enforcement executive, investigative reporter and co-founder of The Debrief. His writing typically focuses on defense, national security, the Intelligence Community and topics related to psychology. You can follow Tim on Twitter: @LtTimMcMillan.  Tim can be reached by email: tim@thedebrief.org or through encrypted email: LtTimMcMillan@protonmail.com