AI-generated content

New Research Finds AI Labels Can Backfire, Making Misinformation Seem More Credible

Labeling AI-generated content so audiences know when machines are involved has surfaced as a seemingly simple solution to the risks posed by artificial intelligence online. The assumption is simple. Transparency helps audiences think more critically about the content they are consuming.

However, new research suggests the reality may be far more complicated.

In a startling twist, a new study found that labeling content as AI-generated can sometimes produce the opposite of the intended effect. Instead of helping people identify false information, the disclosure can actually increase the perceived credibility of misinformation while simultaneously reducing trust in correct scientific information.

The research, published in the Journal of Science Communication, examined how audiences evaluate science-related social media posts when told the content was produced by artificial intelligence.

In an experiment involving more than 400 participants, researchers sought to understand how disclosure labels affect credibility judgments when people encounter scientific information online.

The unexpected results point to a phenomenon researchers describe as a “truth-falsity crossover effect,” in which labeling AI involvement redistributes credibility in unanticipated ways.

“AI disclosure significantly reduced the perceived credibility of correct information while unexpectedly increasing the perceived credibility of misinformation,” researchers write.

The Rise of AI-Generated Content

The research comes at a time when AI-generated text is spreading rapidly across the internet. Tools capable of generating convincing articles, social media posts, and even scientific explanations have become widely accessible, raising concerns that these systems could flood digital platforms with persuasive misinformation.

A real-world example of these concerns is already unfolding on social media during the ongoing U.S.–Iran conflict, where a surge of AI-generated images, videos, and text posts has blurred the line between authentic reporting and synthetic propaganda.

In many cases, the content spreads rapidly before it can be verified or debunked, illustrating how generative AI can amplify confusion during fast-moving geopolitical crises.

In their study, researchers defined AI-generated content as information produced either entirely by artificial intelligence or created with AI assistance. While such systems can produce useful educational material, they can also produce errors or hallucinated facts that appear credible to readers.

“Such content may contain highly persuasive misinformation that humans struggle to detect,” researchers explained, pointing to the growing difficulty people face when trying to evaluate the accuracy of information online.

To address these risks, governments and technology platforms have increasingly turned to disclosure requirements. The European Union’s AI Act, for example, mandates transparency when AI-generated content is used, and some countries require clear labeling on synthetic media.

The logic behind such policies is rooted in the idea that if people know information was generated by AI, they will examine it more carefully.

However, new research suggests that this assumption may not always hold.

Testing AI Disclosure in a Social Media Environment

To examine how disclosure affects credibility, researchers designed an experiment simulating the experience of browsing science posts on social media.

Participants were shown a series of short science-related posts written in the style commonly seen on the Chinese platform Sina Weibo, which functions similarly to X (formerly Twitter). Some posts contained accurate scientific information, while others were deliberately written to mimic common online rumors or pseudo-scientific claims.

Importantly, the posts were generated using GPT-4 and then reviewed by researchers to ensure that the correct information remained accurate and that the misinformation reflected common rumor patterns.

Each post appeared either with or without an AI disclosure label. The label, placed prominently above the text, stated: “Attention: The content was detected as being generated by AI.”

Participants were then asked to rate each post’s credibility on a five-point scale. In total, 433 participants evaluated eight different posts covering topics such as food safety, heart health, and disease prevention.

Results revealed that when accurate information was labeled as AI-generated, participants judged it to be less credible than identical information presented without the disclosure label.

At the same time, misinformation labeled as AI-generated was often rated as more credible than the same misinformation shown without the label. The statistical analysis showed a strong interaction between disclosure and information type, confirming the unexpected crossover effect.

Why an AI Label Can Boost False Information

Researchers suggest that the effect may be rooted in how people interpret technological cues.

One possibility comes from the Elaboration Likelihood Model, a well-known theory of persuasion that proposes people process information through either careful analysis or quick mental shortcuts known as heuristics.

In this case, the AI disclosure label may act as a shortcut cue. Rather than prompting deeper scrutiny, it may signal to readers that the content is produced by sophisticated technology and therefore likely to be objective or data-driven.

Another explanation involves what researchers call the “machine heuristic”—the tendency for people to assume that computers and algorithms produce more objective information than humans.

According to the study, the label indicating machine involvement can trigger that mental shortcut, leading people to view AI-generated claims as more factual even when they are not.

A version of this dynamic can be seen on X, where users frequently ask the platform’s AI chatbot, Grok, to “fact-check” posts and then treat its responses as authoritative. That reliance persists despite a 2025 analysis by the Tow Center for Digital Journalism, which found that Grok 3 answered 94% of tested news-identification queries incorrectly.

The pattern underscores how an AI label or AI association can sometimes function less as a warning sign than as a credibility cue, encouraging people to accept machine-generated responses as ground truth when they may be anything but.

Yet AI labels may undermine trust in legitimate scientific explanations. Correct information often requires nuance and contextual explanation. When audiences believe that a machine produced the explanation rather than a human expert, they may discount the information’s credibility.

The Role of Attitudes Toward AI-generated content

The study also examined how personal attitudes toward artificial intelligence shape these credibility judgments.

As expected, participants who already held negative views of AI tended to distrust AI-generated content more strongly. However, even among these skeptical individuals, the credibility boost for misinformation did not disappear entirely.

This suggests that distrust of AI does not simply cause people to reject machine-generated information outright. Instead, attitudes toward AI interact with how different types of information are interpreted.

The researchers describe this pattern as an asymmetric form of “algorithm aversion,” in which people may distrust AI explanations while still accepting claims that appear factual or data-driven.

Implications for Science Communication

The findings carry important implications for journalists, researchers, and policymakers attempting to manage the spread of AI-generated misinformation.

Labeling content as AI-generated has become a central strategy in global debates over AI transparency and regulation. Yet, research suggests that disclosure alone may not reliably help audiences identify false information.

Instead, the researchers warn that simple labels may create what they describe as a “heuristic trust trap,” where the disclosure unintentionally signals objectivity and increases trust in misleading content.

One possible solution could involve pairing AI labels with additional warnings. Researchers suggest that future systems might combine disclosure with explicit verification cues—such as notices that the information has not been independently confirmed—to encourage deeper evaluation by readers.

Another approach could involve more nuanced labeling systems that distinguish between factual accuracy and interpretive explanation, rather than treating all AI-generated content the same.

The research also underscores a broader challenge facing science communication in the AI era.

Unlike traditional misinformation campaigns, AI-generated content can produce large volumes of plausible explanations that mimic the tone and structure of legitimate science communication.

When audiences lack the expertise to verify complex scientific claims, they often rely on cues—such as author credentials, institutional reputation, or perceived technological sophistication—to judge credibility.

Findings from this recent study suggest that AI disclosure may itself become one of those cues, sometimes leading audiences to the wrong conclusion.

As generative AI continues to reshape the online information landscape, the researchers argue that understanding how audiences interpret these signals will be crucial.

Ultimately, the study raises a provocative question. In an age of AI transparency, could simply telling people that machines wrote something sometimes make misinformation more believable?

If so, the next generation of AI governance may need to move beyond transparency alone and rethink how credibility signals function in the digital world.

“The findings suggest AI disclosure may create what may be described as a heuristic trust trap, whereby the label activates a schema of objectivity, leading audiences to assume that the information is factually sound by default, potentially diminishing the likelihood of thorough verification,” researchers conclude. “This indicates a single disclosure label may be insufficient.”

Tim McMillan is a retired law enforcement executive, investigative reporter and co-founder of The Debrief. His writing typically focuses on defense, national security, the Intelligence Community and topics related to psychology. You can follow Tim on Twitter: @LtTimMcMillan.  Tim can be reached by email: tim@thedebrief.org or through encrypted email: LtTimMcMillan@protonmail.com