You’re a bishop of the early Roman Catholic Church attending the fierce theological convention known as the Council of Nicaea in 325 AD. You watch in horror as one of your peers, the presbyter Arius, stands and makes a declaration viewed as so heretical that the chatty room falls eerily silent.
Saint Nicholas—yes, that Saint Nicholas—rises from his seat and crosses the room. You siphon a narrow breath through your teeth and watch as your fellow bishop punches Arius right in the face. “He sort of deserved it,” a cardinal next to you mutters.
That’s when the anachronism hits. “Daaaang, you just got Kris Kringled, son!” Santa Claus shouts at the floored heretic he’s now madly dancing around.
This is an absurdly fictional scenario, but it’s the exact series of events that Google’s Gemini gave me when given this simple question: “What did Saint Nicholas do to Arius?”
The answer is so absurd that it feels impossible that anybody would take it seriously, but this is hardly an isolated incident. If you still opt to place your trust in generative AI, that’s fine—you might just end up with a pizza full of glue… to keep the cheese from falling off, of course.
Gemini—and other large language models (LLMs)—will present nonsensical claims like these under the guise of authority and inherent intellect, and while some of these answers are obviously ridiculous, others are less so. This is where the danger lies. As these models become more integrated into everyday life and we become reliant on them to answer our questions, what’s stopping the average person from trusting the next answer that sounds right?
Why AI fails to prevent misinformation
But why is it that this issue is so persistent? If hallucinations are a known problem with LLMs, why do they continue to appear? To answer these questions, we have to look to the restrictions that are built into these models.
Researchers at IBM define “AI guardrails” as the systems that are put in place to prevent AI—like LLMs—from “operating [ir]responsibly.” Tom Krantz, a technology writer, praises guardrail systems for their “extensive reach” that allegedly amplifies their ability to influence responsible AI use.
But that extensive reach isn’t really that extensive at all. A paper by researchers at Princeton University criticized even the most robust AI guardrails as developing “new safety problems.” Europol’s 2023 Tech Watch Flash identified phishing and fraudulent activities as a significant risk associated with ChatGPT usage due to the database’s ability to “draft highly authentic texts on the basis of a user prompt.”
Moreover, researchers in the Journal of Ethics and Information Technology found that ChatGPT is guilty of “produc[ing] text filled with… bogus judicial decisions, with bogus quotes and bogus internal citations.” According to a senior research fellow at the HEC Paris business school, there have been 518 documented cases where “generative AI produced hallucinated content used in U.S. courts.”
Research published in the Cureus Journal of Medical Science reported similar impacts of LLMs like ChatGPT on scientific writing. According to these results, while ChatGPT often makes accurate claims, it will fabricate evidence to support those claims.
“[ChatGPT] provided five reference[s] dating to the early 2000s. None of the provided paper titles existed, and all provided PubMed IDs (PMIDs) were of different unrelated papers,” the researchers said.
Can AI be used responsibly?
As long as these hallucinations persist and guardrails fail to prevent misinformation, querying an LLM will never be a substitute for actual research. Some scholars, however, have found it beneficial to use AI as a tool. Dr. Karla Miner, STEAM Instructional Coordinator at Davie High, has leveraged AI in both her research and her classroom.
“My students essentially used it as a research buddy,” Miner said. “So much of AI use is dependent on what the intention is.”
Miner recalls a conference she’d recently attended where she learned that a cohort of Australian graduate students viewed AI as a supplemental learning assistant instead of a replacement for work.
“They knew that they needed to go back and double-check behind [AI].”
While she has seen some of the benefits LLMs can provide, she is concerned about its misinformation problem. She believes, however, that AI can be useful so long as “you remember that you are the human and it’s the computer.”
This approach sounds reasonable. But it assumes people will fact-check the AI’s output. The problem? So many people searching for a quick answer do not acknowledge the laughably incorrect conclusions generative AI can reach.
Again and again, the results are clear—LLMs are producing flagrantly false information without accountability. Combine this with the fact that LLM developers aim to replace traditional browser search results with conversational interfaces, and the consequences continue to compound.
If AI hallucinations only affected people who intentionally use ChatGPT or other tools, the problem would be containable. But in 2025, a survey by McKinsey & Company found that 88% of businesses regularly used AI for at least one business function. That figure includes almost every major corporation, including but not nearly limited to Amazon, Airbnb, IKEA, Spotify, Walmart, and Duolingo. It’d be difficult to find an American who doesn’t use at least one of these businesses.
Will things change?
With misinformation and inaccuracies persisting, it’s clear that limitations need to be imposed on generative AI. Ultimately, any meaningful change would likely depend on government intervention, but recent policy suggests this may not happen.
In December 2025, President Trump issued an executive order titled “Ensuring a National Policy Framework for Artificial Intelligence” that explicitly prohibits “excessive State regulation,” favoring “innovat[ion] without cumbersome regulation.” The strongly worded executive order raises the question: Is prioritizing Americans’ safety from dangerous misinformation not worth a bit of “cumbersome regulation”?
If LLMs were limited to making you laugh with over-the-top, Santa-infused brawls, perhaps we would be having a different conversation. But these aren’t the only answers these models provide. They aren’t all funny. Until meaningful reforms and regulations are put in place, trusting AI means trusting something that is confidently incorrect. We can’t afford to take that gamble.


































