Believe everything ChatGPT/CoPilot/(insert favourite LLM) tells you? Don’t!
Just like us humans, AI sometimes gets things wrong. These misapprehensions have been dubbed “AI hallucinations”.
But what causes hallucinations, how can you spot them, and how can you avoid them?
In this post, I’ll try to answer all these questions, and I hope to show the value of a skill we humans have that the AI doesn’t - critical thinking.
Background #
Large Language Model (LLM) hallucinations are when a model generates believable but incorrect or misleading information.
For example, asking a chatbot to summarise a report sent to you by a colleague, and it responding with a detailed summary of topics never actually mentioned in the report.
But what causes an LLM to hallucinate? To understand that, we need to understand a little about what goes on “behind the prompt”.
The Nuts and Bolts #
LLMs are built on neural networks, the particular flavour being the transformer, a neural network architecture which excels at processing sequential data.
Transformers rely on attention mechanisms to recognise patterns in text, and to process, generate, and predict sequences of words. The attention mechanism directs deep learning models to prioritise relevant parts of the input data.
Putting it all together, you can think of an LLM as a prediction engine which uses statistics, pattern recognition, and internal logic models to predict sequences of words.
Of course, there’s a huge amount more to it than that, and if you’re interested in the nitty-gritty, I highly recommend this video.
This is only half of the story though, as to become useful, an LLM must be trained.
Training allows the model to analyse text and build its internal logic to complete sentences or generate its own.
Training #
Training an LLM requires data - lots of data.
The goal of training is two-fold: to allow the LLM to correctly predict the next word in a sequence given all the previous ones (the context), and to build up a core “knowledge base”.
Training sets are built from a huge corpus of text taken from the internet, such as:
- Books and encyclopaedias
- Blog posts (😅) and news articles
- Code repos
- Wikipedia
- Public forums, FAQs, technical docs, and much more
Training sources expose the model to a wide vocabulary, sentence structure, and knowledge domain.
This raw data isn’t used as-is though - a significant effort is made by engineers to clean the data of profanity, spam, toxic, repetitive, or low-quality content using heuristics and classifiers.
Categorising Hallucinations #
With an understanding of the workings of LLMs and their training, let’s explore why they sometimes get things wrong.
Context Retrieval #
Context Retrieval issues relate to how the model retrieves relevant information from its knowledge base given the current context.
There are 2 primary categories of context retrieval hallucinations - context retrieval recall and context retrieval precision.
Recall #
If an LLM struggles to recall information from its knowledge base, its evaluation method often encourages it to guess rather than say “I don’t know”.
Precision #
Precision hallucinations are caused by issues with the model’s ability to spot and prioritize relevant context, leading to responses which may have accurate information, but are embedded within irrelevant or incorrect contexts.
Question-Answering Capability #
Imagine asking your LLM assistant “what is the capital of Spain?”.
If the model’s question-answering (QA) capabilities aren’t optimal, it may give you the wrong answer or struggle to answer altogether.
There are many potential reasons for this, including factors within the model, or the way the question was phrased.
Handling Reduced Context #
LLMs are dependent on the context provided by a user to generate a meaningful response.
But some situations may involve limited context, such as a single open-ended question, or a very short prompt.
And so when handling this reduced context, the LLM may struggle to understand the user’s intent, or may infer missing details, producing unreliable output.
Knowledge Base Attention #
Large Language Models store a huge amount of information, often called the knowledge base or the parametric knowledge.
Their ability to focus on relevant parts of that raw knowledge during response generation is crucial.
Attention failure is when an LLM focuses on unimportant parts of the context instead of key information for the task.
Knowledge Base Accuracy and Completeness #
An LLM is only as good as the knowledge base it draws upon.
Just as a knowledge base made up of inaccurate or incomplete information produces misleading responses, so too can one missing essential detail produce limited or unhelpful output.
Knowledge Bias #
If the model’s parametric knowledge reflects distorted or prejudiced viewpoints, responses will likely perpetuate stereotypes or misinformation.
Limits in Knowledge or Language Comprehension #
Despite their abilities, LLMs can still be limited in their knowledge and understanding of the world, which when given a complex or nuanced task, can manifest as nonsensical or meaningless output.
Danger of Hallucinations #
It’s very easy to blindly accept everything the AI tells you. Part of the problem is that each model sounds so confident and plausible in its response that it leads you into a false sense of security.
This section describes cases where trusting inaccurate information produced by AI led to disastrous results.
Chatbot Invents Company Policy #
An AI chatbot for Air Canada provided a passenger with false information regarding fare refunds.
To avoid honouring this fabricated policy, the airline argued that the chatbot operated separately from the airline, but this was rejected by a tribunal.
Air Canada were held liable for the misinformation and made to compensate the customer.
ChatGPT Invents Legal Precedent in Court Filings #
A U.S. lawyer used ChatGPT to draft legal documents which referenced previous court cases that didn’t exist.
When questioned in court, the lawyer admitted they didn’t know the tool could “hallucinate” information.
This incident led to the requirement that future legal filings must disclose all AI use, and all citations must be independently verified.
AI Triggers Massive Market Loss #
During a promotional demo, Google’s chatbot Bard claimed the James Webb Telescope had captured images of an exoplanet.
This mistake caused investor jitters and a mass sell-off in Alphabet’s market value. As a result, Google revised their review processes before publishing AI-generated content.
Mitigation Strategies #
With the potential risk of trusting AI so high, how can we improve our trust in their responses?
LLM hallucination mitigation - the process of reducing the effect or likelihood of AI inventing information - is an active area of research.
Below are some mitigation strategies for AI hallucinations.
Retrieval-Augmented Generation (RAG) #
Retrieval-Augmented Generation (RAG) is an approach to improving prompts by augmenting an LLM with relevant data outside the underlying model (think dragging and dropping documents and having the LLM answer questions about them).
RAG helps prevent hallucinations by grounding responses in relevant, up-to-date sources by anchoring the model in relevant knowledge.
Clear and Specific Prompts #
An AI response is only as good as the prompt it receives, so providing detailed instructions and careful questions is vital.
To improve the accuracy of the AI response, minimise ambiguity by clearly outlining context, specifying output format, and providing examples or templates to reduce guessing.
For example, which prompt below do you think has a higher likelihood of producing hallucinations?
- “Write a blog about AI errors.”
- “Write a 500-word explanation of AI hallucinations for a non-technical audience. Include one real-world example. Do not use academic jargon.”1
Multi-Model Verification #
Multi-model verification involves cross-checking AI responses across multiple models, the idea being that doing so will likely improve accuracy.
You can even use multi-model deliberation, where multiple models collaborate with each other and reach a consensus on the correctness of outputs.
Conclusion #
AI is a remarkable tool, but it’s just that - a tool. And like any tool, its value depends entirely on the skill and judgement of the person using it.
The hallucinations we’ve explored are a reminder that LLMs are, at their core, sophisticated pattern-matching engines. They don’t understand things, they predict things, based on the vast ocean of text they were trained on. Confidence is baked into their output by design, which makes blind trust particularly dangerous.
This is where we humans still hold the edge.
The ability to question, cross-reference, and weigh up the plausibility of information is something no LLM has truly mastered. Used well, AI can accelerate your research, spark ideas, and handle the mundane. But it works best as a starting point, not a finishing line.
So the next time an AI gives you a slick, authoritative-sounding answer, pause before you copy and paste. Ask yourself: does this make sense? Treat AI the way you’d treat a well-read colleague who occasionally (and convincingly) makes things up.
The most powerful combination isn’t human or AI, it’s human and AI, with a healthy dose of scepticism to separate the lies from the truth.
-
For the record, this post wasn’t written by AI. ↩︎