Why Do LLMs Hallucinate?

Large Language Models (LLMs) like ChatGPT have captivated researchers, businesses, and casual users alike. Their ability to generate human-like text often feels magical, but beneath the surface lies a significant limitation: hallucination. This term refers to instances where LLMs produce confident yet completely nonsensical or incorrect answers. But why does this happen? And what does it say about the limits of these systems?

At the core of LLMs is autoregressive prediction. When an LLM generates text, it does so by predicting the next word in a sequence based on probabilities. Each word is chosen based on what the model “thinks” is most likely to follow, given its training data. However, this process is not infallible. There’s always a non-zero chance that the model selects an incorrect word. Over the course of a long response, these errors accumulate, creating an exponential drift.

Imagine trying to walk a straight line while blindfolded. With each step, a tiny error in direction may occur. Over time, those small deviations add up, leading you far off course. This is essentially what happens when an LLM produces a response. The longer the output, the more likely it is to “drift” into nonsensical territory.

The Long Tail of Prompts

Fine-tuning can help reduce hallucinations. By training LLMs on specific datasets and refining their responses to common questions, developers can ensure that the models perform well on familiar prompts. However, this approach has significant limits.

The number of possible inputs a user might generate is virtually infinite. This is known as the “long tail” problem. While LLMs can be trained to handle 80% of commonly encountered prompts, the remaining 20% is enormously vast and unpredictable. Even small variations in phrasing or context can lead to hallucinations. For instance, substituting one word with its equivalent in another language can completely derail an otherwise accurate response.

Moreover, some prompts are specifically designed to exploit the weaknesses of LLMs. By feeding the system a random sequence of characters or ambiguous input, users can easily “jailbreak” the model, forcing it to generate gibberish or unrelated answers.

Even when prompted sensically, hallucinations are more common than not.

Training Limitations

A cause for this issue lies in the training process itself. LLMs learn from large datasets, but these datasets represent only a tiny fraction of all possible knowledge. While this enables them to replicate patterns and produce seemingly intelligent responses, they lack true understanding - but do a good job at convincing you otherwise!

In essence, LLMs are not reasoning machines. They do not comprehend the content they generate; they merely predict what comes next based on probabilities. This makes them powerful tools for summarisation or pattern recognition but unreliable when it comes to novel or complex queries.

The Implications for Research

For researchers and academics, the limitations of LLMs are particularly significant. Hallucinations can lead to the dissemination of incorrect information, undermining the credibility of sources and AI-assisted research. As reliance on these tools grows, it’s crucial to understand their limitations and use with caution.

While LLMs can be valuable aids, they are far from infallible. Recognising their weaknesses is the first step toward using them responsibly, and using them less.

Previous
Previous

Are LLMs Just Recycling the Internet?

Next
Next

The Cognitive Map of Language: How Leximancer Sees Connections You Don’t