Large Language Models Don’t “Hallucinate”
Using this term attributes properties to the LLMs they don’t have while also ignoring the real dynamics behind the production of their made-up information
You hear it everywhere. Whenever someone discusses the tendency of Large Language Models (LLMs) like ChatGPT to make up facts or present fictional information, they say the LLM is hallucinating. Beyond simply being a term used by the media, it is also used by researchers and laypeople alike to refer to whenever an LLM produces text which, in one way or another, does not correspond to reality.
Despite its prevalence, the term is, at best, somewhat deceptive, and at worse, actively counterproductive to thinking about what LLMs are actually doing when they produce text which is deemed problematic or untrue. It seems to me that “hallucination” is a bad term for several reasons. It both attributes properties to the LLMs they don’t have while also ignoring the real dynamics behind the production of made-up information in their outputs.
While it is clear the latest language models, such as ChatGPT and Claude, are increasingly impressive in what they can produce (and I don’t take stock in the out-of-touch perspective of Chomsky and others), there is still a lot of room for scrutiny in how we talk and think about them.
It is first perhaps useful to examine the history of the term “hallucination.” It was introduced into English in the mid-17th century in a medical context to refer to visual experiences disconnected from sensory information. The word itself comes from the ancient greek ἀλύω, meaning to “wander in mind.” Contemporarily, the term has precise clinical meaning within psychology and neuroscience, one that refers to specific kinds of disturbances in perceptual experience.
This brings us to the first problem with the term “hallucinations” when applied to LLMs. Hallucinations in the medical context refer to conscious phenomenal experiences of perception. In humans, this can be divided into two broad categories: pseudohallucinations and true hallucinations. Pseudohallucinations refer to phenomenal experiences one understands to not be reflective of the external world and the sensory information provided by it. A common example of this is the kinds of experiences individuals report on psychedelic drugs such as psilocybin.
Despite the fact that an individual may report seeing a number of bizarre or supernatural things, they are (almost) always aware of them being internally generated. A basic version of this kind of hallucination can be produced simply by closing one’s eyes and pressing the palms against the eyelids.
In the case of a true hallucination, one actually experiences the hallucination as being “real” in the sense that it seems to actually be a change in sensory perception, which they are unable to distinguish from other sensory perceptions. As such, it is experienced as part of the external rather than the internal world. These experiences are typically reported by individuals during psychotic episodes or those who have consumed an anticholinergic drug such as atropine.
In both cases, hallucinations refer to changes in one’s perceptual experience. Right away, an issue presents itself, though, since LLMs do not have any perceptual experiences to speak of. All they do is consume and produce text. Even if one was inclined to be poetic and refer to the model’s input as its sensory experience and its output as a kind of motor response, it still doesn’t make sense because the so-called “hallucinations” of the model are in the output it produces rather than its ability to process the input provided to it.
For the sake of argument, we could go even further and assume we live in a world in which LLMs are capable of some kind of phenomenal experience driven by sensory input. Even if they did have such experiences, their production of false text still would not count as a hallucination even then. This is because there is no way to know if an LLM has truly experienced the contents of the false information being output or is simply inclined to report false information.
As humans, we can invent all kinds of fictional information while never once actually hallucinating information in any sense of the word. It is precisely this capacity that allows fantasy writers to still function in their daily lives, despite producing text which often bears no resemblance to it. The same would be true for our hypothetical “experiencing” LLMs.
For the sake of the argument, we could further suppose that the LLM is not simply playing or simulating a role but, in some sense, does indeed “believe” the text it is producing corresponds to some actual experience or knowledge. In this case, “delusion” would actually be a much better term than “hallucination.” Rather than referring to sensory perceptions which do not correspond to actual sensory information, delusions refer to beliefs that do not correspond to the state of reality. Like true hallucinations, delusions are typical most often in individuals experiencing psychotic episodes or other forms of severe mental disorder.
Here again, we run into a similar issue as before. Just as there is no sense in which LLMs have phenomenal experience, there is likewise no sense in which they can be said to hold specific beliefs about the world. That said, there have been arguments made pointing out that the representations within the networks themselves could “mean” something in a way analogous to meaning in our own brains. Meaning and belief are two different things, however. Just because an internal representation of “dog” might capture the meaning of a dog, an LLM outputting the text “I am a dog” says nothing about the actual beliefs of the network. It could just be stating false information or making up facts because it is simulating a specific role or persona for which such language would be appropriate. This is especially apparent in cases where querying an LLM about the truthfulness of its made-up facts will result in it sometimes conceding they are not true.
If not “hallucination” or “delusion,” then what term would be appropriate? Despite it representing less than polite language, I actually agree with a number of commentators who suggest “bullshitting” as a term that captures quite accurately what is really going on in these models. Rather than lying, which like a delusion, would require the model to hold some true beliefs about the world, bullshitting suggests the model has no meaningful relationship to either beliefs or truth, which is indeed the state of current models.
I have also heard “confabulation” used, which likewise avoids connotations of belief or perception. For my own taste, I think the more prosaic “making things up” also captures the essence of what is going on just as well. All three are certainly better than the continued trend toward using anthropomorphic language for LLMs, specifically language which has meanings specifically related to human psychopathology. Such language only serves to reinforce the misguided notion that these tools are (or could one day be) sentient, which takes away both from the real risks as well as the real promises of AI.