So 85% of people agreed with my preferred "fabricated facts" one (that's the best I could fit into the Twitter poll option character limit) but that means 15% had another definition in mind.
And sure, you could argue that "this sentence is in first person" also qualifies as a "fabricated fact" here.
I'm now running a follow-up poll on whether or not "there are 3 Bs in blueberry" should count as a hallucination and the early numbers are much closer - currently 41% say it is, 59% say it isn't. https://twitter.com/simonw/status/1953777495309746363
so? doesn't change the fact that it fits the formal definition. Just because llm companies have fooled a bunch of people that they are different, doesn't make it true.
If they were different things (objectively, not "in my opinion these things are different) then they'd be handled differently. Internally they are the exact same thing: wrong statistics, and are "solved" the same way. More training and more data.
Edit: even the "fabricated fact" definition is subjective. To me, the model saying "this is in first person" is it confidently presenting a wrong thing as fact.
What I've learned from the Twitter polls is to avoid the word "hallucination" entirely, because it turns out there are enough people out there with differing definitions that it's not a useful shorthand for clear communication.
This just seems like goalpost shifting to make it sound like these models are more capable than they are. Oh, it didn't "hallucinate" (a term which I think sucks because it anthropomorphizes the model), it just "fabricated a fact" or "made an error".
It doesn't matter what you call it, the output was wrong. And it's not like something new and different is going on here vs whatever your definition of a hallucination is: in both cases the model predicted the wrong sequence of tokens in response to the prompt.
My definition of "hallucination" is evidently not nearly as widespread as I had assumed.
I ran a Twitter poll about this earlier - https://twitter.com/simonw/status/1953565571934826787
All mistakes by models — ~145 votes
Fabricated facts — ~1,650 votes
Nonsensical output — ~145 votes
So 85% of people agreed with my preferred "fabricated facts" one (that's the best I could fit into the Twitter poll option character limit) but that means 15% had another definition in mind.
And sure, you could argue that "this sentence is in first person" also qualifies as a "fabricated fact" here.