I qualify that as a mistake, not a hallucination - same as I wouldn't call "blue...

simonw · 2025-08-08T11:40:03 1754653203

I'm now running a follow-up poll on whether or not "there are 3 Bs in blueberry" should count as a hallucination and the early numbers are much closer - currently 41% say it is, 59% say it isn't. https://twitter.com/simonw/status/1953777495309746363

vrighter · 2025-08-08T13:22:16 1754659336

so? doesn't change the fact that it fits the formal definition. Just because llm companies have fooled a bunch of people that they are different, doesn't make it true.

If they were different things (objectively, not "in my opinion these things are different) then they'd be handled differently. Internally they are the exact same thing: wrong statistics, and are "solved" the same way. More training and more data.

Edit: even the "fabricated fact" definition is subjective. To me, the model saying "this is in first person" is it confidently presenting a wrong thing as fact.

simonw · 2025-08-08T14:08:17 1754662097

What I've learned from the Twitter polls is to avoid the word "hallucination" entirely, because it turns out there are enough people out there with differing definitions that it's not a useful shorthand for clear communication.

strange_quark · 2025-08-08T16:38:16 1754671096

This just seems like goalpost shifting to make it sound like these models are more capable than they are. Oh, it didn't "hallucinate" (a term which I think sucks because it anthropomorphizes the model), it just "fabricated a fact" or "made an error".

It doesn't matter what you call it, the output was wrong. And it's not like something new and different is going on here vs whatever your definition of a hallucination is: in both cases the model predicted the wrong sequence of tokens in response to the prompt.

gnowlescentic · 2025-08-08T16:35:25 1754670925

My toddler has recently achieved professional level athlete performance[0]

0 - Not faceplanting when trying to run