That's interesting, because you show a fundamental limitation of current LLMs in which there is a skill that humans can learn and that LLMs cannot currently emulate.
I wonder if there are people working on closing that gap.
Humans are very bad at random number generation as well.
LLMs can do sampling via external tools, but as I wrote in other thread, they can't do this in "token space". I'd be curious to see a demonstration of sampling of a distribution (i.e. some uniform) in the "token space", not via external tool calling. Can you make an LLM sample an integer from 1 to 10, or from any other interval, e.g. 223 to 566, without an external tool?
Actually that seems exactly wrong. unless you set temperature 0, converting logits to tokens is a random pull. so in principle it should be possible for an llm to recognize that it's being asked for a random number and pull tokens exactly randomly. in practice it won't be exact, but you should be able to rl it to arbitrary closeness to exact
I wonder if there are people working on closing that gap.