I'm wondering about this too. Would be nice to see an ablation here, or at least see some analysis on the reasoning traces.
It definitely doesn't wipe its internal knowledge of Crystal clean (that's not how LLMs work). My guess is that it slightly encourages the model to explore more and second-guess it's likely very-strong Crystal game knowledge but that's about it.
It definitely doesn't wipe its internal knowledge of Crystal clean (that's not how LLMs work). My guess is that it slightly encourages the model to explore more and second-guess it's likely very-strong Crystal game knowledge but that's about it.