I'm wondering about this too. Would be nice to see an ablation here, or at least...

		tootyskooty 8 days ago \| parent \| context \| favorite \| on: Gemini 3 Pro vs. 2.5 Pro in Pokemon Crystal I'm wondering about this too. Would be nice to see an ablation here, or at least see some analysis on the reasoning traces. It definitely doesn't wipe its internal knowledge of Crystal clean (that's not how LLMs work). My guess is that it slightly encourages the model to explore more and second-guess it's likely very-strong Crystal game knowledge but that's about it.

The model probably recognizes the need for a grassroots effort to solve the problem, to "show it's work".