Paper page - Beyond Reasoning: Reinforcement Learning Unlocks Parametric Knowledge in LLMs
…Our data-attribution study reveals that the hardest examples are the most informative: those whose answers never appear in 128 pre-RL samples (only ~18% of training data) drive ~83% of the…