Paper page - Beyond Reasoning: Reinforcement Learning Unlocks Parametric Knowledge in LLMs
…To put it directly, RL fundamentally optimizes the recall of latent knowledge. 2️⃣ The unexpected contribution of 0/128 samples: Remarkably, ~83% of the performance jump is driven by training on the…