Paper page - Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning
…Prefix-Based Rollout Reuse in Agentic Search Training (2026) Train at Moving Edge: Online-Verified Prompt Selection for Efficient RL Training of Large Reasoning Model (2026) Accelerating RL Post-Training Rollouts via…
