Paper page - Learning to Explore: Scaling Agentic Reasoning via Exploration-Aware Policy Optimization
…Empirically, we demonstrate that our approach achieves consistent improvements across a range of challenging text-based and GUI-based agent benchmarks. Code is available at https://github.com/HansenHua/EAPO-ICML26 and…