Paper page - Representation over Routing: Diagnosing Temporal Routing Pathologies in Multi-Timescale PPO
… Code and reproducible scripts are open-sourced in the repo. the core idea that really sticks is target decoupling: keep multi-timescale predictions on the critic for auxiliary representation learning, while the actor updates are driven only by long-horizon advantages. this separation seems to block… …