Paper page - The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development?
…To ensure evaluation integrity, this framework is secured by multi-layer defenses against reward hacking . Leveraging this framework, we demonstrate that meta-agent s rarely match human-engineered baseline policies, and the…