Paper page - On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment
… These results suggest that failed trajectories can provide structured repair supervision for safer self-evolving agents. …
… These results suggest that failed trajectories can provide structured repair supervision for safer self-evolving agents. …
… The substantial differences arise upstream of the chain, in claim-contract enforcement and deployment fit. A Norwegian public-sector procurement case comparing Borealis and Gemma 3 demonstrates the resulting evidence in practice: the safer model depends on scenario category and risk measure. …
… The following papers were recommended by the Semantic Scholar API SaFeR-Steer: Evolving Multi-Turn MLLMs via Synthetic Bootstrapping and Feedback Dynamics 2026 ContextualJailbreak: Evolutionary Red-Teaming via Simulated Conversational Priming 2026 Transient Turn Injection: Exposing Stateless Multi-… …