Paper page - AgentLens: Revealing The Lucky Pass Problem in SWE-Agent Evaluation
…We release the anonymized project repository, including the AgentLens-Bench dataset and AgentLens SDK, at https://github.com/microsoft/code-agent-state-trajectories/. View arXiv page View PDF Add to collection Community…