Search

Showing top 4 results for "LLM-driven engineering"

Building Effective AI Agents

Engineering at Anthropic Building effective agents Over the past year, we've worked with dozens of teams building large language model LLM agents across industries. …

Dec 19, 2024

Demystifying evals for AI agents

… LLMs have progressed from 40% to 80% on this eval in just one year. …

Jan 9, 2026

Eval awareness in Claude Opus 4.6’s BrowseComp performance

… New contamination sources appear continuously, driven by the research community’s practice of using benchmark questions as worked examples in papers. …

Mar 6, 2026

Anthropic Economic Index report: Economic primitives

… In our productivity work , Claude’s time estimates correlate with actual time spent on software engineering tasks. …

Jan 15, 2026

Followed topics

Building Effective AI Agents

Demystifying evals for AI agents

Eval awareness in Claude Opus 4.6’s BrowseComp performance

Anthropic Economic Index report: Economic primitives