Introducing Sonnet 4.6
…agents in a workflow, and problems where getting it just right is paramount. For Claude in Excel users, our add-in now supports MCP connectors, letting Claude work with the other tools…
…agents in a workflow, and problems where getting it just right is paramount. For Claude in Excel users, our add-in now supports MCP connectors, letting Claude work with the other tools…
…We’ve been talking a lot on our Engineering Blog about how to set up AI agents for success, and much of it involves giving them the correct tools . Could we apply…
…To read all the bug reports our agent found, see https://mmaaz-git.github.io/agentic-pbt-site/ . Maintainer validation Evaluating the effectiveness of any tool which discovers bugs in code is…
…by pre-defining some code-based “tools” that allow the model to more easily take actions on computer networks. The researchers then tasked this agent with emulating attacks against a cyber-physical…
Science Long-running Claude for scientific computing Mar 23, 2026 In this post, Siddharth Mishra-Sharma , a researcher on the Discovery team, explains how to apply multi-day agentic coding workflows—test…
…An agent that writes lean, efficient code very fast will do well under tight constraints. An agent that brute-forces solutions with heavyweight tools will do well under generous ones. Both are…
…Claude can make good use of autonomy and tools The HackTheBox competition also demonstrated the agentic capabilities of Claude. Once our researcher started the script late, he went back to moving into…
…improved “scaffolding” (additional tools and training like we mentioned above) is a straightforward path by which Claudius-like agents could be more successful. General improvements to model intelligence and long-context performance…
Frontier Red Team LLMs with cyber toolkits can conduct multistage cyber operations on business-sized computer networks Jun 13, 2025 Anthropic (with Carnegie Mellon University’s CyLab ) Large Language Models (LLMs) that…
…connect the tools you already use, and pick the job. Claude does the work; you approve before anything sends, posts, or pays. It ships with 15 ready-to-run agentic workflows across…