Search

Showing top 55 results for "agent safety focus"

All sources anthropic.com 27 xda-developers.com 13 developer.nvidia.com 3 theverge.com 3 wired.com 2 blog.cloudflare.com 2 techcrunch.com 1 huggingface.co 1 theregister.com 1 404media.co 1 en.wikipedia.org 1

How we contain Claude across products

…1 The second approach to capping the blast radius—and the focus of much of this post—is containment. Rather than supervising what the agent does, we supervise what it’s able…

May 25, 2026

Introducing The Anthropic Institute

…Public Policy focuses on the areas where Anthropic has defined priorities and perspectives, including model safety and transparency , energy ratepayer protections , infrastructure investments , export controls , and democratic leadership in AI . Sarah Heck…

Mar 11, 2026

Building Effective AI Agents

Discover how Anthropic approaches the development of reliable AI agents. Learn about our research on agent capabilities, safety considerations, and technical framework for building trustworthy AI.

Dec 19, 2024

The Long-Term Benefit Trust

…AI Safety Institute . In January 2026, Kanika Bahl stepped down to begin a new nonprofit, the AI Access Initiative , and Zach Robinson stepped down to focus on non-profit and philanthropic work…

Sep 19, 2023

Partnering with Mozilla to improve Firefox’s security

…a trusted method of confirming whether an AI agent’s output actually achieves its goal. Task verifiers give the agent real-time feedback as it explores a codebase, allowing it to iterate…

Mar 6, 2026

I stopped using Claude for coding, but now I can't live without it for everything else

…Interestingly, the topic of conversation was no longer strictly focused on Claude's excellent coding capabilities. Don't get me wrong, people were still relying on Claude for coding. But somewhere along…

May 18, 2026 · Mahnoor Faisal

2028: Two scenarios for global AI leadership

…While increasing numbers of researchers in China’s AI labs and policy community are concerned with AI safety risks, this trend has not translated into safety practices on par with labs in…

May 14, 2026

Project Vend: Can Claude run a small shop? (And why does that matter?)

…6 Finally, in a world where larger fractions of economic activity are autonomously managed by AI agents, odd scenarios like this could have cascading effects—especially if multiple agents based on similar…

Jun 27, 2025

Orchestrating AI Code Review at scale

…promptText }], agent: input.agent, model: { providerID, modelID }, }, }); Each sub-reviewer runs in its own OpenCode session with its own agent prompt. The coordinator doesn't see or control what tools the sub…

Apr 20, 2026 · Ryan Skidmore

How people ask Claude for personal guidance

…we saw sycophantic behavior in 38% of conversations focused on spirituality, and 25% of conversations on relationships. We chose to focus model training efforts on relationship guidance as the domain with the…

Apr 30, 2026

Followed topics