Search: privacy & security

Trustworthy agents in practice

… Below, we walk through examples drawn from three: human control, alignment with user expectations, and security. Our other two principles—transparency and privacy—run through each. …

Apr 9, 2026

Measuring LLMs’ ability to develop exploits

… It was developed as a collaboration between UC Berkeley, the Max Planck Institute for Security and Privacy, UC Santa Barbara, and Arizona State University with contributions from security researchers at Anthropic, OpenAI, and Google , as a follow-on to the CyberGym vulnerability-reproduction benchm… …

May 22, 2026

Followed topics

Search

Trustworthy agents in practice

Measuring LLMs’ ability to develop exploits

How people ask Claude for personal guidance

Results from first Anthropic Public Record

Measuring AI agent autonomy in practice

Anthropic Economic Index report: Economic primitives

Claude Fable 5 and Claude Mythos 5

How AI Is Transforming Work at Anthropic