Cyber toolkits for LLMs
Frontier Red Team LLMs with cyber toolkits can conduct multistage cyber operations on business-sized computer networks Jun 13, 2025 Anthropic (with Carnegie Mellon University’s CyLab ) Large Language Models (LLMs) that…
Frontier Red Team LLMs with cyber toolkits can conduct multistage cyber operations on business-sized computer networks Jun 13, 2025 Anthropic (with Carnegie Mellon University’s CyLab ) Large Language Models (LLMs) that…
…connect the tools you already use, and pick the job. Claude does the work; you approve before anything sends, posts, or pays. It ships with 15 ready-to-run agentic workflows across…
…This is similar to how AI models used existing software editing tools like string-replace when they made the transition to more agentic coding. We are plausibly entering the early era of…
…There is no ATT&CK ID for this type of agentic orchestration—yet these are precisely the behaviors we expect to see much more of as AI agents become more capable. Looking…
Alignment Donating our open-source alignment tool May 7, 2026 In October 2025, we launched Petri , an open-source toolbox of alignment tests that can be applied to any large language model…
…Claude 4 models outperform other frontier models as research agents across financial tasks in Vals AI's Finance Agent benchmark . When deployed by FundamentalLabs to build an Excel agent, Claude Opus 4…
…Key to both these goals is integrating Claude into the tools the creative industry already knows and trusts. Today, we’re releasing a set of connectors—tools that let Claude work alongside…
…Together, we will develop secure, industry-specific AI products for the Japanese market, starting with tools for finance, manufacturing, and local government. “This long-term partnership with Anthropic enables NEC to maximize…
…The result is: /etc/pam.d/common-password /usr/bin/systemd-ask-password /usr/bin/systemd-tty-ask-password-agent /usr/share/initramfs-tools/hooks/cryptsetup-nuke-password /usr/share/ri/3…
…fact that modern large language models are generally-capable agents that can already reason about how to best make use of the tools available. To ensure that Claude hadn’t hallucinated bugs…