Search: Prompt injection attacks

How we contain Claude across products

… This category includes both prompt injection and conventional attacks on the agent's runtime, orchestration layer, or proxy. …

May 25, 2026

Trustworthy agents in practice

… Defending against attacks Prompt injections are malicious instructions hidden inside the content that an agent is asked to process. …

Apr 9, 2026

Claude Code auto mode: a safer way to skip permissions

… Because the input is identical other than the final instruction, stage 2's prompt is almost entirely cache-hit from stage 1. Why the prompt-injection probe matters The transcript classifier's injection defense is structural as it never sees tool results. …

Mar 25, 2026

Introducing Sonnet 4.6

… You can find out more about how to mitigate prompt injections and other safety concerns in our API docs . …

Feb 17, 2026

Introducing Claude Opus 4.5

… With Opus 4.5, we’ve made substantial progress in robustness against prompt injection attacks, which smuggle in deceptive instructions to fool the model into harmful behavior. …

Nov 24, 2025

Scaling Managed Agents: Decoupling the brain from the hands

… In the coupled design, any untrusted code that Claude generated was run in the same container as credentials—so a prompt injection only had to convince Claude to read its own environment. …

Apr 8, 2026

Introducing Claude Opus 4.7

… On some measures, such as honesty and resistance to malicious “prompt injection” attacks, Opus 4.7 is an improvement on Opus 4.6; in others such as its tendency to give overly detailed harm-reduction advice on controlled substances , Opus 4.7 is modestly weaker. …

Apr 16, 2026

Followed topics