Search

Showing top 5 results for "Safety for agents"

Claude Code bypasses safety rule if given too many commands

… But often developers grant automatic approval to agents --dangerously-skip-permissions mode or just click through reflexively during long sessions. …

Apr 1, 2026 · Thomas Claburn

Anthropic, Google, Microsoft paid AI bug bounties – quietly

… "If they don't publish an advisory, those users may never know they are vulnerable – or under attack." He said the attack probably works on other agents that integrate with GitHub, and GitHub Actions that allow access to tools and secrets, such as Slack bots, Jira agents, email agents, and deployme… …

Apr 15, 2026 · Jessica Lyons

Claude attacks were 'Rorschach test' for infosec community

… In some cases, the agents even found and stole sensitive data. …

Mar 23, 2026 · Jessica Lyons

Anthropic: Claude quota drain not caused by cache tweaks

… Cherny said that larger contexts are now common because users are "pulling in a large number of skills, or running many agents or background automations." MORE CONTEXT Anthropic goes nude, exposes Claude Code source by accident AMD's AI director slams Claude Code for becoming dumber and lazier sinc… …

Apr 13, 2026 · Tim Anderson

Bad teacher bots can leave hidden marks on model students

… "Safety evaluations may therefore need to examine not just behavior, but the origins of models and training data and the processes used to create them," the paper said. ® ai openai large language model anthropic ai and ml software

Apr 15, 2026 · Lindsay Clark

Followed topics

Claude Code bypasses safety rule if given too many commands

Anthropic, Google, Microsoft paid AI bug bounties – quietly

Claude attacks were 'Rorschach test' for infosec community

Anthropic: Claude quota drain not caused by cache tweaks

Bad teacher bots can leave hidden marks on model students