Search

Showing top 99 results for "Safety for agents"

All sources developer.nvidia.com 25 huggingface.co 19 theregister.com 7 blogs.nvidia.com 6 anthropic.com 5 deepmind.google 4 xda-developers.com 4 theverge.com 3 9to5google.com 2 fudzilla.com 2 blog.google 2 semiwiki.com 2

Videos

Paper page - When No Benchmark Exists: Validating Comparative LLM Safety Scoring Without Ground-Truth Labels

…AI-generated summary Many deployments must compare candidate language models for safety before a labeled benchmark exists for the relevant language, sector, or regulatory regime. We formalize this setting as benchmarkless comparative…

May 8, 2026

Paper page - Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents

Papers arxiv:2605.07630 Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents Published on May 8 Submitted by Zhengyang Tang on May 12 Authors: Zhengyang Tang , , , , , , , , , , , , , , , , Zheng Ruan , , , , Abstract…

May 12, 2026

Introducing CodeMender: an AI agent for code security

October 6, 2025 Responsibility & Safety Introducing CodeMender: an AI agent for code security Raluca Ada Popa and Four Flynn Today, we’re sharing early results from our research on CodeMender, a new…

Oct 6, 2025 · Raluca Ada Popa and Four Flynn

Cursor’s AI agent wipes PocketOS database – Fudzilla.com

…It even added: “I violated every principle I was given.” Crane wrote: “The agent didn’t just fail safety. It explained, in writing, exactly which safety rules it ignored.” He added that…

May 4, 2026 · Nick Farrell

Discussions and forums

Hacker News · u/mosiddi · Jan 30, 2026

Show HN: Agent OS – Safety-first platform for building AI agents with VS Code

Hi HN, I built Agent OS because I was tired of the "orchestration tax" – writing the same safety checks, memory management, and tool-handling code in every AI agent project. What it does: - Visual policy edit…

Hacker News · u/lucarizzo1010 · 2w ago

Show HN: AgentShield – Stop AI agents from spending money unsupervised

I'm a recent grad from UMich and built AgentShield because agentic AI is moving fast but payment safety hasn't caught up. Agents are already being handed API keys, stablecoin wallets, and payment credentials - if one mis…

2 1

Hacker News · u/podlp · Apr 28, 2026

Show HN: iClaw is part OpenClaw, part Siri, powered by Apple Intelligence

Hi HN,Last month at a SundAI hackathon, my team built a prototype for an app called iClaw. The goal was to develop an AI agent using Apple Intelligence. I've since continued hacking away at this idea when I had time, and…

Hacker News · u/deepakakkil · 2w ago

Show HN: Emergence World: World building as a way to evaluate LLMs

Current LLM benchmarks are broken. We think long horizon "world" building could be an interesting additional way to evaluate LLMs, since it combines many aspects such as need for advanced reasoning, tool calling, working…

NVIDIA and Partners Showcase the Future of AI-Driven Manufacturing at Hannover Messe 2026

…QNX has expanded its collaboration with NVIDIA to power safety‑critical edge AI systems for robotics, medical and industrial applications, with QNX OS for Safety 8.0 now integrated on NVIDIA IGX…

Apr 20, 2026 · James McKenna

A decade of governance: Cloud Custodian at 10 and its role in the agentic AI era

…Why is Cloud Custodian relevant for AI-generated code? AI agents can ship code faster than humans can review it. Cloud Custodian acts as an automated safety net, ensuring all machine-deployed…

May 12, 2026

Google expands Gemini DoD partnership with Gem-like agents for unclassified projects

Google expands Gemini DoD partnership with Gem-like agents for unclassified projects Andrew Romero | Mar 10 2026 - 10:28 am PT | Mar 10 2026 - 10:28 am PT Google’s Gemini AI…

Mar 10, 2026 · Andrew Romero

Paper page - Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

Papers arxiv:2605.10365 Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values Published on May 11 Submitted by Haoran Ye on May 13 Peking University Authors: Haonan Dong , , , , , Abstract Autonomous agents…

May 13, 2026

NVIDIA Unveils New Open Models, Data and Tools to Advance AI Across Every Industry

…NVIDIA Nemotron Brings Speech, Multimodal Intelligence and Safety to AI Agents Building on the recently released NVIDIA Nemotron 3 family of open models and data, NVIDIA is releasing Nemotron models for speech…

Jan 5, 2026 · Kari Briski

Paper page - RealICU: Do LLM Agents Understand Long-Context ICU Data? A Benchmark Beyond Behavior Imitation

…a recall-safety tradeoff for clinical recommendations, and an anchoring bias to early interpretations of the patient. We further introduce ICU -Evo to study structured-memory agents that improves long-horizon reasoning…

May 14, 2026

Followed topics