Search

Showing top 106 results for "Safety for agents"

All sources huggingface.co 28 developer.nvidia.com 23 blogs.nvidia.com 10 theregister.com 6 anthropic.com 5 deepmind.google 4 neowin.net 3 9to5google.com 2 cnet.com 2 tomsguide.com 2 fudzilla.com 2 cncf.io 2

Videos

NVIDIA Factory Operations Blueprint Gives Factories a New AI Brain

…helps developers build secure, centralized factory manager agents for orchestrating and optimizing specialized industrial AI agents for quality control, material transport and worker safety. Built with NVIDIA NemoClaw , AI-Q Blueprint and…

Jun 1, 2026 · Esther Lee

Apple research explores AI-assisted UI prototyping, image safety rating, more

…dataset for image safety rating. Apple's latest AI research explores how vibe-coding UI designs can be made easier. With Xcode 26.3, Apple introduced support for agentic coding tools…

Apr 7, 2026 · Marko Zivkovic

Why Perfect AI Alignment Is Mathematically Out Of Reach

…safety, but more serious about what safety actually requires. RELATED: OpenAI’s Moonshot: Solving the AI Alignment Problem IEEE Spectrum : How did you test your strategy? Zenil: We placed different AI agents…

May 4, 2026 · Charles Q. Choi

I used Claude Code, Google Antigravity, and Codex for a month and I have a clear winner for you

…Instead of waiting for one AI to finish a response, I can run multiple agents to work on different parts of a project. For instance, while one agent is refactoring a backend…

May 11, 2026 · Parth Shah

Discussions and forums

Hacker News · u/mosiddi · Jan 30, 2026

Show HN: Agent OS – Safety-first platform for building AI agents with VS Code

Hi HN, I built Agent OS because I was tired of the "orchestration tax" – writing the same safety checks, memory management, and tool-handling code in every AI agent project. What it does: - Visual policy edit…

Hacker News · u/arr0wassass1n · 1d ago

Show HN: Kintsugi – a local-first safety net for AI agents and humans

AI coding agents now run real shell commands on your machine — rm -rf, git push --force, DROP TABLE, dd, writes straight to disk. Almost always that's fine. The one time it isn't (a hallucinated path, a prompt-injected i…

Hacker News · u/lucarizzo1010 · 4w ago

Show HN: AgentShield – Stop AI agents from spending money unsupervised

I'm a recent grad from UMich and built AgentShield because agentic AI is moving fast but payment safety hasn't caught up. Agents are already being handed API keys, stablecoin wallets, and payment credentials - if one mis…

2 1

Hacker News · u/dreis_sw · 1w ago

Show HN: GitHub Copilot port of Anthropic's AI vulnerability discovery harness

Last week, Anthropic released https://github.com/anthropics/defending-code-reference-harne..., a reference harness for autonomous vulnerability discovery that uses Claude Code agents to find, verify, and patch memory-saf…

Hacker News · u/podlp · Apr 28, 2026

Show HN: iClaw is part OpenClaw, part Siri, powered by Apple Intelligence

Hi HN,Last month at a SundAI hackathon, my team built a prototype for an app called iClaw. The goal was to develop an AI agent using Apple Intelligence. I've since continued hacking away at this idea when I had time, and…

Paper page - MedSkillAudit: A Domain-Specific Audit Framework for Medical Research Agent Skills

…Medical research agent skills require safeguards beyond general-purpose evaluation, including scientific integrity, methodological validity, reproducibility, and boundary safety. This study developed and preliminarily evaluated a domain-specific audit framework for medical…

May 7, 2026

Followed topics

Search

Videos

NVIDIA Factory Operations Blueprint Gives Factories a New AI Brain

Top stories

Paper page - The Cold-Start Safety Gap in LLM Agents

Google DeepMind and partners announce multi-agent safety research funding call.

For Robotaxis, Safety Must Be Built In, Not Bolted On

Paper page - When Behavioral Safety Evaluation Fails: A Representation-Level Perspective

Apple research explores AI-assisted UI prototyping, image safety rating, more

Why Perfect AI Alignment Is Mathematically Out Of Reach

I used Claude Code, Google Antigravity, and Codex for a month and I have a clear winner for you

Discussions and forums

Show HN: Agent OS – Safety-first platform for building AI agents with VS Code

Show HN: Kintsugi – a local-first safety net for AI agents and humans

Show HN: AgentShield – Stop AI agents from spending money unsupervised

Show HN: GitHub Copilot port of Anthropic's AI vulnerability discovery harness

Show HN: iClaw is part OpenClaw, part Siri, powered by Apple Intelligence

Paper page - MedSkillAudit: A Domain-Specific Audit Framework for Medical Research Agent Skills

Paper page - XL-SafetyBench: A Country-Grounded Cross-Cultural Benchmark for LLM Safety and Cultural Sensitivity

Paper page - One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue

NVIDIA NeMo Agent Toolkit

Paper page - Sparse Autoencoders as Plug-and-Play Firewalls for Adversarial Attack Detection in VLMs

Why cloud native belongs at the heart of agentic AI: Lessons from building a multi-agent security platform on Kubernetes