Search

Showing top 121 results for "AI safety"

All sources xda-developers.com 11 techcrunch.com 9 theverge.com 9 huggingface.co 8 wired.com 6 cnet.com 6 developer.nvidia.com 5 amd.com 5 blogs.nvidia.com 5 engadget.com 4 theregister.com 4 fudzilla.com 4

Videos

Paper page - When No Benchmark Exists: Validating Comparative LLM Safety Scoring Without Ground-Truth Labels

…Sushant Gautam , , , , , , , , Abstract Comparative safety scoring without labeled benchmarks relies on scenario-based audits with validity chains measuring responsiveness, variance dominance, and stability to establish deployment evidence. AI-generated summary Many deployments…

May 8, 2026

I used Claude Code, Google Antigravity, and Codex for a month and I have a clear winner for you

…Anthropic is primarily an AI safety and research company. Its founding mission is rooted in making AI that is safe and understandable, which is why safety-focused training methods like Constitutional AI…

May 11, 2026 · Parth Shah

3 new ways Ads Advisor is making Google Ads safer and faster

…Summaries were generated by Google AI. Generative AI is experimental. Bullet points This article highlights three new safety features in Google Ads Advisor, making Google Ads safer and faster. Ads Advisor now…

Apr 21, 2026 · Priya Baliga

Roblox releases agentic AI tools for creators, promising ability to "build a game with a single prompt"

Roblox has released an agentic AI feature for its Roblox Studio tool for game creators, which it claims will turn text prompts into a game design document which the tool can then…

Apr 17, 2026 · News by Jon Hicks Editorial Director

Discussions and forums

Hacker News · u/mosiddi · Jan 30, 2026

Show HN: Agent OS – Safety-first platform for building AI agents with VS Code

Hi HN, I built Agent OS because I was tired of the "orchestration tax" – writing the same safety checks, memory management, and tool-handling code in every AI agent project. What it does: - Visual policy edit…

Hacker News · u/lucarizzo1010 · 1w ago

Show HN: AgentShield – Stop AI agents from spending money unsupervised

I'm a recent grad from UMich and built AgentShield because agentic AI is moving fast but payment safety hasn't caught up. Agents are already being handed API keys, stablecoin wallets, and payment credentials - if one mis…

2 1

Hacker News · u/podlp · Apr 28, 2026

Show HN: iClaw is part OpenClaw, part Siri, powered by Apple Intelligence

Hi HN,Last month at a SundAI hackathon, my team built a prototype for an app called iClaw. The goal was to develop an AI agent using Apple Intelligence. I've since continued hacking away at this idea when I had time, and…

Hacker News · u/rbuccigrossi · 4d ago

Show HN: Decoding the Language Machine – AI video series and CC repo

Hi HN! I released 3 parts of an educational video series (out of 6 planned), paired with a GitHub repository containing scripts and artifacts (released under Creative Commons).- Main Site: https://skepticcto.com/ (includ…

r/LocalLLaMA · u/OttoRenner · 3d ago

Stop traumatizing AI into loops and turn hallucinations into an honest "I don't know!" by being NICE to them (Proof of Concept, Research, I don't want to sell anything)

!UPDATE!(20.05.2026) WE HAVE NEW NUMBERS FROM 1.500+ TESTS IT'S WORKING! check my update post https://www.reddit.com/r/LocalLLaMA/s/AyNOehjkYT Or the go straight to the my Github https://github.com/OttoRenner/Gentle-Codi…

Followed topics

Search

People also ask

Videos

Paper page - When No Benchmark Exists: Validating Comparative LLM Safety Scoring Without Ground-Truth Labels

Top stories

Illinois Lawmakers Just Passed America’s Strongest AI Safety Bill

Trump scrapped a major AI safety plan — here’s why that matters for ChatGPT users

Apple Provides Update on App Store, Highlights Key 2025 Safety Stats

Former OpenAI Staffers Warn That xAI’s Poor Safety Record Could Complicate SpaceX’s IPO

I used Claude Code, Google Antigravity, and Codex for a month and I have a clear winner for you

3 new ways Ads Advisor is making Google Ads safer and faster

Roblox releases agentic AI tools for creators, promising ability to "build a game with a single prompt"

Discussions and forums

Show HN: Agent OS – Safety-first platform for building AI agents with VS Code

Show HN: AgentShield – Stop AI agents from spending money unsupervised

Show HN: iClaw is part OpenClaw, part Siri, powered by Apple Intelligence

Show HN: Decoding the Language Machine – AI video series and CC repo

Stop traumatizing AI into loops and turn hallucinations into an honest "I don't know!" by being NICE to them (Proof of Concept, Research, I don't want to sell anything)

I replaced Claude Pro with a local 9B model for a week, and finally found out what I was paying $20 a month for

Families of Tumbler Ridge shooting victims sue OpenAI - Engadget

Claude Code with a local LLM running offline is the hybrid setup I didn't know I needed

I use OpenCode over Claude Code, and it's every bit as good

I replaced ChatGPT and Claude with this powerful local LLM and saved over $20 a month while gaining full control

Mira Murati tells the court that she couldn’t trust Sam Altman’s words