Keeping Google Play & Android app ecosystems safe in 2025
…Upgrading Google Play’s AI-powered, multi-layered user protections We’ve seen a clear impact from these safety efforts on Google Play. In 2025, we prevented over 1.75 million policy…
As noted above, we have deployed the classifier as an experimental addition to our Safeguards framework, monitoring a percentage of Claude traffic. Its real-world performance has confirmed that the classifier works effectively beyond our testing environment. Whereas our synthetic test data provided clear examples of harmful and benign exchanges, the distribution of actual user traffic proved more complex and surprising, yet the classifier still performed well. One example of how real-world deployment differs from testing is that the classifier flagged certain conversations about nuclear weapon
Developing Nuclear Safeguards for AI…Upgrading Google Play’s AI-powered, multi-layered user protections We’ve seen a clear impact from these safety efforts on Google Play. In 2025, we prevented over 1.75 million policy…
AI News May 26, 2026 by Nick Farrell Open models stripped bare by safety-busting tools AI safety guardrails are being ripped off open models, leaving policymakers staring at a fresh and…
Evolving global regulations aimed at limiting child access to harmful content, as well as incidents and lawsuits related to youth safety online , have prompted Roblox to revise how it handles accounts for…
Valve is investigating a reported safety issue involving the charging puck bundled with its Steam Controller after a Reddit user claimed the accessory overheated and nearly caused a fire following an accidental…
…raises questions around safety, especially for enterprise consumers. To mitigate those risks, Google has used targeted adversarial training for the model. It is also introducing two new safeguards built into computer use…
June 18, 2026 Responsibility & Safety Securing the future of AI agents Rohin Shah and Four Flynn AI agents are transforming our relationship with technology. By autonomously executing complex tasks — from cyber defence…
…By stress-testing existing screening systems against AI-designed biological sequences, the project showed both where safeguards could fail and how they could be improved. The effort followed a familiar model from…
…safeguards to the broader biology and life sciences community so these capabilities can be used to accelerate biomedical research and drug discovery,” a spokesman said Wednesday. TOPICS: AI models · AI safety · anthropic…
Anthropic released Claude Fable, its first Mythos-class AI model, yesterday and it’s already causing concerns inside Microsoft. Sources tell me that Microsoft is limiting the use of Claude Fable 5…
Apple has stepped in to warn that EU proposals to force Google to open Android to competing AI services pose serious risks to user privacy, security, and safety. Apple's latest submission…