The US government just forced Anthropic to pull its most advanced AI models
… The core of the issue appears to center on a discovered “jailbreak” — a technique used to bypass an AI model’s built-in safety guardrails. …
… The core of the issue appears to center on a discovered “jailbreak” — a technique used to bypass an AI model’s built-in safety guardrails. …
… It adds that Sol will have the company’s “most robust safety stack to date,” with strengthened protections against higher-risk activity, sensitive cyber requests, and repeated misuse. …
… Another area of friction was training the AI on sexual content. This posed as a challenge due to the models previously being trained to avoid such conversations for safety reasons. This isn’t the first time OpenAI has made news this week. …