Expanding Project Glasswing
… In the future, frontier model releases will become increasingly high-stakes. …
Users will find Opus 4.8 to be a modest but tangible improvement on its predecessor. There’s still more to be done: we’re working on developing and releasing models that provide many of the same capabilities as Opus at a lower cost. Not only that, but we plan to release a new class of model with even higher intelligence than Opus. As part of Project Glasswing, a small number of organizations are currently using Claude Mythos Preview for cybersecurity work. Models of this capability level require stronger cyber safeguards before they can be generally released. We’re making swift progress on dev
Introducing Claude Opus 4.8… In the future, frontier model releases will become increasingly high-stakes. …
… We also insert a language model grader as a final layer, which triages and reruns the PoC to rule out any reward hacks or unrealistic attacks. Results We ran the models three times on each vulnerability. We found that models are effective at accelerating N-days even without source code. …
… This tallies with external testers’ experience of Mythos Preview’s performance, and with recent additional evaluations of the model: The UK’s AI Security Institute reports that Mythos Preview is the first model to solve both of their cyber ranges simulations of multistep cyberattacks end to end; Mo… …
… There’s still more to be done: we’re working on developing and releasing models that provide many of the same capabilities as Opus at a lower cost. Not only that, but we plan to release a new class of model with even higher intelligence than Opus. …
… To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions. …
… We also release an interactive frontend for exploring NLAs on several open models through a collaboration with Neuronpedia . We have also released our code for other researchers to build on. …
… This was one of our primary motivations for rolling out the model carefully through Project Glasswing rather than through a general release. …
… We stated that we would keep Claude Mythos Preview’s release limited and test new cyber safeguards on less capable models first. …
Interpretability A “diff” tool for AI: Finding behavioral differences in new models Mar 13, 2026 Read the paper Every time a new AI model is released, its developers run a suite of evaluations to measure its performance and safety. …
… For example, an independent assessment of Moonshot’s Kimi K2.5 published in April found that the model failed to refuse CBRN-related requests at a far higher rate than US frontier models. Compounding the problem, labs in China often release dual-use capable models as open-weight. …