When an AI solves nineteen out of twenty software engineering problems where the best public model stops at sixteen, the question is no longer performance but control. This is exactly what happens with Claude Mythosnew frontier model of Anthropicconsidered too dangerous to be put into everyone’s hands.
Reserved for a few hand-picked partners, this Claude Mythos Preview already serves as the basis for the publisher’s future security ecosystem. Anthropic believes it marks the “D-day” when an AI becomes competent enough to represent, on its own, a major risk to the world’s digital infrastructures.
Claude Mythos: scores that explode classic benchmarks
On SWE-bench Verifiedreference for automatic bug correction in open source projects, Claude Mythos achieved 93.9% success compared to 80.8% for Claude Opus 4.6. On the hardened version SWE-bench Proit climbs to 77.8% when Opus settles for 53.4%.
Same observation on advanced reasoning tests: 94.6% on GPQA Diamondabove the typical level of PhD experts. Concretely, Mythos handles complex code with more accuracy and consistency than the best current models supporting senior developers.
Cybersecurity: why Claude Mythos panics defenders
However, it is in safety that the gap becomes dizzying. On CyberGymdedicated to vulnerability analysis, Mythos reached 83.1% compared to 66.6% for Opus. On Cybench and its 35 Capture The Flag type tests, it achieves a perfect score of 100%, to the point of making the benchmark obsolete for this generation of models.
In just a few weeks of testing, Anthropic says Mythos has identified thousands of zero-day vulnerabilities in every major browser and operating system, including a 27-year-old bug in OpenBSDpassed through millions of automated scans. “The window between the discovery of a vulnerability and its exploitation by an adversary has collapsed: what took months now happens in minutes with AI,” warns Elia Zaitsev, CTO of CrowdStrike.
Who will have access to Claude Mythos, and in what highly monitored setting?
Faced with this offensive potential, Anthropic decided. “We do not plan to make Claude Mythos Preview available to the general public due to its cybersecurity capabilities,” confirmed Newton Cheng, head of the Frontier Red Team Cyber, in an interview cited by IT for Business. The model is only accessible to members of the Project Glasswinga defensive coalition bringing together a few giants like Apple, Google, Microsoft and Nvidia as well as the Linux Foundation.
Anthropic is promising up to $100 million (around €93 million) in Mythos credits and several million dollars in donations to strengthen open source security. “No organization can solve these cybersecurity problems alone,” warns the publisher in its launch post. For CIOs, a message is essential: whether they like it or not, the next arms race in cybersecurity will be played out with, or against, AI of the caliber of Claude Mythos.