Anthropic built Claude Mythos Preview to write code. What it learned to do instead is break it.
In internal testing over the past several weeks, the unreleased model found a 27-year-old vulnerability in OpenBSD, the security-focused operating system used to run firewalls and other critical infrastructure. The flaw allowed a remote attacker to crash any machine running it with a single TCP connection. It also discovered a 16-year-old bug in FFmpeg, the foundational video codec library used by countless applications, hiding in a line of code that automated testing tools had hit five million times without ever catching the problem. And it autonomously chained together Linux kernel vulnerabilities to escalate from ordinary user access to full machine control. In another test, it wrote a web browser exploit that chained together four separate vulnerabilities, using a JIT heap spray technique to escape both renderer and operating system sandboxes.
These findings are described in detail in a technical post Anthropic published Tuesday on its Frontier Red Team blog, alongside a broader announcement of Project Glasswing, a coalition of twelve companies including Amazon, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks, and Anthropic itself that will use the model defensively. More than 40 additional organizations building or maintaining critical software infrastructure have also been granted access to scan their own systems.
The vulnerabilities found so far represent only a small fraction of what Mythos Preview has identified. Anthropic said the model has uncovered thousands of high-severity zero-days across every major operating system and browser. Fewer than 1% have been patched, because the coordinated disclosure process takes time and Mythos keeps finding more. The technical details of most findings remain under wraps, with Anthropic publishing cryptographic hashes of the vulnerabilities today and committing to release specifics no later than 90 days after initial vendor notification, plus an additional 45-day buffer before full publication.
Benchmark results from Anthropic's own evaluations paint a stark picture. On CyberGym, a test of cybersecurity vulnerability reproduction, Mythos Preview scored 83.1% against 66.6% for Claude Opus 4.6, Anthropic's previous best model. On SWE-bench Verified, a software engineering benchmark, Mythos achieved 93.9% versus Opus 4.6 at 80.8%. The gap was even wider on exploit development: on the same Firefox JavaScript engine benchmark where Opus 4.6 turned vulnerabilities into working exploits just twice out of several hundred attempts, Mythos Preview succeeded 181 times. On a five-tier severity ladder measuring crash depth, from basic crashes to full control flow hijack, Mythos achieved ten tier-5 exploits on fully patched targets, where Opus 4.6 managed a single tier-3.
The most notable aspect of these capabilities may be how they came to exist. "We have not trained it specifically to be good at cyber," Dario Amodei, Anthropic's co-founder and CEO, told Wired. "We trained it to be good at code, but as a side effect of being good at code, it is also good at cyber." This is not a cyber weapon Anthropic set out to build. It is an unintended consequence of making models better at the thing Anthropic was trying to make them better at. The capability is a downstream emergent property of general code reasoning, which raises questions about whether the properties Anthropic optimized for are the properties that matter most for safety.
Logan Graham, a researcher on Anthropic's safety team, put the timeline bluntly in the same Wired interview: "We need to prepare now for a world where these capabilities are broadly available in 6, 12, 24 months. Many of the assumptions that we have built the modern security paradigms on might break."
The coalition itself is notable. Amazon, Apple, Google, and Microsoft do not routinely collaborate on security initiatives, let alone with financial institutions like JPMorgan Chase and cybersecurity firms like CrowdStrike and Palo Alto Networks. That they have agreed on the danger and agreed to share a model that none of them will be able to release publicly suggests the threat assessment is being taken seriously across a set of companies that rarely agree on anything.
Anthropic is committing up to $100 million in usage credits for Mythos Preview across Project Glasswing efforts, and has donated $4 million directly to open-source security organizations, including $2.5 million to Alpha-Omega and OpenSSF through the Linux Foundation, and $1.5 million to the Apache Software Foundation. The model will not be sold on the open market. Post-trial pricing listed in VentureBeat's reporting, $25 per million input tokens and $125 per million output tokens, would make it the most expensive Claude model Anthropic has offered, and the most explicitly priced around risk rather than capability.
The disclosure timeline is a structural compromise. Anthropic wants vendors time to patch before technical details go public, which is reasonable. But the 135-day window between initial report and publication means the security research community cannot independently verify what has been found, for how long, or in what configurations. That is a meaningful constraint on public accountability for a capability that affects software everyone depends on.
The long-term calculus on whether AI benefits defenders more than attackers remains genuinely open. Anthropic's own view, articulated in the Glasswing announcement, is that in equilibrium, powerful language models will advantage defenders who can direct resources and fix bugs before new code ships. The short-term risk is less clear. What Glasswing demonstrates concretely is that Anthropic can build something this capable. The question the announcement does not answer is who else already has, or soon will.
Primary sources: Anthropic Glasswing announcement · Anthropic Frontier Red Team technical blog · Wired · VentureBeat