Anthropic built an AI model that can find serious security bugs in existing software. Then it found thousands of them. Fewer than 10 percent have been fixed. That is not a warning about what AI might do to cybersecurity. That is an accounting of damage already done.
The model, called Claude Mythos Preview, achieved something security researchers describe as a step change in the difficulty of finding vulnerabilities. In a single automated run across 7,000 open-source software projects, Mythos produced 595 program crashes at lower severity levels and achieved full control flow hijacks on ten separate, fully patched targets the researchers consider tier 5, meaning the most serious class of remotely exploitable bugs. One of the bugs it found had been sitting in OpenBSD, a widely respected operating system, for 27 years. These findings come from Anthropic's own technical disclosure.
The findings matter because Mythos did this without specialist security knowledge. Anthropic's own engineers, who have no background in vulnerability research, asked the model to find remote code execution bugs overnight. They woke to complete, working exploits, according to Anthropic's technical blog. "Mythos Preview can look across a very complex architecture, including this legacy infrastructure where, frankly, these undiscovered vulnerabilities and complexities are now accessible and threat factors," said TJ Marlin of Guardrail Technologies, speaking to GV Wire.
Anthropic disclosed the results on April 10. Reuters and the New York Times reported on the disclosure within hours. U.S. Treasury Secretary Scott Bessent and former NEC director Lloyd Bentsen warned bank CEOs about Anthropic model risks in separate briefings, Reuters reported. Within days, government officials in the United States, Canada, and Britain had met with top banking officials to discuss the implications, according to GV Wire. The Cloud Security Alliance, an industry group, warned that the model represents a trajectory that lowers the cost and skill floor for discovering and exploiting vulnerabilities faster than organizations can patch them.
The most uncomfortable detail is also the simplest: Anthropic found thousands of vulnerabilities. Fewer than 10 percent have been patched. The rest remain, in software already deployed, in systems already running, available to whoever asks an AI model the right question. Anthropic has not published the specific vulnerabilities it found and says it cannot do so without creating exact exploit blueprints that would make the problem worse. That logic is self-consistent. It does not make the overhang smaller.
There is a counterforce worth naming. Research from the Gray Swan organization and reporting by Fortune suggest that several of the vulnerabilities Anthropic highlighted could have been detected by freely available open-weight models, ones that anyone can download and run without charge. If that is correct, the capability Anthropic is warning about is not confined to its own systems. Spencer Whitman of Gray Swan described the core challenge precisely: finding vulnerabilities requires locating weak points buried within millions of lines of code and verifying that those weak points actually produce working exploits. Mythos, apparently, completed both steps autonomously. The skill floor for that kind of research just dropped for everyone.
Jonathan Iwry of the Wharton School put the governance problem plainly in comments to Fortune: the world is relying on the judgment of a handful of private actors who are not accountable to the public. Anthropic made a deliberate choice to disclose the risk publicly while withholding the specific vulnerabilities. Whether that balance is right is not something the public can evaluate, because the methodology behind the private warnings has not been made available for scrutiny.
What happens next is not clear. The vulnerabilities cannot be disclosed without risk of exploitation. Patching them requires the maintainers of thousands of independent projects to act, many of whom have no idea their code was scanned. Axios reported that the U.S. government held related briefings across multiple agencies. The public record contains a measured, quantified fact: the bugs exist, most of them are still there, and the model that found them is not the only one capable of looking.