Eight openly published AI models can now find the same vulnerability that commercial systems charge premiums to detect. Flyingpenguin tested eight models flyingpenguin — the smallest a 3.6 billion parameter model running at $0.11 per million tokens, about a hundredth the price of the largest commercial systems — and every one flagged CVE-2026-4747, a flaw in a remote procedure call library that shipped before the iPhone and still runs in FreeBSD servers worldwide. The finding, published April 14, is the most concrete evidence yet that AI-driven vulnerability detection is becoming a commodity: if eight independent models can all catch the same bug, the question is no longer whether AI can find flaws at scale, but what happens to the economics underneath.
The CVE database tells a parallel story. Since Anthropic's formal research program began in February, 40 vulnerabilities carry its attribution — one of them tied to Project Glasswing, the consortium backed by more than $100 million in partner commitments from Apple, Cisco, Google, JPMorgan Chase, Microsoft, and Nvidia Anthropic.com. Flyingpenguin found the same flagship flaw independently for twelve cents. The attribution gap is real, but the more consequential observation is what commoditization of detection means for the security industry The Register.
Flyingpenguin's finding matters because it makes detection reproducible and cheap. Eight models of varying sizes, including one that fits on a laptop, all found the flaw that predates the iPhone. The inventory of findable bugs is large and the tools to find them are getting cheaper. What changes when detection stops being a frontier capability and becomes a commodity is not the bugs themselves. It is the economics underneath them.
The Council on Foreign Relations published an analysis the same week Anthropic announced Mythos, arguing that AI-driven vulnerability discovery at scale represents a genuine inflection point in global security CFR. Yoshua Bengio assessed that a threshold had been crossed at the end of 2025. Discovery is accelerating. So is the remediation bottleneck that follows.
Discovery is commoditizing. Remediation is not. Open-source maintainers, automated patch pipelines, vendor coordination — these are where the actual bottleneck lives. The FreeBSD advisory, PGP-signed and dated March 26, credits Nicholas Carlini using Claude — not the specialized Mythos system FreeBSD. Carlini had published a paper in February showing that same general-purpose model found more than 500 vulnerabilities in open-source software, predating the specialized system announcement by two months. Nine days before the Mythos announcement, a separate project called MAD Bugs produced working exploits for the same flaw using Opus 4.6 in approximately four hours of compute time. The gap between a hundredth of a dollar and a patch that takes three months to land is where the next problem lives.
Anthropic has committed $4 million in donations to open-source security organizations and $100 million in Glasswing usage credits. That is a real investment in the supply chain it cannot own through capability alone. What it cannot buy is the time it takes humans to fix what machines now find.
The question Flyingpenguin answered is not whether AI can find vulnerabilities at scale. Eight models already do. The question is who closes the gap between what gets found and what gets fixed — and whether that answer arrives before the next batch does.