The Bug Report Is Dead. Long Live the Bug Report.
The Bug Report Is Dead. Long Live the Bug Report.
When something breaks in your software, the ancient ritual is simple: you write down what you did, what you expected, and what happened instead. That bug report — three or four sentences from a human who actually saw it — is the raw signal that maintainers have used for decades to fix things. It is also, now, being destroyed.
Armin Ronacher has the numbers to prove it. Ronacher, the creator of Flask and Rye and a principal at Earendil-Works, spent 90 days pulling public GitHub data from the issue tracker for Pi, the AI coding agent that his colleague Mario Zechner has been building. The count: 3,145 external issues and pull requests from non-team members. Of those, 2,504 were auto-closed because they came from unapproved accounts — a 79.6 percent auto-close rate. Of the ones that made it through, fewer than 10 percent of pull requests were merged. Seventeen percent got reopened after a human took a second look.
The numbers are ugly, but they are not the real complaint. The real complaint is in the quality of what came through.
"The most frustrating failure mode is that people submit issues that are not in their own voice. They contain an observed problem somewhere, but it has been thrown into a clanker and the clanker reworded it and made a huge mess of it. Typically, it was prompted so badly that the conclusions produced are more often than not inaccurate but always full of confidence."
What arrives in the issue tracker is not a human's description of a bug. It is a human's description of a bug, processed through an AI that hallucinated a root cause, invented a minimal reproduction case, suggested three implementation strategies, and cited the wrong part of the codebase. The prose is confident. The stack trace is real. The diagnosis is useless. And because Pi itself uses issue descriptions as prompts — as inputs to a second AI that will try to reproduce and fix the problem — that confident wrong analysis gets baked into the next step of the process. Pi has a workaround: a slash command called /is that explicitly tells the agent not to trust the issue analysis and to derive its own diagnosis from the code. Ronacher admits it does not fully work.
This is not a niche problem. Daniel Stenberg, who maintains curl, dropped his bug bounty program on January 31 of this year. For six years the program had paid out more than $100,000 across 87 confirmed vulnerabilities, with roughly 15 percent of submissions turning out to be real. Starting in 2025, that rate collapsed to below 5 percent. "Not even one in twenty was real," Stenberg wrote. "The never-ending slop submissions take a serious mental toll to manage and sometimes also a long time to debunk." He also noticed a change in tone: AI-assisted reporters argued harder, submitted with more entitlement, and were less interested in actually fixing the problem. Stenberg banned the platform he used and moved to direct GitHub reporting. He also banned and publicly ridiculed people who sent AI slop. "I believe the best and our most valued security reporters still will tell us when they find security vulnerabilities," he wrote. "Without the incentive of money."
GitHub has responded with a feature that lets projects disable pull requests entirely. Whether many projects are actually using it is not well documented, but the fact that it exists is itself an indicator of where the ecosystem thinks this is heading.
Ronacher's diagnosis is the most honest framing I have seen: the problem is not volume. The problem is that the act of describing what happened — the human's job — is being outsourced to a machine that cannot do it reliably. When you delegate the description, you do not just introduce noise. You remove the thing that made the feedback loop work in the first place. A bad human bug report is vague and incomplete, but it is honest. A bad AI bug report is precise, confident, and wrong, and it takes more time to refute than a vague human one.
"What the human actually observed is enough," Ronacher wants instead. "If you used an LLM to understand the problem, great, maybe leave it as a follow-up comment. But the issue and the issue text should be something you own. If your repro is a guess, say that."
The irony is that the people who built the tools generating this slop — Ronacher, Zechner, the broader Earendil-Works team — are the first ones documenting its costs in detail. They are not warning from outside. They are inside the machine, watching it eat the feedback mechanism they depend on. That is what makes the story different from the usual AI-is-coming-for-X lamentation. This is a direct observation from people who built the thing and are now measuring its consequences with GitHub data.
Whether this generalizes beyond a handful of high-profile open source projects is the open question. The curl experience suggests it does: the collapse in useful vulns was sharp enough and recent enough that Stenberg could point to a before and after with real numbers. The next thing to watch is whether the bug bounty model, which has been a pillar of open source security funding for a decade, becomes untenable for projects that cannot afford to lose 95 percent of their triage time to slop. If it does, the people who lose are not the well-funded companies with security teams. They are the maintainers of small, critical libraries that millions of projects depend on — and who have no institutional mechanism to replace what bug bounties used to provide.