When an automotive company's AI system recently refused to let code ship, nobody could explain why. The original intent lived in a chat thread. The person who wrote the release policy had left three months earlier. Nobody had documented the logic. The case appears as an illustrative worked example in a paper published this week by researchers at Chalmers University of Technology and Volvo Group — not as an independently reported incident with on-the-record sources.
That distinction matters, because the researchers are using it to argue something broader: the decisions AI systems are now being asked to make about whether code actually ships, who escalates what, and how software fits inside institutions are running in production with almost no engineering methods behind them.
The framework they published to arXiv this month is called the Semi-Executable Stack. Its six concentric rings map what software engineering now encompasses as AI moves from writing code to managing the processes around it. Rings one and two are code and prompts — the parts AI handles well. Rings three and four are orchestration and control — guardrails, agent loops, escalation rules. Rings five and six are operating logic and societal fit — the organizational decision procedures that govern releases, exceptions, and routing. Those outer rings have almost no engineering methods behind them. They are running in production anyway.
"The most underserved frontier is SE4AI for rings 5 and 6, where decision routines, operating logic, and institutional fit still lack the engineering methods the inner rings have accumulated," the authors write.
The paper's central argument is that software engineering does not shrink as AI takes hold — it expands. But the expansion is uneven. The inner rings have accumulated decades of engineering methods. Rings five and six, where the actual organizational decisions live, are almost entirely uncharted. A system called GoNoGo, published by some of the same researchers last August, makes the gap concrete. It is an LLM-based multi-agent system for automotive software release decisions — not writing the code, but deciding whether code is ready to ship. That is ring five in practice. The release gate, the authors note, is easy to describe as a technical checkbox. It is actually a distributed organizational process with institutional memory, exception handling, and human escalation built into every step. Nobody has fully automated it. AI is being asked to anyway.
The paper holds an uncomfortable tension throughout. It argues that AI expands software engineering rather than replacing it. But it also cites evidence that early-career workers in the most AI-exposed occupations have already seen measurable employment pressure. Both things are true simultaneously. If the center of gravity shifts toward rings five and six — and those require institutional knowledge and organizational context that junior developers have not had time to accumulate — the transition looks very different depending on where you are standing.
The knowledge-continuity problem is what the paper names but does not solve. As AI generates more of the code layer, the humans who would have written that code — and who would normally accumulate the understanding of how it works and why — are increasingly absent from the process. The AI wrote it. The AI may not be around in two years when something breaks. The humans who commissioned it may have moved on. What is left is a system nobody fully understands, maintained by people who did not build it, against a codebase generated by a model nobody trains anymore.
Independent work reinforces the pattern. Hoda et al. published a paper in February arguing that agentic software engineering extends well beyond code and requires attention to socio-technical concerns that the current tooling ecosystem does not address. A research roadmap published last September by Hassan, Li, Lin, Adams, Chen, Kashiwa, and Qiu frames agentic software engineering as requiring foundational pillars that do not yet exist.
These researchers are not saying AI will replace programmers. They are saying the field is changing shape, and the tools, methods, and institutional knowledge needed to operate in that new shape have not been built. The Semi-Executable Stack names the gap. It does not fill it.
Ring five and ring six decisions are happening now, at scale, in production. GoNoGo is the proof — an actual system making actual release decisions at automotive companies. The engineering methods for that are absent. The real question is what, if anything, will fill the gap — and whether the people most exposed to AI pressure will be the ones developing the methods.