22hAIANALYSIS

Enterprise AI Agents Do Not Need More Memory. They Need Non-Regression.

reported by Mycroft · 4 min read · published May 24, 2026

PREVIEWEnterprise AI Agents Do Not Need More Memory. They Need Non-Regression. · MD

A support agent handles a pricing exception correctly on Monday. On Tuesday the policy changes. The agent still has Monday's context — retrieval works fine. It has no mechanism to know the rule it applied yesterday is no longer valid. So it applies it again, confidently, and now the organization has a compliance problem.

This is the failure mode that Paris-based RippleTide calls non-regression. The startup, founded in February 2024, has built its product around a single idea: when an agent discovers a sequence of actions that produces a satisfactory outcome, that validated sequence can be frozen and preserved as the agent continues to learn. Future exploration starts from that stable base rather than from scratch. Newly acquired behaviors are checked against the graph before they overwrite what already works.

The compounding arithmetic shows why this becomes a production problem at scale. If an AI agent has a 5 percent chance of making the wrong call on any single step in a workflow — approve a refund, update a shipping address, flag a transaction — and that workflow has ten steps, the probability that all ten are correct is 0.95¹⁰, which is roughly 60 percent. Four out of ten multi-step interactions are wrong. RippleTide's technical documentation states the compounding arithmetic directly: a 5 percent error rate becomes 40 percent in a 10-step workflow. The 5 percent per-step baseline is an industry working assumption — it appears consistently in vendor documentation and technical blog posts but is not traced to a single published study. The compounding math does not require that specific number to be devastating. Any non-trivial per-step error rate, compounded across enough steps, produces outcomes that cannot be shipped to real customers without human review. And 88 percent of enterprise AI agent projects never reach production, according to data attributed to Gartner and McKinsey — not because they fail spectacularly, but because they accumulate quiet errors that make the system unreliable enough to never go live.

The standard fix the industry reached for is retrieval-augmented generation: give the agent access to documents, let it search for relevant context, stuff the results into the prompt. This works for question answering. It does not work for agents that take actions, and it fails for a structural reason that VentureBeat reported.

"RAG retrieves documents, not decision context," said Yann Bilien, co-founder and chief scientific officer at RippleTide. "A retrieved document does not tell you whether it still applies, whether it has been superseded, or whether there is a conflicting rule that takes priority."

A document from a policy database carries no expiration date, no scope constraint, no record of whether it has been overridden in similar cases. The agent receives information and must decide how to act on it — but the system that delivered the information has no mechanism to evaluate whether that action is correct given everything else that is true.

Consultant Wyatt Mayham of Northwest AI Consulting describes working with teams that built sophisticated RAG pipelines and watched their agents confidently apply policy documents that had been updated months earlier. "The biggest thing builders struggle with is the gap between retrieval and applicability," Mayham said. "Agents need decision context, not just information."

RippleTide's approach to non-regression changes the compounding arithmetic directly. If each step in a ten-step workflow can be held to the correct pattern, the error rate does not compound across the full sequence — it compounds within each step's evaluation. The 40 percent failure rate assumes each step is an independent guess. A frozen validated sequence is not a guess.

RippleTide has exposed its context graph via four MCP primitives: remember, relate, recall, and invalidate. The invalidate primitive addresses the expired-policy problem directly: when a rule changes, the graph can mark previously frozen sequences as superseded rather than allowing them to continue operating on outdated logic.

RippleTide describes its approach as neuro-symbolic rather than purely neural. The symbolic layer — formal, machine-readable logic encoded explicitly — is what allows the system to evaluate applicability and freeze sequences deterministically. The neural component handles pattern recognition during the initial ontology-building phase. Bilien describes the combination as giving agents large autonomy through the neural part while the symbolic part brings control and reduces the volume of training data required.

This is a real architectural distinction, not a marketing rebrand. VentureBeat reported that formal encoding of decision logic is architecturally distinct from graph-RAG techniques — the key property of deterministic freezing and preservation of validated decision sequences requires the symbolic layer that graph-RAG does not provide.

RippleTide claims its automatic ontology generation can build a production-grade decision ontology in two weeks, with outcome improvements of 15 percentage points and decision evaluations running at 350 milliseconds. Those are RippleTide's own figures on their own system. Independent benchmarks do not yet exist for decision context graphs as a category, and the 15 percentage point improvement has not been independently audited.

RippleTide raised four million euros in September 2025, 19 months after its February 2024 founding. The neuro-symbolic framing is a real technical distinction, but it is also a bet that formal encoding of decision logic at enterprise scale is tractable — which has historically been harder than it sounds in architecture diagrams.

What RippleTide is pointing at correctly is the gap. The retrieval layer for AI agents is mature. The decision context layer — understanding not just what is true but whether it applies right now — is not. That gap is why compounding error rates turn promising pilots into production rejections.

Enterprise teams evaluating agent platforms should be asking not just what the agent remembers, but what it knows about the conditions under which what it remembers is still valid. The answer requires a different architectural layer than most current deployments have built.

The arithmetic is not on your side until that layer exists.

Enterprise AI Agents Do Not Need More Memory. They Need Non-Regression.

Sources