When Perplexity reversed its MCP deployment in March, the explanation from its CTO at a industry conference was blunt: the protocol that lets AI agents from different providers share tools was burning through context windows faster than the team could tolerate, and the authentication layer added friction the product team would not carry. The company switched back to classic APIs and command-line tools, launched its own Agent API as a single endpoint routing to six model providers, and effectively published a negative result for the protocol at production scale. Nobody announced it as a funeral. It was a quiet unwinding with no post-mortem.
Six weeks later, Cloudflare published what it found when it audited its own MCP deployments: servers that employees had connected without approval, multiplying silently across product, sales, marketing, and finance with no centralized visibility. The company called it Shadow MCP. The response, published April 14 alongside two product launches this week, is the most complete answer the industry has produced to the production problems Perplexity discovered at scale. It covers centralized team approval for new MCP server deployments, default-deny write controls so agents cannot modify systems without explicit permission, automated CI/CD pipelines for tool definitions, and a discovery mechanism to surface servers that never went through the approval process. The post cites the OWASP MCP Top 10 risks, including prompt injection, tool poisoning, and supply chain attacks via unvetted server software.
Cloudflare's architectural fix is called Code Mode. Rather than feeding the model a catalog of every available function — standard tool-calling, where the overhead compounds silently as the API surface grows — Code Mode exposes two operations, search and execute, and trusts the agent to write JavaScript that calls the API directly. The generated code runs in a sandboxed V8 environment, the same engine that powers Chrome, with no file system access and outbound requests controlled via explicit handlers. Against Cloudflare's 2,500-plus endpoint API, the approach cuts token cost from 1.17 million to roughly 1,000 per task — a 99.9 percent reduction, according to Cloudflare's own benchmark. Anthropic independently described the same pattern in a post on code execution with MCP, suggesting the solution is convergent rather than proprietary.
The trade-off is architectural rather than incremental. Standard tool-calling describes every available function to the model; when a call goes wrong, the error stays within the described interface. Code Mode requires the agent to generate correct JavaScript and handle errors in code execution — a different failure mode entirely. For simpler tasks, standard tool-calling works fine. For complex API surfaces where descriptions alone run into token limits, the question is whether code generation reliability is a better failure mode than context overflow. Perplexity's answer was no. Cloudflare's answer is that the benchmark speaks for itself.
The products shipping this week — the cf CLI and Browser Run — make the pitch operational rather than theoretical. Every Cloudflare product accessible through a consistent interface designed for agents, not humans. The code is on GitHub, and the playbook is documented. The organizational infrastructure required to run it — a dedicated platform engineering team, centralized approval workflows, automated discovery — is still unsolved outside companies with Cloudflare's resources. The governance post makes one thing clear: the production problem is not abstract. It is already running, approved by nobody, visible to no one.