When Anthropic committed to running its future models on a million of Google's not-yet-shipping tensor processing units last week, the tech press covered the chip specs. That was the wrong place to look.
The better question is the one nobody in the announcement is answering: when an AI agent acts on your behalf, inside someone else's data center, on someone else's silicon, who sets the rules? Anthropic has raised $51 billion in public capital over the past two years, and is now deploying it on a multi-year infrastructure reservation that commits the next generation of its most valuable models to hardware it does not own, in data centers it does not control. That is a commercial relationship with a governance problem hiding inside it.
The immediate announcement was Google's bifurcated chip strategy for the agentic era. The TPU 8t, designed for training, strings up to 9,600 chips together in a single superpod sharing two petabytes of high-bandwidth memory. The TPU 8i, designed for inference, is a different chip entirely: optimized for serving responses from large language models, where memory bandwidth matters more than raw compute throughput. According to Google's Cloud blog, the inference chip carries 384 MB of on-chip SRAM, triple the previous generation, which keeps more of a model's key-value cache resident close to the tensor cores rather than fetching from off-chip memory. The practical result is 80 percent better cost-per-performance than Ironwood for large mixture-of-experts models.
The infrastructure Anthropic reserved sits behind that chip. HyperFRAME Research, a semiconductor analysis firm, estimates the commitment represents multiple gigawatts of capacity beginning in 2027, enough to power a mid-sized city, committed to a single customer's model generation before a single TPU 8 has shipped to a paying customer. Anthropic has raised public capital in three tranches over the past two years, $8 billion from Amazon in February 2024, $13 billion in September 2025, and $30 billion in February 2026, and is deploying it on a reservation with Google and Broadcom that its own blog called well over a gigawatt of 2026 capacity, expanding to 3.5 gigawatts in 2027.
Google Cloud's agentic AI stack, a managed platform that combines a language model with the ability to execute code, call external tools, and maintain memory across sessions, runs on that infrastructure. Customers who use it are deploying autonomous agents on Google's silicon in Google's data centers. The agents can execute tasks, invoke external tools, and maintain memory across sessions. The question of who controls what those agents do, and who is liable when they act, is a question Google is not answering in its marketing materials.
The counterforce is real. Whether Anthropic's TPU bet is actually better economics than running on Nvidia is an open question. As The Register reported, Nvidia's Rubin GPU delivers 35 PFLOPS of FP4 training performance with 288 GB of HBM4 bandwidth. Google has not disclosed per-hour pricing for TPU 8i instances. MLCommons benchmarks for TPU 8 have not been published. As TechCrunch reported, the TorchTPU software layer, which would allow models built in PyTorch to run on Google silicon without a full rewrite, remains in preview. The conversion friction has not been resolved.
Google is also selling Nvidia Rubin instances alongside the TPU 8i. The Vera Rubin NVL72, running in A5X bare-metal instances, remains a first-class product. As Implicator.ai noted, this is the behavior of a company building margin across the board rather than betting its infrastructure future on a single silicon horse.
But the capacity commitment is not in dispute. Anthropic is building for a generation of models that requires compute at a scale that did not previously exist, and it has chosen to build that generation on Google's terms. What Google Cloud's agentic stack does with the agents running on that infrastructure, who can inspect them, who can shut them down, who bears liability when one acts badly, is a question the company has not resolved. It has also not answered why an AI company that raised $51 billion in two years would commit the next generation of its most valuable assets to hardware it does not own, in data centers it does not control.
TPU 8t and 8i chips are expected to be generally available later in 2026. The governance question is not expected to be resolved by then.
Google Cloud spokesperson did not respond to a request for comment by publication. Anthropic declined to comment on infrastructure commitments.