The chip ecosystem is restructuring around a bottleneck nobody was talking about two years ago. In traditional AI data centers, the typical ratio was one CPU for every four to eight GPUs. In the agentic AI era — systems that run continuous reasoning loops, execute code step-by-step, and orchestrate multiple tools simultaneously — that ratio is moving toward one-to-one or even one-to-two, according to TrendForce. Arm estimates that power requirements have climbed from roughly 30 million CPU cores per gigawatt of capacity to about 120 million per gigawatt, a fourfold increase driven by the different ways agentic systems consume compute. In some agent architectures, the CPU-bound portion of the pipeline accounts for up to 90.6 percent of total latency, TrendForce reported, citing arXiv 2511.00739 research published in April 2026.
That is the bet AWS is selling — and Meta is buying. Amazon Web Services announced this week that Meta is now one of the largest Graviton customers in the world, deploying tens of millions of Graviton5 cores at launch with scope to expand. The deal is cloud, not hardware: AWS keeps the silicon in its own data centers, Meta pays for capacity without capital expenditure, per The Next Web. The details that gave the agreement its particular shape are the Graviton5 specs — 192 cores, 3-nanometer process, a cache five times larger than the previous generation, a 33 percent improvement in intercore communication latency, delivering up to 25 percent better performance per core, per AWS's announcement — and the timing. AWS published the deal the same week Google Cloud Next wrapped up, where Google was pitching its own AI infrastructure story to the same enterprise technology buyer audience. The Intel variable is what made that timing exploitable: Intel's 18A manufacturing process has faced delays pushing its next-generation Xeon 6 and 7 server chips out to 2027, a gap AMD and AWS Graviton can fill in 2026. AWS announcing now is filling exactly that window.
The structural shift is real. As AI inference shifts from batch jobs — process a request, stop — to persistent, always-on reasoning loops, the buying criteria moves from peak mathematical throughput toward sustained efficiency and total cost of ownership over years of continuous operation, Network World reported. That is a different conversation than the GPU procurement wars, and it favors long-duration, high-core-count contracts of exactly the kind AWS is offering.
AWS is supplying its most advanced CPU chip to Meta — the company that infrastructure analysts describe as the most credible threat to AWS's core cloud business in three to five years. That is the awkward entanglement at the center of the deal. Neither company is pretending it is comfortable.
Meta is not singularly committed to any single architecture. The company has signed agreements worth a combined $48 billion with CoreWeave and Nebius for GPU access in recent weeks, adding billions more in CPU cloud from AWS on top of existing arrangements with Google and AMD — plus its own MTIA custom silicon. When commitments cross into multi-year, multi-billion-dollar territory, The Next Web noted, the boundary between cloud provider and chip supplier becomes hard to distinguish from the strategic relationship itself.
The exact CPU latency figure — 90.6 percent — traces to a single unreplicated academic paper. The directional claim that CPU-bound tool processing is a meaningful bottleneck is consistent with what AWS described in its own announcement: that agentic AI is creating massive demand for CPU-intensive workloads including real-time reasoning, code generation, search, and orchestrating multi-step tasks, per the AWS Blog.
That is the bet. The question neither company is answering publicly is what happens when Meta's own inference infrastructure matures — and the window AWS is selling closes.