Was Compute Scarcity Ever Real? DeepSeek Just Locked In the Price Floor.
DeepSeek made the 75 percent discount on its flagship V4-Pro model permanent on May 22. Nine days early.
The company had said the promotional pricing would end May 31 at 15:59 UTC, according to its own pricing documentation. Instead, it quietly locked in the floor on its own schedule, dropping output costs to $0.87 per million tokens, down from the original $3.48. Input costs fell similarly: $0.003625 per million cached tokens, $0.435 per million cache misses.
DeepSeek declined to say why.
In a statement confirming the permanent cut, the company did not disclose whether increased supply of Huawei Ascend 950 chips was the reason. That omission is the part of the story most worth sitting with.
When DeepSeek launched V4-Pro in April, it gave a clear account of why the model was expensive. Pro would cost up to 12 times more than the Flash version due to constraints in high-end compute capacity, the company said in its launch announcement. Pricing would drop sharply once Huawei Ascend 950 supernodes ship at scale in the second half of 2026.
That H2 2026 timeline is the part that does not line up with the May 22 price action. Mass production on the 950PR began in April, according to people familiar with Huaweis plans, with full-scale shipments targeted for the second half of the year. Huawei had planned to ship around 750,000 units in 2026, Reuters reported in April. ByteDance, Tencent, and Alibaba had all reached out to Huawei about new chip orders following the V4 release.
That is a plausible explanation for the structural price cut. Huawei chips have no access to Nvidias most advanced exports, a constraint U.S. policy created and has maintained. If supply is genuinely accelerating, DeepSeek locking in a permanent discount makes economic sense.
But it is not the only explanation.
The other version: DeepSeek is running a sustained below-cost price to capture developer mindshare while the company that made the Huawei partnership possible is still scaling. If that is true, the $0.87 floor is not a structural equilibrium. It is a promotional price subsidized by capital, designed to make competing at higher output prices economically painful for any lab not operating on Huawei silicon.
Both stories cannot be true simultaneously. The timing is the clue. DeepSeek said pricing would fall once the Huawei chips were available at scale in H2. The price fell before the chips were available at scale. The company will not explain the gap.
The context is harder to ignore when the numbers are laid out. OpenAIs GPT-5.4 charges $15 per million output tokens. Its GPT-5.5 charges $30. DeepSeek is at $0.87. For any developer or company that built a product or a capital expenditure program around the higher tier, the spread between $0.87 and $15 is not a rounding error. It is a margin structure.
The question DeepSeek opened and did not answer is whether that gap was always optional. If the compute scarcity premium was negotiable all along, the companies that priced accordingly were not recovering costs. They were collecting a rent that rested on the assumption that the floor was higher than it turned out to be.