Micron has started shipping a client SSD that loads a 20 billion parameter AI model in tests whose conditions the company has not fully disclosed. The drive is the Micron 3610, an NVMe device using PCIe Gen5 and QLC NAND flash — the kind of storage that used to be for mass archival, not inference. The specs check out: 11 gigabytes per second sequential read, 1.5 million random read IOPS, up to 4 terabytes in an M.2 2230 form factor. EE Times has the details.
The announcement itself is thin — a product brief, an EE Times writeup, Micron's internal test numbers. But the framing around it is loud. Christopher Moore, Micron's VP of marketing for the mobile and client business unit, said the company foresees commercial client computing increasingly requiring local AI capabilities. That is deliberate positioning: the PC as a competitor to the cloud, not an accessory to it.
The cloud side is not responding. type0 reached out to three major hyperscalers. None would comment on whether on-device inference represents a meaningful threat to their inference revenue. One spokesperson called the question interesting and asked for a follow-up. That email was not answered before publication.
The NAND market backdrop makes Micron's timing less surprising. Combined revenue for the top five NAND suppliers reached $21.17 billion in the fourth quarter of 2025, up 23.8% quarter-over-quarter, according to TrendForce. Micron's NAND revenue was $3.03 billion, up 24.8% quarter-over-quarter in the same period. Spot prices are projected to surge 85 to 90% quarter-over-quarter in the first quarter of 2026. NAND suppliers are not framing this as a cyclical recovery. They are framing it as a structural shift in where AI workloads run.
QLC flash is central to the economic argument. Quad-level cell NAND stores 25% more bits per wafer than triple-level cell, which means lower cost per gigabyte at a given capacity. For a device that needs to hold multi-gigabyte model weights and serve them at line-rate speed, wafer economics matter more than the per-terabyte ASPs that dominate NAND earnings calls. The 3610 uses Micron's G9 NAND at 276 layers. The performance-per-watt claim — 43% better than a Gen4 TLC drive — comes from a DRAM-less architecture using host memory buffer and low-power device sleep states.
Correction: An earlier version of this article cited a 13.3 gigabytes per second model load requirement, which exceeds the 3610's 11GB/s sequential read specification. The specific figure has been removed. The under-three-second claim is from Micron's own internal benchmarks; the test conditions have not been publicly disclosed.