glTF 2.1 Lifts the 4GB Ceiling and Standardizes Multi-File Scene Graphs
Khronos' open 3D delivery format grows from a single asset container to a system designed for the factory floors, BIM drops, and city blocks that modern pipelines actually ship.
Khronos' open 3D delivery format grows from a single asset container to a system designed for the factory floors, BIM drops, and city blocks that modern pipelines actually ship.
For nine years, glTF 2.0 treated every 3D scene as one file. That assumption broke down the moment factory-digital-twin developers, BIM architects, and smart-city builders tried to ship scenes assembled from hundreds of vendor-supplied assets. The glTF 2.1 release announced by the Khronos 3D Formats Working Group on 11 June 2026 is the direct fix: a 64-bit binary format that lifts the 4GB ceiling, a standardized multi-file scene graph that composes external assets declaratively at load time, and embedded thumbnails that let reviewers inspect a scene without spinning up a 3D engine.
The release is backward-compatible with the 2.0 spec that became the web's default 3D interchange language in 2017. Phoronix's same-day write-up frames it as a maturation step: an open standard finally catching up to the scale of the pipelines that adopted it. The headline number is the 64-bit binary GLB. Version 3 of the binary container raises the per-chunk size cap from 4 GiB to 8 EiB, while still reading existing version 2 files. For teams who have been splitting single scenes across multiple GLBs because they hit the ceiling, that single change dissolves a real production tax.
The more interesting work is the structural shift. glTF 2.0 was designed for shipping one asset: a model, a material, an animation. Modern pipeline builders are not shipping one asset. They are shipping compositions: a factory floor assembled from per-machine CAD exports, a city block assembled from per-building BIM drops, a simulation environment assembled from a library of parts. 2.1 turns that composition into a first-class operation rather than a workaround. The new "files" array at the top of a glTF document works as a virtual file system, listing every external buffer, image, mesh, or arbitrary resource the scene references. External Assets then let a glTF document reference another glTF declaratively, with the reference resolved at load time, closing out the years-long community discussion about a companion "glTFX" format that never quite landed. Packaging layers on top, letting a single multi-file scene behave as one self-contained unit without inventing a new container.
Around that core, 2.1 adds the geometry primitives that spatial workloads actually need. A new "shapes" array exposes box, sphere, capsule, cylinder, and plane primitives at the top level, superseding the KHR_implicit_shapes extension, and each node now carries a "boundingVolume" property built from those shapes. That is a Bounding Volume Hierarchy, the data structure that makes spatial queries, frustum culling, and level-of-detail selection tractable on large scenes. The Khronos blog post is explicit that this is a foundation for the progressive streaming work still to come, drawing on patterns OGC 3D Tiles already proved out: spatial subdivision, availability maps, LOD, quality metrics, and streaming hints.
Quality-of-life additions are smaller but felt. First-class thumbnails let a glTF ship a preview image, either embedded or as a sidecar, that a review tool can display without booting a renderer. Four previously optional extensions become required in 2.1 implementations: EXT_texture_webp, KHR_materials_emissive_strength, KHR_mesh_quantization, and KHR_node_visibility. The rarely-used multiple-scenes pattern from 2.0 is formally deprecated, though 2.1 importers must still read it and existing multi-scene files remain valid.
The positioning matters as much as the mechanics. The Khronos post is careful to keep glTF as a delivery and runtime format, complementary to authoring ecosystems like OpenUSD. The model is: build the scene in OpenUSD, deliver the scene as glTF. That is a deliberate division of labor, and it is the right one for a working group whose members ship rendering runtimes, not DCC tools. Amanda Morgan, Khronos 3D Formats co-chair and senior director of Open Standards at Bentley Systems, frames the work in the post as a "Design First" effort: lean, backward-compatible additions chosen for impact rather than completeness.
The honest caveat is ecosystem. The spec work is public and traceable through the glTF GitHub repository (master tracking issue #2585, with feature explainers #2586 through #2594), and 2.1 is binary-readable by every existing 2.0 importer. First-class support for the new features, however, lives in whichever engines, viewers, and DCC tools ship updates. Progressive streaming in particular is designed-for, not delivered-in. A developer who needs External Assets, BVH, and unified file references to compose scenes today will be reading the spec and writing their own resolution layer, at least until Three.js, Babylon, Cesium, and the major DCC exporters catch up.
What to watch next: the first 2.1-aware releases of the major web renderers, the first glTF 2.1 assets shipping through a real digital-twin or BIM pipeline, and whether the working group publishes a 2.1 conformance suite on a date rather than as a "coming soon." Until those land, glTF 2.1 is a well-designed floor for the next decade of open 3D delivery, not yet a finished ceiling above it.