A dependency-free C++ port of distilHuBERT lands as hubert.cpp
A solo developer ships a self contained audio embedding library whose weights compile into the binary, easing the path to on device speech AI in audio plugins and embedded targets.
A solo developer ships a self contained audio embedding library whose weights compile into the binary, easing the path to on device speech AI in audio plugins and embedded targets.
When a neural audio model is roughly 80 million parameters, the practical question for an audio software developer is not whether the model is good. The question is what else they have to drag into their build to run it. hubert.cpp, a C++ reimplementation of distilHuBERT posted this week to r/MachineLearning, answers that question in an unusually direct way: by compiling the weights into the library itself and shipping with no external runtime dependencies.
The project, hosted on GitHub at pfeatherstone/hubert.cpp and announced on r/MachineLearning, is a reimplementation of distilHuBERT, the speech representation model that Chang et al. distilled from HuBERT in 2022. The distilHuBERT work showed that a smaller HuBERT student could recover most of the larger model's utility for downstream tasks. hubert.cpp does not replace that research. It is a C++ port intended to make the distilled model usable in environments where the original PyTorch reference implementation is awkward to deploy.
The mechanism is what makes the library interesting. The author describes it as having no runtime dependencies, with the model weights compiled directly into the library rather than shipped as a separate file. CMake integration lets the project drop into native C++ builds the way a single header-and-source library traditionally would. The project also claims to support dynamic input sizes, and the author's own tests report performance on par with ONNX Runtime.
The performance claim deserves careful reading. The author phrases it as "on par with ONNX Runtime (in my tests)," which is exactly the hedge it sounds like: a single developer's benchmark, run on the developer's hardware, against a runtime the developer chose to compare against. It is not a peer-reviewed result, and no independent reproduction has surfaced in the provided material. The honest framing is that the design goal was parity with a widely used ONNX inference path, and the author reports reaching it under their own conditions.
That design goal itself is the story. Audio developers who want speech embeddings inside a VST plugin, a desktop application, or an embedded device have historically faced a forced choice: either pull in ONNX Runtime, a sizable C++ dependency in its own right, or maintain a hand-tuned inference path for the specific model they want to use. Compile-in weights and a zero-dependency library collapse that choice for the distilHuBERT family of models. The trade-off is portability of the binary. Weights are baked in, so redistributing a new model checkpoint means rebuilding the library. For a fixed checkpoint in a stable deployment, the simplification is real.
The scope is also worth stating plainly. hubert.cpp is a single developer project that surfaced on Reddit a day ago, under the handle pfeatherstone. There is no community reception signal, no production adoption, and no vendor backing. It is one open-source port of one distilled model. The distilHuBERT upstream it implements remains the work of Chang and colleagues, and the audio-representation quality of hubert.cpp inherits whatever the original student model captured.
For a developer evaluating whether distilHuBERT can now slot into a C++ pipeline without the ONNX tax, hubert.cpp is a concrete option to test. The next things worth watching are whether the build and integration story holds up across the platforms implied by the CMake claim, and whether anyone outside the author runs an independent benchmark against a reference runtime.