Mozilla Releases Llamafile 0.10 To Enhance Their AI Offering For Easy-To-Use LLMs

Last updated: 2026/03/19 at 8:25 AM

News Room Published 19 March 2026

The last release of Llamafile was back in May and it’s led me recently to wonder if Mozilla was slowly abandoning this AI project like they had done in the past to DeepSpeech and other software projects. Fortunately, that’s not the case and out today is Llamafile 0.10 with some big updates.

Llamafile is a Mozilla.ai project to distribute and run large language models as a single file. With a single Llamafile you can run the LLM across platforms and with varying hardware support. Their intentions are on making LLMs more accessible and convenient to both developers and end-users.

With Llamafile 0.10 there are a lot of changes that have built up since their prior release last May. There is now a new build system for Llamafile, support for new modes, updated Llama.cpp, integrating Whisper.cpp as a sub-module, Stable Diffusion support is now a sub module, improved BSD support, and many other changes.

Llamafile until now has supported TUI chat and server module modalities. With Llamafile 0.10 there is now a hybrid text user interface chat/server mode, a CLI modality for one-shot questions, improved logging, and improved argument handling.

Llamafile logo

There is now also a “–image” argument for specifying images, Metal GPU support on macOS now works out-of-the-box, restored NVIDIA CUDA support, and a variety of other improvements.

More details on Llamafile 0.10 can be found via the documentation. Downloads and more details on Llamafile 0.10 via GitHub.