In a recent Reddit post, Unsloth published comprehensive tutorials of all of the open models they support. The tutorials can be used to compare the models’ strengths and weaknesses, as well as their performance benchmarks.
The tutorials cover many of the widely used open model families such as Qwen, Kimi, DeepSeek, Mistral, Phi, Gemma, and Llama. The tutorials are useful for architects, ML scientists, and developers looking for guidance on model selection; and then instructions on fine tuning, quantization, and reinforcement learning.
For each model the tutorial contains a description of the model and the use cases it supports well. For example:
Qwen3-Coder-480B-A35B delivers SOTA advancements in agentic coding and code tasks, matching or outperforming Claude Sonnet-4, GPT-4.1, and Kimi K2. The 480B model achieves a 61.8% on Aider Polygot and supports a 256K token context, extendable to 1M tokens.
The tutorials then provide instructions on how to run the model on llama.cpp, Ollama, and OpenWebUI, including recommended parameters and system prompts. The tutorials provide instructions and resources on how to fine tune the model for Unsloth users.
For Gemma 3n and Ollama the instructions are:
Install ollama if you haven’t already!
apt-get update apt-get install pciutils -y curl -fsSL https://ollama.com/install.sh | sh
Run the model! Note you can call ollama servein another terminal if it fails! We include all our fixes and suggested parameters (temperature etc) in params in our Hugging Face upload!
ollama run hf.co/unsloth/gemma-3n-E4B-it-GGUF:UD-Q4_K_XL
The fine-tuning instructions are specific to the Unsloth platform with practical tips to work around potential issues with the model implementations. For example, the Gemma 3n fine-tuning guide includes the following remark:
Gemma 3n, like Gemma 3, had issues running on Flotat16 GPUs such as Tesla T4s in Colab. You will encounter NaNs and infinities if you do not patch Gemma 3n for inference or finetuning. More information below.
[…]
We also found that because Gemma 3n’s unique architecture reuses hidden states in the vision encoder it poses another interesting quirk with Gradient Checkpointing described below
Open-source fine-tuning framework creators, such as Unsloth and Axolotl, hope to reduce the time it takes teams to create models for specific use cases.
Users of alternative fine-tuning frameworks and model ecosystems, such as AWS, should still find the tutorials useful for the instructions on running models and summaries of their capabilities.
Unsloth, a San Francisco startup founded in 2023, provides a number of open fine-tuned and quantized models on the Hugging Face Hub. These models are trained for specific purposes, such as code generation or agentic tool support. Quantization means that they are cheaper to run in inference mode. The Unsloth documentation explains the purpose of the system is to simplify “model training locally and on [cloud] platforms. Our streamlined workflow handles everything from model loading and quantization to training, evaluation, saving, exporting, and integration with inference engines.”
You can find the Unsloth beginners guide at the company website.