Unsloth Tutorials Aim To Make It Easier To Compare And Fine-tune LLMs

In a recent Reddit post, Unsloth published comprehensive tutorials of all of the open models they support. The tutorials can be used to compare the models’ strengths and weaknesses, as well as their performance benchmarks.

The tutorials cover many of the widely used open model families such as Qwen, Kimi, DeepSeek, Mistral, Phi, Gemma, and Llama. The tutorials are useful for architects, ML scientists, and developers looking for guidance on model selection; and then instructions on fine tuning, quantization, and reinforcement learning.

For each model the tutorial contains a description of the model and the use cases it supports well. For example:

Qwen3-Coder-480B-A35B delivers SOTA advancements in agentic coding and code tasks, matching or outperforming Claude Sonnet-4, GPT-4.1, and Kimi K2. The 480B model achieves a 61.8% on Aider Polygot and supports a 256K token context, extendable to 1M tokens.

The tutorials then provide instructions on how to run the model on llama.cpp, Ollama, and OpenWebUI, including recommended parameters and system prompts. The tutorials provide instructions and resources on how to fine tune the model for Unsloth users.

For Gemma 3n and Ollama the instructions are:

Install ollama if you haven’t already!

apt-get update
apt-get install pciutils -y
curl -fsSL https://ollama.com/install.sh | sh 
Run the model! Note you can call ollama servein another terminal if it fails! We include all our fixes and suggested parameters (temperature etc) in params in our Hugging Face upload!

ollama run hf.co/unsloth/gemma-3n-E4B-it-GGUF:UD-Q4_K_XL

The fine-tuning instructions are specific to the Unsloth platform with practical tips to work around potential issues with the model implementations. For example, the Gemma 3n fine-tuning guide includes the following remark:

Gemma 3n, like Gemma 3, had issues running on Flotat16 GPUs such as Tesla T4s in Colab. You will encounter NaNs and infinities if you do not patch Gemma 3n for inference or finetuning. More information below.

[…]

We also found that because Gemma 3n’s unique architecture reuses hidden states in the vision encoder it poses another interesting quirk with Gradient Checkpointing described below

Open-source fine-tuning framework creators, such as Unsloth and Axolotl, hope to reduce the time it takes teams to create models for specific use cases.

Users of alternative fine-tuning frameworks and model ecosystems, such as AWS, should still find the tutorials useful for the instructions on running models and summaries of their capabilities.

Unsloth, a San Francisco startup founded in 2023, provides a number of open fine-tuned and quantized models on the Hugging Face Hub. These models are trained for specific purposes, such as code generation or agentic tool support. Quantization means that they are cheaper to run in inference mode. The Unsloth documentation explains the purpose of the system is to simplify “model training locally and on [cloud] platforms. Our streamlined workflow handles everything from model loading and quantization to training, evaluation, saving, exporting, and integration with inference engines.”

You can find the Unsloth beginners guide at the company website.

Unsloth Tutorials Aim to Make it Easier to Compare and Fine-tune LLMs

Leave a Reply Cancel reply

Stay Connected

Latest News

Toyota sees growth in China in November after nine-month decline · TechNode

What Is The Flipper Zero Actually Used For? – BGR

TikTok’s clock continues to tick · TechNode

Do this now if you plan to stick with Windows 10 past October 14

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News