By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Complete Ollama Tutorial (2026) – LLMs via CLI, Cloud & Python | HackerNoon
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Computing > Complete Ollama Tutorial (2026) – LLMs via CLI, Cloud & Python | HackerNoon
Computing

Complete Ollama Tutorial (2026) – LLMs via CLI, Cloud & Python | HackerNoon

News Room
Last updated: 2026/01/05 at 8:11 PM
News Room Published 5 January 2026
Share
Complete Ollama Tutorial (2026) – LLMs via CLI, Cloud & Python | HackerNoon
SHARE

Ollama has become the standard for running Large Language Models (LLMs) locally. In this tutorial, I want to show you the most important things you should know about Ollama.

Watch on YouTube: Ollama Full Tutorial

What is Ollama?

Ollama is an open-source platform for running and managing large-language-model (LLM) packages entirely on your local machine. It bundles model weights, configuration, and data into a single Modelfile package. Ollama offers a command-line interface (CLI), a REST API, and a Python/JavaScript SDK, allowing users to download models, run them offline, and even call user-defined functions. Running models locally gives users privacy, removes network latency, and keeps data on the user’s device.

Install Ollama

Visit the official website to download Ollama https://ollama.com/. It’s available for Mac, Windows, and Linux.

Linux:

curl -fsSL https://ollama.com/install.sh | sh

macOS:

brew install ollama

Windows: download the .exe installer and run it.

How to Run Ollama

Before running models, it is essential to understand Quantization. Ollama typically runs models quantized to 4 bits (q4_0), which significantly reduces memory usage with minimal loss in quality.

Recommended Hardware:

  • 7B Models (e.g., Llama 3, Mistral): Requires ~8GB RAM (runs on most modern laptops).

  • 13B — 30B Models: Requires 16GB — 32GB RAM.

  • 70B+ Models: Requires 64GB+ RAM or dual GPUs.

  • GPU: An NVIDIA GPU or Apple Silicon (M1/M2/M3) is highly recommended for speed.

    Select the model

Go to the Ollama website and click on the “Models” and select the model for your test.

After that, click on the model name and copy the terminal command:

Then, open the terminal window and paste the command:

It will allow you to download and chat with a model immediately.

Ollama CLI — Core Commands

Ollama’s CLI is central to model management. Common commands include:

  • ollama pull — Download a model
  • ollama run — Run a model interactively
  • ollama list or ollama ls — List downloaded models
  • ollama rm — Remove a model
  • ollama create -f — Create a custom model
  • ollama serve — Start the Ollama API server
  • ollama ps — Show running models
  • ollama stop — Stop a running model
  • ollama help — Show help

Advanced Customization: Custom model with Modelfiles

You can “fine-tune” a model’s personality and constraints using a Modelfile. This is similar to a Dockerfile.

  • Create a file named Modelfile
  • Add the following configuration:
# 1. Base the model on an existing one
FROM llama3
# 2. Set the creative temperature (0.0 = precise, 1.0 = creative)
PARAMETER temperature 0.7
# 3. Set the context window size (default is 4096 tokens)
PARAMETER num_ctx 4096
# 4. Define the System Prompt (The AI’s “brain”)
SYSTEM """
You are a Senior Python Backend Engineer.
Only answer with code snippets and brief technical explanations.
Do not be conversational.
"""

FROM defines the base model

SYSTEM sets a system prompt

PARAMETER controls inference behavior

After that, you need to build the model by using this command:

ollama create [change-to-your-custom-name] -f Modelfile

This wraps the model + prompt template together into a reusable package.

Then run in:

ollama run [change-to-your-custom-name]

Press enter or click to view image in full size Advanced Customization: Custom model with Modelfiles

Ollama Server (Local API)

Ollama can run as a local server that apps can call. To start the server use the command:

ollama serve

It listens on http://localhost:11434 by default.

Raw HTTP

import requests
r = requests.post(
    "http://localhost:11434/api/chat",
    json={
        "model": "llama3",
        "messages": [{"role":"user","content":"Hello Ollama"}]
    }
)
print(r.json()["message"]["content"])

This lets you embed Ollama into apps or services.

Python Integration

Use Ollama inside Python applications with the official library. Run these commands:

Create and activate virtual environments:

python3 -m venv .venv
source .venv/bin/activate

Install the official library:

pip install ollama

Use this simple Python code:

import ollama

# This sends a message to the model 'gemma:2b'
response = ollama.chat(model="gemma:2b", messages=[
  {
    'role': 'user',
    'content': 'Write a short poem about coding.'
  },
])

# Print the AI's reply
print(response['message']['content'])

This works over the local API automatically when Ollama is running.

You can also call a local server:

import requests
r = requests.post(
    "http://localhost:11434/api/chat",
    json={
        "model": "llama3",
        "messages": [{"role":"user","content":"Hello Ollama"}]
    }
)
print(r.json()["message"]["content"])

Using Ollama Cloud

Ollama also supports cloud models — useful when your machine can’t run very large models.

First, create an account on https://ollama.com/cloud and sign in. Then, inside the Models pag,e click on the cloud link and select any model you want to test.

Using Ollama Cloud

In the models list, you will see the model with the -cloud prefix**,** which means it is available in the Ollama cloud.

Click on it and copy the CLI command. Then, inside the terminal, use:

ollama signin

To sign in to your Ollama account. Once you sign in with ollama signin, then run cloud models:

ollama run nemotron-3-nano:30b-cloud

Your Own Model in the Cloud

While Ollama is local-first, Ollama Cloud allows you to push your custom models (the ones you built with Modelfiles) to the web to share with your team or use across devices.

  • Create an account at ollama.com.
  • Add your public key (found in ~/.ollama/id_ed25519.pub).
  • Push your custom model:
ollama push your-username/change-to-your-custom-model-name

Conclusion

That is the complete overview of Ollama! It is a powerful tool that gives you total control over AI. If you like this tutorial, please like it and share your feedback in the section below.

Cheers! 😉

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Rokid’s new smart glasses might be the best Ray-Ban Meta alternative (for now) Rokid’s new smart glasses might be the best Ray-Ban Meta alternative (for now)
Next Article Dell brings XPS back with a bang – new XPS 14 and XPS 16 ultraportables unveiled Dell brings XPS back with a bang – new XPS 14 and XPS 16 ultraportables unveiled
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Moltbot, the AI agent that ‘actually does things,’ is tech’s new obsession
Moltbot, the AI agent that ‘actually does things,’ is tech’s new obsession
News
Anduril has invented a wild new drone flying contest where jobs are the prize  |  News
Anduril has invented a wild new drone flying contest where jobs are the prize  | News
News
TikTok’s new ownership structure doesn’t solve security concerns for Americans
TikTok’s new ownership structure doesn’t solve security concerns for Americans
News
MiniMax releases M2 open-source model , offering double speed at 8% of Claude Sonnet’s price · TechNode
MiniMax releases M2 open-source model , offering double speed at 8% of Claude Sonnet’s price · TechNode
Computing

You Might also Like

MiniMax releases M2 open-source model , offering double speed at 8% of Claude Sonnet’s price · TechNode
Computing

MiniMax releases M2 open-source model , offering double speed at 8% of Claude Sonnet’s price · TechNode

1 Min Read
How to Track and Improve LinkedIn Impressions in 2026
Computing

How to Track and Improve LinkedIn Impressions in 2026

9 Min Read
Bcon Global Launches Non-Custodial Crypto Payment Gateway | HackerNoon
Computing

Bcon Global Launches Non-Custodial Crypto Payment Gateway | HackerNoon

4 Min Read

F5 stock surges 13% after topping estimates with $822M in Q1 revenue

1 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?