Tendril

Lesson 47 of 1570

Ollama Basics: Running a Model Yourself

Ollama turns 'I want to run an LLM locally' into a one-line install and a two-word command. Here's the stack, the key commands, and the models worth pulling first.

BuildersAgentic AI~19 min readIntermediateAdvancedBI2 · Representation & ReasoningBI3 · LearningPrint / PDF

Lesson map

What this lesson covers

32 min17 blocks5 concepts

Learning path

The main moves in order

1What Ollama is
2Ollama
3local models
4GGUF

Concept cluster

Terms to connect while reading

Ollamalocal modelsGGUFllama.cppMCP client

Sections5

Notes4

Code3

Compare1

Terms1

Section 1

What Ollama is

Ollama is a command-line tool that downloads, manages, and serves local LLMs. Under the hood it uses llama.cpp (the fastest open-source runtime) and ships models in GGUF format. You get a localhost API that any app can call — including Claude Code, CrewAI, LangGraph, and OpenClaw.

Install and first model

From zero to local model in three commands.

bash

# macOS: one-line install
brew install ollama

# Or download the app from ollama.com

# Start the background server
ollama serve &

# Pull and run Llama 4 (8B — works on most modern laptops)
ollama run llama4:8b

# You're now chatting with a local model. Ctrl-D to exit.

The commands worth knowing

Core Ollama commands and the OpenAI-compatible API.

bash

ollama list                    # what's installed
ollama pull qwen3.5:8b         # download (no run)
ollama run gemma4:4b           # download if needed, then chat
ollama rm llama4:70b           # free disk space
ollama ps                      # what's loaded in memory
ollama show qwen3.5:8b         # details: params, context, quant

# API access (OpenAI-compatible)
curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.5:8b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Check-in 1. Got it so far?

Models to try (April 2026)

Compare the options

Model	Size	Best for
llama4:8b	~5 GB	General chat, balanced speed/quality.
qwen3.5:8b	~5 GB	Strong function calling + coding.
gemma4:12b	~8 GB	Google's frontier-at-size model, reasoning-tuned.
qwen3.5:32b	~20 GB	Near-frontier quality on a 32 GB Mac.
deepseek-coder:16b	~10 GB	Code-focused. Fast on a laptop GPU.
llama4:70b	~40 GB	Highest quality. Needs a Mac Studio or a real GPU box.

Using Ollama with agents

Because Ollama exposes an OpenAI-compatible API at localhost:11434, it drops into any agent framework that supports OpenAI. Point CrewAI, LangGraph, OpenClaw, or AutoGen at that URL and they'll happily run against your local model. As of March 2026, llama.cpp merged full MCP client support — meaning you can plug MCP servers (GitHub, Notion, Supabase) into a local Qwen or Llama too.

Point OpenClaw at a local Ollama model. No cloud provider involved; no API key required.

bash

# OpenClaw talking to local Ollama — no cloud, no real credentials
# Pull once, then run against the local server at localhost:11434.

ollama pull qwen3.5:8b

openclaw config set backend local-ollama
openclaw config set model qwen3.5:8b

openclaw run "organize my Downloads folder"

# All traffic stays on localhost. Nothing leaves the machine.

Check-in 2. Got it so far?

You now have the full local stack: Ollama for the model, OpenClaw or your framework of choice for the agent, MCP for the tools. Next is the builder capstone — design (not code) an agent for your own life.

Check-in 3. Got it so far?

Key terms in this lesson

End-of-lesson quiz

Check what stuck

15 questions · Score saves to your progress.

Tutor

Curious about “Ollama Basics: Running a Model Yourself”?

Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.

Progress saved locally in this browser. Sign in to sync across devices.

Related lessons

Ollama Basics: Running a Model Yourself

What Ollama is

Install and first model

The commands worth knowing

Models to try (April 2026)

Using Ollama with agents

Curious about “Ollama Basics: Running a Model Yourself”?

Keep going

Ollama Basics: Running a Model Yourself

What Ollama is

Install and first model

The commands worth knowing

Models to try (April 2026)

Using Ollama with agents

Curious about “Ollama Basics: Running a Model Yourself”?

Keep going