Loading lesson…
Ollama turns 'I want to run an LLM locally' into a one-line install and a two-word command. Here's the stack, the key commands, and the models worth pulling first.
Ollama is a command-line tool that downloads, manages, and serves local LLMs. Under the hood it uses llama.cpp (the fastest open-source runtime) and ships models in GGUF format. You get a localhost API that any app can call — including Claude Code, CrewAI, LangGraph, and OpenClaw.
# macOS: one-line install
brew install ollama
# Or download the app from ollama.com
# Start the background server
ollama serve &
# Pull and run Llama 4 (8B — works on most modern laptops)
ollama run llama4:8b
# You're now chatting with a local model. Ctrl-D to exit.From zero to local model in three commands.ollama list # what's installed
ollama pull qwen3.5:8b # download (no run)
ollama run gemma4:4b # download if needed, then chat
ollama rm llama4:70b # free disk space
ollama ps # what's loaded in memory
ollama show qwen3.5:8b # details: params, context, quant
# API access (OpenAI-compatible)
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3.5:8b",
"messages": [{"role": "user", "content": "Hello"}]
}'Core Ollama commands and the OpenAI-compatible API.| Model | Size | Best for |
|---|---|---|
| llama4:8b | ~5 GB | General chat, balanced speed/quality. |
| qwen3.5:8b | ~5 GB | Strong function calling + coding. |
| gemma4:12b | ~8 GB | Google's frontier-at-size model, reasoning-tuned. |
| qwen3.5:32b | ~20 GB | Near-frontier quality on a 32 GB Mac. |
| deepseek-coder:16b | ~10 GB | Code-focused. Fast on a laptop GPU. |
| llama4:70b | ~40 GB | Highest quality. Needs a Mac Studio or a real GPU box. |
Because Ollama exposes an OpenAI-compatible API at localhost:11434, it drops into any agent framework that supports OpenAI. Point CrewAI, LangGraph, OpenClaw, or AutoGen at that URL and they'll happily run against your local model. As of March 2026, llama.cpp merged full MCP client support — meaning you can plug MCP servers (GitHub, Notion, Supabase) into a local Qwen or Llama too.
# OpenClaw talking to local Ollama — no cloud, no real credentials
# Pull once, then run against the local server at localhost:11434.
ollama pull qwen3.5:8b
openclaw config set backend local-ollama
openclaw config set model qwen3.5:8b
openclaw run "organize my Downloads folder"
# All traffic stays on localhost. Nothing leaves the machine.Point OpenClaw at a local Ollama model. No cloud provider involved; no API key required.You now have the full local stack: Ollama for the model, OpenClaw or your framework of choice for the agent, MCP for the tools. Next is the builder capstone — design (not code) an agent for your own life.
15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-agentic-ollama-basics-builders
What is the core idea behind "Ollama Basics: Running a Model Yourself"?
Which term best describes a foundational idea in "Ollama Basics: Running a Model Yourself"?
A learner studying Ollama Basics: Running a Model Yourself would need to understand which concept?
Which of these is directly relevant to Ollama Basics: Running a Model Yourself?
What is the key insight about "LM Studio is the GUI cousin" in the context of Ollama Basics: Running a Model Yourself?
What is the key insight about "Quantization matters" in the context of Ollama Basics: Running a Model Yourself?
What is the key warning about "Define the guardrails first" in the context of Ollama Basics: Running a Model Yourself?
Which statement accurately describes an aspect of Ollama Basics: Running a Model Yourself?
What does working with Ollama Basics: Running a Model Yourself typically involve?
Which of the following is true about Ollama Basics: Running a Model Yourself?
Which best describes the scope of "Ollama Basics: Running a Model Yourself"?
Which section heading best belongs in a lesson about Ollama Basics: Running a Model Yourself?
Which section heading best belongs in a lesson about Ollama Basics: Running a Model Yourself?
Which section heading best belongs in a lesson about Ollama Basics: Running a Model Yourself?
Which section heading best belongs in a lesson about Ollama Basics: Running a Model Yourself?