neural-forge.io

Sign inStartStart learning

Tendril

Model Families0%

Lesson 529 of 2116

Choosing a Local Model: Llama, Mistral, Hermes, Qwen, DeepSeek, and Friends

There are too many open-weight models. A short, opinionated tour of the major families and what each is actually good at.

CreatorsModel Families~6 min readBI2 · Representation & ReasoningBI3 · LearningBI4 · Natural InteractionPrint / PDF

Lesson map

What this lesson covers

10 min16 blocks5 concepts

Learning path

The main moves in order

1The open-weight skyline
2open weights
3model family
4fine-tune

Concept cluster

Terms to connect while reading

open weightsmodel familyfine-tuneinstruction tuningmodel selection

Read2

Sections4

Lists3

Notes5

Compare1

Terms1

Section 1

The open-weight skyline

Open-weight models cluster into a few families with distinct personalities and strengths. Within each family, sizes range from 1B (laptop-friendly) to 70B+ (small cluster). Knowing the families saves you from drowning in Hugging Face.

Compare the options

Family	Origin	Sweet spot	Reputation
Llama	Meta	General purpose, broad ecosystem	The default — well-supported
Mistral / Mixtral	Mistral AI (France)	Efficient, strong reasoning per parameter	European, MoE-friendly
Qwen	Alibaba	Coding, multilingual, long context	Often best-in-class at small sizes
DeepSeek	DeepSeek (China)	Reasoning and coding	Punches well above its size
Hermes / Nous	Community fine-tunes	Chattier, less refusal-y	Fine-tunes of base models
Phi	Microsoft Research	Tiny but capable	Great for embedded / edge
Gemma	Google	Light, well-tuned	Polished, conservative

How to pick — by job, not by hype

1Coding assistant on a laptop: Qwen-coder, Llama-code, or DeepSeek-coder at 7-8B
2General chat with broad world knowledge: Llama 3.x or 4.x at the largest size that fits
3Long-context document analysis: Qwen long variants — strong tokenization for non-English
4Small device or embedded: Phi or Gemma at 1-3B
5Fewer refusals (research-only, not production): a Hermes or Nous fine-tune

Check-in 1. Got it so far?

Reading a model card before you commit

License: not all open-weight models are commercially usable
Context length: advertised vs. effective often differ
Tokenizer: matters for non-English performance and cost estimation
Base vs Instruct: are you getting the chat-tuned version?
Eval scores: take vendor numbers with salt; trust independent leaderboards more

Check-in 2. Got it so far?

Apply this

Pick three models from three different families that fit your hardware
Run the same five-prompt eval set on each
Pick the one that wins on your real task — not the one with the highest leaderboard rank

Key terms in this lesson

The big idea: the right local model is the one that wins on your prompts, on your hardware. Family names are a starting filter, not an answer.

Check-in 3. Got it so far?

End-of-lesson quiz

Check what stuck

15 questions · Score saves to your progress.

Tutor

Curious about “Choosing a Local Model: Llama, Mistral, Hermes, Qwen, DeepSeek, and Friends”?

Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.

Progress saved locally in this browser. Sign in to sync across devices.

Related lessons

Keep going