The Three Ingredients: Data, Compute, Algorithms (Capstone)

Section 1

The Ingredient Model

Compare the options

Era	Data	Compute	Algorithm
1980s expert systems	Handwritten rules	Desktop CPUs	Rule-based inference
2000s statistical ML	Small curated datasets	Commodity servers	SVM, decision trees
2010s deep learning	ImageNet-scale labeled sets	GPUs in parallel	Convolutional and recurrent nets
2020s LLM era	Internet-scale text + human feedback	Datacenter-scale clusters	Transformer with RLHF
2025-2026 reasoning era	Synthetic + verifiable traces	Test-time inference scaling	RL on self-generated reasoning

Every decision here is a bet on one of the three ingredients.

python

# Minimal training config sketch
config = {
    "model": {
        "layers": 24,
        "d_model": 2048,
        "n_heads": 16,
        "kv_heads": 4,
        "vocab_size": 32000,
        "max_seq_len": 2048,
        "position_embedding": "rope",
    },
    "train": {
        "tokens": 20_000_000_000,  # ~20 tokens per param, Chinchilla-ish
        "batch_size": 512,
        "optimizer": "adamw",
        "lr": 3e-4,
        "schedule": "cosine",
    },
    "data": {
        "sources": ["fineweb", "the_stack_v2", "textbooks_synthetic"],
        "dedup": True,
        "language_filter": "en",
    },
}

Key terms in this lesson

The Three Ingredients: Data, Compute, Algorithms (Capstone)

The Ingredient Model

The ingredients mapped to history

How the ingredients interact

Walkthrough: train your own tiny LLM mentally

Capstone exercise

Curious about “The Three Ingredients: Data, Compute, Algorithms (Capstone)”?

Keep going

The Three Ingredients: Data, Compute, Algorithms (Capstone)

The Ingredient Model

The ingredients mapped to history

How the ingredients interact

Walkthrough: train your own tiny LLM mentally

Capstone exercise

Curious about “The Three Ingredients: Data, Compute, Algorithms (Capstone)”?

Keep going