Loading lessons…
Creators · Ages 14–17
The full LLM pipeline, agentic AI with OpenClaw + Ollama, subscription-tier literacy, and a real capstone.
Meet your guide: Atlas — a minimal octahedron
Your progress
Loading your progress…
Where should I start?
Chapters
Modules · 72
Open-source AI is both a technical movement and a political one. Understand the arguments so you can pick a stack and defend it.
Alignment is not a vibes debate. It is a concrete technical problem about getting systems to pursue goals we actually want. Here is what researchers work on when they say they work on alignment.
Most predictions about AI and jobs are either panic or dismissal. Here is what the best evidence through 2025 actually shows — including what is overstated.
The AI safety ecosystem is small, influential, and often misunderstood. Here is who does what, how they get funded, and how to tell real work from rhetoric.
Model Context Protocol is the most important open standard in agents. One protocol, 1,200+ servers, and your agent can plug into almost any system. Here's how it actually works.
Everything comes together. Design, code, test, secure, and ship a production-quality agent with open-source code you can fork today.
Assemble the four or five AI tools that actually belong in your daily life. A tested template for the stack that earns its keep.
Claude Projects, ChatGPT Projects, Notion AI, Perplexity Spaces. How persistent context changes AI from search box to actual assistant.
Perplexity Comet is a full web browser that treats AI as a first-class citizen. It reads, summarizes, and acts on pages you visit.
AP Bio has roughly a thousand terms and four big concepts. NotebookLM and Claude Projects can turn your textbook into a custom tutor that actually knows what you are studying.
Debate rewards knowing the other side's best argument better than they do. AI is built for exactly this kind of fast, balanced research.
Ambient scribes, diagnostic copilots, and evidence engines sit in every exam room. Here is what a physician's workday now looks like — and what still rests on your judgment.
Literature review in minutes, protein structures on demand, AI-proposed drug candidates. The discovery cycle has compressed — but the human posing the question still sets the direction.
NVIDIA GR00T, Physical Intelligence π0, and Figure Helix took the vision-language-action paradigm from research paper to factory floor. This is the hottest hardware-software frontier.
Harvey and CoCounsel research case law, draft briefs, and summarize depositions. The paralegal-and-first-year tier of the profession is genuinely shrinking. The judgment tier is thriving. What AI touches Legal research — Lexis+ AI, Westlaw Precision, Paxton AI, vLex Vincent search and synthesize case law.
The role has inverted: paralegals who used to do research and doc prep now direct the AI that does it. The job is not gone — but it is changing faster than any legal role.
AlphaSense, Hebbia, and Bloomberg GPT read every filing before you do. The edge is the question you ask and the thesis you write.
McKinsey Lilli, Gamma, and Claude generate first-draft slides and research in minutes. The real consulting work — client relationships and implementation — is more human than ever.
v0, Linear AI, and Dovetail synthesize research, draft PRDs, and ship prototypes in hours. The PM role has leveled up from communicator to quasi-builder.
Species identification from underwater footage used to take a season. A model trained on 8 million fish does it in a single afternoon.
Generative imagery, 3D garment sim, and on-demand pattern-making have collapsed the front end. Taste is still the scarce resource.
AI runs the research and drafts the decks. The strategist still has to decide what a brand means.
Wildfire detection, wildlife cameras, and visitor demand modeling changed the job. The ranger still walks the trail at dawn.
Codex CLI is OpenAI's open-source terminal coding agent. Look at how it compares to Claude Code, what it does uniquely, and why it matters to non-Anthropic shops.
Consensus searches 200M+ academic papers and gives evidence-based answers. Deep look at how researchers use it, what it does differently from Perplexity, and its limits.
Elicit automates slow parts of academic research: finding papers, extracting data, building literature matrices. Look at what it saves PhDs 20 hours a week.
What a constitution actually contains, how the training loop works, where the research is now, and the honest trade-offs.
Sparse autoencoders, features, circuits. How researchers try to see what a model actually thinks, and why it may be the most strategically important safety work.
A golden dataset is a curated set of hard, representative examples you trust completely. It is the backbone of every serious eval.
Some capabilities grow smoothly with scale. Others seem to appear out of nowhere. Telling them apart is a whole research program. The Big Question Is AI capability a smooth climb or a staircase?
AI turns weeks of literature review into days — if you know how to use it. Here is a workflow that actually works.
NotebookLM turns a pile of PDFs into a searchable, askable brain. Here is how to build a research notebook that keeps paying dividends.
The norms for disclosing AI use in research are still being written. Here is the emerging consensus and how to stay on the right side of it.
The best way to truly understand an AI claim is to try it yourself. Here is how to run a small experiment that actually teaches you something.
Real data is expensive, private, or scarce. Synthetic data is generated by models themselves. It is rapidly becoming as important as scraped data.
Behind every supervised model is an army of human labelers. Understanding how labeling works is understanding who really builds AI.
The old mantra was more data always wins. The new reality is more complicated. Sometimes a small, hand-crafted dataset beats a giant messy one.
A data card is like a nutrition label for a dataset: who collected it, how, what is in it, and what it should not be used for.
If your training data is 90 percent men, your model will work worse for women. Representation bias is the most pervasive issue in AI.
Measurement bias happens when the thing you measure is a flawed stand-in for what you actually care about. It is subtle and surprisingly common.
Even accurate data can encode an unjust history. The COMPAS recidivism tool shows what happens when AI learns from a biased past.
Every labeled dataset has mistakes. Studies have found error rates of 3 to 6 percent in famous benchmarks like ImageNet. Noisy labels confuse models and mislead evaluations.
If two reasonable humans cannot agree on a label, neither can a model. Inter-annotator agreement tells you if a task is even well-defined.
Small populations get hurt first when datasets are built carelessly. Fixing this requires intentional collection, not just better algorithms.
AI has a geography problem. Training data over-represents North America and Europe, and it shows in subtle and not-so-subtle ways.
English is 6 percent of the world's speakers but 50+ percent of the training data. This asymmetry shapes every model we use.
A data audit is a structured process to find bias, errors, and ethical issues before a model goes live. Every creator should know how.
Everyone wants to debias AI. But the literature is full of methods that look good on paper and fail in the wild. Here is the honest scorecard.
Saying the average is 50,000 dollars can mean three different things. Picking the wrong kind of average is how statistics starts lying to you.
Mean tells you the center. Variance and standard deviation tell you the spread. Without both, you are missing half the story.
Data comes in shapes. The shape determines which tools you can use, and which assumptions will silently betray you.
Some things grow multiplicatively, not additively. Log scales reveal patterns that linear scales hide, especially for anything related to scale or growth.
A trend that appears in every subgroup can reverse when you combine the groups. This is Simpson's Paradox, and it hides in plain sight.
A single weird value can distort your entire analysis. But outliers are also where the most interesting stories live. Knowing when to remove them is an art.
Resampling techniques draw new samples from your data to estimate uncertainty, balance classes, or validate models. It is one of the most underused superpowers in statistics.
Bootstrapping estimates the uncertainty of any statistic, even when you have no clean mathematical formula. It is simple, powerful, and surprisingly deep.
Ownership of data is not one question but a tangle of rights: copyright, contract, privacy, and control. Untangling them is essential for responsible use.
Violating a website's Terms of Service and violating copyright are different legal problems. Understanding the distinction is critical for data work. Fair use in training The argument AI companies make is that training is transformative fair use.
Europe's General Data Protection Regulation (2018) reshaped how the world handles personal data. Understanding its core concepts is now essential. In 2023, Italy briefly banned ChatGPT over GDPR concerns.
Thousands of companies you have never heard of trade your personal data every second. Understanding this invisible market is understanding modern privacy. Brokers and AI training Much training data for specialized models (ad targeting, credit scoring, risk assessment) comes from brokers.
Many AI companies now offer opt-outs from training. But how well do they actually work, and what are the catches?
A 30-year-old simple text file, robots.txt, is how the web has tried to regulate crawlers. The new ai.txt proposal aims to refine this for the AI era.
If you build a dataset, how you license it determines who can use it and how. Picking the right license matters as much as the data itself.
Removing names does not make data anonymous. Combinations of a few seemingly innocent fields can re-identify nearly anyone.
A complete walkthrough from question to shareable dataset. The first project is the hardest; this lesson gets you to the other side.
Jupyter is the data scientist's notebook. Code, output, and narrative in one document. Learning Jupyter well pays dividends for every future project.
Pandas is the Python library that made data science what it is today. Ten verbs get you through 90 percent of day-to-day data work.
These two formats are the bread and butter of data interchange. Handling them well means handling edge cases well.
Creating a dataset from scratch teaches you more than using someone else's. Here is how to build a high-quality small labeled dataset for a real task.
Hugging Face Hub is the GitHub of AI data and models. Uploading a dataset there makes it instantly accessible to millions of practitioners.
A 2015 paper from Microsoft Research let neural networks go 150 layers deep by adding a shortcut.
A lot of civics class is pretending you read the news. AI makes it possible to actually understand a bill, a court case, or a political ad in under ten minutes.