Loading lesson…
AI doesn't read words — it reads tokens. Knowing the difference makes you a better prompter.
AI doesn't read words — it reads tokens. Knowing the difference makes you a better prompter.
The big idea: AI thinks in tokens, not words. Once you see the split, you write better prompts.
Every AI call costs the company GPU compute, which they price per million tokens of input and output. As of 2026, GPT-5 costs roughly $5/million input tokens and $20/million output tokens; Claude Sonnet 4.5 is ~$3/$15. A typical paragraph response is ~300 output tokens — so a single ChatGPT response costs the company a fraction of a cent. This explains everything: why free tiers throttle, why heavy use of vision/voice gets cut off, and why building your own AI app is now within a teen's budget ($20 of API credit can run hundreds of experiments).
Open OpenAI's tokenizer (platform.openai.com/tokenizer). Paste any paragraph of yours and see exactly how many tokens it is. Now you can think in tokens, not words.
AI reads in 'tokens' — chunks like words or word-pieces. Each model has a max it can hold (the 'context window'). When you exceed it, the oldest stuff drops off — that's why ChatGPT 'forgets' something from earlier. GPT-4o holds about 128k tokens (~96k words). Claude can hold a million. Knowing this means you stop blaming the AI and start managing context.
Ask AI: 'How many tokens are you holding right now?' Some can answer. Then count words in your chat — divide by 0.75.
15 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-builders-foundations-AI-and-what-a-token-actually-is-teen
What does an AI language model actually read when processing your text input?
Which of these words is MOST likely to be split into multiple tokens by a tokenizer?
If a prompt contains 100 tokens, approximately how many words would that typically represent in English?
Why might writing a very concise prompt save a user money?
Which of these names would a tokenizer be MOST likely to split into multiple tokens?
What is the process called that converts text into tokens for an AI to process?
A student writes two prompts with the same meaning: Prompt A is 50 words, Prompt B is 30 words. If both use typical English, which is likely to use fewer tokens?
The lesson mentions 'BPE' as a key term. What is BPE?
A user writes 'The quick brown fox jumps over the lazy dog.' How many tokens would this sentence most likely contain?
Which statement best captures the 'big idea' from this lesson?
A user includes this unusual word in their prompt: 'Supercalifragilisticexpialidocious'. How will the tokenizer likely handle this?
If you want to reduce the token count of a prompt without changing its meaning, which approach would help most?
The lesson suggests using a tokenizer demo. What would you observe when using such a tool?
Compare these two prompts with identical instructions: Prompt 1 uses simple common words, Prompt 2 uses technical jargon and rare terms. Which will likely use more tokens?
What happens when you write a prompt that is extremely long?