Lesson 97 of 1455
Kimi K2 — long-context workflow
Moonshot's Kimi K2 specializes in long documents and retrieval-heavy workflows. Here is when it beats a generalist.
Builders · Model Families · ~16 min read
A document-first chat model
Kimi K2 is tuned for uploads and long-document chat. Its attention mechanisms and instruction tuning emphasize consistent recall across hundreds of pages.
- Strong on multi-document synthesis
- Bilingual (Chinese + English) out of the box
- Competitive context window reported in the hundreds of thousands
- Agentic extensions for browser and file tools
Compare the options
| Task | Kimi K2 | Gemini 2.5 Pro | Grok 4.1 Fast |
|---|---|---|---|
| Multi-doc synthesis | Excellent | Excellent | Good |
| Chinese legal/finance | Excellent | Good | Good |
| Price | $$ | $$ | $ |
| Long-context QPS | Moderate | High | High |
Moonshot's API mirrors OpenAI; the 128k/longer variants carry the Kimi brand.
resp = kimi_client.chat.completions.create( model="moonshot-v1-128k", messages=[{"role": "user", "content": long_doc_prompt}], )Workflow tip
Kimi's UI handles drag-and-drop of dozens of files at once, which is smoother than most Western chat UIs for heavy research. Even if you ship on a different model, Kimi can be the research scratchpad.
Key terms in this lesson
End-of-lesson quiz
Check what stuck
8 questions · Score saves to your progress.
Lesson help
Questions are best handled with a grown-up here.
For this age range, Tendril keeps freeform AI chat paused until parent/guardian consent and child-safe moderation are fully verified. Use the quiz, notes, and related lessons below, or ask a parent, guardian, teacher, or librarian to work through the question with you.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Builders · 28 min
Gemini 2.5 Pro — how a 1M context actually helps
Everyone brags about million-token windows. Here is what you can actually do with one when you learn how Gemini 2.5 Pro handles long documents.
Builders · 28 min
Codestral Mamba — state-space architecture
Codestral Mamba ditches transformers for a state-space model. The result: linear-time long-context coding at a fraction of the attention cost.
Builders · 30 min
GPT-5.5 vs. Claude Opus 4.7 — which chatbot wins your day
Two frontier models, same subscription price, very different personalities. Pick by vibe, not by benchmark — here is how to figure out which one clicks for you.
