Loading lesson…
One smart agent is fine. Two agents checking each other's work is better. Master the canonical orchestration patterns: planner/executor, judge/worker, debate, and swarm.
A single agent trying to do everything in one context window hits limits: context bloat, role confusion, weak self-critique. Splitting into specialized agents with narrow roles is the cheapest way to add reliability. The patterns below are well-attested in production at Anthropic, OpenAI, and research papers.
| Agent | Role | Model tier |
|---|---|---|
| Planner | Breaks the goal into ordered steps. | Smartest model (Opus 4.7, GPT-5). |
| Executor | Runs each step. Uses tools. | Mid tier (Sonnet 4.6, GPT-5-mini). |
| Verifier | Checks result against original goal. | Smart + strict (Opus 4.7 at low temp). |
# Simplified planner/executor/verifier loop goal = "Migrate all CSV files in /data to parquet, preserving schemas." plan = planner(goal) # returns ordered steps for step in plan.steps: result = executor(step, tools=TOOLS) # has MCP + shell + file ok, notes = verifier(step, result, goal) if not ok: fix = planner(f"Step failed: {notes}. Replan from here.") plan.splice(step, fix) log(step, result, ok) final_ok, summary = verifier("final", plan.history, goal)Planner writes the plan. Executor runs it. Verifier checks. Replan on failure.Spawn N workers to attempt the same task with different prompts or temperatures. A judge scores their outputs and returns the best. Used in AlphaCode, Anthropic's research tooling, and most SWE-bench top submissions. More compute, better results.
Two agents argue opposite sides of a question. A third agent reads the debate and picks a winner. Effective for subjective tasks (editorial decisions, design tradeoffs) where a single pass lacks rigor. OpenAI's 'debate' research and Anthropic's CAI pipeline both use variants.
A coordinator sends the same input to specialist agents (e.g., 'legal reviewer', 'UX reviewer', 'accessibility reviewer') and merges their feedback. Better than one generalist because each specialist can have a narrower, sharper system prompt and different MCP toolset. CrewAI and Microsoft Agent Framework lean into this pattern.
from langgraph.graph import StateGraph, END from typing import TypedDict, List class State(TypedDict): goal: str plan: List[str] current_step: int results: List[dict] verdict: str graph = StateGraph(State) graph.add_node("plan", plan_fn) graph.add_node("execute", execute_fn) graph.add_node("verify", verify_fn) graph.add_node("replan", replan_fn) graph.set_entry_point("plan") graph.add_edge("plan", "execute") graph.add_conditional_edges("verify", lambda s: "execute" if s["current_step"] < len(s["plan"]) else "replan" if s["verdict"] == "fail" else END, ) graph.add_edge("execute", "verify") graph.add_edge("replan", "execute") app = graph.compile(checkpointer=MemorySaver()) # durable statePlanner/executor/verifier as an explicit state machine. Checkpointers let you pause, rewind, and resume.Next lesson: how to actually build the planner/executor/verifier in LangGraph.
8 questions · take it digitally for instant feedback at tendril.neural-forge.io/learn/quiz/end-agentic-multi-agent-patterns-creators
What is the main idea of "Multi-Agent Orchestration: Planner + Executor + Verifier"?
Which concept is most central to "Multi-Agent Orchestration: Planner + Executor + Verifier"?
Which use of AI fits this topic best?
What should a careful learner remember about "More agents ≠ better by default"?
You want to use AI after this lesson. What is the safest next step?
How should AI output about orchestration be treated?
Name one way to verify an AI answer about orchestration.
Which action would help you apply "Multi-Agent Orchestration: Planner + Executor + Verifier" responsibly?