Multi-Agent Workflow Design Patterns
Created at: Mar 17, 2026 at 06:51:08 AM
Multi-Agent Workflow Design Patterns
Research into established multi-agent design patterns — the philosophical/architectural patterns themselves, not framework-specific implementations. Frameworks cited as evidence of adoption and naming conventions.
Pattern 1: Executor-Reviewer (Generator-Critic)
Core idea: Separate the act of doing from the act of judging. One agent produces work; another evaluates it against criteria. Neither role is complete without the other.
Why it matters: Mirrors how quality emerges in human organizations — writers need editors, developers need code review. The executor optimizes for completion; the reviewer optimizes for correctness. This tension produces better outcomes than either alone.
Variations:
- Single-pass review: Executor produces → Reviewer evaluates → done or reject
- Iterative refinement: Executor produces → Reviewer critiques → Executor revises → loop until satisfied
- Advisory review: Reviewer gives verdict but doesn’t block (ttal’s approach — reviewer posts LGTM/NEEDS_WORK, coder triages)
Trade-offs:
- Higher quality output at the cost of latency and compute
- Risk of infinite revision loops without termination conditions
- Reviewer must be genuinely independent (different prompt/context) to add value — a rubber-stamp reviewer wastes tokens
Seen in: LangGraph (generator-critic), Anthropic’s building-effective-agents blog (evaluator-optimizer), ttal (coder + reviewer windows), traditional software engineering (PR review)
Sources:
- Anthropic, “Building Effective Agents” — https://www.anthropic.com/engineering/building-effective-agents
- Microsoft Multi-Agent Reference Architecture — https://microsoft.github.io/multi-agent-reference-architecture/docs/reference-architecture/Patterns.html
- Google Cloud, “Choose a design pattern for agentic AI” — https://docs.cloud.google.com/architecture/choose-design-pattern-agentic-ai-system
Pattern 2: Router (Dispatcher)
Core idea: A decision point that examines input and sends it to the right specialist. The router itself does minimal work — its job is classification and dispatch.
Why it matters: Avoids the “one agent does everything” anti-pattern. Specialized agents with narrow tool sets and focused prompts outperform general-purpose agents. The router is the switch that makes specialization possible.
Variations:
- Static routing: Rules-based dispatch (if tag=research → Athena, if tag=design → Inke)
- LLM routing: An LLM classifies the request and picks the destination
- Semantic routing with fallback: Lightweight classifier first, LLM only when confidence is low (cost optimization)
- Hierarchical routing: Router → sub-router → specialist (multi-level dispatch)
Trade-offs:
- Routing errors cascade — wrong specialist gets the task, wastes time
- Simple routers are fast but brittle; LLM routers are flexible but add latency
- Need clear capability descriptions for each destination agent
Seen in: LangGraph (conditional edges), OpenAI Swarm/Agents SDK (triage agent), ttal (ttal task route --to <agent>), CrewAI (process selection)
Sources:
- Google Cloud Architecture — https://docs.cloud.google.com/architecture/choose-design-pattern-agentic-ai-system
- Anthropic, “Building Effective Agents” — https://www.anthropic.com/engineering/building-effective-agents
Pattern 3: Orchestrator (Central Coordinator)
Core idea: A single agent holds the big picture. It decomposes complex tasks, delegates subtasks to workers, tracks progress, and synthesizes results. The orchestrator thinks; workers execute.
Why it matters: Complex tasks require decomposition, and someone needs to own the decomposition. Without an orchestrator, agents either duplicate work or leave gaps. The orchestrator is the project manager.
Variations:
- Static orchestration: Predefined workflow graph (DAG of agent calls)
- Dynamic orchestration: Orchestrator decides subtasks at runtime based on input
- Hierarchical orchestration: Orchestrator delegates to sub-orchestrators who each manage a team
- Long-running orchestration: Persistent orchestrator that maintains state across sessions (ttal’s Yuki)
Key difference from Router: A router dispatches and forgets. An orchestrator dispatches, monitors, collects results, and makes decisions about what happens next.
Trade-offs:
- Single point of failure — if orchestrator context window fills up, everything stalls
- Orchestrator overhead (tokens spent on coordination, not execution)
- Must balance between micromanaging (too much control) and hands-off (too little visibility)
Seen in: Anthropic’s orchestrator-workers workflow, LangGraph (supervisor pattern), CrewAI (hierarchical process with manager agent), ttal (Yuki as orchestrator + daemon as infrastructure)
Sources:
- Anthropic, “Building Effective Agents” — https://www.anthropic.com/engineering/building-effective-agents
- CrewAI hierarchical process docs — https://docs.crewai.com/en/learn/hierarchical-process
- Microsoft Multi-Agent Reference Architecture — https://microsoft.github.io/multi-agent-reference-architecture/docs/reference-architecture/Patterns.html
Pattern 4: Pipeline (Sequential Chain)
Core idea: Work flows through a fixed sequence of stages. Each agent transforms the output and passes it forward. Assembly line for AI.
Why it matters: When the workflow is predictable and each stage has clear input/output contracts, pipelines are the simplest and most debuggable pattern. No routing decisions, no coordination overhead.
Variations:
- Linear pipeline: A → B → C → done
- Pipeline with gates: Each stage can reject/send-back to previous stage
- Fan-out/fan-in pipeline: Stage splits into parallel branches, then merges
- Research pipeline: ttal’s research → design → implementation flow (Athena → Inke → Worker)
Trade-offs:
- Simple and predictable, but inflexible for tasks that don’t fit the fixed sequence
- Stage coupling — if stage B’s contract changes, stage C breaks
- No good way to handle tasks that skip stages or need non-linear flow
Seen in: CrewAI (sequential process, the default), Anthropic’s prompt chaining workflow, Unix pipes philosophy, ttal’s task tag pipeline (+research → +design → execute)
Sources:
- Anthropic, “Building Effective Agents” — https://www.anthropic.com/engineering/building-effective-agents
- CrewAI sequential process docs — https://docs.crewai.com/en/learn/sequential-process
Pattern 5: Debate / Consensus (Multi-Perspective Reasoning)
Core idea: Multiple agents independently reason about the same problem, then exchange perspectives and refine. Truth emerges from disagreement.
Why it matters: A single LLM has blind spots. Multiple agents with different prompts/temperatures/roles surface different aspects of a problem. Particularly effective for reasoning tasks where the “obvious” answer may be wrong.
Variations:
- Full debate: All agents see all other agents’ responses, iterate for N rounds
- Sparse debate: Agents only see neighbors’ responses (reduces token cost, maintains diversity)
- Majority voting: Simple aggregation — most popular answer wins (no iterative refinement)
- Judge-mediated: A separate judge agent evaluates competing arguments
Trade-offs:
- Expensive — N agents × M rounds of full-context reasoning
- Risk of convergence to groupthink (agents agree too quickly)
- Sparse communication topologies help maintain diversity but reduce cross-pollination
- Best for high-stakes decisions where correctness matters more than speed
Seen in: AutoGen (multi-agent debate with sparse topology, based on arxiv:2406.11776), LangGraph (swarm pattern), academic research on LLM debate for reasoning
Sources:
- AutoGen multi-agent debate docs — https://microsoft.github.io/autogen/stable/user-guide/core-user-guide/design-patterns/multi-agent-debate.html
- “Improving Multi-Agent Debate with Sparse Communication Topology” — https://arxiv.org/abs/2406.11776
- Microsoft Multi-Agent Reference Architecture — https://microsoft.github.io/multi-agent-reference-architecture/docs/reference-architecture/Patterns.html
Pattern 6: Supervisor (Monitor-Correct)
Core idea: An agent watches other agents work and intervenes when things go wrong. Unlike an orchestrator that actively delegates, a supervisor passively monitors and only acts on deviation.
Why it matters: Autonomous agents make mistakes — hallucinate, get stuck in loops, pursue wrong approaches. A supervisor provides a safety net without the overhead of micromanagement.
Variations:
- Quality supervisor: Checks output quality, requests rework if below threshold
- Safety supervisor: Monitors for policy violations, harmful outputs, or security issues
- Resource supervisor: Tracks token usage, time, cost — kills runaway agents
- Human-in-the-loop supervisor: Escalates to humans when confidence is low
Key difference from Orchestrator: Orchestrator proactively plans and delegates. Supervisor reactively monitors and corrects. An orchestrator says “do X, Y, Z.” A supervisor says “I’m watching — carry on, but I’ll step in if needed.”
Trade-offs:
- Adds overhead for monitoring that may rarely trigger
- Must define clear intervention criteria (too sensitive = micromanaging, too loose = missing problems)
- Can create false sense of safety if supervisor itself has blind spots
Seen in: LangGraph (supervisor pattern), Microsoft reference architecture (Supervisor Agent pattern), ttal (daemon monitors worker sessions via fsnotify, PR watcher polls CI status)
Sources:
- Microsoft Multi-Agent Reference Architecture — https://microsoft.github.io/multi-agent-reference-architecture/docs/reference-architecture/Patterns.html
- LangGraph supervisor docs — https://docs.langchain.com/oss/python/langgraph/workflows-agents
Pattern 7: Swarm (Peer-to-Peer Handoff)
Core idea: No central coordinator. Agents hand off control directly to each other based on the conversation state. Each agent decides who should handle the next step.
Why it matters: Eliminates the orchestrator bottleneck. Agents are autonomous peers. The system is emergent — the workflow arises from individual handoff decisions rather than top-down planning.
Variations:
- Simple handoff: Agent A → Agent B (one-way transfer of control)
- Handoff with context: Transfer includes updated context variables
- Bidirectional handoff: Agents can hand back if they can’t handle it
- Triage-first: Initial triage agent routes, then specialists hand off among themselves
Trade-offs:
- Flexible and decentralized, but less predictable than orchestrated patterns
- Hard to debug — no single place to see the full workflow
- Risk of circular handoffs (A → B → A → B…) without termination logic
- Works best when agent boundaries are clear and handoff criteria are well-defined
Seen in: OpenAI Swarm/Agents SDK (the defining implementation), LangGraph (swarm pattern with create_handoff_tool), ttal (agent-to-agent messaging via ttal send --to)
Sources:
- OpenAI Swarm repository — https://github.com/openai/swarm
- OpenAI Agents SDK handoffs docs — https://openai.github.io/openai-agents-python/handoffs/
Cross-Cutting Concerns
Communication Topologies
All patterns face the question of how agents talk to each other:
- Hub-and-spoke: All messages go through orchestrator/daemon (ttal, LangGraph supervisor)
- Peer-to-peer: Direct agent communication (OpenAI Swarm, ttal’s
ttal send) - Pub/sub: Agents subscribe to message topics (Microsoft reference architecture)
- Shared state: Agents read/write to a common state object (LangGraph’s StateGraph)
State Management
- Stateless: Each agent call is independent (OpenAI Swarm — no state between runs)
- Shared memory: Common context passed between agents (LangGraph state, AutoGen GroupChat)
- External persistence: State lives outside agents in DB/files (ttal’s taskwarrior + flicknote)
- Agent-local memory: Each agent maintains its own memory (ttal’s per-agent memory/ directories)
Termination Conditions
Every multi-agent pattern needs exit criteria:
- Max iterations / rounds
- Quality threshold (reviewer approves)
- Task completion signal (task marked done)
- Human approval gate
- Token/cost budget exhaustion
How ttal Combines These Patterns
ttal doesn’t implement a single pattern — it’s a hybrid system that composes multiple patterns:
| Pattern | ttal Implementation |
|---|---|
| Orchestrator | Yuki (persistent, plans and delegates) |
| Router | ttal task route (dispatch to specialists by role) |
| Pipeline | Task tag flow: +research → +design → execute |
| Executor-Reviewer | Worker coder + reviewer window per PR |
| Supervisor | Daemon (monitors workers via fsnotify, CI polling) |
| Swarm | Agent-to-agent messaging (ttal send --to) |
What’s distinctive about ttal:
- Two-plane architecture — manager plane (long-lived agents on Claude Code) vs worker plane (short-lived coders in git worktrees). Most frameworks treat all agents the same.
- External state — tasks live in taskwarrior, docs in flicknote, not in agent memory. Agents are stateless executors with external persistence.
- Process-level isolation — each worker is a separate tmux session + git worktree. No shared Python/JS runtime. True OS-level isolation.
- Human-native tools — uses git, taskwarrior, tmux — tools humans already understand. Not a custom runtime.
Recommendations
- Document ttal’s pattern composition — the fact that ttal combines 6+ patterns is itself a pattern worth naming (maybe “Composite Agent Architecture” or “Hybrid Orchestration”)
- The two-plane split is genuinely novel — most frameworks don’t distinguish between long-running thinkers and short-lived doers. This maps to manager/IC in orgs.
- External state is underappreciated — most frameworks use shared in-memory state. ttal’s use of taskwarrior + flicknote means agents can crash and restart without losing context.
- Consider formalizing the pipeline — ttal’s +research → +design → execute flow is implicit in tag conventions. Making it a first-class concept could improve observability.