Multi-Agent Workflow Design Patterns

Created at: Mar 17, 2026 at 06:51:08 AM

Multi-Agent Workflow Design Patterns

Research into established multi-agent design patterns — the philosophical/architectural patterns themselves, not framework-specific implementations. Frameworks cited as evidence of adoption and naming conventions.

Pattern 1: Executor-Reviewer (Generator-Critic)

Core idea: Separate the act of doing from the act of judging. One agent produces work; another evaluates it against criteria. Neither role is complete without the other.

Why it matters: Mirrors how quality emerges in human organizations — writers need editors, developers need code review. The executor optimizes for completion; the reviewer optimizes for correctness. This tension produces better outcomes than either alone.

Variations:

  • Single-pass review: Executor produces → Reviewer evaluates → done or reject
  • Iterative refinement: Executor produces → Reviewer critiques → Executor revises → loop until satisfied
  • Advisory review: Reviewer gives verdict but doesn’t block (ttal’s approach — reviewer posts LGTM/NEEDS_WORK, coder triages)

Trade-offs:

  • Higher quality output at the cost of latency and compute
  • Risk of infinite revision loops without termination conditions
  • Reviewer must be genuinely independent (different prompt/context) to add value — a rubber-stamp reviewer wastes tokens

Seen in: LangGraph (generator-critic), Anthropic’s building-effective-agents blog (evaluator-optimizer), ttal (coder + reviewer windows), traditional software engineering (PR review)

Sources:

Pattern 2: Router (Dispatcher)

Core idea: A decision point that examines input and sends it to the right specialist. The router itself does minimal work — its job is classification and dispatch.

Why it matters: Avoids the “one agent does everything” anti-pattern. Specialized agents with narrow tool sets and focused prompts outperform general-purpose agents. The router is the switch that makes specialization possible.

Variations:

  • Static routing: Rules-based dispatch (if tag=research → Athena, if tag=design → Inke)
  • LLM routing: An LLM classifies the request and picks the destination
  • Semantic routing with fallback: Lightweight classifier first, LLM only when confidence is low (cost optimization)
  • Hierarchical routing: Router → sub-router → specialist (multi-level dispatch)

Trade-offs:

  • Routing errors cascade — wrong specialist gets the task, wastes time
  • Simple routers are fast but brittle; LLM routers are flexible but add latency
  • Need clear capability descriptions for each destination agent

Seen in: LangGraph (conditional edges), OpenAI Swarm/Agents SDK (triage agent), ttal (ttal task route --to <agent>), CrewAI (process selection)

Sources:

Pattern 3: Orchestrator (Central Coordinator)

Core idea: A single agent holds the big picture. It decomposes complex tasks, delegates subtasks to workers, tracks progress, and synthesizes results. The orchestrator thinks; workers execute.

Why it matters: Complex tasks require decomposition, and someone needs to own the decomposition. Without an orchestrator, agents either duplicate work or leave gaps. The orchestrator is the project manager.

Variations:

  • Static orchestration: Predefined workflow graph (DAG of agent calls)
  • Dynamic orchestration: Orchestrator decides subtasks at runtime based on input
  • Hierarchical orchestration: Orchestrator delegates to sub-orchestrators who each manage a team
  • Long-running orchestration: Persistent orchestrator that maintains state across sessions (ttal’s Yuki)

Key difference from Router: A router dispatches and forgets. An orchestrator dispatches, monitors, collects results, and makes decisions about what happens next.

Trade-offs:

  • Single point of failure — if orchestrator context window fills up, everything stalls
  • Orchestrator overhead (tokens spent on coordination, not execution)
  • Must balance between micromanaging (too much control) and hands-off (too little visibility)

Seen in: Anthropic’s orchestrator-workers workflow, LangGraph (supervisor pattern), CrewAI (hierarchical process with manager agent), ttal (Yuki as orchestrator + daemon as infrastructure)

Sources:

Pattern 4: Pipeline (Sequential Chain)

Core idea: Work flows through a fixed sequence of stages. Each agent transforms the output and passes it forward. Assembly line for AI.

Why it matters: When the workflow is predictable and each stage has clear input/output contracts, pipelines are the simplest and most debuggable pattern. No routing decisions, no coordination overhead.

Variations:

  • Linear pipeline: A → B → C → done
  • Pipeline with gates: Each stage can reject/send-back to previous stage
  • Fan-out/fan-in pipeline: Stage splits into parallel branches, then merges
  • Research pipeline: ttal’s research → design → implementation flow (Athena → Inke → Worker)

Trade-offs:

  • Simple and predictable, but inflexible for tasks that don’t fit the fixed sequence
  • Stage coupling — if stage B’s contract changes, stage C breaks
  • No good way to handle tasks that skip stages or need non-linear flow

Seen in: CrewAI (sequential process, the default), Anthropic’s prompt chaining workflow, Unix pipes philosophy, ttal’s task tag pipeline (+research → +design → execute)

Sources:

Pattern 5: Debate / Consensus (Multi-Perspective Reasoning)

Core idea: Multiple agents independently reason about the same problem, then exchange perspectives and refine. Truth emerges from disagreement.

Why it matters: A single LLM has blind spots. Multiple agents with different prompts/temperatures/roles surface different aspects of a problem. Particularly effective for reasoning tasks where the “obvious” answer may be wrong.

Variations:

  • Full debate: All agents see all other agents’ responses, iterate for N rounds
  • Sparse debate: Agents only see neighbors’ responses (reduces token cost, maintains diversity)
  • Majority voting: Simple aggregation — most popular answer wins (no iterative refinement)
  • Judge-mediated: A separate judge agent evaluates competing arguments

Trade-offs:

  • Expensive — N agents × M rounds of full-context reasoning
  • Risk of convergence to groupthink (agents agree too quickly)
  • Sparse communication topologies help maintain diversity but reduce cross-pollination
  • Best for high-stakes decisions where correctness matters more than speed

Seen in: AutoGen (multi-agent debate with sparse topology, based on arxiv:2406.11776), LangGraph (swarm pattern), academic research on LLM debate for reasoning

Sources:

Pattern 6: Supervisor (Monitor-Correct)

Core idea: An agent watches other agents work and intervenes when things go wrong. Unlike an orchestrator that actively delegates, a supervisor passively monitors and only acts on deviation.

Why it matters: Autonomous agents make mistakes — hallucinate, get stuck in loops, pursue wrong approaches. A supervisor provides a safety net without the overhead of micromanagement.

Variations:

  • Quality supervisor: Checks output quality, requests rework if below threshold
  • Safety supervisor: Monitors for policy violations, harmful outputs, or security issues
  • Resource supervisor: Tracks token usage, time, cost — kills runaway agents
  • Human-in-the-loop supervisor: Escalates to humans when confidence is low

Key difference from Orchestrator: Orchestrator proactively plans and delegates. Supervisor reactively monitors and corrects. An orchestrator says “do X, Y, Z.” A supervisor says “I’m watching — carry on, but I’ll step in if needed.”

Trade-offs:

  • Adds overhead for monitoring that may rarely trigger
  • Must define clear intervention criteria (too sensitive = micromanaging, too loose = missing problems)
  • Can create false sense of safety if supervisor itself has blind spots

Seen in: LangGraph (supervisor pattern), Microsoft reference architecture (Supervisor Agent pattern), ttal (daemon monitors worker sessions via fsnotify, PR watcher polls CI status)

Sources:

Pattern 7: Swarm (Peer-to-Peer Handoff)

Core idea: No central coordinator. Agents hand off control directly to each other based on the conversation state. Each agent decides who should handle the next step.

Why it matters: Eliminates the orchestrator bottleneck. Agents are autonomous peers. The system is emergent — the workflow arises from individual handoff decisions rather than top-down planning.

Variations:

  • Simple handoff: Agent A → Agent B (one-way transfer of control)
  • Handoff with context: Transfer includes updated context variables
  • Bidirectional handoff: Agents can hand back if they can’t handle it
  • Triage-first: Initial triage agent routes, then specialists hand off among themselves

Trade-offs:

  • Flexible and decentralized, but less predictable than orchestrated patterns
  • Hard to debug — no single place to see the full workflow
  • Risk of circular handoffs (A → B → A → B…) without termination logic
  • Works best when agent boundaries are clear and handoff criteria are well-defined

Seen in: OpenAI Swarm/Agents SDK (the defining implementation), LangGraph (swarm pattern with create_handoff_tool), ttal (agent-to-agent messaging via ttal send --to)

Sources:

Cross-Cutting Concerns

Communication Topologies

All patterns face the question of how agents talk to each other:

  • Hub-and-spoke: All messages go through orchestrator/daemon (ttal, LangGraph supervisor)
  • Peer-to-peer: Direct agent communication (OpenAI Swarm, ttal’s ttal send)
  • Pub/sub: Agents subscribe to message topics (Microsoft reference architecture)
  • Shared state: Agents read/write to a common state object (LangGraph’s StateGraph)

State Management

  • Stateless: Each agent call is independent (OpenAI Swarm — no state between runs)
  • Shared memory: Common context passed between agents (LangGraph state, AutoGen GroupChat)
  • External persistence: State lives outside agents in DB/files (ttal’s taskwarrior + flicknote)
  • Agent-local memory: Each agent maintains its own memory (ttal’s per-agent memory/ directories)

Termination Conditions

Every multi-agent pattern needs exit criteria:

  • Max iterations / rounds
  • Quality threshold (reviewer approves)
  • Task completion signal (task marked done)
  • Human approval gate
  • Token/cost budget exhaustion

How ttal Combines These Patterns

ttal doesn’t implement a single pattern — it’s a hybrid system that composes multiple patterns:

Pattern ttal Implementation
Orchestrator Yuki (persistent, plans and delegates)
Router ttal task route (dispatch to specialists by role)
Pipeline Task tag flow: +research → +design → execute
Executor-Reviewer Worker coder + reviewer window per PR
Supervisor Daemon (monitors workers via fsnotify, CI polling)
Swarm Agent-to-agent messaging (ttal send --to)

What’s distinctive about ttal:

  1. Two-plane architecture — manager plane (long-lived agents on Claude Code) vs worker plane (short-lived coders in git worktrees). Most frameworks treat all agents the same.
  2. External state — tasks live in taskwarrior, docs in flicknote, not in agent memory. Agents are stateless executors with external persistence.
  3. Process-level isolation — each worker is a separate tmux session + git worktree. No shared Python/JS runtime. True OS-level isolation.
  4. Human-native tools — uses git, taskwarrior, tmux — tools humans already understand. Not a custom runtime.

Recommendations

  1. Document ttal’s pattern composition — the fact that ttal combines 6+ patterns is itself a pattern worth naming (maybe “Composite Agent Architecture” or “Hybrid Orchestration”)
  2. The two-plane split is genuinely novel — most frameworks don’t distinguish between long-running thinkers and short-lived doers. This maps to manager/IC in orgs.
  3. External state is underappreciated — most frameworks use shared in-memory state. ttal’s use of taskwarrior + flicknote means agents can crash and restart without losing context.
  4. Consider formalizing the pipeline — ttal’s +research → +design → execute flow is implicit in tag conventions. Making it a first-class concept could improve observability.

Sources Summary

Source URL Type
Anthropic, “Building Effective Agents” https://www.anthropic.com/engineering/building-effective-agents Blog
Microsoft Multi-Agent Reference Architecture https://microsoft.github.io/multi-agent-reference-architecture/docs/reference-architecture/Patterns.html Reference
Google Cloud, “Choose a design pattern” https://docs.cloud.google.com/architecture/choose-design-pattern-agentic-ai-system Architecture Guide
AutoGen Multi-Agent Debate https://microsoft.github.io/autogen/stable/user-guide/core-user-guide/design-patterns/multi-agent-debate.html Docs
OpenAI Swarm https://github.com/openai/swarm Repository
OpenAI Agents SDK Handoffs https://openai.github.io/openai-agents-python/handoffs/ Docs
CrewAI Hierarchical Process https://docs.crewai.com/en/learn/hierarchical-process Docs
LangGraph Workflows & Agents https://docs.langchain.com/oss/python/langgraph/workflows-agents Docs
“Improving Multi-Agent Debate with Sparse Topology” https://arxiv.org/abs/2406.11776 Paper
C# Corner Agent Orchestration Patterns https://www.c-sharpcorner.com/article/llm-agent-orchestration-patterns-architectural-frameworks-for-managing-complex/ Article