Multi-Agent Workflow Design Patterns

Created at: Mar 17, 2026 at 06:51:08 AM

Multi-Agent Workflow Design Patterns

Research into established multi-agent design patterns — the philosophical/architectural patterns themselves, not framework-specific implementations. Frameworks cited as evidence of adoption and naming conventions.

Pattern 1: Executor-Reviewer (Generator-Critic)

Core idea: Separate the act of doing from the act of judging. One agent produces work; another evaluates it against criteria. Neither role is complete without the other.

Why it matters: Mirrors how quality emerges in human organizations — writers need editors, developers need code review. The executor optimizes for completion; the reviewer optimizes for correctness. This tension produces better outcomes than either alone.

Variations:

Single-pass review: Executor produces → Reviewer evaluates → done or reject
Iterative refinement: Executor produces → Reviewer critiques → Executor revises → loop until satisfied
Advisory review: Reviewer gives verdict but doesn’t block (ttal’s approach — reviewer posts LGTM/NEEDS_WORK, coder triages)

Trade-offs:

Higher quality output at the cost of latency and compute
Risk of infinite revision loops without termination conditions
Reviewer must be genuinely independent (different prompt/context) to add value — a rubber-stamp reviewer wastes tokens

Seen in: LangGraph (generator-critic), Anthropic’s building-effective-agents blog (evaluator-optimizer), ttal (coder + reviewer windows), traditional software engineering (PR review)

Sources:

Anthropic, “Building Effective Agents” — https://www.anthropic.com/engineering/building-effective-agents
Microsoft Multi-Agent Reference Architecture — https://microsoft.github.io/multi-agent-reference-architecture/docs/reference-architecture/Patterns.html
Google Cloud, “Choose a design pattern for agentic AI” — https://docs.cloud.google.com/architecture/choose-design-pattern-agentic-ai-system

Pattern 2: Router (Dispatcher)

Core idea: A decision point that examines input and sends it to the right specialist. The router itself does minimal work — its job is classification and dispatch.

Why it matters: Avoids the “one agent does everything” anti-pattern. Specialized agents with narrow tool sets and focused prompts outperform general-purpose agents. The router is the switch that makes specialization possible.

Variations:

Static routing: Rules-based dispatch (if tag=research → Athena, if tag=design → Inke)
LLM routing: An LLM classifies the request and picks the destination
Semantic routing with fallback: Lightweight classifier first, LLM only when confidence is low (cost optimization)
Hierarchical routing: Router → sub-router → specialist (multi-level dispatch)

Trade-offs:

Routing errors cascade — wrong specialist gets the task, wastes time
Simple routers are fast but brittle; LLM routers are flexible but add latency
Need clear capability descriptions for each destination agent

Seen in: LangGraph (conditional edges), OpenAI Swarm/Agents SDK (triage agent), ttal (ttal task route --to <agent>), CrewAI (process selection)

Sources:

Google Cloud Architecture — https://docs.cloud.google.com/architecture/choose-design-pattern-agentic-ai-system
Anthropic, “Building Effective Agents” — https://www.anthropic.com/engineering/building-effective-agents

Pattern 3: Orchestrator (Central Coordinator)

Core idea: A single agent holds the big picture. It decomposes complex tasks, delegates subtasks to workers, tracks progress, and synthesizes results. The orchestrator thinks; workers execute.

Why it matters: Complex tasks require decomposition, and someone needs to own the decomposition. Without an orchestrator, agents either duplicate work or leave gaps. The orchestrator is the project manager.

Variations:

Static orchestration: Predefined workflow graph (DAG of agent calls)
Dynamic orchestration: Orchestrator decides subtasks at runtime based on input
Hierarchical orchestration: Orchestrator delegates to sub-orchestrators who each manage a team
Long-running orchestration: Persistent orchestrator that maintains state across sessions (ttal’s Yuki)

Key difference from Router: A router dispatches and forgets. An orchestrator dispatches, monitors, collects results, and makes decisions about what happens next.

Trade-offs:

Single point of failure — if orchestrator context window fills up, everything stalls
Orchestrator overhead (tokens spent on coordination, not execution)
Must balance between micromanaging (too much control) and hands-off (too little visibility)

Seen in: Anthropic’s orchestrator-workers workflow, LangGraph (supervisor pattern), CrewAI (hierarchical process with manager agent), ttal (Yuki as orchestrator + daemon as infrastructure)

Sources:

Anthropic, “Building Effective Agents” — https://www.anthropic.com/engineering/building-effective-agents
CrewAI hierarchical process docs — https://docs.crewai.com/en/learn/hierarchical-process
Microsoft Multi-Agent Reference Architecture — https://microsoft.github.io/multi-agent-reference-architecture/docs/reference-architecture/Patterns.html

Pattern 4: Pipeline (Sequential Chain)

Core idea: Work flows through a fixed sequence of stages. Each agent transforms the output and passes it forward. Assembly line for AI.

Why it matters: When the workflow is predictable and each stage has clear input/output contracts, pipelines are the simplest and most debuggable pattern. No routing decisions, no coordination overhead.

Variations:

Linear pipeline: A → B → C → done
Pipeline with gates: Each stage can reject/send-back to previous stage
Fan-out/fan-in pipeline: Stage splits into parallel branches, then merges
Research pipeline: ttal’s research → design → implementation flow (Athena → Inke → Worker)

Trade-offs:

Simple and predictable, but inflexible for tasks that don’t fit the fixed sequence
Stage coupling — if stage B’s contract changes, stage C breaks
No good way to handle tasks that skip stages or need non-linear flow

Seen in: CrewAI (sequential process, the default), Anthropic’s prompt chaining workflow, Unix pipes philosophy, ttal’s task tag pipeline (+research → +design → execute)

Sources:

Anthropic, “Building Effective Agents” — https://www.anthropic.com/engineering/building-effective-agents
CrewAI sequential process docs — https://docs.crewai.com/en/learn/sequential-process

Pattern 5: Debate / Consensus (Multi-Perspective Reasoning)

Core idea: Multiple agents independently reason about the same problem, then exchange perspectives and refine. Truth emerges from disagreement.

Why it matters: A single LLM has blind spots. Multiple agents with different prompts/temperatures/roles surface different aspects of a problem. Particularly effective for reasoning tasks where the “obvious” answer may be wrong.

Variations:

Full debate: All agents see all other agents’ responses, iterate for N rounds
Sparse debate: Agents only see neighbors’ responses (reduces token cost, maintains diversity)
Majority voting: Simple aggregation — most popular answer wins (no iterative refinement)
Judge-mediated: A separate judge agent evaluates competing arguments

Trade-offs:

Expensive — N agents × M rounds of full-context reasoning
Risk of convergence to groupthink (agents agree too quickly)
Sparse communication topologies help maintain diversity but reduce cross-pollination
Best for high-stakes decisions where correctness matters more than speed

Seen in: AutoGen (multi-agent debate with sparse topology, based on arxiv:2406.11776), LangGraph (swarm pattern), academic research on LLM debate for reasoning

Sources:

AutoGen multi-agent debate docs — https://microsoft.github.io/autogen/stable/user-guide/core-user-guide/design-patterns/multi-agent-debate.html
“Improving Multi-Agent Debate with Sparse Communication Topology” — https://arxiv.org/abs/2406.11776
Microsoft Multi-Agent Reference Architecture — https://microsoft.github.io/multi-agent-reference-architecture/docs/reference-architecture/Patterns.html

Pattern 6: Supervisor (Monitor-Correct)

Core idea: An agent watches other agents work and intervenes when things go wrong. Unlike an orchestrator that actively delegates, a supervisor passively monitors and only acts on deviation.

Why it matters: Autonomous agents make mistakes — hallucinate, get stuck in loops, pursue wrong approaches. A supervisor provides a safety net without the overhead of micromanagement.

Variations:

Quality supervisor: Checks output quality, requests rework if below threshold
Safety supervisor: Monitors for policy violations, harmful outputs, or security issues
Resource supervisor: Tracks token usage, time, cost — kills runaway agents
Human-in-the-loop supervisor: Escalates to humans when confidence is low

Key difference from Orchestrator: Orchestrator proactively plans and delegates. Supervisor reactively monitors and corrects. An orchestrator says “do X, Y, Z.” A supervisor says “I’m watching — carry on, but I’ll step in if needed.”

Trade-offs:

Adds overhead for monitoring that may rarely trigger
Must define clear intervention criteria (too sensitive = micromanaging, too loose = missing problems)
Can create false sense of safety if supervisor itself has blind spots

Seen in: LangGraph (supervisor pattern), Microsoft reference architecture (Supervisor Agent pattern), ttal (daemon monitors worker sessions via fsnotify, PR watcher polls CI status)

Sources:

Microsoft Multi-Agent Reference Architecture — https://microsoft.github.io/multi-agent-reference-architecture/docs/reference-architecture/Patterns.html
LangGraph supervisor docs — https://docs.langchain.com/oss/python/langgraph/workflows-agents

Pattern 7: Swarm (Peer-to-Peer Handoff)

Core idea: No central coordinator. Agents hand off control directly to each other based on the conversation state. Each agent decides who should handle the next step.

Why it matters: Eliminates the orchestrator bottleneck. Agents are autonomous peers. The system is emergent — the workflow arises from individual handoff decisions rather than top-down planning.

Variations:

Simple handoff: Agent A → Agent B (one-way transfer of control)
Handoff with context: Transfer includes updated context variables
Bidirectional handoff: Agents can hand back if they can’t handle it
Triage-first: Initial triage agent routes, then specialists hand off among themselves

Trade-offs:

Flexible and decentralized, but less predictable than orchestrated patterns
Hard to debug — no single place to see the full workflow
Risk of circular handoffs (A → B → A → B…) without termination logic
Works best when agent boundaries are clear and handoff criteria are well-defined

Seen in: OpenAI Swarm/Agents SDK (the defining implementation), LangGraph (swarm pattern with create_handoff_tool), ttal (agent-to-agent messaging via ttal send --to)

Sources:

OpenAI Swarm repository — https://github.com/openai/swarm
OpenAI Agents SDK handoffs docs — https://openai.github.io/openai-agents-python/handoffs/

Cross-Cutting Concerns

Communication Topologies

All patterns face the question of how agents talk to each other:

Hub-and-spoke: All messages go through orchestrator/daemon (ttal, LangGraph supervisor)
Peer-to-peer: Direct agent communication (OpenAI Swarm, ttal’s ttal send)
Pub/sub: Agents subscribe to message topics (Microsoft reference architecture)
Shared state: Agents read/write to a common state object (LangGraph’s StateGraph)

State Management

Stateless: Each agent call is independent (OpenAI Swarm — no state between runs)
Shared memory: Common context passed between agents (LangGraph state, AutoGen GroupChat)
External persistence: State lives outside agents in DB/files (ttal’s taskwarrior + flicknote)
Agent-local memory: Each agent maintains its own memory (ttal’s per-agent memory/ directories)

Termination Conditions

Every multi-agent pattern needs exit criteria:

Max iterations / rounds
Quality threshold (reviewer approves)
Task completion signal (task marked done)
Human approval gate
Token/cost budget exhaustion

How ttal Combines These Patterns

ttal doesn’t implement a single pattern — it’s a hybrid system that composes multiple patterns:

Pattern	ttal Implementation
Orchestrator	Yuki (persistent, plans and delegates)
Router	`ttal task route` (dispatch to specialists by role)
Pipeline	Task tag flow: +research → +design → execute
Executor-Reviewer	Worker coder + reviewer window per PR
Supervisor	Daemon (monitors workers via fsnotify, CI polling)
Swarm	Agent-to-agent messaging (`ttal send --to`)

What’s distinctive about ttal:

Two-plane architecture — manager plane (long-lived agents on Claude Code) vs worker plane (short-lived coders in git worktrees). Most frameworks treat all agents the same.
External state — tasks live in taskwarrior, docs in flicknote, not in agent memory. Agents are stateless executors with external persistence.
Process-level isolation — each worker is a separate tmux session + git worktree. No shared Python/JS runtime. True OS-level isolation.
Human-native tools — uses git, taskwarrior, tmux — tools humans already understand. Not a custom runtime.

Recommendations

Document ttal’s pattern composition — the fact that ttal combines 6+ patterns is itself a pattern worth naming (maybe “Composite Agent Architecture” or “Hybrid Orchestration”)
The two-plane split is genuinely novel — most frameworks don’t distinguish between long-running thinkers and short-lived doers. This maps to manager/IC in orgs.
External state is underappreciated — most frameworks use shared in-memory state. ttal’s use of taskwarrior + flicknote means agents can crash and restart without losing context.
Consider formalizing the pipeline — ttal’s +research → +design → execute flow is implicit in tag conventions. Making it a first-class concept could improve observability.

Sources Summary

Source	URL	Type
Anthropic, “Building Effective Agents”	https://www.anthropic.com/engineering/building-effective-agents	Blog
Microsoft Multi-Agent Reference Architecture	https://microsoft.github.io/multi-agent-reference-architecture/docs/reference-architecture/Patterns.html	Reference
Google Cloud, “Choose a design pattern”	https://docs.cloud.google.com/architecture/choose-design-pattern-agentic-ai-system	Architecture Guide
AutoGen Multi-Agent Debate	https://microsoft.github.io/autogen/stable/user-guide/core-user-guide/design-patterns/multi-agent-debate.html	Docs
OpenAI Swarm	https://github.com/openai/swarm	Repository
OpenAI Agents SDK Handoffs	https://openai.github.io/openai-agents-python/handoffs/	Docs
CrewAI Hierarchical Process	https://docs.crewai.com/en/learn/hierarchical-process	Docs
LangGraph Workflows & Agents	https://docs.langchain.com/oss/python/langgraph/workflows-agents	Docs
“Improving Multi-Agent Debate with Sparse Topology”	https://arxiv.org/abs/2406.11776	Paper
C# Corner Agent Orchestration Patterns	https://www.c-sharpcorner.com/article/llm-agent-orchestration-patterns-architectural-frameworks-for-managing-complex/	Article