Multi-Agent Systems.Supervisors, handoffs, swarms — and when each one breaks.
One agent works. Two agents talk. Five agents argue forever unless the orchestration is right. Multi-agent systems are not "more agents" — they are an architecture decision about who decides what, when, and with what budget. The orchestration pattern matters more than the model choice.
Supervisor vs handoffs
Supervisor pattern: one orchestrator delegates to specialists and integrates their answers. Clean on paper, slow in production when the supervisor becomes a bottleneck. Handoffs: agents pass the baton with the current context. Messier in code, recovers better when one agent loses the plot. Most teams that start with supervisor end up on handoffs for the recovery story.
When to reach for a swarm
Swarms shine when sub-goals are genuinely parallelisable and the work is exploratory — research, broad search, multi-source synthesis. Skip swarms for sequential workflows; the coordination overhead destroys the parallel-win on anything that has to happen in order.
Parallel sub-agent execution in practice
Modern agent runtimes can fan out to sub-agents for independent reads, then serialise on the write turn. This cuts a 20-minute refactor to under 4. The orchestration pattern matches the supervisor / swarm split — supervisor for the plan, swarm for the read phase, supervisor again for the write.
Deep dives on Multi-Agent Systems
Agentic RAG vs vanilla RAG: why a Sufficient Context Agent beats retrieve-then-pray
Google Research shipped Agentic RAG on Gemini Enterprise with a Sufficient Context Agent that refuses to answer when retrieval is incomplete. On factuality benchmarks they report up to 34% higher accuracy versus standard RAG. Here is when one-shot RAG is still enough, when you need iterative retrieval, and how I wire the pattern without blowing latency budgets.
Inside Recruiting Atelier: a runnable reference for the primitives of an agentic system
A working open studio that vets duplicates, plans the run, screens, scores, shortlists, and notifies. The whole pipeline lives in roughly ninety lines of supervisor code and a tool registry you can read in one sitting. Here is what is inside, why every piece is there, and what you can copy into your own stack.
AI agent vs agentic AI: what the distinction actually means when you ship one
Vendors blur the line because "agentic" sells. The two terms describe different architectures, with different cost shapes, different observability needs, and different scoping conversations. Here is the framing I use with clients and the three-question test for which one your project actually needs.
Why I am replacing supervisor patterns with handoffs
Supervisors looked clean on paper and shipped slow in production. Handoffs read messier in the code but recover better when an agent loses the plot. Two real systems and where supervisors still earn their keep.
Three patterns I broke in 2025, and what I do instead now
Self-correction loops without budgets, single-agent solutions to multi-domain problems, and using JSON mode to force structure I should have built into the schema. An honest review.
Haiku 4.5 made our router 5x cheaper. The trade-off matters
Replacing Sonnet with Haiku in the dispatcher role cut our orchestration cost dramatically. It also cost us in two specific places I did not predict.
Visual breakdowns on Multi-Agent Systems
Latest in Multi-Agent Systems
Xiaomi MiMo V2-Flash and TTS endpoints auto-route to MiMo-V2.5 on June 18: legacy model IDs retire June 30
Cursor adds /in-cloud subagents, /babysit for PR iteration, and reliable handoff between local and cloud agent sessions
Zhipu ships GLM-5.2: MIT open weights, 1M context, and Anthropic-compatible API for long-horizon coding agents
IIT Bombay unveils BharatGen Param2: a 17B MoE with tool calling across all 22 scheduled Indian languages, plus Shrutam2 ASR and Patram document vision
MiniMax M3 open weights ship on Hugging Face: 428B MoE with 1M sparse-attention context, native multimodality, and computer use
Google ships Agentic RAG on Gemini Enterprise with a Sufficient Context Agent that stops when retrieval is incomplete
The agent-to-agent layer consolidates: Microsoft Foundry adds A2A support at Build 2026 as the protocol passes 150 organizations
Cortex ships persistent memory for Claude Code: a local, pgvector-backed engine exposing 49 MCP tools
How Multi-Agent Systems ships in our engagements
The pages below are the buyer-focused, conversion-grade versions of this topic — deliverables, methodology, ROI, security considerations, and CTAs to scope a real engagement.
Agentic AI Consulting
Designed, built, and handed off — production agentic systems for enterprise teams.
Explore the Agentic AI Consulting solutionAI Systems Engineering Training
Eight-day corporate training programs that take dev teams from AI-assisted coding to production agentic systems.
Explore the AI Systems Engineering Training solutionEnterprise AI Architecture
Reference architectures for organisations standing up an AI platform — not one agent, but the foundation for many.
Explore the Enterprise AI Architecture solutionAI Observability
Tracing, eval, cache-hit telemetry, and cost attribution for production agents.
Explore the AI Observability solutionMulti-Agent Workflows
Supervisor + handoff orchestration for portfolios of agents that need to cooperate without arguing.
Explore the Multi-Agent Workflows solutionFull specs for implementers
Multi-Agent Systems — the questions teams actually ask
Train your team on Multi-Agent Systems
Two tracks — one for developers who build agents, one for business teams who use them. Customised to your stack, hands-on from session 1.
See Multi-Agent Systems training tracksShip your first Multi-Agent Systems system
Architecture design, production implementation on Claude API and MCP, full observability, and a real handoff. Working agents, not slides.
Explore Multi-Agent Systems consultingAdjacent topics to read next
Go deeper on this topic
New breakdowns on this and related agentic AI topics, plus what I am shipping for clients — one email on Thursdays.

