**DRAFT — pending editorial expansion.** This article is a working draft published as scaffolding for the NINtec content programme. The current version covers the substantive perspective in compressed form; the published version will expand each section to the 2,000+ word depth the topic warrants. Editorial review is required before promotion.
Multi-agent systems are the architectural pattern most often requested and least often genuinely needed. Single-agent loops handle most production workloads; multi-agent comes with its own failure modes — orchestration deadlocks, cost amplification, debugging complexity. This piece covers the decision framework and the engineering discipline.
When multi-agent is genuinely required
Genuine multi-agent fit: workloads where specialisation matters (a research agent, a planning agent, an execution agent), workloads where parallel exploration adds value (multiple agents proposing solutions concurrently), workloads where boundaries are clear (manager-worker hierarchies).
Single-agent often suffices
Most workloads we evaluate end up as single-agent. The agent can call multiple tools, pursue multiple strategies, and self-correct — without the multi-agent overhead. The Discovery phase weighs whether multi-agent is genuinely required or whether single-agent is simpler and more robust.
Orchestration patterns
Hierarchical (manager assigns work to specialist agents) is the most common production pattern. Peer-to-peer (agents collaborate without central coordination) is harder to debug. Anthropic's Agent SDK provides primitives for hierarchical orchestration; LangGraph supports more complex state machines.
Failure-mode-first design
Multi-agent systems fail in distinctive ways — orchestration deadlocks, cost amplification, conflict between agents. The architecture phase maps the failure modes before scoping the happy path. Production multi-agent systems include explicit timeout, escalation, and rollback patterns.
Multi-agent engagements run 16–24 weeks because the orchestration complexity adds calendar to the build. The shadow-mode validation period is longer than single-agent equivalents.