Files
gh-basher83-lunar-claude-pl…/skills/multi-agent-composition/examples/case-studies.md
2025-11-29 18:00:36 +08:00

30 KiB
Raw Blame History

Multi-Agent Case Studies

Real-world examples of multi-agent systems in production, drawn from field experience.

Case Study Index

# Name Pattern Agents Key Lesson
1 AI Docs Loader Sub-agent delegation 8-10 Parallel work without context pollution
2 SDK Migration Scout-plan-build 6 Search + plan + implement workflow
3 Codebase Summarization Orchestrator + QA 3 Divide and conquer with synthesis
4 UI Component Creation Scout-builder 2 Precise targeting before building
5 PLAN-BUILD-REVIEW-SHIP Task board lifecycle 4 Quality gates between phases
6 Meta-Agent System Agent building agents Variable Recursive agent creation
7 Observability Dashboard Fleet monitoring 5-10+ Real-time multi-agent visibility
8 AFK Agent Device Autonomous background work 3-5 Out-of-loop while you sleep

Case Study 1: AI Docs Loader

Pattern: Sub-agent delegation for parallel work

Problem: Loading 10 documentation URLs consumes 30k+ tokens per scrape. Single agent would hit 150k+ tokens.

Solution: Delegate each scrape to isolated sub-agent

Architecture:

Primary Agent (9k tokens)
├→ Sub-Agent 1: Scrape doc 1 (3k tokens, isolated)
├→ Sub-Agent 2: Scrape doc 2 (3k tokens, isolated)
├→ Sub-Agent 3: Scrape doc 3 (3k tokens, isolated)
...
└→ Sub-Agent 10: Scrape doc 10 (3k tokens, isolated)

Total work: 39k tokens
Primary agent: Only 9k tokens ✅
Context protected: 30k tokens kept out of primary

Implementation:

# Single command
/load-ai-docs

# Agent reads list from ai-docs/README.md
# For each URL older than 24 hours:
#   - Spawn sub-agent
#   - Sub-agent scrapes URL
#   - Sub-agent saves to file
#   - Sub-agent reports completion
# Primary agent never sees scrape content

Key techniques:

  • Sub-agents for isolation - Each scrape in separate context
  • Parallel execution - All 10 scrapes run simultaneously
  • Context delegation - 30k tokens stay out of primary

Results:

  • Time: 10 scrapes in parallel vs. sequential (10x faster)
  • Context: Primary agent stays at 9k tokens throughout
  • Scalability: Can handle 50+ URLs without primary context issues

Source: Elite Context Engineering transcript


Case Study 2: SDK Migration

Pattern: Scout-plan-build with multiple perspectives

Problem: Migrating codebase to new Claude Agent SDK across 8 applications

Challenge:

  • 100+ files potentially affected
  • Agent reading everything = 150k+ tokens
  • Planning without full context = mistakes

Solution: Three-phase workflow with delegation

Phase 1: Scout (Reduce context for planner)

Orchestrator spawns 4 scout agents (parallel):
├→ Scout 1: Gemini Lightning (fast, different perspective)
├→ Scout 2: CodeX (specialized for code search)
├→ Scout 3: Gemini Flash Preview
└→ Scout 4: Haiku (cheap, fast)

Each scout:
- Searches codebase for SDK usage
- Identifies exact files and line numbers
- Notes patterns (e.g., "system prompt now explicit")

Output: relevant-files.md (5k tokens)
├── File paths
├── Line number offsets
├── Character ranges
└── Relevant code snippets

Why multiple models? Diverse perspectives catch edge cases single model might miss.

Phase 2: Plan (Focus on relevant subset)

Planner agent (new instance):
├── Reads relevant-files.md (5k tokens)
├── Scrapes SDK documentation (8k tokens)
├── Analyzes migration patterns
└── Creates detailed-plan.md (3k tokens)

Context used: 16k tokens
vs. 150k if reading entire codebase
Savings: 89% reduction

Phase 3: Build (Execute plan)

Builder agent (new instance):
├── Reads detailed-plan.md (3k tokens)
├── Implements changes across 8 apps
├── Updates system prompts
├── Tests each application
└── Reports completion

Context used: ~80k tokens
Still within safe limits

Final context analysis:

If single agent:
├── Search: 40k tokens
├── Read files: 60k tokens
├── Plan: 20k tokens
├── Implement: 30k tokens
└── Total: 150k tokens (75% used)

With scout-plan-build:
├── Primary orchestrator: 10k tokens
├── 4 scouts (parallel, isolated): 4 × 15k = 60k total, 0k in primary
├── Planner (new agent): 16k tokens
├── Builder (new agent): 80k tokens
└── Max per agent: 80k tokens (40% per agent)

Key techniques:

  • Composable workflows - Chain /scout, /plan, /build
  • Multiple scout models - Diverse perspectives
  • Context offloading - Scouts protect planner's context
  • Fresh agents per phase - No context accumulation

Results:

  • 8 applications migrated successfully
  • 51% context used in builder phase (safe margins)
  • No context explosions across entire workflow
  • Completed in single session (~30 minutes)

Near miss: "We were 14% away from exploding our context" due to autocompact buffer

Lesson: Disable autocompact buffer. That 22% matters at scale.

Source: Claude 2.0 transcript


Case Study 3: Codebase Summarization

Pattern: Orchestrator with specialized QA agents

Problem: Summarize large codebase (frontend + backend) with architecture docs

Approach: Divide and conquer with synthesis

Architecture:

Orchestrator Agent
├→ Creates Frontend QA Agent
│  ├─ Summarizes frontend components
│  └─ Outputs: frontend-summary.md
├→ Creates Backend QA Agent
│  ├─ Summarizes backend APIs
│  └─ Outputs: backend-summary.md
└→ Creates Primary QA Agent
   ├─ Reads both summaries
   ├─ Synthesizes unified view
   └─ Outputs: codebase-overview.md

Orchestrator behavior:

1. Parse user request: "Summarize codebase"
2. Create 3 agents with specialized tasks
3. Command each agent with detailed prompts
4. SLEEP (not observing their work)
5. Wake every 15s to check status
6. Agents complete → Orchestrator wakes
7. Collect results (read produced files)
8. Summarize for user
9. Delete all 3 agents

Prompts from orchestrator:

Frontend QA Agent:
"Analyze all files in src/frontend/. Create markdown summary with:
- Key components and their responsibilities
- State management approach
- Routing structure
- Technology stack
Output to docs/frontend-summary.md"

Backend QA Agent:
"Analyze all files in src/backend/. Create markdown summary with:
- API endpoints and their purposes
- Database schema
- Authentication/authorization
- External integrations
Output to docs/backend-summary.md"

Primary QA Agent:
"Read frontend-summary.md and backend-summary.md. Create unified overview with:
- High-level architecture
- How components interact
- Data flow
- Key technologies
Output to docs/codebase-overview.md"

Observability interface shows:

[Agent 1] Frontend QA
├── Status: Complete ✅
├── Context: 28k tokens used
├── Files consumed: 15 files
├── Files produced: frontend-summary.md
└── Time: 45 seconds

[Agent 2] Backend QA
├── Status: Complete ✅
├── Context: 32k tokens used
├── Files consumed: 12 files
├── Files produced: backend-summary.md
└── Time: 52 seconds

[Agent 3] Primary QA
├── Status: Complete ✅
├── Context: 18k tokens used
├── Files consumed: 2 files (summaries)
├── Files produced: codebase-overview.md
└── Time: 30 seconds

Orchestrator:
├── Context: 12k tokens (commands only, not observing work)
├── Total time: 52 seconds (parallel execution)
└── All agents deleted after completion

Key techniques:

  • Parallel frontend/backend - 2x speedup
  • Orchestrator sleeps - Protects its context
  • Synthesis agent - Combines perspectives
  • Deletable agents - Freed after use

Results:

  • 3 comprehensive docs created
  • Max context per agent: 32k tokens (16%)
  • Orchestrator context: 12k tokens (6%)
  • Time: 52 seconds (vs. 2+ minutes sequential)

Source: One Agent to Rule Them All transcript


Case Study 4: UI Component Creation

Pattern: Scout-builder two-stage

Problem: Create gray pills for app header information display

Challenge: Codebase has specific conventions. Need to find exact files and follow patterns.

Solution: Scout locates, builder implements

Phase 1: Scout

Scout Agent:
├── Task: "Find header UI component files"
├── Searches for: header, display, pills, info components
├── Identifies patterns: existing pill styles, color conventions
├── Locates exact files:
│   ├── src/components/AppHeader.vue
│   ├── src/styles/pills.css
│   └── src/utils/formatters.ts
└── Outputs: scout-header-report.md with:
    ├── File locations
    ├── Line numbers for modifications
    ├── Existing patterns to follow
    └── Recommended approach

Phase 2: Builder

Builder Agent:
├── Reads scout-header-report.md
├── Follows identified patterns
├── Creates gray pill components
├── Applies consistent styling
├── Outputs modified files with exact changes
└── Context: Only 30k tokens (vs. 80k+ without scout)

Orchestrator involvement:

1. User prompts: "Create gray pills for header"
2. Orchestrator creates Scout
3. Orchestrator SLEEPS (checks every 15s)
4. Scout completes → Orchestrator wakes
5. Orchestrator reads scout output
6. Orchestrator creates Builder with detailed instructions
7. Orchestrator SLEEPS again
8. Builder completes → Orchestrator wakes
9. Orchestrator reports results
10. Orchestrator deletes both agents

Key techniques:

  • Scout reduces uncertainty - Builder knows exactly where to work
  • Pattern following - Scout identifies conventions
  • Orchestrator sleep - Two phases, minimal orchestrator context
  • Precise targeting - No wasted reads

Results:

  • Scout: 15k tokens, 20 seconds
  • Builder: 30k tokens, 35 seconds
  • Orchestrator: 8k tokens final
  • Total time: 55 seconds
  • Feature shipped correctly on first try

Source: One Agent to Rule Them All transcript


Case Study 5: PLAN-BUILD-REVIEW-SHIP Task Board

Pattern: Structured lifecycle with quality gates

Problem: Ensure all changes go through proper review before shipping

Architecture:

Task Board Columns:
┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐
│  PLAN   │→ │  BUILD  │→ │ REVIEW  │→ │  SHIP   │
└─────────┘  └─────────┘  └─────────┘  └─────────┘

Example task: "Update HTML titles"

Column 1: PLAN

Planner Agent:
├── Analyzes requirement
├── Identifies affected files:
│   ├── index.html
│   └── src/App.tsx (has <title> in render)
├── Creates implementation plan:
│   1. Update index.html <title>
│   2. Update App.tsx header component
│   3. Test both pages load correctly
└── Moves task to BUILD column

Column 2: BUILD

Builder Agent:
├── Reads plan from PLAN column
├── Implements changes:
│   ├── index.html: "Plan Build Review Ship"
│   └── App.tsx: header="Plan Build Review Ship"
├── Runs tests: All passing ✅
└── Moves task to REVIEW column

Column 3: REVIEW

Reviewer Agent:
├── Reads plan and implementation
├── Checks:
│   ├── Plan followed? ✅
│   ├── Tests passing? ✅
│   ├── Code quality? ✅
│   └── No security issues? ✅
├── Approves changes
└── Moves task to SHIP column

Column 4: SHIP

Shipper Agent:
├── Creates git commit
├── Pushes to remote
├── Updates deployment
└── Marks task complete

Orchestrator's role:

- NOT micromanaging each step
- Responding to user commands like "Move task to next phase"
- Tracking task state in database
- Providing UI showing current phase
- Can intervene if phase fails (e.g., tests fail in BUILD)

UI representation:

Task: Update Titles
├── Status: REVIEW
├── Assigned: reviewer-agent-003
├── History:
│   ├── PLAN: planner-001 (completed 2m ago)
│   ├── BUILD: builder-002 (completed 1m ago)
│   └── REVIEW: reviewer-003 (in progress)
└── Files modified: 2

Key techniques:

  • Clear phases - No ambiguity about current state
  • Quality gates - Can't skip to SHIP without REVIEW
  • Agent specialization - Each agent expert in its phase
  • Failure isolation - If BUILD fails, PLAN preserved

Results:

  • Zero shipping untested code (REVIEW gate catches issues)
  • Clear audit trail (who did what in which phase)
  • Parallel tasks (multiple agents in different columns)
  • Single interface (user sees all tasks across all phases)

Source: Custom Agents transcript


Case Study 6: Meta-Agent System

Pattern: Agents building agents

Problem: Need new specialized agent but don't want to hand-write configuration

Solution: Meta-agent that builds other agents

Meta-agent prompt:

# meta-agent.md

You are a meta-agent that builds new sub-agents from user descriptions.

When user says "build a new sub-agent":
1. Ask what the agent should do
2. Fetch Claude Code sub-agent documentation
3. Design system prompt for new agent
4. Create agent configuration file
5. Test agent with sample prompts
6. Report usage examples

Output: .claude/agents/<agent-name>.md with complete configuration

Example: Building TTS summary agent

User: "Build agent that summarizes what my code does using text-to-speech"

Meta-agent process:

Step 1: Understand requirements
├── Parse: "summarize code" + "text-to-speech"
├── Infer: Needs code reading + TTS API access
└── Clarify: Voice provider? (user chooses 11Labs)

Step 2: Fetch documentation
├── Reads Claude Code sub-agent docs
├── Reads 11Labs API docs
└── Understands agent configuration format

Step 3: Design system prompt
├── Purpose: Concise code summaries via voice
├── Tools needed: read files, 11Labs TTS
├── Response format: Audio file output
└── Trigger: "use TTS summary"

Step 4: Create configuration
Writes .claude/agents/tts-summary.md:
---
name: tts-summary
description: Concisely summarizes code with text-to-speech. Trigger: "TTS summary"
---
Purpose: Review user's code and provide 1-sentence summary via 11Labs voice
[... full system prompt ...]

Step 5: Test
├── Runs test prompt: "TTS summary for hooks.py"
├── Agent reads file, generates summary
├── Outputs audio with summary
└── Validates: Works correctly ✅

Step 6: Report
├── Explains how to use new agent
├── Shows example prompts
└── Notes: Can adjust voice, length, etc.

Result: Fully functional TTS summary agent created from natural language description

Recursion depth:

Level 0: Human user
  └→ Level 1: Meta-agent (builds agents)
      └→ Level 2: TTS summary agent (built by meta-agent)
          └→ Level 3: Sub-agents (if TTS agent spawns any)

Key techniques:

  • Documentation fetching - Meta-agent reads official docs
  • Template following - Follows agent configuration patterns
  • Validation loop - Tests before declaring success
  • Recursive creation - Agents can build agents

Challenges:

  • Dependency coupling - New agent depends on meta-agent's understanding
  • Debugging difficulty - If generated agent fails, hard to trace
  • Version drift - Meta-agent's docs knowledge may become outdated

Results:

  • New agent in ~2 minutes vs. 15+ minutes manually
  • Follows best practices automatically
  • Tested before delivery
  • Documented usage

Source: Sub-Agents transcript


Case Study 7: Observability Dashboard

Pattern: Real-time multi-agent monitoring

Problem: Running 5-10 agents simultaneously. Can't see what they're doing.

Solution: Centralized observability system

Architecture:

┌──────────────────── Multiple Agents ────────────────────┐
│  Agent 1    Agent 2    Agent 3    Agent 4    Agent 5    │
│    ↓          ↓          ↓          ↓          ↓        │
│             pre/post-tool-use hooks                      │
│                        ↓                                 │
└──────────────────────────────────────────────────────────┘
                         ↓
┌──────────────────── Bun Server ─────────────────────────┐
│  POST /events endpoint                                   │
│         ├→ Store in SQLite (persistence)                 │
│         └→ Broadcast via WebSocket (real-time)           │
└──────────────────────────────────────────────────────────┘
                         ↓
┌──────────────────── Web Client ─────────────────────────┐
│  ┌─────────────────────────────────────────────────┐    │
│  │ Live Activity Pulse (1min/3min/5min windows)    │    │
│  ├─────────────────────────────────────────────────┤    │
│  │ Event Stream (filtered by app/session/type)     │    │
│  ├─────────────────────────────────────────────────┤    │
│  │ Event Details (with AI-generated summaries)     │    │
│  └─────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────┘

Hook implementation:

# .claude/hooks/post-tool-use.py
import sys, json, subprocess

def main():
    event = json.load(sys.stdin)

    # Send to observability server (with AI summary)
    subprocess.run([
        "uv", "run",
        ".claude/hooks/utils/send-event.py",
        "my-codebase",          # App name
        "post-tool-use",        # Event type
        "--summarize"           # Generate Haiku summary
    ], input=json.dumps(event), text=True)

if __name__ == "__main__":
    main()

What user sees:

┌─────────────── Live Activity Pulse ───────────────┐
│ ▂▄▆█▆▄▂▁ Agent A (very active)                    │
│ ▁▁▂▂▃▃▂▂ Agent B (moderate activity)              │
│ ▂▂▂▂▂▂▂▂ Agent C (steady work)                    │
│ ▁▁▁█▁▁▁▁ Agent D (spike, then quiet)              │
└────────────────────────────────────────────────────┘

┌─────────────── Event Stream ──────────────────────┐
│ [Agent A] post-tool-use                            │
│   Summary: "Wrote authentication logic to user.py"│
│   Time: 2s ago                                     │
├────────────────────────────────────────────────────┤
│ [Agent B] sub-agent-stop                           │
│   Summary: "Completed documentation scrape"        │
│   Time: 5s ago                                     │
├────────────────────────────────────────────────────┤
│ [Agent C] notification                             │
│   Summary: "Needs approval for rm command"         │
│   Time: 8s ago                                     │
└────────────────────────────────────────────────────┘

Filtering:

Filters available:
├── By app (codebase-1, codebase-2, etc.)
├── By agent session ID
├── By event type (pre-tool, post-tool, stop, etc.)
└── By time window (1min, 3min, 5min)

Event summarization:

# Each event summarized by Haiku ($0.0002 per event)
Event: post-tool-use for Write tool
Input: {file: "auth.py", content: "...500 lines..."}
Output: Success

Summary generated:
"Implemented JWT authentication with refresh tokens in auth.py"

Cost: $0.0002
Human value: Instant understanding without reading 500 lines

Key techniques:

  • One-way data stream - Simple, fast, scalable
  • Edge summarization - AI summaries generated at hook time
  • Dual storage - SQLite (history) + WebSocket (real-time)
  • Color coding - Consistent colors per agent session

Results:

  • 5-10 agents monitored simultaneously
  • Thousands of events logged (cost: ~$0.20)
  • Real-time visibility into all agent work
  • Historical analysis via SQLite queries

Business value:

  • Catch errors fast (notification events = agent blocked)
  • Optimize workflows (which tools used most?)
  • Debug issues (what happened before failure?)
  • Scale confidence (can observe 10+ agents easily)

Source: Multi-Agent Observability transcript


Case Study 8: AFK Agent Device

Pattern: Autonomous background work while you're away

Problem: Long-running tasks block your terminal. You want to work on something else.

Solution: Dedicated device running agent fleet

Architecture:

Your Device (interactive):
├── Claude Code session
├── Send job to agent device
└── Monitor status updates

Agent Device (autonomous):
├── Picks up job from queue
├── Executes: Scout → Plan → Build → Ship
├── Reports status every 60s
└── Ships results to git

Workflow:

# From your device
/afk-agents \
  --prompt "Build 3 OpenAI SDK agents: basic, with-tools, realtime-voice" \
  --adw "plan-build-ship" \
  --docs "https://openai-agent-sdk.com/docs"

# Job sent to dedicated device
# You continue working on your device
# Background: Agent device executes workflow

Agent device execution:

[00:00] Job received: Build 3 SDK agents
[00:05] Planner agent created
[00:45] Plan complete: 3 agents specified
[01:00] Builder agent 1 created (basic agent)
[02:30] Builder agent 1 complete: basic-agent.py ✅
[02:35] Builder agent 2 created (with tools)
[04:15] Builder agent 2 complete: agent-with-tools.py ✅
[04:20] Builder agent 3 created (realtime voice)
[07:45] Builder agent 3 partial: needs audio libraries
[08:00] Builder agent 3 complete: realtime-agent.py ⚠️ (partial)
[08:05] Shipper agent created
[08:20] Git commit created
[08:25] Pushed to remote
[08:30] Job complete ✅

Status updates (every 60s):

Your device shows:

[60s] Status: Planning agents...
[120s] Status: Building agent 1 of 3...
[180s] Status: Building agent 2 of 3...
[240s] Status: Building agent 3 of 3...
[300s] Status: Testing agents...
[360s] Status: Shipping to git...
[420s] Status: Complete ✅

Click to view: results/sdk-agents-20250105/

What you do:

1. Send job (10 seconds)
2. Go AFK (work on something else)
3. Get notified when complete (7 minutes later)
4. Review results

Key techniques:

  • Job queue - Agents pick up work from queue
  • Async status - Reports back periodically
  • Autonomous execution - No human in the loop
  • Git integration - Results automatically committed

Results:

  • 3 SDK agents built in 7 minutes
  • You worked on other things during that time
  • Autonomous end-to-end - plan + build + test + ship
  • Code review - Quick glance confirms quality

Infrastructure required:

  • Dedicated machine (M4 Mac Mini, cloud VM, etc.)
  • Agent queue system
  • Job scheduler
  • Status reporting

Use cases:

  • Long-running builds
  • Overnight work
  • Prototyping experiments
  • Documentation generation
  • Codebase refactors

Source: Claude 2.0 transcript


Cross-Cutting Patterns

Pattern: Context Window as Resource Constraint

Appears in:

  • Case 1: Sub-agent delegation protects primary
  • Case 2: Scout-plan-build reduces planner context
  • Case 3: Orchestrator sleeps to protect its context
  • Case 8: Fresh agents for each phase (no accumulation)

Lesson: Context is precious. Protect it aggressively.

Pattern: Specialized Agents Over General

Appears in:

  • Case 3: Frontend/Backend/QA agents vs. one do-everything agent
  • Case 4: Scout finds, builder builds (not one agent doing both)
  • Case 5: Planner/builder/reviewer/shipper (4 specialists)
  • Case 6: Meta-agent only builds, doesn't execute

Lesson: "A focused agent is a performant agent."

Pattern: Observability Enables Scale

Appears in:

  • Case 3: Orchestrator tracks agent status
  • Case 5: Task board shows current phase
  • Case 7: Real-time dashboard for all agents
  • Case 8: Status updates every 60s

Lesson: "If you can't measure it, you can't scale it."

Pattern: Deletable Temporary Resources

Appears in:

  • Case 3: All 3 agents deleted after completion
  • Case 4: Scout and builder deleted
  • Case 5: Each phase agent deleted after task moves
  • Case 8: Builder agents deleted after shipping

Lesson: "The best agent is a deleted agent."

Performance Comparisons

Single Agent vs. Multi-Agent

Task Single Agent Multi-Agent Speedup
Load 10 docs 150k tokens, 5min 30k primary, 2min 2.5x faster, 80% less context
SDK migration Fails (overflow) 80k max/agent, 30min Completes vs. fails
Codebase summary 120k tokens, 3min 32k max/agent, 52s 3.5x faster
UI components 80k tokens, 2min 30k max, 55s 2.2x faster

With vs. Without Orchestration

Metric Manual (no orchestrator) With Orchestrator
Commands per task 8-12 manual prompts 1 prompt to orchestrator
Context management Manual (forget limits) Automatic (orchestrator sleeps)
Error recovery Start over Retry failed phase only
Observability Terminal logs Real-time dashboard

Common Failure Modes

Failure: Context Explosion

Scenario: Case 2 without scouts

  • Single agent reads 100+ files
  • Context hits 180k tokens
  • Agent slows down, makes mistakes
  • Eventually fails or times out

Fix: Add scout phase to filter files first

Failure: Orchestrator Watching Everything

Scenario: Case 3 with observing orchestrator

  • Orchestrator watches all agent work
  • Orchestrator context grows to 100k+
  • Can't coordinate more than 2-3 agents
  • System doesn't scale

Fix: Implement orchestrator sleep pattern

Failure: No Observability

Scenario: Case 7 without dashboard

  • 5 agents running
  • One agent stuck on permission request
  • No way to know which agent needs attention
  • Entire workflow blocked

Fix: Add hooks + observability system

Failure: Agent Accumulation

Scenario: Case 5 not deleting agents

  • 20 tasks completed
  • 80 agents still running (4 per task)
  • System resources exhausted
  • New agents can't start

Fix: Delete agents after task completion

Key Takeaways

  1. Parallelization = Sub-agents - Nothing else runs agents in parallel

  2. Context protection = Specialization - Focused agents use less context

  3. Orchestration = Scale - Single interface manages fleet

  4. Observability = Confidence - Can't scale what you can't see

  5. Deletable = Sustainable - Free resources for next task

  6. Multi-agent is Level 5 - Requires mastering Levels 1-4 first

When to Use Multi-Agent Patterns

Use multi-agent when:

  • Task naturally divides into parallel subtasks
  • Single agent context approaching limits
  • Need quality gates between phases
  • Want to work on other things while agents execute
  • Have observability infrastructure

Don't use multi-agent when:

  • Simple one-off task
  • Learning/prototyping phase
  • No way to monitor agents
  • Task requires tight human-in-loop feedback

Source Attribution

All case studies drawn from field experience documented in 8 source transcripts:

  1. Elite Context Engineering - Case 1 (AI docs loader)
  2. Claude 2.0 - Case 2 (SDK migration), Case 8 (AFK device)
  3. Custom Agents - Case 5 (task board)
  4. Sub-Agents - Case 6 (meta-agent)
  5. Multi-Agent Observability - Case 7 (dashboard)
  6. Hooked - Supporting patterns
  7. One Agent to Rule Them All - Case 3 (summarization), Case 4 (UI components)
  8. (Transcript 8 name not specified in context)

Remember: These are real systems in production. Start simple, add complexity only when needed.