zhongwei/gh-basher83-lunar-claude-plugins-meta-meta-claude

Fork 0

Files

Zhongwei Li c83b4639c5 Initial commit

2025-11-29 18:00:36 +08:00

30 KiB

Raw Blame History

Multi-Agent Case Studies

Real-world examples of multi-agent systems in production, drawn from field experience.

Case Study Index

#	Name	Pattern	Agents	Key Lesson
1	AI Docs Loader	Sub-agent delegation	8-10	Parallel work without context pollution
2	SDK Migration	Scout-plan-build	6	Search + plan + implement workflow
3	Codebase Summarization	Orchestrator + QA	3	Divide and conquer with synthesis
4	UI Component Creation	Scout-builder	2	Precise targeting before building
5	PLAN-BUILD-REVIEW-SHIP	Task board lifecycle	4	Quality gates between phases
6	Meta-Agent System	Agent building agents	Variable	Recursive agent creation
7	Observability Dashboard	Fleet monitoring	5-10+	Real-time multi-agent visibility
8	AFK Agent Device	Autonomous background work	3-5	Out-of-loop while you sleep

Case Study 1: AI Docs Loader

Pattern: Sub-agent delegation for parallel work

Problem: Loading 10 documentation URLs consumes 30k+ tokens per scrape. Single agent would hit 150k+ tokens.

Solution: Delegate each scrape to isolated sub-agent

Architecture:

Primary Agent (9k tokens)
├→ Sub-Agent 1: Scrape doc 1 (3k tokens, isolated)
├→ Sub-Agent 2: Scrape doc 2 (3k tokens, isolated)
├→ Sub-Agent 3: Scrape doc 3 (3k tokens, isolated)
...
└→ Sub-Agent 10: Scrape doc 10 (3k tokens, isolated)

Total work: 39k tokens
Primary agent: Only 9k tokens ✅
Context protected: 30k tokens kept out of primary

Implementation:

# Single command
/load-ai-docs

# Agent reads list from ai-docs/README.md
# For each URL older than 24 hours:
#   - Spawn sub-agent
#   - Sub-agent scrapes URL
#   - Sub-agent saves to file
#   - Sub-agent reports completion
# Primary agent never sees scrape content

Key techniques:

Sub-agents for isolation - Each scrape in separate context
Parallel execution - All 10 scrapes run simultaneously
Context delegation - 30k tokens stay out of primary

Results:

Time: 10 scrapes in parallel vs. sequential (10x faster)
Context: Primary agent stays at 9k tokens throughout
Scalability: Can handle 50+ URLs without primary context issues

Source: Elite Context Engineering transcript

Case Study 2: SDK Migration

Pattern: Scout-plan-build with multiple perspectives

Problem: Migrating codebase to new Claude Agent SDK across 8 applications

Challenge:

100+ files potentially affected
Agent reading everything = 150k+ tokens
Planning without full context = mistakes

Solution: Three-phase workflow with delegation

Phase 1: Scout (Reduce context for planner)

Orchestrator spawns 4 scout agents (parallel):
├→ Scout 1: Gemini Lightning (fast, different perspective)
├→ Scout 2: CodeX (specialized for code search)
├→ Scout 3: Gemini Flash Preview
└→ Scout 4: Haiku (cheap, fast)

Each scout:
- Searches codebase for SDK usage
- Identifies exact files and line numbers
- Notes patterns (e.g., "system prompt now explicit")

Output: relevant-files.md (5k tokens)
├── File paths
├── Line number offsets
├── Character ranges
└── Relevant code snippets

Why multiple models? Diverse perspectives catch edge cases single model might miss.

Phase 2: Plan (Focus on relevant subset)

Planner agent (new instance):
├── Reads relevant-files.md (5k tokens)
├── Scrapes SDK documentation (8k tokens)
├── Analyzes migration patterns
└── Creates detailed-plan.md (3k tokens)

Context used: 16k tokens
vs. 150k if reading entire codebase
Savings: 89% reduction

Phase 3: Build (Execute plan)

Builder agent (new instance):
├── Reads detailed-plan.md (3k tokens)
├── Implements changes across 8 apps
├── Updates system prompts
├── Tests each application
└── Reports completion

Context used: ~80k tokens
Still within safe limits

Final context analysis:

If single agent:
├── Search: 40k tokens
├── Read files: 60k tokens
├── Plan: 20k tokens
├── Implement: 30k tokens
└── Total: 150k tokens (75% used)

With scout-plan-build:
├── Primary orchestrator: 10k tokens
├── 4 scouts (parallel, isolated): 4 × 15k = 60k total, 0k in primary
├── Planner (new agent): 16k tokens
├── Builder (new agent): 80k tokens
└── Max per agent: 80k tokens (40% per agent)

Key techniques:

Composable workflows - Chain /scout, /plan, /build
Multiple scout models - Diverse perspectives
Context offloading - Scouts protect planner's context
Fresh agents per phase - No context accumulation

Results:

8 applications migrated successfully
51% context used in builder phase (safe margins)
No context explosions across entire workflow
Completed in single session (~30 minutes)

Near miss: "We were 14% away from exploding our context" due to autocompact buffer

Lesson: Disable autocompact buffer. That 22% matters at scale.

Source: Claude 2.0 transcript

Case Study 3: Codebase Summarization

Pattern: Orchestrator with specialized QA agents

Problem: Summarize large codebase (frontend + backend) with architecture docs

Approach: Divide and conquer with synthesis

Architecture:

Orchestrator Agent
├→ Creates Frontend QA Agent
│  ├─ Summarizes frontend components
│  └─ Outputs: frontend-summary.md
├→ Creates Backend QA Agent
│  ├─ Summarizes backend APIs
│  └─ Outputs: backend-summary.md
└→ Creates Primary QA Agent
   ├─ Reads both summaries
   ├─ Synthesizes unified view
   └─ Outputs: codebase-overview.md

Orchestrator behavior:

1. Parse user request: "Summarize codebase"
2. Create 3 agents with specialized tasks
3. Command each agent with detailed prompts
4. SLEEP (not observing their work)
5. Wake every 15s to check status
6. Agents complete → Orchestrator wakes
7. Collect results (read produced files)
8. Summarize for user
9. Delete all 3 agents

Prompts from orchestrator:

Frontend QA Agent:
"Analyze all files in src/frontend/. Create markdown summary with:
- Key components and their responsibilities
- State management approach
- Routing structure
- Technology stack
Output to docs/frontend-summary.md"

Backend QA Agent:
"Analyze all files in src/backend/. Create markdown summary with:
- API endpoints and their purposes
- Database schema
- Authentication/authorization
- External integrations
Output to docs/backend-summary.md"

Primary QA Agent:
"Read frontend-summary.md and backend-summary.md. Create unified overview with:
- High-level architecture
- How components interact
- Data flow
- Key technologies
Output to docs/codebase-overview.md"

Observability interface shows:

[Agent 1] Frontend QA
├── Status: Complete ✅
├── Context: 28k tokens used
├── Files consumed: 15 files
├── Files produced: frontend-summary.md
└── Time: 45 seconds

[Agent 2] Backend QA
├── Status: Complete ✅
├── Context: 32k tokens used
├── Files consumed: 12 files
├── Files produced: backend-summary.md
└── Time: 52 seconds

[Agent 3] Primary QA
├── Status: Complete ✅
├── Context: 18k tokens used
├── Files consumed: 2 files (summaries)
├── Files produced: codebase-overview.md
└── Time: 30 seconds

Orchestrator:
├── Context: 12k tokens (commands only, not observing work)
├── Total time: 52 seconds (parallel execution)
└── All agents deleted after completion

Key techniques:

Parallel frontend/backend - 2x speedup
Orchestrator sleeps - Protects its context
Synthesis agent - Combines perspectives
Deletable agents - Freed after use

Results:

3 comprehensive docs created
Max context per agent: 32k tokens (16%)
Orchestrator context: 12k tokens (6%)
Time: 52 seconds (vs. 2+ minutes sequential)

Source: One Agent to Rule Them All transcript

Case Study 4: UI Component Creation

Pattern: Scout-builder two-stage

Problem: Create gray pills for app header information display

Challenge: Codebase has specific conventions. Need to find exact files and follow patterns.

Solution: Scout locates, builder implements

Phase 1: Scout

Scout Agent:
├── Task: "Find header UI component files"
├── Searches for: header, display, pills, info components
├── Identifies patterns: existing pill styles, color conventions
├── Locates exact files:
│   ├── src/components/AppHeader.vue
│   ├── src/styles/pills.css
│   └── src/utils/formatters.ts
└── Outputs: scout-header-report.md with:
    ├── File locations
    ├── Line numbers for modifications
    ├── Existing patterns to follow
    └── Recommended approach

Phase 2: Builder

Builder Agent:
├── Reads scout-header-report.md
├── Follows identified patterns
├── Creates gray pill components
├── Applies consistent styling
├── Outputs modified files with exact changes
└── Context: Only 30k tokens (vs. 80k+ without scout)

Orchestrator involvement:

1. User prompts: "Create gray pills for header"
2. Orchestrator creates Scout
3. Orchestrator SLEEPS (checks every 15s)
4. Scout completes → Orchestrator wakes
5. Orchestrator reads scout output
6. Orchestrator creates Builder with detailed instructions
7. Orchestrator SLEEPS again
8. Builder completes → Orchestrator wakes
9. Orchestrator reports results
10. Orchestrator deletes both agents

Key techniques:

Scout reduces uncertainty - Builder knows exactly where to work
Pattern following - Scout identifies conventions
Orchestrator sleep - Two phases, minimal orchestrator context
Precise targeting - No wasted reads

Results:

Scout: 15k tokens, 20 seconds
Builder: 30k tokens, 35 seconds
Orchestrator: 8k tokens final
Total time: 55 seconds
Feature shipped correctly on first try

Source: One Agent to Rule Them All transcript

Case Study 5: PLAN-BUILD-REVIEW-SHIP Task Board

Pattern: Structured lifecycle with quality gates

Problem: Ensure all changes go through proper review before shipping

Architecture:

Task Board Columns:
┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐
│  PLAN   │→ │  BUILD  │→ │ REVIEW  │→ │  SHIP   │
└─────────┘  └─────────┘  └─────────┘  └─────────┘

Example task: "Update HTML titles"

Column 1: PLAN

Planner Agent:
├── Analyzes requirement
├── Identifies affected files:
│   ├── index.html
│   └── src/App.tsx (has <title> in render)
├── Creates implementation plan:
│   1. Update index.html <title>
│   2. Update App.tsx header component
│   3. Test both pages load correctly
└── Moves task to BUILD column

Column 2: BUILD

Builder Agent:
├── Reads plan from PLAN column
├── Implements changes:
│   ├── index.html: "Plan Build Review Ship"
│   └── App.tsx: header="Plan Build Review Ship"
├── Runs tests: All passing ✅
└── Moves task to REVIEW column

Column 3: REVIEW

Reviewer Agent:
├── Reads plan and implementation
├── Checks:
│   ├── Plan followed? ✅
│   ├── Tests passing? ✅
│   ├── Code quality? ✅
│   └── No security issues? ✅
├── Approves changes
└── Moves task to SHIP column

Column 4: SHIP

Shipper Agent:
├── Creates git commit
├── Pushes to remote
├── Updates deployment
└── Marks task complete

Orchestrator's role:

- NOT micromanaging each step
- Responding to user commands like "Move task to next phase"
- Tracking task state in database
- Providing UI showing current phase
- Can intervene if phase fails (e.g., tests fail in BUILD)

UI representation:

Task: Update Titles
├── Status: REVIEW
├── Assigned: reviewer-agent-003
├── History:
│   ├── PLAN: planner-001 (completed 2m ago)
│   ├── BUILD: builder-002 (completed 1m ago)
│   └── REVIEW: reviewer-003 (in progress)
└── Files modified: 2

Key techniques:

Clear phases - No ambiguity about current state
Quality gates - Can't skip to SHIP without REVIEW
Agent specialization - Each agent expert in its phase
Failure isolation - If BUILD fails, PLAN preserved

Results:

Zero shipping untested code (REVIEW gate catches issues)
Clear audit trail (who did what in which phase)
Parallel tasks (multiple agents in different columns)
Single interface (user sees all tasks across all phases)

Source: Custom Agents transcript

Case Study 6: Meta-Agent System

Pattern: Agents building agents

Problem: Need new specialized agent but don't want to hand-write configuration

Solution: Meta-agent that builds other agents

Meta-agent prompt:

# meta-agent.md

You are a meta-agent that builds new sub-agents from user descriptions.

When user says "build a new sub-agent":
1. Ask what the agent should do
2. Fetch Claude Code sub-agent documentation
3. Design system prompt for new agent
4. Create agent configuration file
5. Test agent with sample prompts
6. Report usage examples

Output: .claude/agents/<agent-name>.md with complete configuration

Example: Building TTS summary agent

User: "Build agent that summarizes what my code does using text-to-speech"

Meta-agent process:

Step 1: Understand requirements
├── Parse: "summarize code" + "text-to-speech"
├── Infer: Needs code reading + TTS API access
└── Clarify: Voice provider? (user chooses 11Labs)

Step 2: Fetch documentation
├── Reads Claude Code sub-agent docs
├── Reads 11Labs API docs
└── Understands agent configuration format

Step 3: Design system prompt
├── Purpose: Concise code summaries via voice
├── Tools needed: read files, 11Labs TTS
├── Response format: Audio file output
└── Trigger: "use TTS summary"

Step 4: Create configuration
Writes .claude/agents/tts-summary.md:
---
name: tts-summary
description: Concisely summarizes code with text-to-speech. Trigger: "TTS summary"
---
Purpose: Review user's code and provide 1-sentence summary via 11Labs voice
[... full system prompt ...]

Step 5: Test
├── Runs test prompt: "TTS summary for hooks.py"
├── Agent reads file, generates summary
├── Outputs audio with summary
└── Validates: Works correctly ✅

Step 6: Report
├── Explains how to use new agent
├── Shows example prompts
└── Notes: Can adjust voice, length, etc.

Result: Fully functional TTS summary agent created from natural language description

Recursion depth:

Level 0: Human user
  └→ Level 1: Meta-agent (builds agents)
      └→ Level 2: TTS summary agent (built by meta-agent)
          └→ Level 3: Sub-agents (if TTS agent spawns any)

Key techniques:

Documentation fetching - Meta-agent reads official docs
Template following - Follows agent configuration patterns
Validation loop - Tests before declaring success
Recursive creation - Agents can build agents

Challenges:

Dependency coupling - New agent depends on meta-agent's understanding
Debugging difficulty - If generated agent fails, hard to trace
Version drift - Meta-agent's docs knowledge may become outdated

Results:

New agent in ~2 minutes vs. 15+ minutes manually
Follows best practices automatically
Tested before delivery
Documented usage

Source: Sub-Agents transcript

Case Study 7: Observability Dashboard

Pattern: Real-time multi-agent monitoring

Problem: Running 5-10 agents simultaneously. Can't see what they're doing.

Solution: Centralized observability system

Architecture:

┌──────────────────── Multiple Agents ────────────────────┐
│  Agent 1    Agent 2    Agent 3    Agent 4    Agent 5    │
│    ↓          ↓          ↓          ↓          ↓        │
│             pre/post-tool-use hooks                      │
│                        ↓                                 │
└──────────────────────────────────────────────────────────┘
                         ↓
┌──────────────────── Bun Server ─────────────────────────┐
│  POST /events endpoint                                   │
│         ├→ Store in SQLite (persistence)                 │
│         └→ Broadcast via WebSocket (real-time)           │
└──────────────────────────────────────────────────────────┘
                         ↓
┌──────────────────── Web Client ─────────────────────────┐
│  ┌─────────────────────────────────────────────────┐    │
│  │ Live Activity Pulse (1min/3min/5min windows)    │    │
│  ├─────────────────────────────────────────────────┤    │
│  │ Event Stream (filtered by app/session/type)     │    │
│  ├─────────────────────────────────────────────────┤    │
│  │ Event Details (with AI-generated summaries)     │    │
│  └─────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────┘

Hook implementation:

# .claude/hooks/post-tool-use.py
import sys, json, subprocess

def main():
    event = json.load(sys.stdin)

    # Send to observability server (with AI summary)
    subprocess.run([
        "uv", "run",
        ".claude/hooks/utils/send-event.py",
        "my-codebase",          # App name
        "post-tool-use",        # Event type
        "--summarize"           # Generate Haiku summary
    ], input=json.dumps(event), text=True)

if __name__ == "__main__":
    main()

What user sees:

┌─────────────── Live Activity Pulse ───────────────┐
│ ▂▄▆█▆▄▂▁ Agent A (very active)                    │
│ ▁▁▂▂▃▃▂▂ Agent B (moderate activity)              │
│ ▂▂▂▂▂▂▂▂ Agent C (steady work)                    │
│ ▁▁▁█▁▁▁▁ Agent D (spike, then quiet)              │
└────────────────────────────────────────────────────┘

┌─────────────── Event Stream ──────────────────────┐
│ [Agent A] post-tool-use                            │
│   Summary: "Wrote authentication logic to user.py"│
│   Time: 2s ago                                     │
├────────────────────────────────────────────────────┤
│ [Agent B] sub-agent-stop                           │
│   Summary: "Completed documentation scrape"        │
│   Time: 5s ago                                     │
├────────────────────────────────────────────────────┤
│ [Agent C] notification                             │
│   Summary: "Needs approval for rm command"         │
│   Time: 8s ago                                     │
└────────────────────────────────────────────────────┘

Filtering:

Filters available:
├── By app (codebase-1, codebase-2, etc.)
├── By agent session ID
├── By event type (pre-tool, post-tool, stop, etc.)
└── By time window (1min, 3min, 5min)

Event summarization:

# Each event summarized by Haiku ($0.0002 per event)
Event: post-tool-use for Write tool
Input: {file: "auth.py", content: "...500 lines..."}
Output: Success

Summary generated:
"Implemented JWT authentication with refresh tokens in auth.py"

Cost: $0.0002
Human value: Instant understanding without reading 500 lines

Key techniques:

One-way data stream - Simple, fast, scalable
Edge summarization - AI summaries generated at hook time
Dual storage - SQLite (history) + WebSocket (real-time)
Color coding - Consistent colors per agent session

Results:

5-10 agents monitored simultaneously
Thousands of events logged (cost: ~$0.20)
Real-time visibility into all agent work
Historical analysis via SQLite queries

Business value:

Catch errors fast (notification events = agent blocked)
Optimize workflows (which tools used most?)
Debug issues (what happened before failure?)
Scale confidence (can observe 10+ agents easily)

Source: Multi-Agent Observability transcript

Case Study 8: AFK Agent Device

Pattern: Autonomous background work while you're away

Problem: Long-running tasks block your terminal. You want to work on something else.

Solution: Dedicated device running agent fleet

Architecture:

Your Device (interactive):
├── Claude Code session
├── Send job to agent device
└── Monitor status updates

Agent Device (autonomous):
├── Picks up job from queue
├── Executes: Scout → Plan → Build → Ship
├── Reports status every 60s
└── Ships results to git

Workflow:

# From your device
/afk-agents \
  --prompt "Build 3 OpenAI SDK agents: basic, with-tools, realtime-voice" \
  --adw "plan-build-ship" \
  --docs "https://openai-agent-sdk.com/docs"

# Job sent to dedicated device
# You continue working on your device
# Background: Agent device executes workflow

Agent device execution:

[00:00] Job received: Build 3 SDK agents
[00:05] Planner agent created
[00:45] Plan complete: 3 agents specified
[01:00] Builder agent 1 created (basic agent)
[02:30] Builder agent 1 complete: basic-agent.py ✅
[02:35] Builder agent 2 created (with tools)
[04:15] Builder agent 2 complete: agent-with-tools.py ✅
[04:20] Builder agent 3 created (realtime voice)
[07:45] Builder agent 3 partial: needs audio libraries
[08:00] Builder agent 3 complete: realtime-agent.py ⚠️ (partial)
[08:05] Shipper agent created
[08:20] Git commit created
[08:25] Pushed to remote
[08:30] Job complete ✅

Status updates (every 60s):

Your device shows:

[60s] Status: Planning agents...
[120s] Status: Building agent 1 of 3...
[180s] Status: Building agent 2 of 3...
[240s] Status: Building agent 3 of 3...
[300s] Status: Testing agents...
[360s] Status: Shipping to git...
[420s] Status: Complete ✅

Click to view: results/sdk-agents-20250105/

What you do:

1. Send job (10 seconds)
2. Go AFK (work on something else)
3. Get notified when complete (7 minutes later)
4. Review results

Key techniques:

Job queue - Agents pick up work from queue
Async status - Reports back periodically
Autonomous execution - No human in the loop
Git integration - Results automatically committed

Results:

3 SDK agents built in 7 minutes
You worked on other things during that time
Autonomous end-to-end - plan + build + test + ship
Code review - Quick glance confirms quality

Infrastructure required:

Dedicated machine (M4 Mac Mini, cloud VM, etc.)
Agent queue system
Job scheduler
Status reporting

Use cases:

Long-running builds
Overnight work
Prototyping experiments
Documentation generation
Codebase refactors

Source: Claude 2.0 transcript

Cross-Cutting Patterns

Pattern: Context Window as Resource Constraint

Appears in:

Case 1: Sub-agent delegation protects primary
Case 2: Scout-plan-build reduces planner context
Case 3: Orchestrator sleeps to protect its context
Case 8: Fresh agents for each phase (no accumulation)

Lesson: Context is precious. Protect it aggressively.

Pattern: Specialized Agents Over General

Appears in:

Case 3: Frontend/Backend/QA agents vs. one do-everything agent
Case 4: Scout finds, builder builds (not one agent doing both)
Case 5: Planner/builder/reviewer/shipper (4 specialists)
Case 6: Meta-agent only builds, doesn't execute

Lesson: "A focused agent is a performant agent."

Pattern: Observability Enables Scale

Appears in:

Case 3: Orchestrator tracks agent status
Case 5: Task board shows current phase
Case 7: Real-time dashboard for all agents
Case 8: Status updates every 60s

Lesson: "If you can't measure it, you can't scale it."

Pattern: Deletable Temporary Resources

Appears in:

Case 3: All 3 agents deleted after completion
Case 4: Scout and builder deleted
Case 5: Each phase agent deleted after task moves
Case 8: Builder agents deleted after shipping

Lesson: "The best agent is a deleted agent."

Performance Comparisons

Single Agent vs. Multi-Agent

Task	Single Agent	Multi-Agent	Speedup
Load 10 docs	150k tokens, 5min	30k primary, 2min	2.5x faster, 80% less context
SDK migration	Fails (overflow)	80k max/agent, 30min	Completes vs. fails
Codebase summary	120k tokens, 3min	32k max/agent, 52s	3.5x faster
UI components	80k tokens, 2min	30k max, 55s	2.2x faster

With vs. Without Orchestration

Metric	Manual (no orchestrator)	With Orchestrator
Commands per task	8-12 manual prompts	1 prompt to orchestrator
Context management	Manual (forget limits)	Automatic (orchestrator sleeps)
Error recovery	Start over	Retry failed phase only
Observability	Terminal logs	Real-time dashboard

Common Failure Modes

Failure: Context Explosion

Scenario: Case 2 without scouts

Single agent reads 100+ files
Context hits 180k tokens
Agent slows down, makes mistakes
Eventually fails or times out

Fix: Add scout phase to filter files first

Failure: Orchestrator Watching Everything

Scenario: Case 3 with observing orchestrator

Orchestrator watches all agent work
Orchestrator context grows to 100k+
Can't coordinate more than 2-3 agents
System doesn't scale

Fix: Implement orchestrator sleep pattern

Failure: No Observability

Scenario: Case 7 without dashboard

5 agents running
One agent stuck on permission request
No way to know which agent needs attention
Entire workflow blocked

Fix: Add hooks + observability system

Failure: Agent Accumulation

Scenario: Case 5 not deleting agents

20 tasks completed
80 agents still running (4 per task)
System resources exhausted
New agents can't start

Fix: Delete agents after task completion

Key Takeaways

Parallelization = Sub-agents - Nothing else runs agents in parallel
Context protection = Specialization - Focused agents use less context
Orchestration = Scale - Single interface manages fleet
Observability = Confidence - Can't scale what you can't see
Deletable = Sustainable - Free resources for next task
Multi-agent is Level 5 - Requires mastering Levels 1-4 first

When to Use Multi-Agent Patterns

Use multi-agent when:

✅ Task naturally divides into parallel subtasks
✅ Single agent context approaching limits
✅ Need quality gates between phases
✅ Want to work on other things while agents execute
✅ Have observability infrastructure

Don't use multi-agent when:

❌ Simple one-off task
❌ Learning/prototyping phase
❌ No way to monitor agents
❌ Task requires tight human-in-loop feedback

Source Attribution

All case studies drawn from field experience documented in 8 source transcripts:

Elite Context Engineering - Case 1 (AI docs loader)
Claude 2.0 - Case 2 (SDK migration), Case 8 (AFK device)
Custom Agents - Case 5 (task board)
Sub-Agents - Case 6 (meta-agent)
Multi-Agent Observability - Case 7 (dashboard)
Hooked - Supporting patterns
One Agent to Rule Them All - Case 3 (summarization), Case 4 (UI components)
(Transcript 8 name not specified in context)

Orchestrator Pattern - Multi-agent coordination
Hooks for Observability - Monitoring implementation
Context Window Protection - Resource management
Evolution Path - Progression to multi-agent mastery

Remember: These are real systems in production. Start simple, add complexity only when needed.

30 KiB Raw Blame History Unescape Escape

Multi-Agent Case Studies

Case Study Index

Case Study 1: AI Docs Loader

Case Study 2: SDK Migration

Case Study 3: Codebase Summarization

Case Study 4: UI Component Creation

Case Study 5: PLAN-BUILD-REVIEW-SHIP Task Board

Case Study 6: Meta-Agent System

Case Study 7: Observability Dashboard

Case Study 8: AFK Agent Device

Cross-Cutting Patterns

Pattern: Context Window as Resource Constraint

Pattern: Specialized Agents Over General

Pattern: Observability Enables Scale

Pattern: Deletable Temporary Resources

Performance Comparisons

Single Agent vs. Multi-Agent

With vs. Without Orchestration

Common Failure Modes

Failure: Context Explosion

Failure: Orchestrator Watching Everything

Failure: No Observability

Failure: Agent Accumulation

Key Takeaways

When to Use Multi-Agent Patterns

Source Attribution

Related Documentation

30 KiB

Raw Blame History