Initial commit
This commit is contained in:
715
skills/multi-agent-composition/patterns/context-management.md
Normal file
715
skills/multi-agent-composition/patterns/context-management.md
Normal file
@@ -0,0 +1,715 @@
|
||||
# Context Window Protection
|
||||
|
||||
> "200k context window is plenty. You're just stuffing a single agent with too much work. Don't force your agent to context switch."
|
||||
|
||||
Context window protection is about managing your agent's most precious resource: attention. A focused agent is a performant agent.
|
||||
|
||||
## The Core Problem
|
||||
|
||||
**Every engineer hits this wall:**
|
||||
|
||||
```text
|
||||
Agent starts: 10k tokens (5% used)
|
||||
↓
|
||||
After exploration: 80k tokens (40% used)
|
||||
↓
|
||||
After planning: 120k tokens (60% used)
|
||||
↓
|
||||
During implementation: 170k tokens (85% used) ⚠️
|
||||
↓
|
||||
Context explodes: 195k tokens (98% used) ❌
|
||||
↓
|
||||
Agent performance degrades, fails, or times out
|
||||
```
|
||||
|
||||
**The realization:** More context ≠ better performance. Too much context = cognitive overload.
|
||||
|
||||
## The R&D Framework
|
||||
|
||||
There are only two ways to manage your context window:
|
||||
|
||||
```text
|
||||
R - REDUCE
|
||||
└─→ Minimize what enters the context window
|
||||
|
||||
D - DELEGATE
|
||||
└─→ Move work to other agents' context windows
|
||||
```
|
||||
|
||||
**Everything else is a tactic implementing R or D.**
|
||||
|
||||
## The Four Levels of Context Protection
|
||||
|
||||
### Level 1: Beginner - Reduce Waste
|
||||
|
||||
**Focus:** Stop wasting tokens on unused resources
|
||||
|
||||
#### Tactic 1: Eliminate Default MCP Servers
|
||||
|
||||
**Problem:**
|
||||
|
||||
```bash
|
||||
# Default mcp.json
|
||||
{
|
||||
"mcpServers": {
|
||||
"firecrawl": {...}, # 6k tokens
|
||||
"github": {...}, # 8k tokens
|
||||
"postgres": {...}, # 5k tokens
|
||||
"redis": {...} # 5k tokens
|
||||
}
|
||||
}
|
||||
# Total: 24k tokens always loaded (12% of 200k window!)
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
|
||||
```bash
|
||||
# Option 1: Delete default mcp.json entirely
|
||||
rm .claude/mcp.json
|
||||
|
||||
# Option 2: Load selectively
|
||||
claude-mcp-config --strict specialized-configs/firecrawl-only.json
|
||||
# Result: 4k tokens instead of 24k (83% reduction)
|
||||
```
|
||||
|
||||
#### Tactic 2: Minimize CLAUDE.md
|
||||
|
||||
**Before:**
|
||||
|
||||
```markdown
|
||||
# CLAUDE.md (23,000 tokens = 11.5% of window)
|
||||
- 500 lines of API documentation
|
||||
- 300 lines of deployment procedures
|
||||
- 1,500 lines of coding standards
|
||||
- Architecture diagrams
|
||||
- Always loaded, whether relevant or not
|
||||
```
|
||||
|
||||
**After:**
|
||||
|
||||
```markdown
|
||||
# CLAUDE.md (500 tokens = 0.25% of window)
|
||||
# Only universal essentials
|
||||
|
||||
- Fenced code blocks MUST have language
|
||||
- Use rg instead of grep
|
||||
- ALWAYS use set -euo pipefail
|
||||
```
|
||||
|
||||
**Rule:** Only include what you're 100% sure you want loaded 100% of the time.
|
||||
|
||||
#### Tactic 3: Disable Autocompact Buffer
|
||||
|
||||
**Problem:**
|
||||
|
||||
```bash
|
||||
/context
|
||||
|
||||
# Output:
|
||||
autocompact buffer: 22% ⚠️ (44k tokens gone!)
|
||||
messages: 51%
|
||||
system_tools: 8%
|
||||
---
|
||||
Total available: 78% (should be 100%)
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
|
||||
```bash
|
||||
/config
|
||||
# Set: autocompact = false
|
||||
|
||||
# Now:
|
||||
/context
|
||||
# Output:
|
||||
messages: 51%
|
||||
system_tools: 8%
|
||||
custom_agents: 2%
|
||||
---
|
||||
Total available: 91% ✅ (reclaimed 22%!)
|
||||
```
|
||||
|
||||
**Impact:** Reclaims 40k+ tokens immediately.
|
||||
|
||||
### Level 2: Intermediate - Dynamic Loading
|
||||
|
||||
**Focus:** Load what you need, when you need it
|
||||
|
||||
#### Tactic 4: Context Priming
|
||||
|
||||
**Replace static CLAUDE.md with task-specific `/prime` commands**
|
||||
|
||||
```markdown
|
||||
# .claude/commands/prime.md
|
||||
# General codebase context (2k tokens)
|
||||
Read README, understand structure, report findings
|
||||
|
||||
# .claude/commands/prime-feature.md
|
||||
# Feature development context (3k tokens)
|
||||
Read feature requirements, understand dependencies, plan implementation
|
||||
|
||||
# .claude/commands/prime-api.md
|
||||
# API work context (4k tokens)
|
||||
Read API docs, understand endpoints, review integration patterns
|
||||
```
|
||||
|
||||
**Usage pattern:**
|
||||
|
||||
```bash
|
||||
# Starting feature work
|
||||
/prime-feature
|
||||
|
||||
# vs. having 23k tokens always loaded
|
||||
```
|
||||
|
||||
**Savings:** 20k tokens (87% reduction)
|
||||
|
||||
#### Tactic 5: Sub-Agent Delegation
|
||||
|
||||
**Problem:** Primary agent doing parallel work fills its own context
|
||||
|
||||
```text
|
||||
Primary Agent tries to do:
|
||||
├── Web scraping (15k tokens)
|
||||
├── Documentation fetch (12k tokens)
|
||||
├── Data analysis (10k tokens)
|
||||
└── Synthesis (5k tokens)
|
||||
= 42k tokens in one agent
|
||||
```
|
||||
|
||||
**Solution:** Delegate to sub-agents with isolated contexts
|
||||
|
||||
```text
|
||||
Primary Agent (9k tokens):
|
||||
├→ Sub-Agent 1: Web scraping (15k tokens, isolated)
|
||||
├→ Sub-Agent 2: Docs fetch (12k tokens, isolated)
|
||||
└→ Sub-Agent 3: Analysis (10k tokens, isolated)
|
||||
|
||||
Total work: 46k tokens
|
||||
Primary agent context: Only 9k tokens ✅
|
||||
```
|
||||
|
||||
**Example:**
|
||||
|
||||
```bash
|
||||
/load-ai-docs
|
||||
|
||||
# Agent spawns 10 sub-agents for web scraping
|
||||
# Each scrape: ~3k tokens
|
||||
# Total work: 30k tokens
|
||||
# Primary agent context: Still only 9k tokens
|
||||
# Savings: 21k tokens protected
|
||||
```
|
||||
|
||||
**Key insight:** Sub-agents use system prompts (not user prompts), keeping their context isolated from primary.
|
||||
|
||||
### Level 3: Advanced - Multi-Agent Handoff
|
||||
|
||||
**Focus:** Chain agents together without context explosion
|
||||
|
||||
#### Tactic 6: Context Bundles
|
||||
|
||||
**Problem:** Agent 1's context explodes (180k tokens). Need to hand off to fresh Agent 2 without full replay.
|
||||
|
||||
**Solution:** Bundle 60-70% of essential context
|
||||
|
||||
```markdown
|
||||
# context-bundle-2025-01-05-<session-id>.md
|
||||
|
||||
## Context Bundle
|
||||
Created: 2025-01-05 14:30
|
||||
Source Agent: agent-abc123
|
||||
|
||||
## Initial Setup
|
||||
/prime-feature
|
||||
|
||||
## Read Operations (deduplicated)
|
||||
- src/api/endpoints.ts
|
||||
- src/components/Auth.tsx
|
||||
- config/env.ts
|
||||
|
||||
## Key Findings
|
||||
- Auth system uses JWT
|
||||
- API has 15 endpoints
|
||||
- Config needs migration
|
||||
|
||||
## User Prompts (summarized)
|
||||
1. "Implement OAuth2 flow"
|
||||
2. "Add refresh token logic"
|
||||
|
||||
[Excluded: full write operations, detailed read contents, tool execution details]
|
||||
```
|
||||
|
||||
**Usage:**
|
||||
|
||||
```bash
|
||||
# Agent 1: Context exploding at 180k
|
||||
# Automatic bundle saved
|
||||
|
||||
# Agent 2: Fresh start (10k base)
|
||||
/loadbundle /path/to/context-bundle-<timestamp>.md
|
||||
# Agent 2 now has 70% of Agent 1's context in ~15k tokens
|
||||
|
||||
# Total: 25k tokens vs. 180k (86% reduction)
|
||||
```
|
||||
|
||||
#### Tactic 7: Composable Workflows (Scout-Plan-Build)
|
||||
|
||||
**Problem:** Single agent searching + planning + building = context explosion
|
||||
|
||||
```text
|
||||
Monolithic Agent:
|
||||
├── Search codebase: 40k tokens
|
||||
├── Read files: 60k tokens
|
||||
├── Plan changes: 20k tokens
|
||||
├── Implement: 30k tokens
|
||||
├── Test: 15k tokens
|
||||
└── Total: 165k tokens (83% used!)
|
||||
```
|
||||
|
||||
**Solution:** Break into composable steps that delegate
|
||||
|
||||
```text
|
||||
/scout-plan-build workflow:
|
||||
|
||||
Step 1: /scout (delegates to 4 parallel sub-agents)
|
||||
├→ Sub-agents search codebase: 4 × 15k = 60k total
|
||||
├→ Output: relevant-files.md (5k tokens)
|
||||
└→ Primary agent context: unchanged
|
||||
|
||||
Step 2: /plan-with-docs
|
||||
├→ Reads relevant-files.md: 5k tokens
|
||||
├→ Scrapes docs: 8k tokens
|
||||
├→ Creates plan: 3k tokens
|
||||
└→ Total added: 16k tokens
|
||||
|
||||
Step 3: /build
|
||||
├→ Reads plan: 3k tokens
|
||||
├→ Implements: 30k tokens
|
||||
└→ Total added: 33k tokens
|
||||
|
||||
Final primary agent context: 10k + 16k + 33k = 59k tokens
|
||||
Savings: 106k tokens (64% reduction)
|
||||
```
|
||||
|
||||
**Why this works:** Scout step offloads searching from planner (R&D: Reduce + Delegate)
|
||||
|
||||
### Level 4: Agentic - Out-of-Loop Systems
|
||||
|
||||
**Focus:** Agents working autonomously while you're AFK
|
||||
|
||||
#### Tactic 8: Focused Agents (One Agent, One Task)
|
||||
|
||||
**Anti-pattern:**
|
||||
|
||||
```text
|
||||
Super Agent (trying to do everything):
|
||||
├── API development
|
||||
├── UI implementation
|
||||
├── Database migrations
|
||||
├── Testing
|
||||
├── Documentation
|
||||
├── Deployment
|
||||
└── Context: 170k tokens (85% used)
|
||||
```
|
||||
|
||||
**Pattern:**
|
||||
|
||||
```text
|
||||
Focused Agent Fleet:
|
||||
├── Agent 1: API only (30k tokens)
|
||||
├── Agent 2: UI only (35k tokens)
|
||||
├── Agent 3: DB only (20k tokens)
|
||||
├── Agent 4: Tests only (25k tokens)
|
||||
├── Agent 5: Docs only (15k tokens)
|
||||
└── Each agent: <35k tokens (max 18% per agent)
|
||||
```
|
||||
|
||||
**Principle:** "A focused engineer is a performant engineer. A focused agent is a performant agent."
|
||||
|
||||
#### Tactic 9: Deletable Agents
|
||||
|
||||
**Pattern:**
|
||||
|
||||
```bash
|
||||
# Create agent for specific task
|
||||
/create-agent docs-writer "Document frontend components"
|
||||
|
||||
# Agent completes task (used 30k tokens)
|
||||
|
||||
# DELETE agent immediately
|
||||
/delete-agent docs-writer
|
||||
|
||||
# Result: 30k tokens freed for next agent
|
||||
```
|
||||
|
||||
**Lifecycle:**
|
||||
|
||||
```text
|
||||
1. Create agent → Task-specific context loaded
|
||||
2. Agent works → Context grows to completion
|
||||
3. Agent completes → Context maxed out
|
||||
4. DELETE agent → Context freed
|
||||
5. Create new agent → Fresh start
|
||||
6. Repeat
|
||||
```
|
||||
|
||||
**Engineering analogy:** "The best code is no code at all. The best agent is a deleted agent."
|
||||
|
||||
#### Tactic 10: Background Agent Delegation
|
||||
|
||||
**Problem:** You're in the loop, waiting for agent to finish long task
|
||||
|
||||
**Solution:** Delegate to background agent, continue working
|
||||
|
||||
```bash
|
||||
# In-loop (you wait, your context stays open)
|
||||
/implement-feature "Build auth system"
|
||||
# Your terminal blocked for 20 minutes
|
||||
# Context accumulates: 150k tokens
|
||||
|
||||
# Out-of-loop (you continue working)
|
||||
/background "Build auth system" \
|
||||
--model opus \
|
||||
--report agents/auth-report.md
|
||||
|
||||
# Background agent works independently
|
||||
# Your terminal freed immediately
|
||||
# Background agent context isolated
|
||||
# You get notified when complete
|
||||
```
|
||||
|
||||
**Context protection:**
|
||||
|
||||
- Primary agent: 10k tokens (just manages job queue)
|
||||
- Background agent: 150k tokens (isolated, will be deleted)
|
||||
- Your interactive session: 10k tokens (protected)
|
||||
|
||||
#### Tactic 11: Orchestrator Sleep Pattern
|
||||
|
||||
**Problem:** Orchestrator observing all agent work = context explosion
|
||||
|
||||
```text
|
||||
Orchestrator watches everything:
|
||||
├── Scout 1 work: 15k tokens observed
|
||||
├── Scout 2 work: 15k tokens observed
|
||||
├── Scout 3 work: 15k tokens observed
|
||||
├── Planner work: 25k tokens observed
|
||||
├── Builder work: 35k tokens observed
|
||||
└── Orchestrator context: 105k tokens
|
||||
```
|
||||
|
||||
**Solution:** Orchestrator sleeps while agents work
|
||||
|
||||
```text
|
||||
Orchestrator pattern:
|
||||
1. Create scouts → 3k tokens (commands only)
|
||||
2. SLEEP (not observing)
|
||||
3. Wake every 15s, check status → 1k tokens
|
||||
4. Scouts complete, read outputs → 5k tokens
|
||||
5. Create planner → 2k tokens
|
||||
6. SLEEP (not observing)
|
||||
7. Wake every 15s, check status → 1k tokens
|
||||
8. Planner completes, read output → 3k tokens
|
||||
9. Create builder → 2k tokens
|
||||
10. SLEEP (not observing)
|
||||
|
||||
Orchestrator final context: 17k tokens ✅
|
||||
vs. 105k if watching everything (84% reduction)
|
||||
```
|
||||
|
||||
**Key principle:** Orchestrator wakes to coordinate, sleeps while agents work.
|
||||
|
||||
## Monitoring Context Health
|
||||
|
||||
### The /context Command
|
||||
|
||||
```bash
|
||||
/context
|
||||
|
||||
# Healthy agent (beginner level):
|
||||
messages: 8%
|
||||
system_tools: 5%
|
||||
custom_agents: 2%
|
||||
---
|
||||
Total used: 15% ✅ (85% free)
|
||||
|
||||
# Warning (intermediate):
|
||||
messages: 45%
|
||||
mcp_tools: 18%
|
||||
system_tools: 5%
|
||||
---
|
||||
Total used: 68% ⚠️ (32% free, approaching limits)
|
||||
|
||||
# Danger (needs intervention):
|
||||
messages: 72%
|
||||
mcp_tools: 24%
|
||||
system_tools: 5%
|
||||
---
|
||||
Total used: 101% ❌ (context overflow!)
|
||||
```
|
||||
|
||||
### Success Metrics by Level
|
||||
|
||||
| Level | Target Context Free | What This Enables |
|
||||
|-------|---------------------|-------------------|
|
||||
| Beginner | 85-90% | Basic tasks without running out |
|
||||
| Intermediate | 60-75% | Complex tasks with breathing room |
|
||||
| Advanced | 40-60% | Multi-step workflows without overflow |
|
||||
| Agentic | Per-agent 60-80% | Fleet of focused agents |
|
||||
|
||||
### Warning Signs
|
||||
|
||||
**Your context window is in danger when:**
|
||||
|
||||
❌ **Single agent exceeds 150k tokens**
|
||||
|
||||
- Solution: Split work across multiple agents
|
||||
|
||||
❌ **Agent needs to read >20 files**
|
||||
|
||||
- Solution: Use scout agents to find relevant subset
|
||||
|
||||
❌ **`/context` shows >80% used**
|
||||
|
||||
- Solution: Start fresh agent, use context bundles
|
||||
|
||||
❌ **Agent gets slower/less accurate**
|
||||
|
||||
- Solution: Check context usage, delegate to sub-agents
|
||||
|
||||
❌ **Autocompact buffer active**
|
||||
|
||||
- Solution: Disable it, reclaim 20%+ tokens
|
||||
|
||||
## Context Window Hard Limits
|
||||
|
||||
> "Context window is a hard limit. We have to respect this and work around it."
|
||||
|
||||
### The Reality
|
||||
|
||||
```text
|
||||
Claude Opus 200k limit:
|
||||
├── System prompt: ~8k tokens (4%)
|
||||
├── Available tools: ~5k tokens (2.5%)
|
||||
├── MCP servers: 0-24k tokens (0-12%)
|
||||
├── CLAUDE.md: 0-23k tokens (0-11.5%)
|
||||
├── Custom agents: ~2k tokens (1%)
|
||||
└── Available for work: 138-185k tokens (69-92.5%)
|
||||
|
||||
Best case (optimized): 185k available
|
||||
Worst case (unoptimized): 138k available
|
||||
Difference: 47k tokens (25% of total capacity!)
|
||||
```
|
||||
|
||||
### Real Example from the Field
|
||||
|
||||
> "We were 14% away from exploding our context in our scout-plan-build workflow."
|
||||
|
||||
```text
|
||||
Scout-Plan-Build execution:
|
||||
├── Base context: 15k tokens
|
||||
├── Scout work (4 sub-agents): +40k tokens
|
||||
├── Planner work: +35k tokens
|
||||
├── Builder work: +80k tokens
|
||||
└── Total: 170k tokens
|
||||
|
||||
With autocompact buffer (22%):
|
||||
170k / 0.78 = 218k tokens
|
||||
❌ Exceeds 200k limit by 18k (9% overflow)
|
||||
|
||||
Without autocompact buffer:
|
||||
170k / 1.0 = 170k tokens
|
||||
✅ Within limits with 30k buffer (15% free)
|
||||
```
|
||||
|
||||
**Lesson:** Every percentage point matters when approaching limits.
|
||||
|
||||
## Common Context Explosion Patterns
|
||||
|
||||
### Pattern 1: The Sponge Agent
|
||||
|
||||
**Symptoms:**
|
||||
|
||||
- Agent reads entire codebase
|
||||
- Opens 50+ files
|
||||
- Context grows 10k tokens every few minutes
|
||||
|
||||
**Cause:** No filtering strategy
|
||||
|
||||
**Fix:**
|
||||
|
||||
```bash
|
||||
# Before: Agent reads everything
|
||||
Agent: "Analyzing codebase..."
|
||||
[reads 100 files = 150k tokens]
|
||||
|
||||
# After: Scout first
|
||||
/scout "Find files related to authentication"
|
||||
# Scout outputs: 5 relevant files
|
||||
Agent reads only those 5 files = 8k tokens
|
||||
```
|
||||
|
||||
### Pattern 2: The Accumulator
|
||||
|
||||
**Symptoms:**
|
||||
|
||||
- Long conversation
|
||||
- Many tool calls
|
||||
- Context steadily grows to limit
|
||||
|
||||
**Cause:** Not resetting agent between phases
|
||||
|
||||
**Fix:**
|
||||
|
||||
```bash
|
||||
# Phase 1: Exploration
|
||||
[Agent explores, context hits 120k]
|
||||
|
||||
# Phase 2: Implementation
|
||||
# ❌ Bad: Continue same agent (will overflow)
|
||||
# ✅ Good: New agent with context bundle
|
||||
|
||||
/loadbundle context-from-phase-1.md
|
||||
# Fresh agent (15k) + bundle (20k) = 35k tokens
|
||||
# Ready for implementation without overflow
|
||||
```
|
||||
|
||||
### Pattern 3: The Observer
|
||||
|
||||
**Symptoms:**
|
||||
|
||||
- Orchestrator context growing rapidly
|
||||
- Watching all sub-agent work
|
||||
- Can't coordinate more than 2-3 agents
|
||||
|
||||
**Cause:** Not using sleep pattern
|
||||
|
||||
**Fix:**
|
||||
|
||||
```python
|
||||
# ❌ Bad: Orchestrator watches everything
|
||||
for agent in agents:
|
||||
result = orchestrator.watch_agent_work(agent) # Observes all work
|
||||
orchestrator.context += result # Context explodes
|
||||
|
||||
# ✅ Good: Orchestrator sleeps
|
||||
for agent in agents:
|
||||
orchestrator.create_and_command(agent)
|
||||
orchestrator.sleep() # Not observing
|
||||
|
||||
orchestrator.wake_and_check_status() # Only reads summaries
|
||||
```
|
||||
|
||||
## The "200k is Plenty" Principle
|
||||
|
||||
> "I'm super excited for larger effective context windows, but 200k context window is plenty. You're just stuffing a single agent with too much work."
|
||||
|
||||
**The mindset shift:**
|
||||
|
||||
```text
|
||||
Beginner thinking:
|
||||
"I need a bigger context window"
|
||||
"If only I had 500k tokens..."
|
||||
"My task is too complex for 200k"
|
||||
|
||||
Expert thinking:
|
||||
"I need better context management"
|
||||
"I'm overloading a single agent"
|
||||
"I should split this across focused agents"
|
||||
```
|
||||
|
||||
**The truth:** Most context explosions are design problems, not capacity problems.
|
||||
|
||||
### Why 200k is Sufficient
|
||||
|
||||
**With proper protection:**
|
||||
|
||||
```text
|
||||
Task: Refactor authentication across 50-file codebase
|
||||
|
||||
Approach 1 (Single Agent - fails):
|
||||
├── Agent reads 50 files: 75k tokens
|
||||
├── Agent plans changes: 20k tokens
|
||||
├── Agent implements: 80k tokens
|
||||
├── Agent tests: 30k tokens
|
||||
└── Total: 205k tokens ❌ (overflow by 5k)
|
||||
|
||||
Approach 2 (Multi-Agent - succeeds):
|
||||
├── Scout finds relevant 10 files: 15k tokens
|
||||
├── Planner creates strategy: 20k tokens (new agent)
|
||||
├── Builder 1 (auth logic): 35k tokens (new agent)
|
||||
├── Builder 2 (UI changes): 30k tokens (new agent)
|
||||
├── Tester verifies: 25k tokens (new agent)
|
||||
└── Max per agent: 35k tokens ✅ (all within limits)
|
||||
```
|
||||
|
||||
## Integration with Other Patterns
|
||||
|
||||
Context window protection enables:
|
||||
|
||||
**Progressive Disclosure:**
|
||||
|
||||
- Reduces: Minimal static context
|
||||
- Enables: Dynamic loading via priming
|
||||
|
||||
**Core 4 Management:**
|
||||
|
||||
- Protects: Context (pillar #1)
|
||||
- Enables: Better model/prompt/tools choices
|
||||
|
||||
**Orchestration:**
|
||||
|
||||
- Requires: Context protection (orchestrator sleep)
|
||||
- Enables: Fleet management without overflow
|
||||
|
||||
**Observability:**
|
||||
|
||||
- Monitors: Context usage via hooks
|
||||
- Prevents: Unnoticed context explosion
|
||||
|
||||
## Key Principles
|
||||
|
||||
1. **Reduce and Delegate** - The only two strategies that matter
|
||||
|
||||
2. **A focused agent is a performant agent** - Single-purpose beats multi-purpose
|
||||
|
||||
3. **Agents are deletable** - Free context by removing completed agents
|
||||
|
||||
4. **200k is plenty** - Context explosions are design problems
|
||||
|
||||
5. **Monitor constantly** - `/context` command is your best friend
|
||||
|
||||
6. **Orchestrators must sleep** - Don't observe all agent work
|
||||
|
||||
7. **Context bundles over full replay** - 70% of context in 10% of tokens
|
||||
|
||||
## Source Attribution
|
||||
|
||||
**Primary sources:**
|
||||
|
||||
- Elite Context Engineering (R&D framework, 4 levels, all tactics)
|
||||
- Claude 2.0 (autocompact buffer, hard limits, scout-plan-build)
|
||||
|
||||
**Supporting sources:**
|
||||
|
||||
- One Agent to Rule Them All (orchestrator sleep, 200k principle, deletable agents)
|
||||
- Sub-Agents (sub-agent delegation, context isolation)
|
||||
|
||||
**Key quotes:**
|
||||
|
||||
- "200k context window is plenty. You're just stuffing a single agent with too much work." (One Agent)
|
||||
- "A focused agent is a performant agent." (Elite Context Engineering)
|
||||
- "We were 14% away from exploding our context." (Claude 2.0)
|
||||
- "There are only two ways to manage your context window: R and D." (Elite Context Engineering)
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Progressive Disclosure](../reference/progressive-disclosure.md) - Context loading strategies
|
||||
- [Orchestrator Pattern](orchestrator-pattern.md) - Fleet management requiring protection
|
||||
- [Evolution Path](../workflows/evolution-path.md) - Progression through protection levels
|
||||
- [Core 4 Framework](../reference/core-4-framework.md) - Context as first pillar
|
||||
|
||||
---
|
||||
|
||||
**Remember:** Context window management separates beginners from experts. Master it, and you can scale infinitely with focused agents.
|
||||
Reference in New Issue
Block a user