Initial commit
This commit is contained in:
567
skills/create-subagents/references/context-management.md
Normal file
567
skills/create-subagents/references/context-management.md
Normal file
@@ -0,0 +1,567 @@
|
||||
# Context Management for Subagents
|
||||
|
||||
<core_problem>
|
||||
|
||||
|
||||
"Most agent failures are not model failures, they are context failures."
|
||||
|
||||
<stateless_nature>
|
||||
LLMs are stateless by default. Each invocation starts fresh with no memory of previous interactions.
|
||||
|
||||
**For subagents, this means**:
|
||||
- Long-running tasks lose context between tool calls
|
||||
- Repeated information wastes tokens
|
||||
- Important decisions from earlier in workflow forgotten
|
||||
- Context window fills with redundant information
|
||||
</stateless_nature>
|
||||
|
||||
<context_window_limits>
|
||||
Full conversation history leads to:
|
||||
- Degraded performance (important info buried in noise)
|
||||
- High costs (paying for redundant tokens)
|
||||
- Context limits exceeded (workflow fails)
|
||||
|
||||
**Critical threshold**: When context approaches limit, quality degrades before hard failure.
|
||||
</context_window_limits>
|
||||
</core_problem>
|
||||
|
||||
<memory_architecture>
|
||||
|
||||
|
||||
<short_term_memory>
|
||||
**Short-term memory (STM)**: Last 5-9 interactions.
|
||||
|
||||
**Implementation**: Preserved in context window.
|
||||
|
||||
**Use for**:
|
||||
- Current task state
|
||||
- Recent tool call results
|
||||
- Immediate decisions
|
||||
- Active conversation flow
|
||||
|
||||
**Limitation**: Limited capacity, volatile (lost when context cleared).
|
||||
</short_term_memory>
|
||||
|
||||
<long_term_memory>
|
||||
**Long-term memory (LTM)**: Persistent storage across sessions.
|
||||
|
||||
**Implementation**: External storage (files, databases, vector stores).
|
||||
|
||||
**Use for**:
|
||||
- Historical patterns
|
||||
- Accumulated knowledge
|
||||
- User preferences
|
||||
- Past task outcomes
|
||||
|
||||
**Access pattern**: Retrieve relevant memories into working memory when needed.
|
||||
</long_term_memory>
|
||||
|
||||
<working_memory>
|
||||
**Working memory**: Current context + retrieved memories.
|
||||
|
||||
**Composition**:
|
||||
- Core task information (always present)
|
||||
- Recent interaction history (STM)
|
||||
- Retrieved relevant memories (from LTM)
|
||||
- Current tool outputs
|
||||
|
||||
**Management**: This is what fits in context window. Optimize aggressively.
|
||||
</working_memory>
|
||||
|
||||
<core_memory>
|
||||
**Core memory**: Actively used information in current interaction.
|
||||
|
||||
**Examples**:
|
||||
- Current task goal and constraints
|
||||
- Key facts about the codebase being worked on
|
||||
- Critical requirements from user
|
||||
- Active workflow state
|
||||
|
||||
**Principle**: Keep core memory minimal and highly relevant. Everything else is retrievable.
|
||||
</core_memory>
|
||||
|
||||
<archival_memory>
|
||||
**Archival memory**: Persistent storage for less critical data.
|
||||
|
||||
**Examples**:
|
||||
- Complete conversation transcripts
|
||||
- Full tool output logs
|
||||
- Historical metrics
|
||||
- Deprecated approaches that were tried
|
||||
|
||||
**Access**: Rarely needed, searchable when required, doesn't consume context window.
|
||||
</archival_memory>
|
||||
</memory_architecture>
|
||||
|
||||
<context_strategies>
|
||||
|
||||
|
||||
<summarization>
|
||||
**Pattern**: Move information from context to searchable database, keep summary in memory.
|
||||
|
||||
<when_to_summarize>
|
||||
Trigger summarization when:
|
||||
- Context reaches 75% of limit
|
||||
- Task transitions to new phase
|
||||
- Information is important but no longer actively needed
|
||||
- Repeated information appears multiple times
|
||||
</when_to_summarize>
|
||||
|
||||
<summary_quality>
|
||||
**Quality guidelines**:
|
||||
|
||||
1. **Highlight important events**
|
||||
```markdown
|
||||
Bad: "Reviewed code, found issues, provided fixes"
|
||||
Good: "Identified critical SQL injection in auth.ts:127, provided parameterized query fix. High-priority: requires immediate attention before deployment."
|
||||
```
|
||||
|
||||
2. **Include timing for sequential reasoning**
|
||||
```markdown
|
||||
"First attempt: Direct fix failed due to type mismatch.
|
||||
Second attempt: Added type conversion, introduced runtime error.
|
||||
Final approach: Refactored to use type-safe wrapper (successful)."
|
||||
```
|
||||
|
||||
3. **Structure into categories vs long paragraphs**
|
||||
```markdown
|
||||
Issues found:
|
||||
- Security: SQL injection (Critical), XSS (High)
|
||||
- Performance: N+1 query (Medium)
|
||||
- Code quality: Duplicate logic (Low)
|
||||
|
||||
Actions taken:
|
||||
- Fixed SQL injection with prepared statements
|
||||
- Added input sanitization for XSS
|
||||
- Deferred performance optimization (noted in TODOs)
|
||||
```
|
||||
|
||||
**Benefit**: Organized grouping improves relationship understanding.
|
||||
</summary_quality>
|
||||
|
||||
<example_workflow>
|
||||
```markdown
|
||||
<context_management>
|
||||
When conversation history exceeds 15 turns:
|
||||
1. Identify information that is:
|
||||
- Important (must preserve)
|
||||
- Complete (no longer actively changing)
|
||||
- Historical (not needed for next immediate step)
|
||||
2. Create structured summary with categories
|
||||
3. Store full details in file (archival memory)
|
||||
4. Replace verbose history with concise summary
|
||||
5. Continue with reduced context load
|
||||
</context_management>
|
||||
```
|
||||
</example_workflow>
|
||||
</summarization>
|
||||
|
||||
<sliding_window>
|
||||
**Pattern**: Recent interactions in context, older interactions as vectors for retrieval.
|
||||
|
||||
<implementation>
|
||||
```markdown
|
||||
<sliding_window_strategy>
|
||||
Maintain in context:
|
||||
- Last 5 tool calls and results (short-term memory)
|
||||
- Current task state and goals (core memory)
|
||||
- Key facts from user requirements (core memory)
|
||||
|
||||
Move to vector storage:
|
||||
- Tool calls older than 5 steps
|
||||
- Completed subtask results
|
||||
- Historical debugging attempts
|
||||
- Exploration that didn't lead to solution
|
||||
|
||||
Retrieval trigger:
|
||||
- When current issue similar to past issue
|
||||
- When user references earlier discussion
|
||||
- When pattern matching suggests relevant history
|
||||
</sliding_window_strategy>
|
||||
```
|
||||
|
||||
**Benefit**: Bounded context growth, relevant history still accessible.
|
||||
</implementation>
|
||||
</sliding_window>
|
||||
|
||||
<semantic_context_switching>
|
||||
**Pattern**: Detect context changes, respond appropriately.
|
||||
|
||||
<example>
|
||||
```markdown
|
||||
<context_switch_detection>
|
||||
Monitor for topic changes:
|
||||
- User switches from "fix bug" to "add feature"
|
||||
- Subagent transitions from "analysis" to "implementation"
|
||||
- Task scope changes mid-execution
|
||||
|
||||
On context switch:
|
||||
1. Summarize current context state
|
||||
2. Save state to working memory/file
|
||||
3. Load relevant context for new topic
|
||||
4. Acknowledge switch: "Switching from bug analysis to feature implementation. Bug analysis results saved for later reference."
|
||||
</context_switch_detection>
|
||||
```
|
||||
|
||||
**Prevents**: Mixing contexts, applying wrong constraints, forgetting important info when switching tasks.
|
||||
</example>
|
||||
</semantic_context_switching>
|
||||
|
||||
<scratchpads>
|
||||
**Pattern**: Record intermediate results outside LLM context.
|
||||
|
||||
<use_cases>
|
||||
**When to use scratchpads**:
|
||||
- Complex calculations with many steps
|
||||
- Exploration of multiple approaches
|
||||
- Detailed analysis that may not all be relevant
|
||||
- Debugging traces
|
||||
- Intermediate data transformations
|
||||
|
||||
**Implementation**:
|
||||
```markdown
|
||||
<scratchpad_workflow>
|
||||
For complex debugging:
|
||||
1. Create scratchpad file: `.claude/scratch/debug-session-{timestamp}.md`
|
||||
2. Log each hypothesis and test result in scratchpad
|
||||
3. Keep only current hypothesis and key findings in context
|
||||
4. Reference scratchpad for full debugging history
|
||||
5. Summarize successful approach in final output
|
||||
</scratchpad_workflow>
|
||||
```
|
||||
|
||||
**Benefit**: Context contains insights, scratchpad contains exploration. User gets clean summary, full details available if needed.
|
||||
</use_cases>
|
||||
</scratchpads>
|
||||
|
||||
<smart_memory_management>
|
||||
**Pattern**: Auto-add key data, retrieve on demand.
|
||||
|
||||
<smart_write>
|
||||
```markdown
|
||||
<auto_capture>
|
||||
Automatically save to memory:
|
||||
- User-stated preferences: "I prefer TypeScript over JavaScript"
|
||||
- Project conventions: "This codebase uses Jest for testing"
|
||||
- Critical decisions: "Decided to use OAuth2 for authentication"
|
||||
- Frequent patterns: "API endpoints follow REST naming: /api/v1/{resource}"
|
||||
|
||||
Store in structured format for easy retrieval.
|
||||
</auto_capture>
|
||||
```
|
||||
</smart_write>
|
||||
|
||||
<smart_read>
|
||||
```markdown
|
||||
<auto_retrieval>
|
||||
Automatically retrieve from memory when:
|
||||
- User asks about past decision: "Why did we choose OAuth2?"
|
||||
- Similar task encountered: "Last time we added auth, we used..."
|
||||
- Pattern matching: "This looks like the payment flow issue from last week"
|
||||
|
||||
Inject relevant memories into working context.
|
||||
</auto_retrieval>
|
||||
```
|
||||
</smart_read>
|
||||
</smart_memory_management>
|
||||
|
||||
<compaction>
|
||||
**Pattern**: Summarize near-limit conversations, reinitiate with summary.
|
||||
|
||||
<workflow>
|
||||
```markdown
|
||||
<compaction_workflow>
|
||||
When context reaches 90% capacity:
|
||||
1. Identify essential information:
|
||||
- Current task and status
|
||||
- Key decisions made
|
||||
- Critical constraints
|
||||
- Important discoveries
|
||||
2. Generate concise summary (max 20% of context size)
|
||||
3. Save full context to archival storage
|
||||
4. Create new conversation initialized with summary
|
||||
5. Continue task in fresh context
|
||||
|
||||
Summary format:
|
||||
**Task**: [Current objective]
|
||||
**Status**: [What's been completed, what remains]
|
||||
**Key findings**: [Important discoveries]
|
||||
**Decisions**: [Critical choices made]
|
||||
**Next steps**: [Immediate actions]
|
||||
</compaction_workflow>
|
||||
```
|
||||
|
||||
**When to use**: Long-running tasks, exploratory analysis, iterative debugging.
|
||||
</workflow>
|
||||
</compaction>
|
||||
</context_strategies>
|
||||
|
||||
<framework_support>
|
||||
|
||||
|
||||
<langchain>
|
||||
**LangChain**: Provides automatic memory management.
|
||||
|
||||
**Features**:
|
||||
- Conversation memory buffers
|
||||
- Summary memory
|
||||
- Vector store memory
|
||||
- Entity extraction
|
||||
|
||||
**Use case**: Building subagents that need sophisticated memory without manual implementation.
|
||||
</langchain>
|
||||
|
||||
<llamaindex>
|
||||
**LlamaIndex**: Indexing for longer conversations.
|
||||
|
||||
**Features**:
|
||||
- Semantic search over conversation history
|
||||
- Automatic chunking and indexing
|
||||
- Retrieval augmentation
|
||||
|
||||
**Use case**: Subagents working with large codebases, documentation, or extensive conversation history.
|
||||
</llamaindex>
|
||||
|
||||
<file_based>
|
||||
**File-based memory**: Simple, explicit, debuggable.
|
||||
|
||||
```markdown
|
||||
<memory_structure>
|
||||
.claude/memory/
|
||||
core-facts.md # Essential project information
|
||||
decisions.md # Key decisions and rationale
|
||||
patterns.md # Discovered patterns and conventions
|
||||
{subagent}-state.json # Subagent-specific state
|
||||
</memory_structure>
|
||||
|
||||
<usage>
|
||||
Subagent reads relevant files at start, updates during execution, summarizes at end.
|
||||
</usage>
|
||||
```
|
||||
|
||||
**Benefit**: Transparent, version-controllable, human-readable.
|
||||
</file_based>
|
||||
</framework_support>
|
||||
|
||||
<subagent_patterns>
|
||||
|
||||
|
||||
<stateful_subagent>
|
||||
**For long-running or frequently-invoked subagents**:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: code-architect
|
||||
description: Maintains understanding of system architecture across multiple invocations
|
||||
tools: Read, Write, Grep, Glob
|
||||
model: sonnet
|
||||
---
|
||||
|
||||
<role>
|
||||
You are a system architect maintaining coherent design across project evolution.
|
||||
</role>
|
||||
|
||||
<memory_management>
|
||||
On each invocation:
|
||||
1. Read `.claude/memory/architecture-state.md` for current system state
|
||||
2. Perform assigned task with full context
|
||||
3. Update architecture-state.md with new components, decisions, patterns
|
||||
4. Maintain concise state (max 500 lines), summarize older decisions
|
||||
|
||||
State file structure:
|
||||
- Current architecture (always up-to-date)
|
||||
- Recent changes (last 10 modifications)
|
||||
- Key design decisions (why choices were made)
|
||||
- Active concerns (issues to address)
|
||||
</memory_management>
|
||||
```
|
||||
</stateful_subagent>
|
||||
|
||||
<stateless_subagent>
|
||||
**For simple, focused subagents**:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: syntax-checker
|
||||
description: Validates code syntax without maintaining state
|
||||
tools: Read, Bash
|
||||
model: haiku
|
||||
---
|
||||
|
||||
<role>
|
||||
You are a syntax validator. Check code for syntax errors.
|
||||
</role>
|
||||
|
||||
<workflow>
|
||||
1. Read specified files
|
||||
2. Run syntax checker (language-specific linter)
|
||||
3. Report errors with line numbers
|
||||
4. No memory needed - each invocation is independent
|
||||
</workflow>
|
||||
```
|
||||
|
||||
**When to use stateless**: Single-purpose validators, formatters, simple transformations.
|
||||
</stateless_subagent>
|
||||
|
||||
<context_inheritance>
|
||||
**Inheriting context from main chat**:
|
||||
|
||||
Subagents automatically have access to:
|
||||
- User's original request
|
||||
- Any context provided in invocation
|
||||
|
||||
```markdown
|
||||
Main chat: "Review the authentication changes for security issues.
|
||||
Context: We recently switched from JWT to session-based auth."
|
||||
|
||||
Subagent receives:
|
||||
- Task: Review authentication changes
|
||||
- Context: Recent switch from JWT to session-based auth
|
||||
- This context informs review focus without explicit memory management
|
||||
```
|
||||
</context_inheritance>
|
||||
</subagent_patterns>
|
||||
|
||||
<anti_patterns>
|
||||
|
||||
|
||||
<anti_pattern name="context_dumping">
|
||||
❌ Including everything in context "just in case"
|
||||
|
||||
**Problem**: Buries important information in noise, wastes tokens, degrades performance.
|
||||
|
||||
**Fix**: Include only what's relevant for current task. Everything else is retrievable.
|
||||
</anti_pattern>
|
||||
|
||||
<anti_pattern name="no_summarization">
|
||||
❌ Letting context grow unbounded until limit hit
|
||||
|
||||
**Problem**: Sudden context overflow mid-task, quality degradation before failure.
|
||||
|
||||
**Fix**: Proactive summarization at 75% capacity, continuous compaction.
|
||||
</anti_pattern>
|
||||
|
||||
<anti_pattern name="lossy_summarization">
|
||||
❌ Summaries that discard critical information
|
||||
|
||||
**Example**:
|
||||
```markdown
|
||||
Bad summary: "Tried several approaches, eventually fixed bug"
|
||||
Lost information: What approaches failed, why, what the successful fix was
|
||||
```
|
||||
|
||||
**Fix**: Summaries preserve essential facts, decisions, and rationale. Details go to archival storage.
|
||||
</anti_pattern>
|
||||
|
||||
<anti_pattern name="no_memory_structure">
|
||||
❌ Unstructured memory (long paragraphs, no organization)
|
||||
|
||||
**Problem**: Hard to retrieve relevant information, poor for LLM reasoning.
|
||||
|
||||
**Fix**: Structured memory with categories, bullet points, clear sections.
|
||||
</anti_pattern>
|
||||
|
||||
<anti_pattern name="context_failure_ignorance">
|
||||
❌ Assuming all failures are model limitations
|
||||
|
||||
**Reality**: "Most agent failures are context failures, not model failures."
|
||||
|
||||
Check context quality before blaming model:
|
||||
- Is relevant information present?
|
||||
- Is it organized clearly?
|
||||
- Is important info buried in noise?
|
||||
- Has context been properly maintained?
|
||||
</anti_pattern>
|
||||
</anti_patterns>
|
||||
|
||||
<best_practices>
|
||||
|
||||
|
||||
<principle name="core_memory_minimal">
|
||||
Keep core memory minimal and highly relevant.
|
||||
|
||||
**Rule of thumb**: If information isn't needed for next 3 steps, it doesn't belong in core memory.
|
||||
</principle>
|
||||
|
||||
<principle name="summaries_structured">
|
||||
Summaries should be structured, categorized, and scannable.
|
||||
|
||||
**Template**:
|
||||
```markdown
|
||||
|
||||
**Status**: [Progress]
|
||||
**Completed**:
|
||||
- [Key accomplishment 1]
|
||||
- [Key accomplishment 2]
|
||||
|
||||
**Active**:
|
||||
- [Current work]
|
||||
|
||||
**Decisions**:
|
||||
- [Important choice 1]: [Rationale]
|
||||
- [Important choice 2]: [Rationale]
|
||||
|
||||
**Next**: [Immediate next steps]
|
||||
```
|
||||
</principle>
|
||||
|
||||
<principle name="timing_matters">
|
||||
Include timing for sequential reasoning.
|
||||
|
||||
"First tried X (failed), then tried Y (worked)" is more useful than "Used approach Y".
|
||||
</principle>
|
||||
|
||||
<principle name="retrieval_over_retention">
|
||||
Better to retrieve information on-demand than keep it in context always.
|
||||
|
||||
**Exception**: Frequently-used core facts (task goal, critical constraints).
|
||||
</principle>
|
||||
|
||||
<principle name="external_storage">
|
||||
Use filesystem for:
|
||||
- Full logs and traces
|
||||
- Detailed exploration results
|
||||
- Historical data
|
||||
- Intermediate work products
|
||||
|
||||
Use context for:
|
||||
- Current task state
|
||||
- Key decisions
|
||||
- Active workflow
|
||||
- Immediate next steps
|
||||
</principle>
|
||||
</best_practices>
|
||||
|
||||
<prompt_caching_interaction>
|
||||
|
||||
|
||||
Prompt caching (see [subagents.md](subagents.md#prompt_caching)) works best with stable context.
|
||||
|
||||
<cache_friendly_context>
|
||||
**Structure context for caching**:
|
||||
|
||||
```markdown
|
||||
[CACHEABLE: Stable subagent instructions]
|
||||
<role>...</role>
|
||||
<focus_areas>...</focus_areas>
|
||||
<workflow>...</workflow>
|
||||
---
|
||||
[CACHE BREAKPOINT]
|
||||
---
|
||||
[VARIABLE: Task-specific context]
|
||||
Current task: ...
|
||||
Recent context: ...
|
||||
```
|
||||
|
||||
**Benefit**: Stable instructions cached, task-specific context fresh. 90% cost reduction on cached portion.
|
||||
</cache_friendly_context>
|
||||
|
||||
<cache_invalidation>
|
||||
**When context changes invalidate cache**:
|
||||
- Subagent prompt updated
|
||||
- Core memory structure changed
|
||||
- Context reorganization
|
||||
|
||||
**Mitigation**: Keep stable content (role, workflow, constraints) separate from variable content (current task, recent history).
|
||||
</cache_invalidation>
|
||||
</prompt_caching_interaction>
|
||||
Reference in New Issue
Block a user