10 KiB
Recovery Strategy Patterns
Version: 1.0 Source: Bootstrap-003 Error Recovery Methodology Last Updated: 2025-10-18
This document provides proven recovery patterns for each error category.
Pattern 1: Syntax Error Fix-and-Retry
Applicable to: Build/Compilation Errors (Category 1)
Strategy: Fix syntax error in source code and rebuild
Steps:
- Locate: Identify file and line from error (
file.go:line:col) - Read: Read the problematic file section
- Fix: Edit file to correct syntax error
- Verify: Run
go buildorgo test - Retry: Retry original operation
Automation: Semi-automated (detection automatic, fix manual)
Success Rate: >90%
Time to Recovery: 2-5 minutes
Example:
Error: cmd/root.go:4:2: "fmt" imported and not used
Recovery:
1. Read cmd/root.go
2. Edit cmd/root.go - remove line 4: import "fmt"
3. Bash: go build
4. Verify: Build succeeds
Pattern 2: Test Fixture Update
Applicable to: Test Failures (Category 2)
Strategy: Update test fixtures or expectations to match current code
Steps:
- Analyze: Understand test expectation vs code output
- Decide: Determine if code or test is incorrect
- Update: Fix code or update test fixture/assertion
- Verify: Run test again
- Full test: Run complete test suite
Automation: Low (requires human judgment)
Success Rate: >85%
Time to Recovery: 5-15 minutes
Example:
Error: --- FAIL: TestLoadFixture (0.00s)
fixtures_test.go:34: Missing 'sequence' field
Recovery:
1. Read tests/fixtures/sample-session.jsonl
2. Identify missing 'sequence' field
3. Edit fixture to add 'sequence' field
4. Bash: go test ./internal/testutil -v
5. Verify: Test passes
Pattern 3: Path Correction ⚠️ AUTOMATABLE
Applicable to: File Not Found (Category 3)
Strategy: Correct file path or create missing file
Steps:
- Verify: Confirm file doesn't exist (
lsorfind) - Locate: Search for file with correct name
- Decide: Path typo vs file not created
- Fix:
- If typo: Correct path
- If not created: Create file or reorder workflow
- Retry: Retry with correct path
Automation: High (path validation, fuzzy matching, "did you mean?")
Success Rate: >95%
Time to Recovery: 1-3 minutes
Example:
Error: No such file: /path/internal/testutil/fixture.go
Recovery:
1. Bash: ls /path/internal/testutil/
2. Find: File is fixtures.go (not fixture.go)
3. Bash: wc -l /path/internal/testutil/fixtures.go
4. Verify: Success
Pattern 4: Read-Then-Write ⚠️ AUTOMATABLE
Applicable to: Write Before Read (Category 5)
Strategy: Add Read step before Write, or use Edit
Steps:
- Check existence: Verify file exists
- Decide tool:
- For modifications: Use Edit
- For complete rewrite: Read then Write
- Read: Read existing file content
- Write/Edit: Perform operation
- Verify: Confirm desired content
Automation: Fully automated (can auto-insert Read step)
Success Rate: >98%
Time to Recovery: 1-2 minutes
Example:
Error: File has not been read yet.
Recovery:
1. Bash: ls internal/testutil/fixtures.go
2. Read internal/testutil/fixtures.go
3. Edit internal/testutil/fixtures.go
4. Verify: Updated successfully
Pattern 5: Build-Then-Execute
Applicable to: Command Not Found (Category 6)
Strategy: Build binary before executing, or add to PATH
Steps:
- Identify: Determine missing command
- Check buildable: Is this a project binary?
- Build: Run build command (
make build) - Execute: Use local path or install to PATH
- Verify: Command executes
Automation: Medium (can detect and suggest build)
Success Rate: >90%
Time to Recovery: 2-5 minutes
Example:
Error: meta-cc: command not found
Recovery:
1. Bash: ls meta-cc (check if exists)
2. If not: make build
3. Bash: ./meta-cc --version
4. Verify: Command runs
Pattern 6: Pagination for Large Files ⚠️ AUTOMATABLE
Applicable to: File Size Exceeded (Category 4)
Strategy: Use offset/limit or alternative tools
Steps:
- Detect: File size check before read
- Choose approach:
- Option A: Read with offset/limit
- Option B: Use grep/head/tail
- Option C: Process in chunks
- Execute: Apply chosen approach
- Verify: Obtained needed information
Automation: Fully automated (can auto-detect and paginate)
Success Rate: 100%
Time to Recovery: 1-2 minutes
Example:
Error: File exceeds 25000 tokens
Recovery:
1. Bash: wc -l large-file.jsonl # Check size
2. Read large-file.jsonl offset=0 limit=1000 # Read first 1000 lines
3. OR: Bash: head -n 1000 large-file.jsonl
4. Verify: Got needed content
Pattern 7: JSON Schema Fix
Applicable to: JSON Parsing Errors (Category 7)
Strategy: Fix JSON structure or update schema
Steps:
- Validate: Use
jqto check JSON validity - Locate: Find exact parsing error location
- Analyze: Determine if JSON or code schema is wrong
- Fix:
- If JSON: Fix structure (commas, braces, types)
- If schema: Update Go struct tags/types
- Test: Verify parsing succeeds
Automation: Medium (syntax validation yes, schema fix no)
Success Rate: >85%
Time to Recovery: 3-8 minutes
Example:
Error: json: cannot unmarshal string into field .count of type int
Recovery:
1. Read testdata/fixture.json
2. Find: "count": "42" (string instead of int)
3. Edit: Change to "count": 42
4. Bash: go test ./internal/parser
5. Verify: Test passes
Pattern 8: String Exact Match
Applicable to: String Not Found (Edit Errors) (Category 13)
Strategy: Re-read file and copy exact string
Steps:
- Re-read: Read file to get current content
- Locate: Find target section (grep or visual)
- Copy exact: Copy current string exactly (no retyping)
- Retry Edit: Use exact old_string
- Verify: Edit succeeds
Automation: High (auto-refresh content before edit)
Success Rate: >95%
Time to Recovery: 1-3 minutes
Example:
Error: String to replace not found in file
Recovery:
1. Read internal/parser/parse.go # Fresh read
2. Grep: Search for target function
3. Copy exact string from current file
4. Edit with exact old_string
5. Verify: Edit succeeds
Pattern 9: MCP Server Health Check
Applicable to: MCP Server Errors (Category 9)
Strategy: Check server health, restart if needed
Steps:
- Check status: Verify MCP server is running
- Test connection: Simple query to test connectivity
- Restart: If down, restart MCP server
- Optimize query: If timeout, add pagination/filters
- Retry: Retry original query
Automation: Medium (health checks yes, query optimization no)
Success Rate: >80%
Time to Recovery: 2-10 minutes
Example:
Error: MCP server connection failed
Recovery:
1. Bash: ps aux | grep mcp-server
2. If not running: Restart MCP server
3. Test: Simple query (e.g., get_session_stats)
4. If working: Retry original query
5. Verify: Query succeeds
Pattern 10: Permission Fix
Applicable to: Permission Denied (Category 10)
Strategy: Change permissions or use appropriate user
Steps:
- Check current:
ls -lato see permissions - Identify owner:
ls -lshows file owner - Fix permission:
- Option A:
chmodto add permissions - Option B:
chownto change owner - Option C: Use sudo (if appropriate)
- Option A:
- Retry: Retry original operation
- Verify: Operation succeeds
Automation: Low (security implications)
Success Rate: >90%
Time to Recovery: 1-3 minutes
Example:
Error: Permission denied: /path/to/file
Recovery:
1. Bash: ls -la /path/to/file
2. See: -r--r--r-- (read-only)
3. Bash: chmod u+w /path/to/file
4. Retry: Write operation
5. Verify: Success
Recovery Pattern Selection
Decision Tree
Error occurs
├─ Build/compilation? → Pattern 1 (Fix-and-Retry)
├─ Test failure? → Pattern 2 (Test Fixture Update)
├─ File not found? → Pattern 3 (Path Correction) ⚠️ AUTOMATE
├─ File too large? → Pattern 6 (Pagination) ⚠️ AUTOMATE
├─ Write before read? → Pattern 4 (Read-Then-Write) ⚠️ AUTOMATE
├─ Command not found? → Pattern 5 (Build-Then-Execute)
├─ JSON parsing? → Pattern 7 (JSON Schema Fix)
├─ String not found (Edit)? → Pattern 8 (String Exact Match)
├─ MCP server? → Pattern 9 (MCP Health Check)
├─ Permission denied? → Pattern 10 (Permission Fix)
└─ Other? → Consult taxonomy for category
Automation Priority
High Priority (Full automation possible):
- Pattern 3: Path Correction (validate-path.sh)
- Pattern 4: Read-Then-Write (check-read-before-write.sh)
- Pattern 6: Pagination (check-file-size.sh)
Medium Priority (Partial automation): 4. Pattern 5: Build-Then-Execute 5. Pattern 7: JSON Schema Fix 6. Pattern 9: MCP Server Health
Low Priority (Manual required): 7. Pattern 1: Syntax Error Fix 8. Pattern 2: Test Fixture Update 9. Pattern 10: Permission Fix
Best Practices
General Recovery Workflow
- Classify: Match error to category (use taxonomy.md)
- Select pattern: Choose appropriate recovery pattern
- Execute steps: Follow pattern steps systematically
- Verify: Confirm recovery successful
- Document: Note if pattern needs refinement
Efficiency Tips
- Keep taxonomy.md open for quick classification
- Use automation tools when available
- Don't skip verification steps
- Track recurring errors for prevention
Common Mistakes
❌ Don't: Retry without understanding error ❌ Don't: Skip verification step ❌ Don't: Ignore automation opportunities ✅ Do: Classify error first ✅ Do: Follow pattern steps systematically ✅ Do: Verify recovery completely
Source: Bootstrap-003 Error Recovery Methodology Framework: BAIME (Bootstrapped AI Methodology Engineering) Status: Production-ready, validated with 1336 errors