Initial commit

2025-11-30 09:07:22 +08:00
commit fab98d059b
179 changed files with 46209 additions and 0 deletions
--- a/skills/error-recovery/reference/recovery-patterns.md
+++ b/skills/error-recovery/reference/recovery-patterns.md
@@ -0,0 +1,418 @@
+# Recovery Strategy Patterns
+
+**Version**: 1.0
+**Source**: Bootstrap-003 Error Recovery Methodology
+**Last Updated**: 2025-10-18
+
+This document provides proven recovery patterns for each error category.
+
+---
+
+## Pattern 1: Syntax Error Fix-and-Retry
+
+**Applicable to**: Build/Compilation Errors (Category 1)
+
+**Strategy**: Fix syntax error in source code and rebuild
+
+**Steps**:
+1. **Locate**: Identify file and line from error (`file.go:line:col`)
+2. **Read**: Read the problematic file section
+3. **Fix**: Edit file to correct syntax error
+4. **Verify**: Run `go build` or `go test`
+5. **Retry**: Retry original operation
+
+**Automation**: Semi-automated (detection automatic, fix manual)
+
+**Success Rate**: >90%
+
+**Time to Recovery**: 2-5 minutes
+
+**Example**:
+```
+Error: cmd/root.go:4:2: "fmt" imported and not used
+
+Recovery:
+1. Read cmd/root.go
+2. Edit cmd/root.go - remove line 4: import "fmt"
+3. Bash: go build
+4. Verify: Build succeeds
+```
+
+---
+
+## Pattern 2: Test Fixture Update
+
+**Applicable to**: Test Failures (Category 2)
+
+**Strategy**: Update test fixtures or expectations to match current code
+
+**Steps**:
+1. **Analyze**: Understand test expectation vs code output
+2. **Decide**: Determine if code or test is incorrect
+3. **Update**: Fix code or update test fixture/assertion
+4. **Verify**: Run test again
+5. **Full test**: Run complete test suite
+
+**Automation**: Low (requires human judgment)
+
+**Success Rate**: >85%
+
+**Time to Recovery**: 5-15 minutes
+
+**Example**:
+```
+Error: --- FAIL: TestLoadFixture (0.00s)
+    fixtures_test.go:34: Missing 'sequence' field
+
+Recovery:
+1. Read tests/fixtures/sample-session.jsonl
+2. Identify missing 'sequence' field
+3. Edit fixture to add 'sequence' field
+4. Bash: go test ./internal/testutil -v
+5. Verify: Test passes
+```
+
+---
+
+## Pattern 3: Path Correction ⚠️ AUTOMATABLE
+
+**Applicable to**: File Not Found (Category 3)
+
+**Strategy**: Correct file path or create missing file
+
+**Steps**:
+1. **Verify**: Confirm file doesn't exist (`ls` or `find`)
+2. **Locate**: Search for file with correct name
+3. **Decide**: Path typo vs file not created
+4. **Fix**:
+   - If typo: Correct path
+   - If not created: Create file or reorder workflow
+5. **Retry**: Retry with correct path
+
+**Automation**: High (path validation, fuzzy matching, "did you mean?")
+
+**Success Rate**: >95%
+
+**Time to Recovery**: 1-3 minutes
+
+**Example**:
+```
+Error: No such file: /path/internal/testutil/fixture.go
+
+Recovery:
+1. Bash: ls /path/internal/testutil/
+2. Find: File is fixtures.go (not fixture.go)
+3. Bash: wc -l /path/internal/testutil/fixtures.go
+4. Verify: Success
+```
+
+---
+
+## Pattern 4: Read-Then-Write ⚠️ AUTOMATABLE
+
+**Applicable to**: Write Before Read (Category 5)
+
+**Strategy**: Add Read step before Write, or use Edit
+
+**Steps**:
+1. **Check existence**: Verify file exists
+2. **Decide tool**:
+   - For modifications: Use Edit
+   - For complete rewrite: Read then Write
+3. **Read**: Read existing file content
+4. **Write/Edit**: Perform operation
+5. **Verify**: Confirm desired content
+
+**Automation**: Fully automated (can auto-insert Read step)
+
+**Success Rate**: >98%
+
+**Time to Recovery**: 1-2 minutes
+
+**Example**:
+```
+Error: File has not been read yet.
+
+Recovery:
+1. Bash: ls internal/testutil/fixtures.go
+2. Read internal/testutil/fixtures.go
+3. Edit internal/testutil/fixtures.go
+4. Verify: Updated successfully
+```
+
+---
+
+## Pattern 5: Build-Then-Execute
+
+**Applicable to**: Command Not Found (Category 6)
+
+**Strategy**: Build binary before executing, or add to PATH
+
+**Steps**:
+1. **Identify**: Determine missing command
+2. **Check buildable**: Is this a project binary?
+3. **Build**: Run build command (`make build`)
+4. **Execute**: Use local path or install to PATH
+5. **Verify**: Command executes
+
+**Automation**: Medium (can detect and suggest build)
+
+**Success Rate**: >90%
+
+**Time to Recovery**: 2-5 minutes
+
+**Example**:
+```
+Error: meta-cc: command not found
+
+Recovery:
+1. Bash: ls meta-cc (check if exists)
+2. If not: make build
+3. Bash: ./meta-cc --version
+4. Verify: Command runs
+```
+
+---
+
+## Pattern 6: Pagination for Large Files ⚠️ AUTOMATABLE
+
+**Applicable to**: File Size Exceeded (Category 4)
+
+**Strategy**: Use offset/limit or alternative tools
+
+**Steps**:
+1. **Detect**: File size check before read
+2. **Choose approach**:
+   - **Option A**: Read with offset/limit
+   - **Option B**: Use grep/head/tail
+   - **Option C**: Process in chunks
+3. **Execute**: Apply chosen approach
+4. **Verify**: Obtained needed information
+
+**Automation**: Fully automated (can auto-detect and paginate)
+
+**Success Rate**: 100%
+
+**Time to Recovery**: 1-2 minutes
+
+**Example**:
+```
+Error: File exceeds 25000 tokens
+
+Recovery:
+1. Bash: wc -l large-file.jsonl  # Check size
+2. Read large-file.jsonl offset=0 limit=1000  # Read first 1000 lines
+3. OR: Bash: head -n 1000 large-file.jsonl
+4. Verify: Got needed content
+```
+
+---
+
+## Pattern 7: JSON Schema Fix
+
+**Applicable to**: JSON Parsing Errors (Category 7)
+
+**Strategy**: Fix JSON structure or update schema
+
+**Steps**:
+1. **Validate**: Use `jq` to check JSON validity
+2. **Locate**: Find exact parsing error location
+3. **Analyze**: Determine if JSON or code schema is wrong
+4. **Fix**:
+   - If JSON: Fix structure (commas, braces, types)
+   - If schema: Update Go struct tags/types
+5. **Test**: Verify parsing succeeds
+
+**Automation**: Medium (syntax validation yes, schema fix no)
+
+**Success Rate**: >85%
+
+**Time to Recovery**: 3-8 minutes
+
+**Example**:
+```
+Error: json: cannot unmarshal string into field .count of type int
+
+Recovery:
+1. Read testdata/fixture.json
+2. Find: "count": "42" (string instead of int)
+3. Edit: Change to "count": 42
+4. Bash: go test ./internal/parser
+5. Verify: Test passes
+```
+
+---
+
+## Pattern 8: String Exact Match
+
+**Applicable to**: String Not Found (Edit Errors) (Category 13)
+
+**Strategy**: Re-read file and copy exact string
+
+**Steps**:
+1. **Re-read**: Read file to get current content
+2. **Locate**: Find target section (grep or visual)
+3. **Copy exact**: Copy current string exactly (no retyping)
+4. **Retry Edit**: Use exact old_string
+5. **Verify**: Edit succeeds
+
+**Automation**: High (auto-refresh content before edit)
+
+**Success Rate**: >95%
+
+**Time to Recovery**: 1-3 minutes
+
+**Example**:
+```
+Error: String to replace not found in file
+
+Recovery:
+1. Read internal/parser/parse.go  # Fresh read
+2. Grep: Search for target function
+3. Copy exact string from current file
+4. Edit with exact old_string
+5. Verify: Edit succeeds
+```
+
+---
+
+## Pattern 9: MCP Server Health Check
+
+**Applicable to**: MCP Server Errors (Category 9)
+
+**Strategy**: Check server health, restart if needed
+
+**Steps**:
+1. **Check status**: Verify MCP server is running
+2. **Test connection**: Simple query to test connectivity
+3. **Restart**: If down, restart MCP server
+4. **Optimize query**: If timeout, add pagination/filters
+5. **Retry**: Retry original query
+
+**Automation**: Medium (health checks yes, query optimization no)
+
+**Success Rate**: >80%
+
+**Time to Recovery**: 2-10 minutes
+
+**Example**:
+```
+Error: MCP server connection failed
+
+Recovery:
+1. Bash: ps aux | grep mcp-server
+2. If not running: Restart MCP server
+3. Test: Simple query (e.g., get_session_stats)
+4. If working: Retry original query
+5. Verify: Query succeeds
+```
+
+---
+
+## Pattern 10: Permission Fix
+
+**Applicable to**: Permission Denied (Category 10)
+
+**Strategy**: Change permissions or use appropriate user
+
+**Steps**:
+1. **Check current**: `ls -la` to see permissions
+2. **Identify owner**: `ls -l` shows file owner
+3. **Fix permission**:
+   - Option A: `chmod` to add permissions
+   - Option B: `chown` to change owner
+   - Option C: Use sudo (if appropriate)
+4. **Retry**: Retry original operation
+5. **Verify**: Operation succeeds
+
+**Automation**: Low (security implications)
+
+**Success Rate**: >90%
+
+**Time to Recovery**: 1-3 minutes
+
+**Example**:
+```
+Error: Permission denied: /path/to/file
+
+Recovery:
+1. Bash: ls -la /path/to/file
+2. See: -r--r--r-- (read-only)
+3. Bash: chmod u+w /path/to/file
+4. Retry: Write operation
+5. Verify: Success
+```
+
+---
+
+## Recovery Pattern Selection
+
+### Decision Tree
+
+```
+Error occurs
+├─ Build/compilation? → Pattern 1 (Fix-and-Retry)
+├─ Test failure? → Pattern 2 (Test Fixture Update)
+├─ File not found? → Pattern 3 (Path Correction) ⚠️ AUTOMATE
+├─ File too large? → Pattern 6 (Pagination) ⚠️ AUTOMATE
+├─ Write before read? → Pattern 4 (Read-Then-Write) ⚠️ AUTOMATE
+├─ Command not found? → Pattern 5 (Build-Then-Execute)
+├─ JSON parsing? → Pattern 7 (JSON Schema Fix)
+├─ String not found (Edit)? → Pattern 8 (String Exact Match)
+├─ MCP server? → Pattern 9 (MCP Health Check)
+├─ Permission denied? → Pattern 10 (Permission Fix)
+└─ Other? → Consult taxonomy for category
+```
+
+---
+
+## Automation Priority
+
+**High Priority** (Full automation possible):
+1. Pattern 3: Path Correction (validate-path.sh)
+2. Pattern 4: Read-Then-Write (check-read-before-write.sh)
+3. Pattern 6: Pagination (check-file-size.sh)
+
+**Medium Priority** (Partial automation):
+4. Pattern 5: Build-Then-Execute
+5. Pattern 7: JSON Schema Fix
+6. Pattern 9: MCP Server Health
+
+**Low Priority** (Manual required):
+7. Pattern 1: Syntax Error Fix
+8. Pattern 2: Test Fixture Update
+9. Pattern 10: Permission Fix
+
+---
+
+## Best Practices
+
+### General Recovery Workflow
+
+1. **Classify**: Match error to category (use taxonomy.md)
+2. **Select pattern**: Choose appropriate recovery pattern
+3. **Execute steps**: Follow pattern steps systematically
+4. **Verify**: Confirm recovery successful
+5. **Document**: Note if pattern needs refinement
+
+### Efficiency Tips
+
+- Keep taxonomy.md open for quick classification
+- Use automation tools when available
+- Don't skip verification steps
+- Track recurring errors for prevention
+
+### Common Mistakes
+
+❌ **Don't**: Retry without understanding error
+❌ **Don't**: Skip verification step
+❌ **Don't**: Ignore automation opportunities
+✅ **Do**: Classify error first
+✅ **Do**: Follow pattern steps systematically
+✅ **Do**: Verify recovery completely
+
+---
+
+**Source**: Bootstrap-003 Error Recovery Methodology
+**Framework**: BAIME (Bootstrapped AI Methodology Engineering)
+**Status**: Production-ready, validated with 1336 errors