Initial commit
This commit is contained in:
221
skills/error-troubleshooter/SKILL.md
Normal file
221
skills/error-troubleshooter/SKILL.md
Normal file
@@ -0,0 +1,221 @@
|
||||
---
|
||||
name: error-troubleshooter
|
||||
description: Automatically troubleshoot unexpected results OR command/script errors without user request. Triggers when: (1) unexpected behavior - command succeeded but expected effect didn't happen, missing expected errors, wrong output, silent failures; (2) explicit failures - stderr, exceptions, non-zero exit, SDK/API errors. Applies systematic diagnosis using error templates, hypothesis testing, and web research for any Stack Overflow-worthy issue.
|
||||
---
|
||||
|
||||
# Error Troubleshooter
|
||||
|
||||
## Overview
|
||||
|
||||
This skill enables systematic troubleshooting of unexpected behavior and technical failures - whether explicit errors or silent anomalies where commands succeed but don't produce expected results. Proactively investigate any mismatch between expected and actual outcomes using a structured approach that balances quick fixes with thorough analysis.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Trigger this skill automatically when encountering either:
|
||||
|
||||
### (1) Unexpected Behavior (Priority)
|
||||
- **Command succeeded but expected effect didn't happen** - e.g., configuration set but not taking effect, file created but empty
|
||||
- **Missing expected errors** - e.g., test was designed to fail but passed, validation that should reject but accepted
|
||||
- **Wrong or unexpected output** - e.g., different data than expected, incorrect format, unexpected side effects
|
||||
- **Silent failures** - no error reported but operation clearly didn't work
|
||||
- **Behavioral anomalies** - program runs but behaves differently than intended
|
||||
|
||||
### (2) Explicit Failures
|
||||
- **Error messages from SDK/API calls** - exceptions, error codes, failure responses
|
||||
- **Tool execution failures** - Bash errors, script crashes, MCP tool failures
|
||||
- **Runtime errors** - exceptions in any programming language
|
||||
- **Build or compilation failures** - compiler errors, linking failures
|
||||
- **System errors** - permission denied, file not found, connection refused
|
||||
|
||||
**Key principle**: If there's any mismatch between expected and actual behavior - whether explicit error or silent anomaly - this skill applies.
|
||||
|
||||
## Troubleshooting Decision Tree
|
||||
|
||||
### 1. Initial Assessment
|
||||
|
||||
When unexpected behavior or an error occurs, immediately assess the situation:
|
||||
|
||||
```
|
||||
Unexpected Behavior or Error Detected
|
||||
↓
|
||||
What type of issue is this?
|
||||
│
|
||||
├─ UNEXPECTED BEHAVIOR (command succeeded but wrong result)
|
||||
│ ↓
|
||||
│ Document the mismatch:
|
||||
│ - What was expected?
|
||||
│ - What actually happened?
|
||||
│ - Any error messages? (none expected for unexpected behavior)
|
||||
│ ↓
|
||||
│ Is the cause obvious? (e.g., wrong variable, typo, wrong file)
|
||||
│ ├─ YES → Apply quick fix
|
||||
│ │ ↓
|
||||
│ │ Did expected behavior occur?
|
||||
│ │ ├─ YES → Done ✓
|
||||
│ │ └─ NO → Revert, proceed to Rigorous Investigation
|
||||
│ │
|
||||
│ └─ NO → Proceed directly to Rigorous Investigation
|
||||
│
|
||||
└─ EXPLICIT ERROR (stderr, exception, non-zero exit)
|
||||
↓
|
||||
Is the fix obvious from the error message itself?
|
||||
├─ YES → Apply quick fix (Happy Case Path)
|
||||
│ ↓
|
||||
│ Did it work?
|
||||
│ ├─ YES → Done ✓
|
||||
│ └─ NO → Revert changes, proceed to Rigorous Investigation
|
||||
│
|
||||
└─ NO → Is this a common trivial error?
|
||||
├─ YES → Apply known fix based on experience
|
||||
│ ↓
|
||||
│ Did it work?
|
||||
│ ├─ YES → Done ✓
|
||||
│ └─ NO → Revert changes, proceed to Rigorous Investigation
|
||||
│
|
||||
└─ NO → Proceed directly to Rigorous Investigation
|
||||
```
|
||||
|
||||
### 2. Happy Case Path (Quick Resolution)
|
||||
|
||||
For issues with obvious causes and fixes:
|
||||
|
||||
**Unexpected Behavior Quick Fixes:**
|
||||
- Obvious typo or wrong variable name
|
||||
- Wrong file path or target
|
||||
- Cached data (clear cache and retry)
|
||||
- Tool not reloaded after changes (restart and retry)
|
||||
|
||||
**Explicit Error Quick Fixes:**
|
||||
- Error message explicitly states the solution (e.g., "Missing required argument --config")
|
||||
- Common trivial errors (e.g., "command not found" → check installation)
|
||||
- Direct and unambiguous error descriptions
|
||||
|
||||
**Action**: Apply the fix immediately and verify the result.
|
||||
|
||||
**Failure criteria for unexpected behavior**: If expected behavior still doesn't occur, revert and switch to rigorous investigation.
|
||||
|
||||
**Failure criteria for explicit errors**: If the error message is unchanged OR the problem clearly worsened, revert immediately and switch to rigorous investigation.
|
||||
|
||||
### 3. Rigorous Investigation Path (Complex Problems)
|
||||
|
||||
When quick fixes fail or the problem is non-trivial, follow this systematic approach:
|
||||
|
||||
#### Step 1: Extract Problem Pattern
|
||||
|
||||
**For Explicit Errors:**
|
||||
SDK/API errors often follow fixed templates. Extract the template by:
|
||||
- Removing variable components (file paths, user inputs, timestamps, IDs)
|
||||
- Isolating the core error message structure
|
||||
- Preparing the template for web search
|
||||
|
||||
Example:
|
||||
```
|
||||
Original: FileNotFoundError: [Errno 2] No such file or directory: '/home/user/data.csv'
|
||||
Template: FileNotFoundError No such file or directory
|
||||
```
|
||||
|
||||
See `{baseDir}/references/error-template-patterns.md` for detailed guidance.
|
||||
|
||||
**For Unexpected Behavior:**
|
||||
Document the behavior pattern:
|
||||
- What command/operation was performed?
|
||||
- What was the expected outcome?
|
||||
- What actually happened instead?
|
||||
- Are there any observable symptoms (wrong data, missing files, etc.)?
|
||||
- Has this operation worked before? When did it stop working?
|
||||
|
||||
Formulate a search query focusing on the behavior:
|
||||
- "command X succeeded but didn't create Y"
|
||||
- "configuration Z not taking effect"
|
||||
- "expected validation to fail but passed"
|
||||
|
||||
#### Step 2: Gather Environment Information
|
||||
|
||||
Collect relevant environment details when:
|
||||
- The pattern search doesn't yield clear solutions
|
||||
- Environment-specific factors are likely relevant (versions, configurations, system state)
|
||||
- Context is needed to understand the problem
|
||||
|
||||
**IMPORTANT**: Avoid collecting sensitive information. If sensitive data is necessary, explicitly request user authorization first.
|
||||
|
||||
See `{baseDir}/references/environment-info-guide.md` for collection guidelines and privacy protection.
|
||||
|
||||
#### Step 3: Research the Problem
|
||||
|
||||
Use efficient research strategies:
|
||||
- **Web Search**: Search the extracted pattern (error template or behavior description)
|
||||
- **Parallel Investigation**: Use Task tool with subagent_type=Explore for multiple research angles simultaneously
|
||||
- **Documentation**: Search official docs, GitHub Issues, Stack Overflow for similar problems
|
||||
- **Counter-evidence**: Look for cases where the expected behavior DID occur to identify what's different
|
||||
|
||||
**Token Efficiency**: For complex investigations, delegate research to subagents to avoid context exhaustion.
|
||||
|
||||
#### Step 4: Create Debug Notes File
|
||||
|
||||
For difficult problems, create a debug notes file to:
|
||||
- Track theories and test results
|
||||
- Enable parallel investigation
|
||||
- Resume from interruptions
|
||||
- Maintain systematic progress
|
||||
|
||||
Use the template in `{baseDir}/assets/debug-notes-template.md` to structure notes.
|
||||
|
||||
#### Step 5: Formulate and Test Theories
|
||||
|
||||
Based on research:
|
||||
1. List plausible theories explaining the problem (why error occurred OR why expected behavior didn't happen)
|
||||
2. Order by likelihood
|
||||
3. Design tests to verify each theory
|
||||
4. Execute tests systematically
|
||||
5. Document results in debug notes
|
||||
6. Iterate until the correct solution is found
|
||||
|
||||
**For unexpected behavior**: Focus theories on "why the expected effect didn't occur" rather than "why an error happened".
|
||||
|
||||
#### Step 6: Implement Solution
|
||||
|
||||
Once the correct theory is identified:
|
||||
- Apply the fix
|
||||
- Verify the problem is resolved:
|
||||
- **For errors**: Error no longer occurs
|
||||
- **For unexpected behavior**: Expected behavior now occurs as intended
|
||||
- Document the solution in debug notes (if notes were created)
|
||||
- Consider if this pattern should be added to common patterns
|
||||
|
||||
## Token Efficiency Strategies
|
||||
|
||||
For complex investigations:
|
||||
|
||||
1. **Use Subagents**: Delegate research tasks using the Task tool with subagent_type=Explore
|
||||
2. **File-Based Notes**: Write debug notes to files instead of maintaining context in memory
|
||||
3. **Parallel Research**: Launch multiple subagents simultaneously for different research angles
|
||||
4. **Selective Context**: Only load reference files when specifically needed
|
||||
|
||||
## Key Principles
|
||||
|
||||
1. **Proactive Investigation**: Don't wait for the user to request troubleshooting—start investigating immediately when unexpected behavior or errors occur
|
||||
2. **Prioritize Unexpected Behavior**: Check for silent failures and behavioral anomalies first, as they're more subtle than explicit errors
|
||||
3. **Bold Hypotheses, Careful Verification**: Generate multiple competing theories, then rigorously verify each with concrete evidence (see `{baseDir}/references/problem-solving-mindset.md`)
|
||||
4. **Challenge Your Own Reasoning**: Actively search for counter-evidence and successful counter-examples that would disprove your theories
|
||||
5. **Acknowledge Uncertainty**: Present confidence levels; admit when evidence is incomplete rather than pretending certainty
|
||||
6. **Revert on Failure**: If a fix doesn't work, always revert before trying another approach
|
||||
7. **Systematic Documentation**: For difficult problems, maintain structured debug notes
|
||||
8. **Privacy Protection**: Never collect sensitive information without explicit user authorization
|
||||
9. **Efficient Resource Usage**: Use subagents and files to manage context for complex investigations
|
||||
|
||||
## Resources
|
||||
|
||||
This skill includes:
|
||||
|
||||
### references/
|
||||
- `problem-solving-mindset.md` - Scientific approach to problem-solving: bold hypotheses, careful verification, and disciplined reasoning
|
||||
- `systematic-debugging-methodology.md` - Practical debugging framework: Occam's Razor, diagnostic scripts, evidence hierarchy, and real-world examples
|
||||
- `troubleshooting-sop.md` - Detailed standard operating procedures for systematic troubleshooting
|
||||
- `error-template-patterns.md` - Guide to identifying and extracting error message templates
|
||||
- `environment-info-guide.md` - Environment information collection guidelines with privacy protection
|
||||
- `common-error-patterns.md` - Database of frequently encountered trivial errors and quick fixes
|
||||
|
||||
### assets/
|
||||
- `debug-notes-template.md` - Template for structured debugging documentation
|
||||
|
||||
These resources are loaded as needed during the troubleshooting process.
|
||||
Reference in New Issue
Block a user