Initial commit
This commit is contained in:
179
skills/debug/SKILL.md
Normal file
179
skills/debug/SKILL.md
Normal file
@@ -0,0 +1,179 @@
|
||||
---
|
||||
name: debug
|
||||
description: Apply systematic debugging methodology using medical differential diagnosis principles. Trigger when AI modifies working code and anomalies occur, or when users report unexpected test results or execution failures. Use observation without preconception, fact isolation, differential diagnosis lists, deductive exclusion, experimental verification, precise fixes, and prevention mechanisms.
|
||||
---
|
||||
|
||||
# Debug
|
||||
|
||||
## Overview
|
||||
|
||||
This skill applies a systematic debugging methodology inspired by medical differential diagnosis. It provides a rigorous 7-step process for investigating and resolving bugs through observation, classification, hypothesis testing, and verification. This approach prioritizes evidence-based reasoning over assumptions, ensuring root causes are identified rather than symptoms treated.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Activate this skill in two primary scenarios:
|
||||
|
||||
**Scenario A: Post-Modification Anomalies**
|
||||
When modifying a previously tested and working version, and any unexpected behavior emerges after the changes.
|
||||
|
||||
**Scenario B: User-Reported Issues**
|
||||
When users report that test results don't meet expectations or the system fails to execute as intended.
|
||||
|
||||
## Debugging Workflow
|
||||
|
||||
Follow this 7-step systematic approach to diagnose and resolve issues.
|
||||
|
||||
For a detailed checklist of each step, refer to `{baseDir}/references/debugging_checklist.md`. For common bug patterns and their signatures, see `{baseDir}/references/common_patterns.md`.
|
||||
|
||||
### Step 1: Observe Without Preconception (Observe)
|
||||
|
||||
**Objective:** Collect all available evidence without jumping to conclusions.
|
||||
|
||||
**Process:**
|
||||
- Gather all accessible clues: user reports, system logs, dashboards, error stack traces, version changes (git diff), configuration parameters (configs/args/env)
|
||||
- Focus exclusively on facts and observable phenomena
|
||||
- Avoid premature hypotheses or assumptions about causes
|
||||
- Document all observations systematically
|
||||
|
||||
**Key Principle:** Observe, don't just see. At this stage, the goal is comprehensive data collection, not interpretation.
|
||||
|
||||
### Step 2: Classify and Isolate Facts (Classify & Isolate Facts)
|
||||
|
||||
**Objective:** Distinguish symptoms from root causes and narrow the problem scope.
|
||||
|
||||
**Process:**
|
||||
|
||||
**For Incremental Development (Scenario A - Post-Modification Anomalies):**
|
||||
- Confirm the previous step still works (ensure issue is from new changes)
|
||||
- List ALL changes since last working state (git diff, code modifications, config changes)
|
||||
- Identify implicit assumptions in these changes, such as:
|
||||
- API calling conventions ("I assume this API works this way")
|
||||
- Parameter types/order ("I assume this parameter accepts X")
|
||||
- Configuration values ("I assume this env var is set")
|
||||
- Data formats ("I assume the response is JSON")
|
||||
- [And other fundamental assumptions embedded in the changes]
|
||||
- **Apply Occam's Razor**: The simplest explanation is usually correct—prioritize basic assumption errors (typos, wrong parameters, incorrect API usage) over complex failure modes
|
||||
- Verify fundamental assumptions with this priority:
|
||||
1. Check how it was implemented in the last working version (proven to work)
|
||||
2. Consult official documentation for correct usage (may be outdated)
|
||||
3. Only then consider external issues (community-reported bugs, known issues)
|
||||
|
||||
**General Isolation:**
|
||||
- Separate "what is broken" (symptoms) from "why it's broken" (causes)
|
||||
- Systematically narrow down the problem domain by testing:
|
||||
- Does it occur only in specific browsers?
|
||||
- Does it happen on specific operating systems?
|
||||
- Is it time-dependent?
|
||||
- Is it triggered by specific parameter values or input data?
|
||||
- Eliminate all modules/components that function correctly
|
||||
- Isolate the suspicious area
|
||||
|
||||
**Key Principle:** Reduce the search space by eliminating what works correctly.
|
||||
|
||||
### Step 3: Build Differential Diagnosis List (Differential Diagnosis List)
|
||||
|
||||
**Objective:** Enumerate all possible technical failure points.
|
||||
|
||||
**Process:**
|
||||
- Create a comprehensive list of potential failure modes:
|
||||
- Cache errors
|
||||
- Database connection failures
|
||||
- Third-party API outages
|
||||
- Memory leaks
|
||||
- Configuration anomalies
|
||||
- Version compatibility issues
|
||||
- Race conditions
|
||||
- Resource exhaustion
|
||||
- Include even rare or unlikely scenarios
|
||||
- Draw on knowledge base and past experiences
|
||||
- Consider both common and edge cases
|
||||
- Consult `{baseDir}/references/common_patterns.md` for known bug patterns
|
||||
|
||||
**Key Principle:** Cast a wide net initially—don't prematurely exclude possibilities.
|
||||
|
||||
### Step 4: Apply Elimination and Deductive Reasoning (Deduce & Exclude)
|
||||
|
||||
**Objective:** Systematically eliminate impossible factors to find the truth.
|
||||
|
||||
**Process:**
|
||||
- Follow Sherlock Holmes' principle: "When you eliminate the impossible, whatever remains, however improbable, must be the truth"
|
||||
- Design precise tests to validate or invalidate each hypothesis
|
||||
- Use Chain-of-Thought reasoning to document the deductive process
|
||||
- Make reasoning transparent and verifiable
|
||||
- Progressively eliminate factors until a single root cause remains
|
||||
|
||||
**Key Principle:** Evidence-based elimination leads to certainty.
|
||||
|
||||
### Step 5: Experimental Verification and Investigation (Experimental Verification)
|
||||
|
||||
**Objective:** Validate hypotheses through controlled experiments.
|
||||
|
||||
**Process:**
|
||||
- Create restorable checkpoints before making changes
|
||||
- Design and execute targeted experiments to test remaining hypotheses
|
||||
- Research latest versions, known issues, and community discussions (GitHub issues, Stack Overflow)
|
||||
- Conduct focused verification tests
|
||||
- Use experimental evidence to prove each logical step
|
||||
- Iterate until the exact cause is confirmed
|
||||
|
||||
**Key Principle:** Prove hypotheses with experiments, not assumptions.
|
||||
|
||||
### Step 6: Locate and Implement Fix (Locate & Implement Fix)
|
||||
|
||||
**Objective:** Apply the most elegant and least invasive solution.
|
||||
|
||||
**Process:**
|
||||
- Pinpoint the exact code location or configuration causing the issue
|
||||
- Design the fix with minimal side effects
|
||||
- Prioritize elegant solutions over quick patches
|
||||
- Consider long-term maintainability
|
||||
- Implement the fix with precision
|
||||
|
||||
**Key Principle:** Seek elegant solutions, not temporary workarounds.
|
||||
|
||||
### Step 7: Prevention Mechanism (Prevent)
|
||||
|
||||
**Objective:** Ensure the same error doesn't recur and verify stability.
|
||||
|
||||
**Process:**
|
||||
- Verify all related modules remain stable after the fix
|
||||
- Run comprehensive regression tests
|
||||
- Review the entire debugging process
|
||||
- Generalize lessons learned
|
||||
- Document findings in CLAUDE.md or project documentation
|
||||
- Implement safeguards to prevent similar issues
|
||||
|
||||
**Key Principle:** Fix once, prevent forever.
|
||||
|
||||
## Best Practices
|
||||
|
||||
**Maintain Scientific Rigor:**
|
||||
- Bold hypotheses, careful verification
|
||||
- Evidence before assertions
|
||||
- Transparency in reasoning
|
||||
|
||||
**Documentation:**
|
||||
- Track all observations, hypotheses, and test results
|
||||
- Make the investigation reproducible
|
||||
- Document not just the fix, but the reasoning process
|
||||
- Use `{baseDir}/references/investigation_template.md` to structure investigation logs
|
||||
- Use `{baseDir}/assets/debug_report_template.md` for creating post-mortem reports
|
||||
|
||||
**Communication:**
|
||||
- Explain findings clearly to users
|
||||
- Provide context for why the issue occurred
|
||||
- Describe preventive measures implemented
|
||||
|
||||
## Resources
|
||||
|
||||
This skill includes bundled resources to support the debugging workflow:
|
||||
|
||||
### references/
|
||||
Load these into context as needed during investigation:
|
||||
- `{baseDir}/references/debugging_checklist.md` - Comprehensive checklist for each debugging step
|
||||
- `{baseDir}/references/common_patterns.md` - Common bug patterns and their signatures
|
||||
- `{baseDir}/references/investigation_template.md` - Template for documenting investigations
|
||||
|
||||
### assets/
|
||||
Use these templates for documentation and reporting:
|
||||
- `{baseDir}/assets/debug_report_template.md` - Template for summarizing debugging sessions and creating post-mortem reports
|
||||
134
skills/debug/assets/debug_report_template.md
Normal file
134
skills/debug/assets/debug_report_template.md
Normal file
@@ -0,0 +1,134 @@
|
||||
# Debug Report: [Issue Title]
|
||||
|
||||
**Date:** [YYYY-MM-DD]
|
||||
**Investigator:** [Name/AI]
|
||||
**Status:** 🟢 Resolved / 🔴 Unresolved / ⚠️ Workaround Applied
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
[2-3 sentence summary of the issue, root cause, and resolution]
|
||||
|
||||
---
|
||||
|
||||
## Issue Description
|
||||
|
||||
**Reported By:** [User/System]
|
||||
**Initial Report:**
|
||||
> [User's description or error message]
|
||||
|
||||
**Impact:**
|
||||
- **Severity:** Critical / High / Medium / Low
|
||||
- **Users Affected:** [Number or description]
|
||||
- **Systems Affected:** [List of affected components]
|
||||
|
||||
---
|
||||
|
||||
## Root Cause
|
||||
|
||||
**TL;DR:** [One sentence explanation]
|
||||
|
||||
**Detailed Explanation:**
|
||||
[Explain what caused the issue and why it manifested the way it did]
|
||||
|
||||
**Location:**
|
||||
- File: `[path/to/file]:[line]`
|
||||
- Component: [Component name]
|
||||
- Introduced in: [Commit hash or version]
|
||||
|
||||
---
|
||||
|
||||
## Investigation Process
|
||||
|
||||
### Observations
|
||||
- [Key observation 1]
|
||||
- [Key observation 2]
|
||||
- [Key observation 3]
|
||||
|
||||
### Hypotheses Considered
|
||||
1. ❌ [Eliminated hypothesis] - Ruled out because [reason]
|
||||
2. ❌ [Eliminated hypothesis] - Ruled out because [reason]
|
||||
3. ✅ [Confirmed hypothesis] - Confirmed by [evidence]
|
||||
|
||||
### Key Evidence
|
||||
- [Evidence 1 that led to root cause]
|
||||
- [Evidence 2 that confirmed the diagnosis]
|
||||
|
||||
---
|
||||
|
||||
## Resolution
|
||||
|
||||
### Fix Applied
|
||||
|
||||
```diff
|
||||
# File: [filename]
|
||||
- [removed code]
|
||||
+ [added code]
|
||||
```
|
||||
|
||||
**Rationale:** [Why this fix was chosen over alternatives]
|
||||
|
||||
### Verification
|
||||
- ✅ Original issue resolved
|
||||
- ✅ No regression in related functionality
|
||||
- ✅ Test suite passes
|
||||
- ✅ Deployed to production
|
||||
|
||||
---
|
||||
|
||||
## Prevention Measures
|
||||
|
||||
**Immediate Actions:**
|
||||
1. [Action 1 - e.g., Added validation]
|
||||
2. [Action 2 - e.g., Added test coverage]
|
||||
|
||||
**Long-term Improvements:**
|
||||
1. [Improvement 1 - e.g., Refactor error handling]
|
||||
2. [Improvement 2 - e.g., Add monitoring]
|
||||
|
||||
**Tests Added:**
|
||||
```
|
||||
[Description or snippet of regression test]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Timeline
|
||||
|
||||
| Time | Event |
|
||||
|------|-------|
|
||||
| [HH:MM] | Issue reported |
|
||||
| [HH:MM] | Investigation started |
|
||||
| [HH:MM] | Root cause identified |
|
||||
| [HH:MM] | Fix implemented |
|
||||
| [HH:MM] | Fix deployed |
|
||||
| [HH:MM] | Issue resolved |
|
||||
|
||||
**Total Resolution Time:** [Duration]
|
||||
|
||||
---
|
||||
|
||||
## Lessons Learned
|
||||
|
||||
**What Went Well:**
|
||||
- [Positive aspect 1]
|
||||
- [Positive aspect 2]
|
||||
|
||||
**What Could Be Improved:**
|
||||
- [Improvement area 1]
|
||||
- [Improvement area 2]
|
||||
|
||||
**Key Takeaway:**
|
||||
[Main lesson for future reference]
|
||||
|
||||
---
|
||||
|
||||
## Related Issues
|
||||
|
||||
- [Related issue #1]
|
||||
- [Related issue #2]
|
||||
|
||||
---
|
||||
|
||||
**Report Generated:** [YYYY-MM-DD HH:MM]
|
||||
306
skills/debug/references/common_patterns.md
Normal file
306
skills/debug/references/common_patterns.md
Normal file
@@ -0,0 +1,306 @@
|
||||
# Common Bug Patterns and Signatures
|
||||
|
||||
This reference documents frequently encountered bug patterns, their signatures, and diagnostic approaches.
|
||||
|
||||
## Pattern Categories
|
||||
|
||||
### 1. Timing and Concurrency Issues
|
||||
|
||||
#### Race Conditions
|
||||
**Signature:**
|
||||
- Intermittent failures
|
||||
- Works in development but fails in production
|
||||
- Different results with same input
|
||||
- Failures during high load
|
||||
|
||||
**Common Causes:**
|
||||
- Shared mutable state without synchronization
|
||||
- Incorrect thread-safety assumptions
|
||||
- Async operations completing in unexpected order
|
||||
|
||||
**Investigation Approach:**
|
||||
- Add extensive logging with timestamps
|
||||
- Use debugger breakpoints sparingly (changes timing)
|
||||
- Add delays to expose race windows
|
||||
- Review all shared state access patterns
|
||||
|
||||
#### Deadlocks
|
||||
**Signature:**
|
||||
- Application hangs indefinitely
|
||||
- No error messages
|
||||
- High CPU or complete freeze
|
||||
- Multiple threads waiting
|
||||
|
||||
**Common Causes:**
|
||||
- Circular wait for locks
|
||||
- Lock ordering violations
|
||||
- Database transaction deadlocks
|
||||
|
||||
**Investigation Approach:**
|
||||
- Check thread dumps / stack traces
|
||||
- Review lock acquisition order
|
||||
- Use database deadlock detection tools
|
||||
- Add timeout mechanisms
|
||||
|
||||
### 2. Memory Issues
|
||||
|
||||
#### Memory Leaks
|
||||
**Signature:**
|
||||
- Gradually increasing memory usage
|
||||
- Performance degradation over time
|
||||
- Out of memory errors after extended runtime
|
||||
- Works initially, fails after hours/days
|
||||
|
||||
**Common Causes:**
|
||||
- Event listeners not cleaned up
|
||||
- Cache without eviction policy
|
||||
- Circular references preventing garbage collection
|
||||
- Resource handles not closed
|
||||
|
||||
**Investigation Approach:**
|
||||
- Profile memory over time
|
||||
- Take heap dumps at intervals
|
||||
- Compare object counts between snapshots
|
||||
- Check for unclosed resources (files, connections, sockets)
|
||||
|
||||
#### Stack Overflow
|
||||
**Signature:**
|
||||
- Stack overflow error
|
||||
- Deep recursion errors
|
||||
- Crashes at predictable depth
|
||||
|
||||
**Common Causes:**
|
||||
- Unbounded recursion
|
||||
- Missing base case in recursive function
|
||||
- Circular data structure traversal
|
||||
|
||||
**Investigation Approach:**
|
||||
- Check recursion depth
|
||||
- Verify base case conditions
|
||||
- Look for circular references
|
||||
- Consider iterative alternative
|
||||
|
||||
### 3. State Management Issues
|
||||
|
||||
#### Stale Cache
|
||||
**Signature:**
|
||||
- Outdated data displayed
|
||||
- Inconsistency between systems
|
||||
- Works after cache clear
|
||||
- Different results on different servers
|
||||
|
||||
**Common Causes:**
|
||||
- Cache invalidation not triggered
|
||||
- TTL too long
|
||||
- Distributed cache synchronization issues
|
||||
|
||||
**Investigation Approach:**
|
||||
- Check cache invalidation logic
|
||||
- Verify cache key generation
|
||||
- Test with cache disabled
|
||||
- Review cache update patterns
|
||||
|
||||
#### State Corruption
|
||||
**Signature:**
|
||||
- Invalid state transitions
|
||||
- Data inconsistency
|
||||
- Unexpected null values
|
||||
- Objects in impossible states
|
||||
|
||||
**Common Causes:**
|
||||
- Direct state mutation
|
||||
- Missing validation
|
||||
- Incorrect error handling leaving partial updates
|
||||
- Concurrent modifications
|
||||
|
||||
**Investigation Approach:**
|
||||
- Add state validation assertions
|
||||
- Review state mutation points
|
||||
- Check transaction boundaries
|
||||
- Look for error handling gaps
|
||||
|
||||
### 4. Integration Issues
|
||||
|
||||
#### API Failures
|
||||
**Signature:**
|
||||
- Timeout errors
|
||||
- 500/503 errors
|
||||
- Network errors
|
||||
- Rate limiting responses
|
||||
|
||||
**Common Causes:**
|
||||
- Third-party API downtime
|
||||
- Network connectivity issues
|
||||
- Authentication token expiration
|
||||
- Rate limits exceeded
|
||||
|
||||
**Investigation Approach:**
|
||||
- Check API status pages
|
||||
- Verify network connectivity
|
||||
- Review authentication flow
|
||||
- Check rate limit headers
|
||||
- Test with API directly (curl/Postman)
|
||||
|
||||
#### Database Issues
|
||||
**Signature:**
|
||||
- Connection pool exhausted
|
||||
- Slow query performance
|
||||
- Lock wait timeouts
|
||||
- Connection refused errors
|
||||
|
||||
**Common Causes:**
|
||||
- Connection leaks (not closing connections)
|
||||
- Missing indexes causing full table scans
|
||||
- N+1 query problems
|
||||
- Database server overload
|
||||
|
||||
**Investigation Approach:**
|
||||
- Monitor connection pool metrics
|
||||
- Review slow query logs
|
||||
- Check execution plans
|
||||
- Look for repeated queries in loops
|
||||
|
||||
### 5. Configuration Issues
|
||||
|
||||
#### Environment Mismatches
|
||||
**Signature:**
|
||||
- Works locally, fails in production
|
||||
- Different behavior across environments
|
||||
- "It works on my machine"
|
||||
|
||||
**Common Causes:**
|
||||
- Different environment variables
|
||||
- Different dependency versions
|
||||
- Different configuration files
|
||||
- Platform-specific code paths
|
||||
|
||||
**Investigation Approach:**
|
||||
- Compare environment variables
|
||||
- Check dependency versions (package-lock.json, poetry.lock, etc.)
|
||||
- Review configuration for environment-specific values
|
||||
- Check platform-specific code paths
|
||||
|
||||
#### Missing Dependencies
|
||||
**Signature:**
|
||||
- Module not found errors
|
||||
- Import errors
|
||||
- Class/function not defined
|
||||
- Version incompatibility errors
|
||||
|
||||
**Common Causes:**
|
||||
- Missing package in requirements
|
||||
- Outdated dependency versions
|
||||
- Peer dependency conflicts
|
||||
- System library missing
|
||||
|
||||
**Investigation Approach:**
|
||||
- Review dependency manifests
|
||||
- Check installed versions vs required
|
||||
- Look for dependency conflicts
|
||||
- Verify system libraries installed
|
||||
|
||||
### 6. Logic Errors
|
||||
|
||||
#### Off-by-One Errors
|
||||
**Signature:**
|
||||
- Index out of bounds
|
||||
- Missing first or last element
|
||||
- Infinite loops
|
||||
- Incorrect boundary handling
|
||||
|
||||
**Common Causes:**
|
||||
- Using < instead of <=
|
||||
- 0-indexed vs 1-indexed confusion
|
||||
- Incorrect loop conditions
|
||||
|
||||
**Investigation Approach:**
|
||||
- Check boundary conditions
|
||||
- Test with edge cases (empty, single element)
|
||||
- Review loop conditions carefully
|
||||
- Add assertions for expected ranges
|
||||
|
||||
#### Type Coercion Bugs
|
||||
**Signature:**
|
||||
- Unexpected type errors
|
||||
- Comparison behaving unexpectedly
|
||||
- String concatenation instead of addition
|
||||
- Falsy value handling issues
|
||||
|
||||
**Common Causes:**
|
||||
- Implicit type conversion
|
||||
- Loose equality checks (== vs ===)
|
||||
- Type assumptions without validation
|
||||
- Mixed numeric types
|
||||
|
||||
**Investigation Approach:**
|
||||
- Add explicit type checks
|
||||
- Use strict equality
|
||||
- Add type annotations/hints
|
||||
- Check for implicit conversions
|
||||
|
||||
### 7. Error Handling Issues
|
||||
|
||||
#### Swallowed Exceptions
|
||||
**Signature:**
|
||||
- Silent failures
|
||||
- No error messages despite failure
|
||||
- Incomplete operations
|
||||
- Success reported despite failure
|
||||
|
||||
**Common Causes:**
|
||||
- Empty catch blocks
|
||||
- Broad exception catching
|
||||
- Returning default values on error
|
||||
- Not re-raising exceptions
|
||||
|
||||
**Investigation Approach:**
|
||||
- Search for empty catch/except blocks
|
||||
- Review exception handling patterns
|
||||
- Add logging to all error paths
|
||||
- Check for bare except/catch clauses
|
||||
|
||||
#### Error Propagation Failures
|
||||
**Signature:**
|
||||
- Low-level errors exposed to users
|
||||
- Unclear error messages
|
||||
- Generic "Something went wrong"
|
||||
- Stack traces in user interface
|
||||
|
||||
**Common Causes:**
|
||||
- No error translation layer
|
||||
- Missing error boundaries
|
||||
- Not catching specific exceptions
|
||||
- No user-friendly error messages
|
||||
|
||||
**Investigation Approach:**
|
||||
- Review error handling architecture
|
||||
- Check error message clarity
|
||||
- Verify error boundaries exist
|
||||
- Test error scenarios
|
||||
|
||||
## Pattern Recognition Strategies
|
||||
|
||||
### Look for These Red Flags:
|
||||
|
||||
1. **Time-based behavior**: If adding delays changes behavior, suspect timing issues
|
||||
2. **Load-based failures**: If failures increase with load, suspect resource exhaustion or race conditions
|
||||
3. **Environment-specific**: If only fails in certain environments, suspect configuration differences
|
||||
4. **Gradual degradation**: If performance worsens over time, suspect memory leaks or resource leaks
|
||||
5. **Intermittent failures**: If behavior is non-deterministic, suspect concurrency issues or external dependencies
|
||||
|
||||
### Diagnostic Quick Checks:
|
||||
|
||||
1. **Can you reproduce it consistently?** No → Likely timing/concurrency issue
|
||||
2. **Does it fail immediately?** Yes → Likely configuration or initialization issue
|
||||
3. **Does it fail after some time?** Yes → Likely resource leak or state corruption
|
||||
4. **Does it fail with specific input?** Yes → Likely validation or edge case handling issue
|
||||
5. **Does it fail only in production?** Yes → Likely environment or load-related issue
|
||||
|
||||
## Using This Reference
|
||||
|
||||
When encountering a bug:
|
||||
1. Match the signature to patterns above
|
||||
2. Review common causes for that pattern
|
||||
3. Follow the investigation approach
|
||||
4. Apply lessons from similar past issues
|
||||
5. Update this document if you discover new patterns
|
||||
176
skills/debug/references/debugging_checklist.md
Normal file
176
skills/debug/references/debugging_checklist.md
Normal file
@@ -0,0 +1,176 @@
|
||||
# Debugging Checklist
|
||||
|
||||
This checklist provides detailed action items for each step of the debugging workflow.
|
||||
|
||||
## Step 1: Observe Without Preconception ✓
|
||||
|
||||
**Evidence Collection:**
|
||||
- [ ] Review user's bug report or issue description
|
||||
- [ ] Examine error messages and stack traces
|
||||
- [ ] Check application logs (stderr, stdout, application-specific logs)
|
||||
- [ ] Review monitoring dashboards (if available)
|
||||
- [ ] Inspect recent code changes (`git diff`, `git log`)
|
||||
- [ ] Document current environment (OS, versions, dependencies)
|
||||
- [ ] Capture configuration files (config files, environment variables, CLI arguments)
|
||||
- [ ] Screenshot or record the error if visual
|
||||
- [ ] Note exact steps to reproduce
|
||||
|
||||
**Documentation:**
|
||||
- [ ] Create investigation log file
|
||||
- [ ] Record timestamp and initial observations
|
||||
- [ ] List all data sources consulted
|
||||
|
||||
## Step 2: Classify and Isolate Facts ✓
|
||||
|
||||
**Symptom Analysis:**
|
||||
- [ ] List all observable symptoms
|
||||
- [ ] Distinguish symptoms from potential causes
|
||||
- [ ] Identify what changed recently (code, config, dependencies, infrastructure)
|
||||
|
||||
**Scope Narrowing:**
|
||||
- [ ] Test across different environments (dev, staging, production)
|
||||
- [ ] Test across different platforms (Windows, Linux, macOS)
|
||||
- [ ] Test across different browsers (if web application)
|
||||
- [ ] Test with different input data
|
||||
- [ ] Test with different configurations
|
||||
- [ ] Identify minimal reproduction case
|
||||
- [ ] Test with previous working version (regression testing)
|
||||
|
||||
**Component Isolation:**
|
||||
- [ ] List all involved components/modules
|
||||
- [ ] Mark components known to work correctly
|
||||
- [ ] Highlight suspicious components
|
||||
- [ ] Draw dependency diagram if complex
|
||||
|
||||
## Step 3: Build Differential Diagnosis List ✓
|
||||
|
||||
**Infrastructure Issues:**
|
||||
- [ ] Network connectivity problems
|
||||
- [ ] DNS resolution failures
|
||||
- [ ] Load balancer misconfiguration
|
||||
- [ ] Firewall/security group blocking
|
||||
- [ ] Resource exhaustion (CPU, memory, disk)
|
||||
|
||||
**Application Issues:**
|
||||
- [ ] Cache staleness or corruption
|
||||
- [ ] Database connection pool exhaustion
|
||||
- [ ] Database deadlocks or slow queries
|
||||
- [ ] Third-party API failures or timeouts
|
||||
- [ ] Memory leaks
|
||||
- [ ] Race conditions or threading issues
|
||||
- [ ] Incorrect error handling
|
||||
- [ ] Invalid input validation
|
||||
|
||||
**Configuration Issues:**
|
||||
- [ ] Environment variable mismatch
|
||||
- [ ] Configuration file errors
|
||||
- [ ] Version incompatibility
|
||||
- [ ] Missing dependencies
|
||||
- [ ] Permission problems
|
||||
|
||||
**Code Issues:**
|
||||
- [ ] Logic errors in recent changes
|
||||
- [ ] Null pointer/undefined errors
|
||||
- [ ] Type mismatches
|
||||
- [ ] Off-by-one errors
|
||||
- [ ] Incorrect assumptions
|
||||
|
||||
## Step 4: Apply Elimination and Deductive Reasoning ✓
|
||||
|
||||
**Hypothesis Testing:**
|
||||
- [ ] Rank hypotheses by likelihood
|
||||
- [ ] Design test for most likely hypothesis
|
||||
- [ ] Execute test and document result
|
||||
- [ ] If hypothesis invalidated, mark as eliminated
|
||||
- [ ] If hypothesis confirmed, design further verification
|
||||
- [ ] Move to next hypothesis if needed
|
||||
|
||||
**Reasoning Documentation:**
|
||||
- [ ] Document "If X, then Y" statements
|
||||
- [ ] Record why each hypothesis was eliminated
|
||||
- [ ] Note which tests ruled out which possibilities
|
||||
- [ ] Maintain chain of reasoning for review
|
||||
|
||||
**Narrowing Down:**
|
||||
- [ ] Eliminate external factors first (network, APIs)
|
||||
- [ ] Then infrastructure (resources, configuration)
|
||||
- [ ] Then application-level issues (cache, database)
|
||||
- [ ] Finally code-level issues (logic, types)
|
||||
|
||||
## Step 5: Experimental Verification ✓
|
||||
|
||||
**Preparation:**
|
||||
- [ ] Create git branch for experiments
|
||||
- [ ] Backup current state (checkpoint)
|
||||
- [ ] Document experiment plan
|
||||
|
||||
**Experimentation:**
|
||||
- [ ] Add logging/instrumentation to suspected area
|
||||
- [ ] Add debug breakpoints if using debugger
|
||||
- [ ] Create controlled test case
|
||||
- [ ] Run experiment and capture output
|
||||
- [ ] Compare actual vs expected behavior
|
||||
|
||||
**Research:**
|
||||
- [ ] Search GitHub issues for similar problems
|
||||
- [ ] Check Stack Overflow for related questions
|
||||
- [ ] Review official documentation for edge cases
|
||||
- [ ] Check release notes for known issues
|
||||
- [ ] Consult language/framework changelog
|
||||
|
||||
**Validation:**
|
||||
- [ ] Can the issue be reproduced consistently?
|
||||
- [ ] Does the evidence match the hypothesis?
|
||||
- [ ] Are there alternative explanations?
|
||||
|
||||
## Step 6: Locate and Implement Fix ✓
|
||||
|
||||
**Root Cause Confirmation:**
|
||||
- [ ] Identify exact file and line number
|
||||
- [ ] Understand why the code fails
|
||||
- [ ] Confirm this is root cause, not symptom
|
||||
|
||||
**Solution Design:**
|
||||
- [ ] Consider multiple fix approaches
|
||||
- [ ] Evaluate side effects of each approach
|
||||
- [ ] Choose most elegant and maintainable solution
|
||||
- [ ] Ensure fix doesn't introduce new issues
|
||||
|
||||
**Implementation:**
|
||||
- [ ] Implement the fix
|
||||
- [ ] Add comments explaining the fix
|
||||
- [ ] Update related documentation
|
||||
- [ ] Add test case to prevent regression
|
||||
|
||||
**Verification:**
|
||||
- [ ] Test the fix resolves original issue
|
||||
- [ ] Run existing test suite
|
||||
- [ ] Test edge cases
|
||||
- [ ] Verify no new issues introduced
|
||||
|
||||
## Step 7: Prevention Mechanism ✓
|
||||
|
||||
**Stability Verification:**
|
||||
- [ ] Run full test suite
|
||||
- [ ] Perform integration testing
|
||||
- [ ] Test in staging environment
|
||||
- [ ] Monitor for unexpected behavior
|
||||
|
||||
**Documentation:**
|
||||
- [ ] Update CLAUDE.md or project docs
|
||||
- [ ] Document root cause
|
||||
- [ ] Document fix and reasoning
|
||||
- [ ] Add to knowledge base
|
||||
|
||||
**Prevention Measures:**
|
||||
- [ ] Add automated test for this scenario
|
||||
- [ ] Add validation/assertions to prevent recurrence
|
||||
- [ ] Update error messages for clarity
|
||||
- [ ] Add monitoring/alerting if applicable
|
||||
- [ ] Share learnings with team
|
||||
|
||||
**Post-Mortem:**
|
||||
- [ ] Review what went well
|
||||
- [ ] Identify what could improve
|
||||
- [ ] Update debugging procedures if needed
|
||||
- [ ] Celebrate the fix! 🎉
|
||||
292
skills/debug/references/investigation_template.md
Normal file
292
skills/debug/references/investigation_template.md
Normal file
@@ -0,0 +1,292 @@
|
||||
# Bug Investigation Log Template
|
||||
|
||||
Use this template to document debugging sessions systematically. Copy and adapt as needed.
|
||||
|
||||
---
|
||||
|
||||
## Investigation Metadata
|
||||
|
||||
**Issue ID/Reference:** [e.g., #123, TICKET-456]
|
||||
**Date Started:** [YYYY-MM-DD HH:MM]
|
||||
**Investigator:** [Name or AI assistant]
|
||||
**Priority:** [Critical / High / Medium / Low]
|
||||
**Status:** [🔴 Investigating / 🟡 In Progress / 🟢 Resolved]
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Initial Observations
|
||||
|
||||
**User Report:**
|
||||
```
|
||||
[Paste user's bug report or description here]
|
||||
```
|
||||
|
||||
**Reproduction Steps:**
|
||||
1. [Step 1]
|
||||
2. [Step 2]
|
||||
3. [Step 3]
|
||||
|
||||
**Expected Behavior:**
|
||||
[What should happen]
|
||||
|
||||
**Actual Behavior:**
|
||||
[What actually happens]
|
||||
|
||||
**Environment:**
|
||||
- OS: [e.g., Ubuntu 22.04, Windows 11, macOS 14]
|
||||
- Application Version: [e.g., v2.3.1]
|
||||
- Runtime: [e.g., Node.js 18.16, Python 3.11]
|
||||
- Browser: [if applicable]
|
||||
|
||||
**Evidence Collected:**
|
||||
|
||||
*Error Messages:*
|
||||
```
|
||||
[Paste error messages, stack traces]
|
||||
```
|
||||
|
||||
*Logs:*
|
||||
```
|
||||
[Relevant log entries]
|
||||
```
|
||||
|
||||
*Configuration:*
|
||||
```
|
||||
[Relevant config values, environment variables]
|
||||
```
|
||||
|
||||
*Recent Changes:*
|
||||
- [Commit hash / PR / change description]
|
||||
- [git diff summary if relevant]
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Fact Classification
|
||||
|
||||
**Confirmed Symptoms (Observable Facts):**
|
||||
1. [Symptom 1]
|
||||
2. [Symptom 2]
|
||||
3. [Symptom 3]
|
||||
|
||||
**Scope Analysis:**
|
||||
|
||||
| Test | Result | Notes |
|
||||
|------|--------|-------|
|
||||
| Different environments (dev/staging/prod) | ✓/✗ | |
|
||||
| Different platforms (Win/Mac/Linux) | ✓/✗ | |
|
||||
| Different browsers | ✓/✗ | |
|
||||
| Different input data | ✓/✗ | |
|
||||
| Previous version | ✓/✗ | |
|
||||
|
||||
**Isolated Components:**
|
||||
- ✅ Working correctly: [Component A, Component B]
|
||||
- ❌ Suspected issues: [Component C, Component D]
|
||||
- ❓ Uncertain: [Component E]
|
||||
|
||||
**What Changed Recently:**
|
||||
- [Change 1 - date, description]
|
||||
- [Change 2 - date, description]
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Differential Diagnosis List
|
||||
|
||||
**Hypotheses (Ranked by Likelihood):**
|
||||
|
||||
### Hypothesis 1: [Name of hypothesis]
|
||||
**Likelihood:** High / Medium / Low
|
||||
**Category:** [Infrastructure / Application / Configuration / Code]
|
||||
**Reasoning:** [Why this is suspected]
|
||||
|
||||
### Hypothesis 2: [Name of hypothesis]
|
||||
**Likelihood:** High / Medium / Low
|
||||
**Category:** [Infrastructure / Application / Configuration / Code]
|
||||
**Reasoning:** [Why this is suspected]
|
||||
|
||||
### Hypothesis 3: [Name of hypothesis]
|
||||
**Likelihood:** High / Medium / Low
|
||||
**Category:** [Infrastructure / Application / Configuration / Code]
|
||||
**Reasoning:** [Why this is suspected]
|
||||
|
||||
[Add more as needed]
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Elimination and Deductive Reasoning
|
||||
|
||||
### Test 1: [Hypothesis being tested]
|
||||
**Test Design:** [How to test this hypothesis]
|
||||
**Expected Result if Hypothesis True:** [What you expect to see]
|
||||
**Actual Result:** [What you observed]
|
||||
**Conclusion:** ✅ Confirmed / ❌ Eliminated / ⚠️ Inconclusive
|
||||
**Reasoning:**
|
||||
```
|
||||
If [condition], then [expected behavior]
|
||||
We observed [actual behavior]
|
||||
Therefore [conclusion]
|
||||
```
|
||||
|
||||
### Test 2: [Hypothesis being tested]
|
||||
**Test Design:** [How to test this hypothesis]
|
||||
**Expected Result if Hypothesis True:** [What you expect to see]
|
||||
**Actual Result:** [What you observed]
|
||||
**Conclusion:** ✅ Confirmed / ❌ Eliminated / ⚠️ Inconclusive
|
||||
**Reasoning:**
|
||||
```
|
||||
[Chain of reasoning]
|
||||
```
|
||||
|
||||
[Continue for each test]
|
||||
|
||||
**Hypotheses Remaining:** [List hypotheses not yet eliminated]
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Experimental Verification
|
||||
|
||||
**Checkpoint Created:** [git branch name, commit hash, or backup location]
|
||||
|
||||
### Experiment 1: [Description]
|
||||
**Goal:** [What this experiment aims to prove/disprove]
|
||||
**Method:**
|
||||
```bash
|
||||
# Commands or code used
|
||||
```
|
||||
**Results:**
|
||||
```
|
||||
[Output or findings]
|
||||
```
|
||||
**Conclusion:** [What this proves]
|
||||
|
||||
### Experiment 2: [Description]
|
||||
**Goal:** [What this experiment aims to prove/disprove]
|
||||
**Method:**
|
||||
```bash
|
||||
# Commands or code used
|
||||
```
|
||||
**Results:**
|
||||
```
|
||||
[Output or findings]
|
||||
```
|
||||
**Conclusion:** [What this proves]
|
||||
|
||||
**Research Conducted:**
|
||||
- [ ] GitHub issues searched: [keywords used]
|
||||
- [ ] Stack Overflow checked: [relevant Q&As]
|
||||
- [ ] Documentation reviewed: [sections consulted]
|
||||
- [ ] Release notes: [findings]
|
||||
|
||||
**Findings:**
|
||||
[Summary of research findings]
|
||||
|
||||
**Root Cause Identified:** ✅ Yes / ❌ No
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Root Cause and Fix
|
||||
|
||||
**Root Cause:**
|
||||
[Precise description of what's causing the issue]
|
||||
|
||||
**Location:**
|
||||
- File: [path/to/file.ext]
|
||||
- Line(s): [line number(s)]
|
||||
- Function/Method: [name]
|
||||
|
||||
**Why This Causes the Issue:**
|
||||
[Explanation of the causal mechanism]
|
||||
|
||||
**Fix Approaches Considered:**
|
||||
|
||||
| Approach | Pros | Cons | Selected |
|
||||
|----------|------|------|----------|
|
||||
| [Approach 1] | [pros] | [cons] | ✅/❌ |
|
||||
| [Approach 2] | [pros] | [cons] | ✅/❌ |
|
||||
| [Approach 3] | [pros] | [cons] | ✅/❌ |
|
||||
|
||||
**Selected Fix:**
|
||||
```diff
|
||||
[Show code diff or configuration change]
|
||||
```
|
||||
|
||||
**Rationale:** [Why this fix was chosen]
|
||||
|
||||
**Implementation Notes:**
|
||||
[Any important details about the fix]
|
||||
|
||||
**Verification:**
|
||||
- [ ] Original issue resolved
|
||||
- [ ] No new issues introduced
|
||||
- [ ] Test suite passes
|
||||
- [ ] Edge cases tested
|
||||
|
||||
---
|
||||
|
||||
## Step 7: Prevention and Documentation
|
||||
|
||||
**Regression Test Added:**
|
||||
```
|
||||
[Test code or test case description]
|
||||
```
|
||||
|
||||
**Documentation Updated:**
|
||||
- [ ] CLAUDE.md updated
|
||||
- [ ] Code comments added
|
||||
- [ ] API documentation updated
|
||||
- [ ] README updated (if needed)
|
||||
|
||||
**Prevention Measures Implemented:**
|
||||
1. [Measure 1 - e.g., added validation]
|
||||
2. [Measure 2 - e.g., improved error handling]
|
||||
3. [Measure 3 - e.g., added monitoring]
|
||||
|
||||
**Lessons Learned:**
|
||||
1. [Lesson 1]
|
||||
2. [Lesson 2]
|
||||
3. [Lesson 3]
|
||||
|
||||
**Knowledge Base Update:**
|
||||
- Pattern: [If this represents a new pattern to document]
|
||||
- Category: [What category of bug this was]
|
||||
- Key Insight: [Main takeaway for future debugging]
|
||||
|
||||
---
|
||||
|
||||
## Timeline Summary
|
||||
|
||||
| Time | Activity | Result |
|
||||
|------|----------|--------|
|
||||
| [HH:MM] | Investigation started | |
|
||||
| [HH:MM] | Initial observations completed | |
|
||||
| [HH:MM] | Hypothesis list created | |
|
||||
| [HH:MM] | Testing began | |
|
||||
| [HH:MM] | Root cause identified | |
|
||||
| [HH:MM] | Fix implemented | |
|
||||
| [HH:MM] | Verification completed | |
|
||||
| [HH:MM] | Issue resolved | |
|
||||
|
||||
**Total Time:** [Duration]
|
||||
|
||||
---
|
||||
|
||||
## Status Update for Stakeholders
|
||||
|
||||
**Summary for Non-Technical Audience:**
|
||||
[1-2 sentence explanation of what went wrong and how it was fixed]
|
||||
|
||||
**Impact:**
|
||||
- Users affected: [number or description]
|
||||
- Duration: [how long the issue existed]
|
||||
- Severity: [impact level]
|
||||
|
||||
**Resolution:**
|
||||
[Brief description of the fix]
|
||||
|
||||
**Follow-up Actions:**
|
||||
- [ ] [Action 1]
|
||||
- [ ] [Action 2]
|
||||
|
||||
---
|
||||
|
||||
**Investigation Completed:** [YYYY-MM-DD HH:MM]
|
||||
**Final Status:** 🟢 Resolved / 🔴 Unresolved / ⚠️ Workaround Applied
|
||||
33
skills/debug/repackage.py
Normal file
33
skills/debug/repackage.py
Normal file
@@ -0,0 +1,33 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Repackage this skill into a distributable zip file.
|
||||
|
||||
Usage:
|
||||
cd debug
|
||||
python repackage.py
|
||||
|
||||
Output: ../debug.zip
|
||||
"""
|
||||
import zipfile
|
||||
from pathlib import Path
|
||||
|
||||
# Paths relative to this script
|
||||
script_dir = Path(__file__).parent
|
||||
skill_name = script_dir.name
|
||||
zip_path = script_dir.parent / f'{skill_name}.zip'
|
||||
|
||||
# Remove old zip if exists
|
||||
if zip_path.exists():
|
||||
zip_path.unlink()
|
||||
print(f"Removed old: {zip_path.name}")
|
||||
|
||||
print(f"Packaging skill: {skill_name}\n")
|
||||
|
||||
with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zf:
|
||||
for file_path in script_dir.rglob('*'):
|
||||
if file_path.is_file() and file_path.name != 'repackage.py': # Don't include this script
|
||||
arcname = file_path.relative_to(script_dir.parent)
|
||||
zf.write(file_path, arcname)
|
||||
print(f" Added: {arcname}")
|
||||
|
||||
print(f"\n✅ Successfully packaged to: {zip_path.absolute()}")
|
||||
221
skills/error-troubleshooter/SKILL.md
Normal file
221
skills/error-troubleshooter/SKILL.md
Normal file
@@ -0,0 +1,221 @@
|
||||
---
|
||||
name: error-troubleshooter
|
||||
description: Automatically troubleshoot unexpected results OR command/script errors without user request. Triggers when: (1) unexpected behavior - command succeeded but expected effect didn't happen, missing expected errors, wrong output, silent failures; (2) explicit failures - stderr, exceptions, non-zero exit, SDK/API errors. Applies systematic diagnosis using error templates, hypothesis testing, and web research for any Stack Overflow-worthy issue.
|
||||
---
|
||||
|
||||
# Error Troubleshooter
|
||||
|
||||
## Overview
|
||||
|
||||
This skill enables systematic troubleshooting of unexpected behavior and technical failures - whether explicit errors or silent anomalies where commands succeed but don't produce expected results. Proactively investigate any mismatch between expected and actual outcomes using a structured approach that balances quick fixes with thorough analysis.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Trigger this skill automatically when encountering either:
|
||||
|
||||
### (1) Unexpected Behavior (Priority)
|
||||
- **Command succeeded but expected effect didn't happen** - e.g., configuration set but not taking effect, file created but empty
|
||||
- **Missing expected errors** - e.g., test was designed to fail but passed, validation that should reject but accepted
|
||||
- **Wrong or unexpected output** - e.g., different data than expected, incorrect format, unexpected side effects
|
||||
- **Silent failures** - no error reported but operation clearly didn't work
|
||||
- **Behavioral anomalies** - program runs but behaves differently than intended
|
||||
|
||||
### (2) Explicit Failures
|
||||
- **Error messages from SDK/API calls** - exceptions, error codes, failure responses
|
||||
- **Tool execution failures** - Bash errors, script crashes, MCP tool failures
|
||||
- **Runtime errors** - exceptions in any programming language
|
||||
- **Build or compilation failures** - compiler errors, linking failures
|
||||
- **System errors** - permission denied, file not found, connection refused
|
||||
|
||||
**Key principle**: If there's any mismatch between expected and actual behavior - whether explicit error or silent anomaly - this skill applies.
|
||||
|
||||
## Troubleshooting Decision Tree
|
||||
|
||||
### 1. Initial Assessment
|
||||
|
||||
When unexpected behavior or an error occurs, immediately assess the situation:
|
||||
|
||||
```
|
||||
Unexpected Behavior or Error Detected
|
||||
↓
|
||||
What type of issue is this?
|
||||
│
|
||||
├─ UNEXPECTED BEHAVIOR (command succeeded but wrong result)
|
||||
│ ↓
|
||||
│ Document the mismatch:
|
||||
│ - What was expected?
|
||||
│ - What actually happened?
|
||||
│ - Any error messages? (none expected for unexpected behavior)
|
||||
│ ↓
|
||||
│ Is the cause obvious? (e.g., wrong variable, typo, wrong file)
|
||||
│ ├─ YES → Apply quick fix
|
||||
│ │ ↓
|
||||
│ │ Did expected behavior occur?
|
||||
│ │ ├─ YES → Done ✓
|
||||
│ │ └─ NO → Revert, proceed to Rigorous Investigation
|
||||
│ │
|
||||
│ └─ NO → Proceed directly to Rigorous Investigation
|
||||
│
|
||||
└─ EXPLICIT ERROR (stderr, exception, non-zero exit)
|
||||
↓
|
||||
Is the fix obvious from the error message itself?
|
||||
├─ YES → Apply quick fix (Happy Case Path)
|
||||
│ ↓
|
||||
│ Did it work?
|
||||
│ ├─ YES → Done ✓
|
||||
│ └─ NO → Revert changes, proceed to Rigorous Investigation
|
||||
│
|
||||
└─ NO → Is this a common trivial error?
|
||||
├─ YES → Apply known fix based on experience
|
||||
│ ↓
|
||||
│ Did it work?
|
||||
│ ├─ YES → Done ✓
|
||||
│ └─ NO → Revert changes, proceed to Rigorous Investigation
|
||||
│
|
||||
└─ NO → Proceed directly to Rigorous Investigation
|
||||
```
|
||||
|
||||
### 2. Happy Case Path (Quick Resolution)
|
||||
|
||||
For issues with obvious causes and fixes:
|
||||
|
||||
**Unexpected Behavior Quick Fixes:**
|
||||
- Obvious typo or wrong variable name
|
||||
- Wrong file path or target
|
||||
- Cached data (clear cache and retry)
|
||||
- Tool not reloaded after changes (restart and retry)
|
||||
|
||||
**Explicit Error Quick Fixes:**
|
||||
- Error message explicitly states the solution (e.g., "Missing required argument --config")
|
||||
- Common trivial errors (e.g., "command not found" → check installation)
|
||||
- Direct and unambiguous error descriptions
|
||||
|
||||
**Action**: Apply the fix immediately and verify the result.
|
||||
|
||||
**Failure criteria for unexpected behavior**: If expected behavior still doesn't occur, revert and switch to rigorous investigation.
|
||||
|
||||
**Failure criteria for explicit errors**: If the error message is unchanged OR the problem clearly worsened, revert immediately and switch to rigorous investigation.
|
||||
|
||||
### 3. Rigorous Investigation Path (Complex Problems)
|
||||
|
||||
When quick fixes fail or the problem is non-trivial, follow this systematic approach:
|
||||
|
||||
#### Step 1: Extract Problem Pattern
|
||||
|
||||
**For Explicit Errors:**
|
||||
SDK/API errors often follow fixed templates. Extract the template by:
|
||||
- Removing variable components (file paths, user inputs, timestamps, IDs)
|
||||
- Isolating the core error message structure
|
||||
- Preparing the template for web search
|
||||
|
||||
Example:
|
||||
```
|
||||
Original: FileNotFoundError: [Errno 2] No such file or directory: '/home/user/data.csv'
|
||||
Template: FileNotFoundError No such file or directory
|
||||
```
|
||||
|
||||
See `{baseDir}/references/error-template-patterns.md` for detailed guidance.
|
||||
|
||||
**For Unexpected Behavior:**
|
||||
Document the behavior pattern:
|
||||
- What command/operation was performed?
|
||||
- What was the expected outcome?
|
||||
- What actually happened instead?
|
||||
- Are there any observable symptoms (wrong data, missing files, etc.)?
|
||||
- Has this operation worked before? When did it stop working?
|
||||
|
||||
Formulate a search query focusing on the behavior:
|
||||
- "command X succeeded but didn't create Y"
|
||||
- "configuration Z not taking effect"
|
||||
- "expected validation to fail but passed"
|
||||
|
||||
#### Step 2: Gather Environment Information
|
||||
|
||||
Collect relevant environment details when:
|
||||
- The pattern search doesn't yield clear solutions
|
||||
- Environment-specific factors are likely relevant (versions, configurations, system state)
|
||||
- Context is needed to understand the problem
|
||||
|
||||
**IMPORTANT**: Avoid collecting sensitive information. If sensitive data is necessary, explicitly request user authorization first.
|
||||
|
||||
See `{baseDir}/references/environment-info-guide.md` for collection guidelines and privacy protection.
|
||||
|
||||
#### Step 3: Research the Problem
|
||||
|
||||
Use efficient research strategies:
|
||||
- **Web Search**: Search the extracted pattern (error template or behavior description)
|
||||
- **Parallel Investigation**: Use Task tool with subagent_type=Explore for multiple research angles simultaneously
|
||||
- **Documentation**: Search official docs, GitHub Issues, Stack Overflow for similar problems
|
||||
- **Counter-evidence**: Look for cases where the expected behavior DID occur to identify what's different
|
||||
|
||||
**Token Efficiency**: For complex investigations, delegate research to subagents to avoid context exhaustion.
|
||||
|
||||
#### Step 4: Create Debug Notes File
|
||||
|
||||
For difficult problems, create a debug notes file to:
|
||||
- Track theories and test results
|
||||
- Enable parallel investigation
|
||||
- Resume from interruptions
|
||||
- Maintain systematic progress
|
||||
|
||||
Use the template in `{baseDir}/assets/debug-notes-template.md` to structure notes.
|
||||
|
||||
#### Step 5: Formulate and Test Theories
|
||||
|
||||
Based on research:
|
||||
1. List plausible theories explaining the problem (why error occurred OR why expected behavior didn't happen)
|
||||
2. Order by likelihood
|
||||
3. Design tests to verify each theory
|
||||
4. Execute tests systematically
|
||||
5. Document results in debug notes
|
||||
6. Iterate until the correct solution is found
|
||||
|
||||
**For unexpected behavior**: Focus theories on "why the expected effect didn't occur" rather than "why an error happened".
|
||||
|
||||
#### Step 6: Implement Solution
|
||||
|
||||
Once the correct theory is identified:
|
||||
- Apply the fix
|
||||
- Verify the problem is resolved:
|
||||
- **For errors**: Error no longer occurs
|
||||
- **For unexpected behavior**: Expected behavior now occurs as intended
|
||||
- Document the solution in debug notes (if notes were created)
|
||||
- Consider if this pattern should be added to common patterns
|
||||
|
||||
## Token Efficiency Strategies
|
||||
|
||||
For complex investigations:
|
||||
|
||||
1. **Use Subagents**: Delegate research tasks using the Task tool with subagent_type=Explore
|
||||
2. **File-Based Notes**: Write debug notes to files instead of maintaining context in memory
|
||||
3. **Parallel Research**: Launch multiple subagents simultaneously for different research angles
|
||||
4. **Selective Context**: Only load reference files when specifically needed
|
||||
|
||||
## Key Principles
|
||||
|
||||
1. **Proactive Investigation**: Don't wait for the user to request troubleshooting—start investigating immediately when unexpected behavior or errors occur
|
||||
2. **Prioritize Unexpected Behavior**: Check for silent failures and behavioral anomalies first, as they're more subtle than explicit errors
|
||||
3. **Bold Hypotheses, Careful Verification**: Generate multiple competing theories, then rigorously verify each with concrete evidence (see `{baseDir}/references/problem-solving-mindset.md`)
|
||||
4. **Challenge Your Own Reasoning**: Actively search for counter-evidence and successful counter-examples that would disprove your theories
|
||||
5. **Acknowledge Uncertainty**: Present confidence levels; admit when evidence is incomplete rather than pretending certainty
|
||||
6. **Revert on Failure**: If a fix doesn't work, always revert before trying another approach
|
||||
7. **Systematic Documentation**: For difficult problems, maintain structured debug notes
|
||||
8. **Privacy Protection**: Never collect sensitive information without explicit user authorization
|
||||
9. **Efficient Resource Usage**: Use subagents and files to manage context for complex investigations
|
||||
|
||||
## Resources
|
||||
|
||||
This skill includes:
|
||||
|
||||
### references/
|
||||
- `problem-solving-mindset.md` - Scientific approach to problem-solving: bold hypotheses, careful verification, and disciplined reasoning
|
||||
- `systematic-debugging-methodology.md` - Practical debugging framework: Occam's Razor, diagnostic scripts, evidence hierarchy, and real-world examples
|
||||
- `troubleshooting-sop.md` - Detailed standard operating procedures for systematic troubleshooting
|
||||
- `error-template-patterns.md` - Guide to identifying and extracting error message templates
|
||||
- `environment-info-guide.md` - Environment information collection guidelines with privacy protection
|
||||
- `common-error-patterns.md` - Database of frequently encountered trivial errors and quick fixes
|
||||
|
||||
### assets/
|
||||
- `debug-notes-template.md` - Template for structured debugging documentation
|
||||
|
||||
These resources are loaded as needed during the troubleshooting process.
|
||||
396
skills/error-troubleshooter/assets/debug-notes-template.md
Normal file
396
skills/error-troubleshooter/assets/debug-notes-template.md
Normal file
@@ -0,0 +1,396 @@
|
||||
# Debug Session: [Brief Error Summary]
|
||||
|
||||
**Date**: [YYYY-MM-DD]
|
||||
**Status**: [Active/Resolved/Blocked]
|
||||
**Investigator**: Claude Error Troubleshooter
|
||||
|
||||
---
|
||||
|
||||
## Error Information
|
||||
|
||||
### Error Message
|
||||
```
|
||||
[Paste full error message here, including stack trace if available]
|
||||
```
|
||||
|
||||
### Context
|
||||
- **Command/Tool Executed**: `[command or tool that failed]`
|
||||
- **What Was Being Attempted**: [Brief description of the goal]
|
||||
- **When It Failed**: [e.g., during build, at runtime, on specific input]
|
||||
|
||||
### Initial Observations
|
||||
- [Key observation 1]
|
||||
- [Key observation 2]
|
||||
- [Key observation 3]
|
||||
|
||||
---
|
||||
|
||||
## Environment Information
|
||||
|
||||
### System
|
||||
- **OS**: [e.g., Ubuntu 22.04, macOS 13.1, Windows 11]
|
||||
- **Architecture**: [e.g., x86_64, ARM64]
|
||||
- **Shell**: [e.g., bash, zsh, PowerShell]
|
||||
|
||||
### Runtime
|
||||
- **Language/Runtime**: [e.g., Python 3.11.0, Node.js 18.12.0]
|
||||
- **Package Manager**: [e.g., pip 22.3.1, npm 8.19.2]
|
||||
- **Virtual Environment**: [e.g., active venv at /path/to/venv, none]
|
||||
|
||||
### Dependencies (Relevant Packages)
|
||||
```
|
||||
[package-1]==1.2.3
|
||||
[package-2]==4.5.6
|
||||
```
|
||||
|
||||
### Configuration (Sanitized)
|
||||
```
|
||||
[Relevant configuration settings, with sensitive data redacted]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Template
|
||||
|
||||
**Extracted Template** (for searching):
|
||||
```
|
||||
[Error template with variables removed - see error-template-patterns.md]
|
||||
```
|
||||
|
||||
**Search Queries Used**:
|
||||
- `[query 1]`
|
||||
- `[query 2]`
|
||||
- `[query 3]`
|
||||
|
||||
---
|
||||
|
||||
## Research Findings
|
||||
|
||||
### Source 1: [Stack Overflow / GitHub Issue / Docs / etc.]
|
||||
**URL**: [link]
|
||||
|
||||
**Summary**: [Brief summary of relevant information]
|
||||
|
||||
**Applicability**: [High/Medium/Low - how relevant is this to our case?]
|
||||
|
||||
**Key Quotes/Code**:
|
||||
```
|
||||
[Relevant code snippet or quote]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Source 2: [Title]
|
||||
**URL**: [link]
|
||||
|
||||
**Summary**: [Brief summary]
|
||||
|
||||
**Applicability**: [High/Medium/Low]
|
||||
|
||||
**Key Quotes/Code**:
|
||||
```
|
||||
[Relevant information]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Source 3: [Title]
|
||||
**URL**: [link]
|
||||
|
||||
**Summary**: [Brief summary]
|
||||
|
||||
**Applicability**: [High/Medium/Low]
|
||||
|
||||
---
|
||||
|
||||
## Theories
|
||||
|
||||
### Theory 1: [Concise Description]
|
||||
**Likelihood**: [High/Medium/Low]
|
||||
|
||||
**Evidence Supporting**:
|
||||
- [Evidence 1]
|
||||
- [Evidence 2]
|
||||
|
||||
**Evidence Against**:
|
||||
- [Counter-evidence 1]
|
||||
- [Counter-evidence 2]
|
||||
|
||||
**How to Test**:
|
||||
```bash
|
||||
[Command or approach to verify this theory]
|
||||
```
|
||||
|
||||
**Expected Outcome if Correct**:
|
||||
[What would we observe if this theory is correct?]
|
||||
|
||||
**Expected Outcome if Incorrect**:
|
||||
[What would we observe if this theory is wrong?]
|
||||
|
||||
**Status**: [Pending/Testing/Confirmed/Rejected]
|
||||
|
||||
---
|
||||
|
||||
### Theory 2: [Concise Description]
|
||||
**Likelihood**: [High/Medium/Low]
|
||||
|
||||
**Evidence Supporting**:
|
||||
- [Evidence 1]
|
||||
- [Evidence 2]
|
||||
|
||||
**Evidence Against**:
|
||||
- [Counter-evidence 1]
|
||||
|
||||
**How to Test**:
|
||||
```bash
|
||||
[Command or approach]
|
||||
```
|
||||
|
||||
**Expected Outcome if Correct**:
|
||||
[Description]
|
||||
|
||||
**Expected Outcome if Incorrect**:
|
||||
[Description]
|
||||
|
||||
**Status**: [Pending/Testing/Confirmed/Rejected]
|
||||
|
||||
---
|
||||
|
||||
### Theory 3: [Concise Description]
|
||||
**Likelihood**: [High/Medium/Low]
|
||||
|
||||
**Evidence Supporting**:
|
||||
- [Evidence]
|
||||
|
||||
**Evidence Against**:
|
||||
- [Counter-evidence]
|
||||
|
||||
**How to Test**:
|
||||
```bash
|
||||
[Command]
|
||||
```
|
||||
|
||||
**Expected Outcome if Correct**:
|
||||
[Description]
|
||||
|
||||
**Expected Outcome if Incorrect**:
|
||||
[Description]
|
||||
|
||||
**Status**: [Pending/Testing/Confirmed/Rejected]
|
||||
|
||||
---
|
||||
|
||||
## Tests Conducted
|
||||
|
||||
### Test 1: [Brief Test Description]
|
||||
**Date/Time**: [When was this test run]
|
||||
|
||||
**Theory Being Tested**: [Which theory or theories does this test address?]
|
||||
|
||||
**Test Procedure**:
|
||||
```bash
|
||||
[Exact commands or steps taken]
|
||||
```
|
||||
|
||||
**Expected Result**:
|
||||
[What we expected to happen]
|
||||
|
||||
**Actual Result**:
|
||||
```
|
||||
[What actually happened - output, errors, observations]
|
||||
```
|
||||
|
||||
**Analysis**:
|
||||
[What does this result tell us? Does it support or reject any theories?]
|
||||
|
||||
**Conclusion**:
|
||||
- [Theory X]: [Supported/Rejected/Inconclusive]
|
||||
- [New information learned]
|
||||
|
||||
---
|
||||
|
||||
### Test 2: [Brief Test Description]
|
||||
**Date/Time**: [When]
|
||||
|
||||
**Theory Being Tested**: [Theory name]
|
||||
|
||||
**Test Procedure**:
|
||||
```bash
|
||||
[Commands]
|
||||
```
|
||||
|
||||
**Expected Result**:
|
||||
[Description]
|
||||
|
||||
**Actual Result**:
|
||||
```
|
||||
[Output]
|
||||
```
|
||||
|
||||
**Analysis**:
|
||||
[Interpretation]
|
||||
|
||||
**Conclusion**:
|
||||
- [What was learned]
|
||||
|
||||
---
|
||||
|
||||
### Test 3: [Brief Test Description]
|
||||
**Date/Time**: [When]
|
||||
|
||||
**Theory Being Tested**: [Theory name]
|
||||
|
||||
**Test Procedure**:
|
||||
```bash
|
||||
[Commands]
|
||||
```
|
||||
|
||||
**Expected Result**:
|
||||
[Description]
|
||||
|
||||
**Actual Result**:
|
||||
```
|
||||
[Output]
|
||||
```
|
||||
|
||||
**Analysis**:
|
||||
[Interpretation]
|
||||
|
||||
**Conclusion**:
|
||||
- [What was learned]
|
||||
|
||||
---
|
||||
|
||||
## Solution
|
||||
|
||||
### Root Cause
|
||||
[Detailed explanation of what was actually causing the error]
|
||||
|
||||
### Fix Applied
|
||||
```bash
|
||||
[Exact commands or code changes made to resolve the issue]
|
||||
```
|
||||
|
||||
### Why This Works
|
||||
[Explanation of why this solution resolves the root cause]
|
||||
|
||||
### Verification
|
||||
**Verification Command**:
|
||||
```bash
|
||||
[Command to verify the fix]
|
||||
```
|
||||
|
||||
**Verification Result**:
|
||||
```
|
||||
[Output showing the error is resolved]
|
||||
```
|
||||
|
||||
**Additional Testing**:
|
||||
- [Test 1]: [Result]
|
||||
- [Test 2]: [Result]
|
||||
|
||||
---
|
||||
|
||||
## Lessons Learned
|
||||
|
||||
### What Went Well
|
||||
- [Approach or technique that was effective]
|
||||
- [Tool or resource that was helpful]
|
||||
|
||||
### What Could Be Improved
|
||||
- [Mistake or inefficiency to avoid in future]
|
||||
- [Better approach that could have been taken]
|
||||
|
||||
### Key Insights
|
||||
- [Important insight 1]
|
||||
- [Important insight 2]
|
||||
- [Important insight 3]
|
||||
|
||||
### Applicable to Future Cases
|
||||
- [General pattern or principle learned]
|
||||
- [Red flags to watch for in similar errors]
|
||||
- [Quick checks to try first next time]
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
### Documentation
|
||||
- [Link 1]: [Brief description]
|
||||
- [Link 2]: [Brief description]
|
||||
|
||||
### Related Issues
|
||||
- [Link to similar GitHub issue, Stack Overflow question, etc.]
|
||||
|
||||
### Tools Used
|
||||
- [Tool 1]: [What it was used for]
|
||||
- [Tool 2]: [What it was used for]
|
||||
|
||||
---
|
||||
|
||||
## Investigation Timeline
|
||||
|
||||
**[HH:MM]** - Investigation started
|
||||
- Initial error encountered: [brief description]
|
||||
|
||||
**[HH:MM]** - Extracted error template
|
||||
- Template: `[template]`
|
||||
|
||||
**[HH:MM]** - Gathered environment information
|
||||
- [Key findings]
|
||||
|
||||
**[HH:MM]** - Completed initial research
|
||||
- [Number] relevant sources found
|
||||
|
||||
**[HH:MM]** - Formulated theories
|
||||
- [Number] theories developed
|
||||
|
||||
**[HH:MM]** - Began testing
|
||||
- Started with Theory [N] (highest likelihood)
|
||||
|
||||
**[HH:MM]** - Test 1 completed
|
||||
- [Result and conclusion]
|
||||
|
||||
**[HH:MM]** - Test 2 completed
|
||||
- [Result and conclusion]
|
||||
|
||||
**[HH:MM]** - Root cause identified
|
||||
- [Brief description]
|
||||
|
||||
**[HH:MM]** - Fix applied
|
||||
- [Brief description]
|
||||
|
||||
**[HH:MM]** - Fix verified
|
||||
- Error resolved successfully
|
||||
|
||||
**[HH:MM]** - Investigation completed
|
||||
- Total time: [duration]
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
### Open Questions
|
||||
- [Question 1 that remains unanswered]
|
||||
- [Question 2 that needs clarification]
|
||||
|
||||
### Future Investigation
|
||||
- [Area to explore if this error recurs]
|
||||
- [Related issue to investigate separately]
|
||||
|
||||
### Blocked By
|
||||
- [If investigation is blocked, what's blocking it]
|
||||
- [What needs to happen to unblock]
|
||||
|
||||
---
|
||||
|
||||
## Metadata
|
||||
|
||||
**Status**: [Active/Resolved/Blocked]
|
||||
**Priority**: [High/Medium/Low]
|
||||
**Tags**: [error-type, language, component, etc.]
|
||||
**Related Files**:
|
||||
- `[file1.py]`
|
||||
- `[file2.js]`
|
||||
**Related Errors**:
|
||||
- [Link to other debug notes if related]
|
||||
784
skills/error-troubleshooter/references/common-error-patterns.md
Normal file
784
skills/error-troubleshooter/references/common-error-patterns.md
Normal file
@@ -0,0 +1,784 @@
|
||||
# Common Error Patterns and Quick Fixes
|
||||
|
||||
This reference documents frequently encountered trivial errors and their known solutions. Use this for rapid diagnosis and resolution of common issues.
|
||||
|
||||
## How to Use This Reference
|
||||
|
||||
1. **Pattern Match**: Compare the error against patterns in this document
|
||||
2. **Apply Fix**: If there's a high-confidence match, apply the suggested fix
|
||||
3. **Verify**: Confirm the fix resolved the issue
|
||||
4. **Escalate**: If fix fails, revert and proceed to rigorous investigation
|
||||
|
||||
**Important**: Only apply quick fixes when confident. If the first attempt doesn't work, revert and use systematic troubleshooting.
|
||||
|
||||
## Python Common Errors
|
||||
|
||||
### 1. ModuleNotFoundError / ImportError
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
ModuleNotFoundError: No module named 'package_name'
|
||||
ImportError: cannot import name 'something' from 'package'
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: Package Not Installed
|
||||
```bash
|
||||
# Fix
|
||||
pip install package_name
|
||||
|
||||
# Or if using requirements.txt
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
#### Cause 2: Wrong Python Environment
|
||||
```bash
|
||||
# Check current Python
|
||||
which python
|
||||
python --version
|
||||
|
||||
# Activate correct virtual environment
|
||||
source venv/bin/activate # Unix
|
||||
.\venv\Scripts\activate # Windows
|
||||
|
||||
# Or use python3 explicitly
|
||||
pip3 install package_name
|
||||
```
|
||||
|
||||
#### Cause 3: Package Name Mismatch
|
||||
```bash
|
||||
# Install name might differ from import name
|
||||
# Example: pip install Pillow, but import PIL
|
||||
pip install Pillow # for 'import PIL'
|
||||
pip install opencv-python # for 'import cv2'
|
||||
pip install scikit-learn # for 'import sklearn'
|
||||
```
|
||||
|
||||
#### Cause 4: Circular Import
|
||||
- Check if files are importing each other
|
||||
- Restructure imports or use lazy imports
|
||||
|
||||
### 2. SyntaxError
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
SyntaxError: invalid syntax
|
||||
SyntaxError: unexpected EOF while parsing
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: Missing Colon
|
||||
```python
|
||||
# Wrong
|
||||
if condition
|
||||
do_something()
|
||||
|
||||
# Right
|
||||
if condition:
|
||||
do_something()
|
||||
```
|
||||
|
||||
#### Cause 2: Unclosed Brackets/Quotes
|
||||
```python
|
||||
# Wrong
|
||||
result = calculate(x, y
|
||||
print("Hello world)
|
||||
|
||||
# Right
|
||||
result = calculate(x, y)
|
||||
print("Hello world")
|
||||
```
|
||||
|
||||
#### Cause 3: Python Version Incompatibility
|
||||
```python
|
||||
# f-strings require Python 3.6+
|
||||
print(f"Value: {x}") # SyntaxError in Python 3.5
|
||||
|
||||
# Walrus operator requires Python 3.8+
|
||||
if (n := len(items)) > 10: # SyntaxError in Python 3.7
|
||||
```
|
||||
|
||||
**Fix:** Check Python version and upgrade if needed
|
||||
|
||||
### 3. IndentationError
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
IndentationError: unexpected indent
|
||||
IndentationError: expected an indented block
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause: Mixed Tabs and Spaces
|
||||
```bash
|
||||
# Fix: Convert to spaces (recommended)
|
||||
# In most editors: Settings → Convert indentation to spaces
|
||||
|
||||
# Or use autopep8
|
||||
pip install autopep8
|
||||
autopep8 --in-place --select=E101,E121 file.py
|
||||
```
|
||||
|
||||
### 4. AttributeError
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
AttributeError: 'NoneType' object has no attribute 'something'
|
||||
AttributeError: module 'X' has no attribute 'Y'
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: None Value
|
||||
```python
|
||||
# Function returned None when you expected object
|
||||
result = function_that_returns_none()
|
||||
result.method() # AttributeError
|
||||
|
||||
# Fix: Check for None
|
||||
if result is not None:
|
||||
result.method()
|
||||
```
|
||||
|
||||
#### Cause 2: Wrong Module Version
|
||||
```bash
|
||||
# API changed in new version
|
||||
pip show package_name # Check version
|
||||
pip install package_name==1.2.3 # Install specific version
|
||||
```
|
||||
|
||||
#### Cause 3: Module Not Reloaded
|
||||
```python
|
||||
# In interactive session/Jupyter
|
||||
import importlib
|
||||
importlib.reload(module_name)
|
||||
```
|
||||
|
||||
### 5. FileNotFoundError
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
FileNotFoundError: [Errno 2] No such file or directory: 'path/to/file'
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: Wrong Current Directory
|
||||
```python
|
||||
# Check current directory
|
||||
import os
|
||||
print(os.getcwd())
|
||||
|
||||
# Fix: Use absolute path or change directory
|
||||
os.chdir('/correct/path')
|
||||
# Or use absolute path
|
||||
file_path = os.path.join(os.path.dirname(__file__), 'data', 'file.txt')
|
||||
```
|
||||
|
||||
#### Cause 2: Typo in Path
|
||||
- Check file name spelling (case-sensitive on Unix)
|
||||
- Check file extension
|
||||
|
||||
#### Cause 3: File Doesn't Exist Yet
|
||||
```python
|
||||
# Create file if it doesn't exist
|
||||
os.makedirs(os.path.dirname(file_path), exist_ok=True)
|
||||
with open(file_path, 'w') as f:
|
||||
f.write('')
|
||||
```
|
||||
|
||||
## JavaScript/Node.js Common Errors
|
||||
|
||||
### 1. Cannot find module
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
Error: Cannot find module 'module-name'
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: Module Not Installed
|
||||
```bash
|
||||
# Fix
|
||||
npm install module-name
|
||||
|
||||
# Or restore all dependencies
|
||||
npm install
|
||||
```
|
||||
|
||||
#### Cause 2: Wrong Node Version
|
||||
```bash
|
||||
# Check required version in package.json
|
||||
cat package.json | grep "engines"
|
||||
|
||||
# Use nvm to switch version
|
||||
nvm install 16
|
||||
nvm use 16
|
||||
```
|
||||
|
||||
#### Cause 3: Module Path Error
|
||||
```javascript
|
||||
// Wrong (relative path without ./)
|
||||
const myModule = require('utils/helper')
|
||||
|
||||
// Right
|
||||
const myModule = require('./utils/helper')
|
||||
```
|
||||
|
||||
### 2. Reference Error
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
ReferenceError: X is not defined
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: Variable Not Declared
|
||||
```javascript
|
||||
// Wrong
|
||||
console.log(myVar) // ReferenceError
|
||||
|
||||
// Right
|
||||
const myVar = 'value'
|
||||
console.log(myVar)
|
||||
```
|
||||
|
||||
#### Cause 2: Scope Issue
|
||||
```javascript
|
||||
// Wrong
|
||||
if (true) {
|
||||
var x = 1
|
||||
}
|
||||
console.log(x) // Works with var, but not with let/const
|
||||
|
||||
// Right
|
||||
let x
|
||||
if (true) {
|
||||
x = 1
|
||||
}
|
||||
console.log(x)
|
||||
```
|
||||
|
||||
#### Cause 3: Typo in Variable Name
|
||||
- Check spelling and case (JavaScript is case-sensitive)
|
||||
|
||||
### 3. SyntaxError: Unexpected token
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
SyntaxError: Unexpected token 'x'
|
||||
SyntaxError: Unexpected token {
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: JSON Parsing Error
|
||||
```javascript
|
||||
// Wrong: Invalid JSON
|
||||
const data = JSON.parse("{'key': 'value'}") // Single quotes not valid JSON
|
||||
|
||||
// Right: Valid JSON
|
||||
const data = JSON.parse('{"key": "value"}')
|
||||
```
|
||||
|
||||
#### Cause 2: Missing Semicolons/Commas
|
||||
```javascript
|
||||
// Wrong
|
||||
const obj = {
|
||||
key1: 'value1'
|
||||
key2: 'value2'
|
||||
}
|
||||
|
||||
// Right
|
||||
const obj = {
|
||||
key1: 'value1',
|
||||
key2: 'value2'
|
||||
}
|
||||
```
|
||||
|
||||
#### Cause 3: ES6 Syntax in Old Node
|
||||
```javascript
|
||||
// Arrow functions require Node 4+
|
||||
const func = () => {}
|
||||
|
||||
// async/await requires Node 7.6+
|
||||
async function test() { await promise }
|
||||
```
|
||||
|
||||
**Fix:** Upgrade Node or use transpiler (Babel)
|
||||
|
||||
### 4. ECONNREFUSED
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
Error: connect ECONNREFUSED 127.0.0.1:3000
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: Server Not Running
|
||||
```bash
|
||||
# Fix: Start the server
|
||||
npm start
|
||||
node server.js
|
||||
```
|
||||
|
||||
#### Cause 2: Wrong Port
|
||||
```javascript
|
||||
// Check server port in code
|
||||
const PORT = process.env.PORT || 3000
|
||||
|
||||
// Ensure client uses same port
|
||||
```
|
||||
|
||||
#### Cause 3: Firewall Blocking
|
||||
```bash
|
||||
# Check if port is listening
|
||||
netstat -an | grep 3000 # Unix
|
||||
netstat -an | findstr 3000 # Windows
|
||||
```
|
||||
|
||||
### 5. EADDRINUSE
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
Error: listen EADDRINUSE: address already in use :::3000
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Fix 1: Kill Process Using Port
|
||||
```bash
|
||||
# Unix/Linux/Mac
|
||||
lsof -ti:3000 | xargs kill
|
||||
# Or
|
||||
kill $(lsof -t -i:3000)
|
||||
|
||||
# Windows
|
||||
netstat -ano | findstr :3000
|
||||
taskkill /PID <PID> /F
|
||||
```
|
||||
|
||||
#### Fix 2: Use Different Port
|
||||
```javascript
|
||||
const PORT = process.env.PORT || 3001 // Change port
|
||||
```
|
||||
|
||||
## Git Common Errors
|
||||
|
||||
### 1. Permission Denied (publickey)
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
Permission denied (publickey).
|
||||
fatal: Could not read from remote repository.
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause: SSH Key Not Set Up
|
||||
```bash
|
||||
# Generate SSH key
|
||||
ssh-keygen -t ed25519 -C "your_email@example.com"
|
||||
|
||||
# Add to SSH agent
|
||||
eval "$(ssh-agent -s)"
|
||||
ssh-add ~/.ssh/id_ed25519
|
||||
|
||||
# Copy public key and add to GitHub/GitLab
|
||||
cat ~/.ssh/id_ed25519.pub
|
||||
```
|
||||
|
||||
### 2. Merge Conflict
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
CONFLICT (content): Merge conflict in file.txt
|
||||
Automatic merge failed; fix conflicts and then commit the result.
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Standard Resolution
|
||||
```bash
|
||||
# 1. Open conflicted files and resolve markers
|
||||
# <<<<<<< HEAD
|
||||
# =======
|
||||
# >>>>>>> branch-name
|
||||
|
||||
# 2. Stage resolved files
|
||||
git add file.txt
|
||||
|
||||
# 3. Complete merge
|
||||
git commit
|
||||
```
|
||||
|
||||
### 3. Detached HEAD
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
You are in 'detached HEAD' state.
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Fix: Create Branch or Return to Branch
|
||||
```bash
|
||||
# Option 1: Create branch at current commit
|
||||
git checkout -b new-branch-name
|
||||
|
||||
# Option 2: Return to main branch
|
||||
git checkout main
|
||||
```
|
||||
|
||||
## Docker Common Errors
|
||||
|
||||
### 1. Cannot connect to Docker daemon
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: Docker Not Running
|
||||
```bash
|
||||
# Start Docker
|
||||
sudo systemctl start docker # Linux
|
||||
# Or start Docker Desktop (Mac/Windows)
|
||||
```
|
||||
|
||||
#### Cause 2: Permission Issue
|
||||
```bash
|
||||
# Add user to docker group
|
||||
sudo usermod -aG docker $USER
|
||||
# Then logout and login again
|
||||
```
|
||||
|
||||
### 2. Port Already Allocated
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
Error starting userland proxy: listen tcp 0.0.0.0:8080: bind: address already in use
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Fix 1: Stop Conflicting Container
|
||||
```bash
|
||||
# Find container using port
|
||||
docker ps | grep 8080
|
||||
|
||||
# Stop it
|
||||
docker stop <container_id>
|
||||
```
|
||||
|
||||
#### Fix 2: Use Different Port
|
||||
```bash
|
||||
# Change port mapping
|
||||
docker run -p 8081:8080 image_name
|
||||
```
|
||||
|
||||
### 3. No Space Left on Device
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
Error: No space left on device
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Fix: Clean Up Docker Resources
|
||||
```bash
|
||||
# Remove unused containers, images, volumes
|
||||
docker system prune -a --volumes
|
||||
|
||||
# Or selectively
|
||||
docker container prune
|
||||
docker image prune -a
|
||||
docker volume prune
|
||||
```
|
||||
|
||||
## Database Common Errors
|
||||
|
||||
### 1. Connection Refused
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
Error: connect ECONNREFUSED 127.0.0.1:5432
|
||||
psycopg2.OperationalError: could not connect to server: Connection refused
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: Database Not Running
|
||||
```bash
|
||||
# PostgreSQL
|
||||
sudo systemctl start postgresql # Linux
|
||||
brew services start postgresql # Mac
|
||||
|
||||
# MySQL
|
||||
sudo systemctl start mysql # Linux
|
||||
brew services start mysql # Mac
|
||||
|
||||
# MongoDB
|
||||
sudo systemctl start mongod # Linux
|
||||
brew services start mongodb-community # Mac
|
||||
```
|
||||
|
||||
#### Cause 2: Wrong Port
|
||||
- PostgreSQL default: 5432
|
||||
- MySQL default: 3306
|
||||
- MongoDB default: 27017
|
||||
|
||||
#### Cause 3: Wrong Host
|
||||
```python
|
||||
# Wrong
|
||||
host='localhost'
|
||||
|
||||
# Try
|
||||
host='127.0.0.1'
|
||||
# Or check actual host in database config
|
||||
```
|
||||
|
||||
### 2. Authentication Failed
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
Authentication failed for user 'username'
|
||||
Access denied for user 'username'@'localhost'
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Fix: Check Credentials
|
||||
```bash
|
||||
# PostgreSQL: Check user exists
|
||||
psql -U postgres
|
||||
\du # List users
|
||||
|
||||
# MySQL: Check user exists
|
||||
mysql -u root -p
|
||||
SELECT User, Host FROM mysql.user;
|
||||
|
||||
# Reset password if needed
|
||||
ALTER USER 'username' IDENTIFIED BY 'new_password';
|
||||
```
|
||||
|
||||
### 3. Database Does Not Exist
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
FATAL: database "dbname" does not exist
|
||||
ERROR 1049 (42000): Unknown database 'dbname'
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Fix: Create Database
|
||||
```sql
|
||||
-- PostgreSQL
|
||||
CREATE DATABASE dbname;
|
||||
|
||||
-- MySQL
|
||||
CREATE DATABASE dbname;
|
||||
|
||||
-- Or use command line
|
||||
createdb dbname # PostgreSQL
|
||||
mysql -e "CREATE DATABASE dbname" # MySQL
|
||||
```
|
||||
|
||||
## Package Manager Common Errors
|
||||
|
||||
### 1. npm: EACCES Permission Denied
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
npm ERR! code EACCES
|
||||
npm ERR! EACCES: permission denied
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Fix: Don't Use Sudo (Instead Fix Permissions)
|
||||
```bash
|
||||
# Fix npm global directory permissions
|
||||
mkdir ~/.npm-global
|
||||
npm config set prefix '~/.npm-global'
|
||||
echo 'export PATH=~/.npm-global/bin:$PATH' >> ~/.profile
|
||||
source ~/.profile
|
||||
```
|
||||
|
||||
### 2. pip: Could Not Find a Version
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
ERROR: Could not find a version that satisfies the requirement package_name
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: Typo in Package Name
|
||||
```bash
|
||||
# Fix: Check correct name on PyPI
|
||||
pip search package_name # (Note: search disabled on PyPI)
|
||||
# Or search on https://pypi.org
|
||||
```
|
||||
|
||||
#### Cause 2: Python Version Incompatibility
|
||||
```bash
|
||||
# Check Python version
|
||||
python --version
|
||||
|
||||
# Some packages require specific Python versions
|
||||
# Upgrade Python or find compatible package version
|
||||
pip install package_name==1.2.3
|
||||
```
|
||||
|
||||
#### Cause 3: No Internet / Proxy Issue
|
||||
```bash
|
||||
# Check connectivity
|
||||
ping pypi.org
|
||||
|
||||
# If behind proxy
|
||||
pip install --proxy http://proxy:port package_name
|
||||
```
|
||||
|
||||
### 3. Requirements File Error
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
ERROR: Invalid requirement: 'package_name==1.0.0\r' (from line X of requirements.txt)
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause: Windows Line Endings
|
||||
```bash
|
||||
# Fix: Convert to Unix line endings
|
||||
dos2unix requirements.txt
|
||||
|
||||
# Or with Python
|
||||
python -c "import sys; data = open('requirements.txt', 'r').read(); open('requirements.txt', 'w').write(data.replace('\r\n', '\n'))"
|
||||
```
|
||||
|
||||
## Build Tool Common Errors
|
||||
|
||||
### 1. Make: Command Not Found
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
make: command not found
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Fix: Install Build Tools
|
||||
```bash
|
||||
# Debian/Ubuntu
|
||||
sudo apt-get install build-essential
|
||||
|
||||
# macOS
|
||||
xcode-select --install
|
||||
|
||||
# CentOS/RHEL
|
||||
sudo yum groupinstall "Development Tools"
|
||||
```
|
||||
|
||||
### 2. Compiler Error: Missing Header
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
fatal error: Python.h: No such file or directory
|
||||
fatal error: openssl/ssl.h: No such file or directory
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Fix: Install Development Headers
|
||||
```bash
|
||||
# Python headers
|
||||
sudo apt-get install python3-dev # Debian/Ubuntu
|
||||
sudo yum install python3-devel # CentOS/RHEL
|
||||
|
||||
# OpenSSL headers
|
||||
sudo apt-get install libssl-dev # Debian/Ubuntu
|
||||
sudo yum install openssl-devel # CentOS/RHEL
|
||||
```
|
||||
|
||||
## Cross-Platform Path Issues
|
||||
|
||||
### Windows Backslash vs Unix Forward Slash
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
FileNotFoundError: [Errno 2] No such file or directory: 'path\\to\\file'
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Fix: Use os.path or pathlib
|
||||
```python
|
||||
# Wrong (platform-specific)
|
||||
path = 'C:\\Users\\name\\file.txt' # Windows only
|
||||
path = '/home/user/file.txt' # Unix only
|
||||
|
||||
# Right (cross-platform)
|
||||
import os
|
||||
path = os.path.join('Users', 'name', 'file.txt')
|
||||
|
||||
# Or use pathlib (Python 3.4+)
|
||||
from pathlib import Path
|
||||
path = Path('Users') / 'name' / 'file.txt'
|
||||
```
|
||||
|
||||
## Quick Diagnosis Checklist
|
||||
|
||||
When encountering an error, quickly check:
|
||||
|
||||
- [ ] Is the tool/service running?
|
||||
- [ ] Are dependencies installed?
|
||||
- [ ] Is it a typo (file name, variable, import)?
|
||||
- [ ] Am I in the right directory?
|
||||
- [ ] Am I using the right version (Python, Node, package)?
|
||||
- [ ] Am I in the right environment (virtual env, conda env)?
|
||||
- [ ] Is there a port conflict?
|
||||
- [ ] Do I have proper permissions?
|
||||
- [ ] Are there network/firewall issues?
|
||||
- [ ] Did I check the error message carefully?
|
||||
|
||||
## Adding New Patterns
|
||||
|
||||
As new common errors are discovered during troubleshooting:
|
||||
|
||||
1. **Document the pattern**: Error message template
|
||||
2. **Document the cause**: Why it happens
|
||||
3. **Document the fix**: Step-by-step solution
|
||||
4. **Test the fix**: Verify it works reliably
|
||||
5. **Add to this file**: So it's available for future use
|
||||
|
||||
**Format:**
|
||||
```markdown
|
||||
### N. Brief Error Description
|
||||
|
||||
**Pattern:**
|
||||
```
|
||||
Error message pattern
|
||||
```
|
||||
|
||||
**Common Causes & Fixes:**
|
||||
|
||||
#### Cause 1: Specific Cause
|
||||
```bash
|
||||
# Fix
|
||||
command or code to fix
|
||||
```
|
||||
|
||||
**Notes:** Additional context or warnings
|
||||
```
|
||||
591
skills/error-troubleshooter/references/environment-info-guide.md
Normal file
591
skills/error-troubleshooter/references/environment-info-guide.md
Normal file
@@ -0,0 +1,591 @@
|
||||
# Environment Information Collection Guide
|
||||
|
||||
This guide provides systematic approaches for gathering environment information during error troubleshooting while maintaining strict privacy and security standards.
|
||||
|
||||
## Core Privacy Principles
|
||||
|
||||
### Never Collect Without Authorization
|
||||
|
||||
**Absolutely Forbidden** (never collect these):
|
||||
- Passwords or password hashes
|
||||
- API keys, tokens, or credentials
|
||||
- Private keys or certificates
|
||||
- Personal identifiable information (PII): names, emails, addresses, phone numbers
|
||||
- Financial information
|
||||
- Session cookies
|
||||
- Authentication headers
|
||||
- Database connection strings with credentials
|
||||
|
||||
**Requires Explicit User Permission**:
|
||||
- Project-specific file paths
|
||||
- Custom configuration files
|
||||
- Custom environment variables
|
||||
- Application logs (may contain sensitive data)
|
||||
- Network configurations
|
||||
- User-specific system settings
|
||||
|
||||
**Generally Safe** (collect without permission):
|
||||
- Public software versions
|
||||
- Operating system type and version
|
||||
- Public package versions
|
||||
- Standard error messages
|
||||
- Command outputs that don't reveal sensitive paths or data
|
||||
|
||||
### Privacy-First Collection Strategy
|
||||
|
||||
1. **Assess Necessity**: Only collect information directly relevant to the error
|
||||
2. **Request Permission**: When in doubt, ask the user first
|
||||
3. **Sanitize Output**: Remove sensitive data before recording
|
||||
4. **Minimize Scope**: Collect the smallest amount needed
|
||||
5. **Explain Purpose**: Tell the user why specific information is needed
|
||||
|
||||
## Environment Information Categories
|
||||
|
||||
### 1. System Environment
|
||||
|
||||
**What to Collect:**
|
||||
- Operating system and version
|
||||
- System architecture (x86, ARM, etc.)
|
||||
- Shell/terminal type
|
||||
- Locale and encoding settings
|
||||
|
||||
**Collection Commands:**
|
||||
|
||||
```bash
|
||||
# Cross-platform OS detection
|
||||
# Linux/Mac
|
||||
uname -a
|
||||
uname -s # Just OS name
|
||||
uname -r # Just kernel version
|
||||
|
||||
# Windows
|
||||
ver
|
||||
systeminfo | findstr /B /C:"OS Name" /C:"OS Version"
|
||||
|
||||
# Architecture
|
||||
uname -m # Linux/Mac
|
||||
echo %PROCESSOR_ARCHITECTURE% # Windows
|
||||
|
||||
# Locale
|
||||
locale # Linux/Mac
|
||||
echo %LANG% # Unix-like
|
||||
chcp # Windows (code page)
|
||||
```
|
||||
|
||||
**Privacy Notes:**
|
||||
- These commands generally don't expose sensitive information
|
||||
- Hostname may be included in `uname -a` output (consider sanitizing)
|
||||
|
||||
### 2. Language Runtime Environment
|
||||
|
||||
**What to Collect:**
|
||||
- Programming language version
|
||||
- Runtime environment details
|
||||
- Virtual environment status
|
||||
|
||||
**Collection Commands by Language:**
|
||||
|
||||
#### Python
|
||||
```bash
|
||||
# Version
|
||||
python --version
|
||||
python3 --version
|
||||
|
||||
# Detailed info
|
||||
python -c "import sys; print(sys.version)"
|
||||
|
||||
# Virtual environment detection
|
||||
echo $VIRTUAL_ENV # Unix-like
|
||||
echo %VIRTUAL_ENV% # Windows
|
||||
|
||||
# Python path
|
||||
python -c "import sys; print(sys.executable)"
|
||||
```
|
||||
|
||||
#### Node.js/JavaScript
|
||||
```bash
|
||||
# Versions
|
||||
node --version
|
||||
npm --version
|
||||
yarn --version
|
||||
|
||||
# Node environment
|
||||
echo $NODE_ENV
|
||||
|
||||
# Global package location
|
||||
npm config get prefix
|
||||
```
|
||||
|
||||
#### Java
|
||||
```bash
|
||||
# Version
|
||||
java -version
|
||||
javac -version
|
||||
|
||||
# Runtime details
|
||||
java -XshowSettings:properties -version 2>&1 | grep 'java.version'
|
||||
```
|
||||
|
||||
#### Ruby
|
||||
```bash
|
||||
# Version
|
||||
ruby --version
|
||||
|
||||
# Gem environment
|
||||
gem env
|
||||
```
|
||||
|
||||
#### Go
|
||||
```bash
|
||||
# Version
|
||||
go version
|
||||
|
||||
# Environment
|
||||
go env
|
||||
```
|
||||
|
||||
**Privacy Notes:**
|
||||
- Installation paths may reveal usernames (sanitize if needed)
|
||||
- Custom environment variables may contain sensitive data
|
||||
|
||||
### 3. Package Dependencies
|
||||
|
||||
**What to Collect:**
|
||||
- Installed package versions (relevant to error)
|
||||
- Package manager version
|
||||
- Dependency conflicts
|
||||
- Lock file status
|
||||
|
||||
**Collection Commands:**
|
||||
|
||||
#### Python (pip)
|
||||
```bash
|
||||
# Specific package version
|
||||
pip show <package-name>
|
||||
pip list | grep <package-name>
|
||||
|
||||
# All packages (use sparingly, large output)
|
||||
pip list
|
||||
|
||||
# Dependency conflicts
|
||||
pip check
|
||||
|
||||
# Requirements file
|
||||
cat requirements.txt # Request permission first
|
||||
```
|
||||
|
||||
#### Python (conda)
|
||||
```bash
|
||||
# Environment info
|
||||
conda info
|
||||
|
||||
# Installed packages
|
||||
conda list <package-name>
|
||||
|
||||
# Environment exports
|
||||
conda env export # Use with caution, may be large
|
||||
```
|
||||
|
||||
#### Node.js (npm)
|
||||
```bash
|
||||
# Specific package version
|
||||
npm list <package-name>
|
||||
npm view <package-name> version
|
||||
|
||||
# Global packages
|
||||
npm list -g --depth=0
|
||||
|
||||
# Dependency audit
|
||||
npm doctor
|
||||
npm audit
|
||||
|
||||
# Lock file status
|
||||
ls -l package-lock.json
|
||||
```
|
||||
|
||||
#### Ruby (gem)
|
||||
```bash
|
||||
# Specific gem version
|
||||
gem list <gem-name>
|
||||
|
||||
# All gems
|
||||
gem list
|
||||
|
||||
# Gem environment
|
||||
gem env
|
||||
```
|
||||
|
||||
**Privacy Notes:**
|
||||
- Package lists can be large; collect only relevant packages when possible
|
||||
- Lock files may contain private registry URLs (review before sharing)
|
||||
- Package names might reveal business logic (request permission for private packages)
|
||||
|
||||
### 4. Configuration Files
|
||||
|
||||
**What to Collect:**
|
||||
Configuration files often contain sensitive data. Always exercise caution.
|
||||
|
||||
**Approach:**
|
||||
1. **Identify relevant config**: Only collect configs directly related to the error
|
||||
2. **Request permission**: Always ask before reading project-specific configs
|
||||
3. **Sanitize**: Remove credentials, API keys, and sensitive values before recording
|
||||
4. **Provide context**: Explain why the config is needed
|
||||
|
||||
**Common Configuration Files:**
|
||||
|
||||
```bash
|
||||
# Python
|
||||
# - setup.py, setup.cfg, pyproject.toml, tox.ini
|
||||
|
||||
# Node.js
|
||||
# - package.json (usually safe), .npmrc (check for tokens)
|
||||
|
||||
# General
|
||||
# - .env files (NEVER share without sanitization)
|
||||
# - config.json, config.yaml (sanitize before sharing)
|
||||
```
|
||||
|
||||
**Sanitization Example:**
|
||||
|
||||
```yaml
|
||||
# Before sanitization
|
||||
database:
|
||||
host: db.example.com
|
||||
username: admin
|
||||
password: super_secret_123
|
||||
port: 5432
|
||||
|
||||
# After sanitization
|
||||
database:
|
||||
host: [REDACTED]
|
||||
username: [REDACTED]
|
||||
password: [REDACTED]
|
||||
port: 5432
|
||||
```
|
||||
|
||||
### 5. Environment Variables
|
||||
|
||||
**What to Collect:**
|
||||
Environment variables often contain sensitive data. Collect with extreme caution.
|
||||
|
||||
**Approach:**
|
||||
1. **Be Specific**: Only check specific, relevant variables
|
||||
2. **Avoid Wildcards**: Never do `env` or `printenv` without filtering
|
||||
3. **Sanitize**: Redact values that might be sensitive
|
||||
4. **Public Variables Only**: Prefer checking well-known, non-sensitive variables
|
||||
|
||||
**Safe Environment Variables:**
|
||||
|
||||
```bash
|
||||
# Locale and encoding
|
||||
echo $LANG
|
||||
echo $LC_ALL
|
||||
|
||||
# Shell
|
||||
echo $SHELL
|
||||
|
||||
# Path (usually safe, but may reveal usernames)
|
||||
echo $PATH
|
||||
|
||||
# Node environment
|
||||
echo $NODE_ENV
|
||||
|
||||
# Python path
|
||||
echo $PYTHONPATH
|
||||
```
|
||||
|
||||
**Potentially Sensitive Variables:**
|
||||
|
||||
```bash
|
||||
# Requires permission or sanitization
|
||||
# - API keys: $API_KEY, $SECRET_KEY, $TOKEN
|
||||
# - Database URLs: $DATABASE_URL
|
||||
# - Credentials: $USERNAME, $PASSWORD
|
||||
# - Custom app settings: $APP_* variables
|
||||
```
|
||||
|
||||
**Collection Command (filtered):**
|
||||
|
||||
```bash
|
||||
# Safe: Check specific variable
|
||||
echo $LANG
|
||||
|
||||
# Risky: List all variables (AVOID unless necessary)
|
||||
env
|
||||
printenv
|
||||
|
||||
# Better: Filter for specific patterns
|
||||
env | grep -i "^PYTHON"
|
||||
env | grep -i "^NODE"
|
||||
```
|
||||
|
||||
### 6. Network and Connectivity
|
||||
|
||||
**What to Collect:**
|
||||
- Network connectivity status
|
||||
- DNS resolution (for external services)
|
||||
- Proxy settings
|
||||
- Firewall status (general)
|
||||
|
||||
**Collection Commands:**
|
||||
|
||||
```bash
|
||||
# Test connectivity
|
||||
ping -c 4 google.com # Linux/Mac
|
||||
ping -n 4 google.com # Windows
|
||||
|
||||
# DNS resolution
|
||||
nslookup example.com
|
||||
dig example.com # Linux/Mac
|
||||
|
||||
# Proxy settings (may contain credentials - sanitize)
|
||||
echo $HTTP_PROXY
|
||||
echo $HTTPS_PROXY
|
||||
|
||||
# Network interfaces (general info)
|
||||
ifconfig # Linux/Mac
|
||||
ipconfig # Windows
|
||||
|
||||
# Firewall status (general)
|
||||
sudo ufw status # Linux (Ubuntu)
|
||||
netsh advfirewall show allprofiles # Windows (requires admin)
|
||||
```
|
||||
|
||||
**Privacy Notes:**
|
||||
- Internal IP addresses are generally low-risk
|
||||
- Proxy settings may contain authentication credentials (sanitize)
|
||||
- Network topology might be sensitive in enterprise environments
|
||||
|
||||
### 7. File System and Permissions
|
||||
|
||||
**What to Collect:**
|
||||
- File existence and permissions (for files mentioned in error)
|
||||
- Directory structure (limited)
|
||||
- Disk space (if relevant)
|
||||
|
||||
**Collection Commands:**
|
||||
|
||||
```bash
|
||||
# File info (request permission if non-system file)
|
||||
ls -la /path/to/file # Linux/Mac
|
||||
dir /path/to/file # Windows
|
||||
|
||||
# Permissions
|
||||
stat /path/to/file # Linux/Mac
|
||||
|
||||
# Disk space
|
||||
df -h # Linux/Mac
|
||||
wmic logicaldisk get size,freespace,caption # Windows
|
||||
|
||||
# Check if file exists
|
||||
test -f /path/to/file && echo "exists" || echo "not found"
|
||||
```
|
||||
|
||||
**Privacy Notes:**
|
||||
- File paths may reveal usernames or project structure (request permission)
|
||||
- Avoid listing directory contents unless necessary
|
||||
|
||||
## Collection Workflow
|
||||
|
||||
### Step 1: Assess Relevance
|
||||
|
||||
Before collecting any information, ask:
|
||||
- Is this directly related to the error?
|
||||
- Will this information help diagnose or resolve the issue?
|
||||
- Is there a less invasive way to get the same information?
|
||||
|
||||
### Step 2: Categorize Sensitivity
|
||||
|
||||
Classify the information:
|
||||
- **Public**: Widely available, non-sensitive (e.g., OS version)
|
||||
- **Private**: User-specific but non-confidential (e.g., package versions)
|
||||
- **Confidential**: May contain sensitive data (e.g., config files)
|
||||
- **Secret**: Credentials, keys, PII (NEVER collect without explicit permission)
|
||||
|
||||
### Step 3: Request Permission When Needed
|
||||
|
||||
For private or confidential information:
|
||||
|
||||
```
|
||||
"To diagnose this error, I need to check [specific information].
|
||||
This will involve [specific action].
|
||||
Is it okay to proceed?"
|
||||
```
|
||||
|
||||
Example:
|
||||
```
|
||||
"To diagnose this database connection error, I need to check your database
|
||||
configuration settings. This will involve reading the config/database.yml file.
|
||||
Any sensitive values will be redacted. Is it okay to proceed?"
|
||||
```
|
||||
|
||||
### Step 4: Collect and Sanitize
|
||||
|
||||
Execute the collection command and immediately sanitize:
|
||||
|
||||
1. **Capture output**
|
||||
2. **Review for sensitive data**
|
||||
3. **Redact or replace sensitive values**
|
||||
4. **Document what was redacted**
|
||||
|
||||
### Step 5: Document Collection
|
||||
|
||||
Record what was collected and why:
|
||||
- What information was gathered
|
||||
- Why it was needed
|
||||
- What commands were used
|
||||
- What was sanitized
|
||||
|
||||
## Sanitization Techniques
|
||||
|
||||
### Pattern-Based Redaction
|
||||
|
||||
Common patterns to redact:
|
||||
|
||||
```bash
|
||||
# API keys (various formats)
|
||||
AIza[0-9A-Za-z-_]{35} # Google API keys
|
||||
sk_live_[0-9a-zA-Z]{24} # Stripe keys
|
||||
[0-9a-f]{32} # Generic 32-char hex keys
|
||||
|
||||
# Email addresses
|
||||
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
|
||||
|
||||
# URLs with credentials
|
||||
https?://[^:]+:[^@]+@[^/]+
|
||||
|
||||
# IP addresses (if needed)
|
||||
\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b
|
||||
|
||||
# File paths with usernames
|
||||
/home/[^/]+/ -> /home/[USERNAME]/
|
||||
C:\\Users\\[^\\]+\\ -> C:\Users\[USERNAME]\
|
||||
```
|
||||
|
||||
### Replacement Strategies
|
||||
|
||||
```bash
|
||||
# Replace with generic placeholder
|
||||
password: super_secret_123 → password: [REDACTED]
|
||||
|
||||
# Replace with type indicator
|
||||
api_key: sk_live_abc123xyz → api_key: [API_KEY]
|
||||
|
||||
# Partial redaction
|
||||
email: john.doe@example.com → email: [***]@example.com
|
||||
|
||||
# Anonymize paths
|
||||
/home/john/project → /home/[USER]/project
|
||||
```
|
||||
|
||||
## Quick Reference: Collection Decision Tree
|
||||
|
||||
```
|
||||
Need environment information?
|
||||
↓
|
||||
Is it sensitive or user-specific?
|
||||
├─ NO → Collect directly
|
||||
│ (e.g., OS version, Python version)
|
||||
│
|
||||
└─ YES → Does it contain credentials or PII?
|
||||
├─ YES → Request explicit permission
|
||||
│ ↓
|
||||
│ Permission granted?
|
||||
│ ├─ YES → Collect and sanitize
|
||||
│ └─ NO → Find alternative approach
|
||||
│
|
||||
└─ NO → Is it project-specific?
|
||||
├─ YES → Request permission
|
||||
└─ NO → Collect and sanitize proactively
|
||||
```
|
||||
|
||||
## Common Scenarios
|
||||
|
||||
### Scenario 1: Module Not Found Error
|
||||
|
||||
**Information Needed:**
|
||||
- Python version
|
||||
- pip version
|
||||
- Virtual environment status
|
||||
- Package installation status
|
||||
|
||||
**Collection:**
|
||||
```bash
|
||||
python --version
|
||||
pip --version
|
||||
echo $VIRTUAL_ENV
|
||||
pip show <package-name>
|
||||
```
|
||||
|
||||
**Privacy Impact:** Low (all public information)
|
||||
|
||||
### Scenario 2: Database Connection Error
|
||||
|
||||
**Information Needed:**
|
||||
- Database client version
|
||||
- Connection configuration (sanitized)
|
||||
- Network connectivity
|
||||
|
||||
**Collection:**
|
||||
```bash
|
||||
# Client version (safe)
|
||||
psql --version # PostgreSQL
|
||||
mysql --version # MySQL
|
||||
|
||||
# Configuration (REQUIRES PERMISSION)
|
||||
# Request permission, then read config with sanitization
|
||||
|
||||
# Connectivity (safe)
|
||||
ping -c 4 database.host.com
|
||||
nslookup database.host.com
|
||||
```
|
||||
|
||||
**Privacy Impact:** Medium-High (config contains credentials)
|
||||
|
||||
### Scenario 3: Build Failure
|
||||
|
||||
**Information Needed:**
|
||||
- Compiler/build tool version
|
||||
- System libraries
|
||||
- Build configuration
|
||||
|
||||
**Collection:**
|
||||
```bash
|
||||
# Build tools (safe)
|
||||
gcc --version
|
||||
make --version
|
||||
cmake --version
|
||||
|
||||
# Package manager (safe)
|
||||
apt list --installed | grep <lib-name> # Debian/Ubuntu
|
||||
brew info <lib-name> # macOS
|
||||
|
||||
# Build config (request permission for project-specific)
|
||||
cat CMakeLists.txt
|
||||
cat Makefile
|
||||
```
|
||||
|
||||
**Privacy Impact:** Low-Medium (build config might reveal project details)
|
||||
|
||||
## Best Practices Summary
|
||||
|
||||
1. **Collect Minimally**: Only gather what's directly relevant
|
||||
2. **Request Permission**: When information is user-specific or potentially sensitive
|
||||
3. **Sanitize Proactively**: Remove credentials and PII before recording
|
||||
4. **Document Purpose**: Explain why information is needed
|
||||
5. **Validate Necessity**: Double-check if collection is truly required
|
||||
6. **Use Specific Commands**: Avoid broad commands like `env` or `find /`
|
||||
7. **Respect User Privacy**: When uncertain, err on the side of asking permission
|
||||
8. **Provide Context**: Help users understand what information will be accessed
|
||||
|
||||
## Red Flags: Never Collect
|
||||
|
||||
- Raw credential files (.env, credentials.json)
|
||||
- Browser cookies or session storage
|
||||
- SSH keys or SSL certificates
|
||||
- Database dumps
|
||||
- Full process listings (might expose arguments with credentials)
|
||||
- Complete environment variable dumps
|
||||
- User home directory listings
|
||||
- Git repository contents (without permission)
|
||||
- Application logs (without permission and sanitization)
|
||||
@@ -0,0 +1,344 @@
|
||||
# Error Template Pattern Recognition
|
||||
|
||||
This guide explains how to identify and extract error message templates for effective web searching and pattern matching.
|
||||
|
||||
## Why Extract Templates?
|
||||
|
||||
SDK and API errors typically follow fixed templates with variable components. Searching for the full error (including variables like file paths, user inputs, or IDs) rarely yields useful results. Extracting the template allows finding relevant discussions and solutions that apply to your specific case.
|
||||
|
||||
## Template Extraction Process
|
||||
|
||||
### Step 1: Identify the Error Structure
|
||||
|
||||
Most errors follow this general structure:
|
||||
|
||||
```
|
||||
[Error Type/Class]: [Core Message] [Additional Context] [Variable Details]
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```
|
||||
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/data.csv'
|
||||
↑ ↑ ↑
|
||||
Error Code Core Message Variable (Path)
|
||||
```
|
||||
|
||||
### Step 2: Categorize Components
|
||||
|
||||
Classify each part of the error message:
|
||||
|
||||
#### Fixed Components (Keep These)
|
||||
- Error type/class names (e.g., `ValueError`, `TypeError`, `ConnectionError`)
|
||||
- Standard error codes (e.g., `[Errno 2]`, `HTTP 404`, `ENOENT`)
|
||||
- Core message structure (e.g., "No such file or directory")
|
||||
- Standard library/SDK function names
|
||||
- Generic parameter names in templates (e.g., "expected {type}, got {type}")
|
||||
|
||||
#### Variable Components (Remove These)
|
||||
- File system paths: `/home/user/project/file.py`, `C:\Users\Name\file.txt`
|
||||
- URLs: `https://api.example.com/endpoint`
|
||||
- User inputs: `'user_entered_value'`, `"some string"`
|
||||
- Identifiers: UUIDs, database IDs, session tokens
|
||||
- Timestamps: `2024-01-15 10:30:45`, `1642234567`
|
||||
- Usernames/emails: `john@example.com`, `user123`
|
||||
- Line numbers: `line 42` (unless part of standard template)
|
||||
- Function call arguments: `function(arg1='value', arg2=123)`
|
||||
- IP addresses and ports: `192.168.1.1:8080`
|
||||
|
||||
### Step 3: Extract the Template
|
||||
|
||||
Remove all variable components while preserving the error structure.
|
||||
|
||||
## Template Extraction Examples
|
||||
|
||||
### Python Errors
|
||||
|
||||
#### Example 1: Import Error
|
||||
```
|
||||
Original:
|
||||
ModuleNotFoundError: No module named 'pandas'
|
||||
|
||||
Template:
|
||||
ModuleNotFoundError No module named
|
||||
|
||||
Reasoning:
|
||||
- 'ModuleNotFoundError': Error class (keep)
|
||||
- 'No module named': Core message (keep)
|
||||
- 'pandas': Specific module name (remove - variable)
|
||||
```
|
||||
|
||||
#### Example 2: Type Error
|
||||
```
|
||||
Original:
|
||||
TypeError: unsupported operand type(s) for +: 'int' and 'str' at line 42 in /home/user/script.py
|
||||
|
||||
Template:
|
||||
TypeError unsupported operand type(s) for
|
||||
|
||||
Reasoning:
|
||||
- 'TypeError': Error class (keep)
|
||||
- 'unsupported operand type(s) for': Core message (keep)
|
||||
- '+: 'int' and 'str'': Specific types and operator (remove - variable)
|
||||
- 'at line 42': Line number (remove - variable)
|
||||
- '/home/user/script.py': File path (remove - variable)
|
||||
```
|
||||
|
||||
#### Example 3: Value Error
|
||||
```
|
||||
Original:
|
||||
ValueError: invalid literal for int() with base 10: 'abc'
|
||||
|
||||
Template:
|
||||
ValueError invalid literal for int() with base 10
|
||||
|
||||
Reasoning:
|
||||
- 'ValueError': Error class (keep)
|
||||
- 'invalid literal for int() with base 10': Standard message (keep)
|
||||
- 'abc': User input value (remove - variable)
|
||||
```
|
||||
|
||||
### JavaScript/Node.js Errors
|
||||
|
||||
#### Example 4: Reference Error
|
||||
```
|
||||
Original:
|
||||
ReferenceError: myVariable is not defined at Object.<anonymous> (/home/user/project/app.js:15:5)
|
||||
|
||||
Template:
|
||||
ReferenceError is not defined
|
||||
|
||||
Reasoning:
|
||||
- 'ReferenceError': Error type (keep)
|
||||
- 'is not defined': Core message (keep)
|
||||
- 'myVariable': Specific variable name (remove - variable)
|
||||
- Location information: (remove - variable)
|
||||
```
|
||||
|
||||
#### Example 5: Network Error
|
||||
```
|
||||
Original:
|
||||
Error: connect ECONNREFUSED 127.0.0.1:3000 at TCPConnectWrap.afterConnect [as oncomplete]
|
||||
|
||||
Template:
|
||||
Error connect ECONNREFUSED
|
||||
|
||||
Reasoning:
|
||||
- 'Error': Error type (keep)
|
||||
- 'connect': Operation (keep)
|
||||
- 'ECONNREFUSED': Standard error code (keep)
|
||||
- '127.0.0.1:3000': IP and port (remove - variable)
|
||||
- Stack trace info: (remove - variable)
|
||||
```
|
||||
|
||||
### HTTP/API Errors
|
||||
|
||||
#### Example 6: HTTP Error
|
||||
```
|
||||
Original:
|
||||
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://api.example.com/v1/users/12345
|
||||
|
||||
Template:
|
||||
requests.exceptions.HTTPError 404 Client Error Not Found
|
||||
|
||||
Reasoning:
|
||||
- 'requests.exceptions.HTTPError': Exception class (keep)
|
||||
- '404': Standard HTTP status code (keep)
|
||||
- 'Client Error: Not Found': Standard HTTP message (keep)
|
||||
- URL: (remove - variable)
|
||||
```
|
||||
|
||||
#### Example 7: Connection Error
|
||||
```
|
||||
Original:
|
||||
requests.exceptions.ConnectionError: HTTPConnectionPool(host='api.example.com', port=443): Max retries exceeded with url: /v1/users (Caused by NewConnectionError)
|
||||
|
||||
Template:
|
||||
requests.exceptions.ConnectionError HTTPConnectionPool Max retries exceeded
|
||||
|
||||
Reasoning:
|
||||
- Exception class (keep)
|
||||
- 'HTTPConnectionPool', 'Max retries exceeded': Core message components (keep)
|
||||
- Host, port, URL: (remove - variable)
|
||||
- Specific cause details: (remove - variable)
|
||||
```
|
||||
|
||||
### Database Errors
|
||||
|
||||
#### Example 8: SQL Error
|
||||
```
|
||||
Original:
|
||||
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "users_email_key" DETAIL: Key (email)=(john@example.com) already exists.
|
||||
|
||||
Template:
|
||||
psycopg2.errors.UniqueViolation duplicate key value violates unique constraint
|
||||
|
||||
Reasoning:
|
||||
- Error class (keep)
|
||||
- Core message structure (keep)
|
||||
- Constraint name, field value: (remove - variable)
|
||||
```
|
||||
|
||||
### File System Errors
|
||||
|
||||
#### Example 9: Permission Error
|
||||
```
|
||||
Original:
|
||||
PermissionError: [Errno 13] Permission denied: '/var/log/app.log'
|
||||
|
||||
Template:
|
||||
PermissionError [Errno 13] Permission denied
|
||||
|
||||
Reasoning:
|
||||
- Error type (keep)
|
||||
- Error code (keep)
|
||||
- Core message (keep)
|
||||
- File path: (remove - variable)
|
||||
```
|
||||
|
||||
### Package Manager Errors
|
||||
|
||||
#### Example 10: NPM Error
|
||||
```
|
||||
Original:
|
||||
npm ERR! code ERESOLVE
|
||||
npm ERR! ERESOLVE unable to resolve dependency tree for myproject@1.0.0
|
||||
|
||||
Template:
|
||||
npm ERR! code ERESOLVE unable to resolve dependency tree
|
||||
|
||||
Reasoning:
|
||||
- Error prefix and code (keep)
|
||||
- Core message (keep)
|
||||
- Package name and version: (remove - variable)
|
||||
```
|
||||
|
||||
## Common Patterns by Language/Framework
|
||||
|
||||
### Python Standard Templates
|
||||
- `[ErrorType]: [Message]`
|
||||
- `[ErrorType]: [Message] at line [number] in [file]`
|
||||
- `[Module].[ErrorType]: [Message]`
|
||||
|
||||
### JavaScript/Node Standard Templates
|
||||
- `[ErrorType]: [Message] at [Location]`
|
||||
- `Error: [Operation] [ErrorCode]`
|
||||
- `Unhandled [PromiseRejection/Error]: [Message]`
|
||||
|
||||
### HTTP/REST API Templates
|
||||
- `[StatusCode] [StatusMessage]: [Description]`
|
||||
- `[Library].[Exception]: [StatusCode] [Message]`
|
||||
|
||||
### Database Templates
|
||||
- `[Driver].[ErrorType]: [Message]`
|
||||
- `[Database] Error [Code]: [Message]`
|
||||
|
||||
## Template Quality Checklist
|
||||
|
||||
A good error template should:
|
||||
|
||||
✓ Be searchable (yields relevant results on Stack Overflow/GitHub)
|
||||
✓ Be generic (applies to multiple specific instances of the error)
|
||||
✓ Retain error classification (type/class name)
|
||||
✓ Preserve standard error codes
|
||||
✓ Remove all user-specific or environment-specific details
|
||||
✓ Be concise (typically 3-8 words)
|
||||
|
||||
## Testing Your Template
|
||||
|
||||
After extracting a template, validate it:
|
||||
|
||||
1. **Search Test**: Search the template on Stack Overflow or Google
|
||||
- Good template: Returns relevant discussions about the error type
|
||||
- Bad template: Returns no results or overly specific results
|
||||
|
||||
2. **Generalization Test**: Would this template match similar errors from other users?
|
||||
- Good template: Yes, it matches the general pattern
|
||||
- Bad template: No, it's too specific to your case
|
||||
|
||||
3. **Specificity Test**: Is the template specific enough to be useful?
|
||||
- Good template: Identifies a specific error condition
|
||||
- Bad template: Too vague (e.g., just "Error")
|
||||
|
||||
## Advanced: Multi-Error Messages
|
||||
|
||||
Some errors contain multiple nested errors or stack traces:
|
||||
|
||||
### Example: Nested Errors
|
||||
```
|
||||
Original:
|
||||
RuntimeError: Failed to initialize module
|
||||
Caused by: ImportError: cannot import name 'foo' from 'bar' (/path/to/bar.py)
|
||||
Caused by: AttributeError: module 'bar' has no attribute 'foo'
|
||||
|
||||
Template Approach:
|
||||
- Extract primary error: "RuntimeError Failed to initialize module"
|
||||
- Extract root cause: "AttributeError module has no attribute"
|
||||
- Search both templates for comprehensive results
|
||||
```
|
||||
|
||||
**Strategy**: Extract templates for both the outer error and the root cause, as either may lead to relevant solutions.
|
||||
|
||||
## Quick Reference: Variable Component Checklist
|
||||
|
||||
Remove these from error messages:
|
||||
- [ ] File paths (absolute or relative)
|
||||
- [ ] URLs and domains
|
||||
- [ ] User inputs and arguments
|
||||
- [ ] Database record IDs
|
||||
- [ ] Session tokens and API keys
|
||||
- [ ] Timestamps and dates
|
||||
- [ ] IP addresses and ports
|
||||
- [ ] Usernames and emails
|
||||
- [ ] Line numbers (usually)
|
||||
- [ ] Stack trace locations
|
||||
- [ ] Variable names (user-defined)
|
||||
- [ ] Specific package versions (unless relevant to breaking changes)
|
||||
|
||||
Keep these in templates:
|
||||
- [x] Error type/class names
|
||||
- [x] Standard error codes
|
||||
- [x] Core message structure
|
||||
- [x] Standard library function names
|
||||
- [x] HTTP status codes
|
||||
- [x] Generic type names (when part of template)
|
||||
- [x] Standard operations (e.g., "connect", "read", "write")
|
||||
|
||||
## Practice Examples
|
||||
|
||||
Try extracting templates from these errors:
|
||||
|
||||
### Exercise 1
|
||||
```
|
||||
JSONDecodeError: Expecting value: line 1 column 1 (char 0) in /home/user/data.json
|
||||
```
|
||||
|
||||
<details>
|
||||
<summary>Answer</summary>
|
||||
Template: `JSONDecodeError Expecting value`
|
||||
|
||||
Reasoning: Remove line/column numbers and file path; keep error type and core message.
|
||||
</details>
|
||||
|
||||
### Exercise 2
|
||||
```
|
||||
AssertionError: Expected response status 200, got 404 for endpoint /api/users
|
||||
```
|
||||
|
||||
<details>
|
||||
<summary>Answer</summary>
|
||||
Template: `AssertionError Expected response status got`
|
||||
|
||||
Reasoning: Remove specific status codes and endpoint; keep error type and message structure.
|
||||
</details>
|
||||
|
||||
### Exercise 3
|
||||
```
|
||||
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not connect to server: Connection refused on port 5432
|
||||
```
|
||||
|
||||
<details>
|
||||
<summary>Answer</summary>
|
||||
Template: `sqlalchemy.exc.OperationalError could not connect to server Connection refused`
|
||||
|
||||
Reasoning: Keep full error class path and core message; remove port number.
|
||||
</details>
|
||||
@@ -0,0 +1,120 @@
|
||||
# Claude Assistant Guidelines
|
||||
|
||||
## Problem-Solving Mindset: Bold Hypotheses, Careful Verification
|
||||
|
||||
### Core Principles
|
||||
|
||||
When encountering technical problems or errors, follow this disciplined approach:
|
||||
|
||||
1. **大膽提出假說 (Bold Hypotheses)**
|
||||
- Generate multiple competing hypotheses to explain the problem
|
||||
- Don't be afraid to speculate about root causes
|
||||
- Cast a wide net - consider obvious and non-obvious possibilities
|
||||
|
||||
2. **小心求證 (Careful Verification)**
|
||||
- Rigorously verify each hypothesis with concrete evidence
|
||||
- Actively search for counter-evidence that disproves your hypotheses
|
||||
- Distinguish between official documentation, user reports, and speculation
|
||||
- Check for exact matches (e.g., model names, version numbers) - similar ≠ identical
|
||||
|
||||
3. **挑戰自己的推論 (Challenge Your Own Reasoning)**
|
||||
- Ask: "What would disprove this hypothesis?"
|
||||
- Look for successful counter-examples
|
||||
- Identify weak points in your evidence chain
|
||||
- Question assumptions and implicit leaps in logic
|
||||
|
||||
4. **承認不確定性 (Acknowledge Uncertainty)**
|
||||
- It's better to say "I don't know" than to pretend certainty
|
||||
- Present confidence levels for each hypothesis
|
||||
- Avoid premature conclusions when evidence is incomplete
|
||||
- Be explicit about what is proven vs. what is speculation
|
||||
|
||||
5. **設計實驗 (Design Experiments)**
|
||||
- Propose controlled experiments to distinguish between hypotheses
|
||||
- Order experiments by information gain and cost
|
||||
- Change one variable at a time
|
||||
- Collect diagnostic information before making changes
|
||||
|
||||
---
|
||||
|
||||
### Case Study: Gemini Live API Function Calling Investigation
|
||||
|
||||
**Context**: POC test script immediately closed connection with no error messages or responses.
|
||||
|
||||
#### ❌ Initial Mistakes
|
||||
|
||||
**Mistake 1: Over-interpreting vague language**
|
||||
- Saw: "Native audio models have limited tool use support"
|
||||
- Jumped to: "This is why our connection closed!"
|
||||
- Problem: "Limited" ≠ "doesn't work" or "causes connection closure"
|
||||
|
||||
**Mistake 2: Conflating user reports with official statements**
|
||||
- Found: GitHub Issues reporting function calling problems
|
||||
- Concluded: "Native audio doesn't support function calling"
|
||||
- Problem: User reports are bug reports, not design documentation. Issues were still OPEN/unresolved.
|
||||
|
||||
**Mistake 3: Assuming name similarity = identity**
|
||||
- Issue mentioned: `gemini-2.5-flash-preview-native-audio-dialog`
|
||||
- We used: `gemini-2.5-flash-native-audio-preview-09-2025`
|
||||
- Assumed: Same model, same limitations
|
||||
- Reality: Different releases (May vs September), latter has "improved function calling"
|
||||
|
||||
**Mistake 4: Not seeking counter-evidence**
|
||||
- Never searched for: "Working examples of gemini-2.5-flash-native-audio-preview-09-2025 with function calling"
|
||||
- Never checked: Official model capability table
|
||||
- Result: Missed official documentation stating model DOES support function calling
|
||||
|
||||
**Mistake 5: Skipping other possible causes**
|
||||
- Jumped to "function calling incompatibility"
|
||||
- Ignored: API key issues, SDK version mismatches, config format errors, network issues, quota limits, etc.
|
||||
|
||||
#### ✅ Corrected Approach
|
||||
|
||||
**Step 1: Verify exact capabilities**
|
||||
- Searched official docs specifically for our model name
|
||||
- Found: Official table explicitly states function calling IS supported
|
||||
- Conclusion: Hypothesis 3 was DISPROVEN by official source
|
||||
|
||||
**Step 2: Search for counter-examples**
|
||||
- Looked for successful usage examples
|
||||
- Result: Found claims of support but NO complete working examples
|
||||
- Insight: Documentation says it works, but no proof in the wild
|
||||
|
||||
**Step 3: Acknowledge limitations of evidence**
|
||||
- Admitted: Can't prove function calling is the issue
|
||||
- Admitted: Can't prove model switch will fix it
|
||||
- Admitted: Don't know root cause without more diagnostics
|
||||
|
||||
**Step 4: Propose experiments**
|
||||
- Collect error messages (close code/reason)
|
||||
- Test same model without tools (isolate variable)
|
||||
- Test different model with tools (isolate variable)
|
||||
- Replicate official examples exactly
|
||||
|
||||
---
|
||||
|
||||
### Checklist for Technical Investigations
|
||||
|
||||
Before presenting conclusions, verify:
|
||||
|
||||
- [ ] Have I checked official documentation for ALL components/versions involved?
|
||||
- [ ] Are my evidence sources exact matches (names, versions, dates)?
|
||||
- [ ] Have I actively searched for counter-evidence?
|
||||
- [ ] Can I distinguish between "official statement" vs "user report" vs "my inference"?
|
||||
- [ ] Have I listed what I DON'T know or can't prove?
|
||||
- [ ] Have I proposed experiments to test competing hypotheses?
|
||||
- [ ] Am I stating confidence levels appropriately?
|
||||
|
||||
---
|
||||
|
||||
## Application to Error Troubleshooting
|
||||
|
||||
Apply these principles when using the error-troubleshooter skill:
|
||||
|
||||
1. **Generate Multiple Theories**: Don't fixate on the first explanation
|
||||
2. **Verify with Official Sources**: Distinguish documentation from user reports
|
||||
3. **Check Exact Matches**: Version numbers, model names, and configurations must match exactly
|
||||
4. **Search for Counter-Evidence**: Look for cases where the theory should fail but doesn't
|
||||
5. **Design Targeted Experiments**: Isolate variables to test specific hypotheses
|
||||
|
||||
This disciplined approach prevents premature conclusions and leads to more reliable solutions.
|
||||
@@ -0,0 +1,835 @@
|
||||
# Systematic Debugging Methodology
|
||||
|
||||
## Guiding Principle: Occam's Razor for Debugging
|
||||
|
||||
**When facing mysterious errors, the root cause is almost always simpler than it appears.**
|
||||
|
||||
Common error distribution in real-world debugging:
|
||||
- **70%**: Configuration issues (wrong parameters, missing flags, incorrect paths)
|
||||
- **20%**: Environment issues (missing dependencies, version mismatches, path problems)
|
||||
- **8%**: Data format issues (encoding, structure assumptions)
|
||||
- **2%**: Actual bugs in external APIs or fundamental incompatibilities
|
||||
|
||||
**Core Rule**: Exhaust simple hypotheses before considering complex ones. Authentication failures are configuration problems 99% of the time, not API design flaws.
|
||||
|
||||
---
|
||||
|
||||
## 1. Hypothesis Priority Framework
|
||||
|
||||
### Always Start Here: The "Boring Checklist"
|
||||
|
||||
Before investigating complex hypotheses, verify these fundamentals:
|
||||
|
||||
1. **Configuration Loading**
|
||||
- Are environment variables actually loaded? (Not just "file exists")
|
||||
- Are config files in the correct location relative to execution path?
|
||||
- Are there hidden default parameters overriding your explicit config?
|
||||
|
||||
2. **Exact Version/Name Matching**
|
||||
- Library versions exact match with documentation examples?
|
||||
- API endpoint names exactly correct (not similar, not partially matching)?
|
||||
- Model/service names character-perfect? (e.g., `preview-09-2025` ≠ `preview-2025-09`)
|
||||
|
||||
3. **First-Hand Error Visibility**
|
||||
- Have you personally executed the failing code?
|
||||
- Have you seen the complete error output (not summarized or truncated)?
|
||||
- Are there error details hidden in non-obvious places (WebSocket close codes, HTTP headers)?
|
||||
|
||||
### Hypothesis Ordering Template
|
||||
|
||||
```markdown
|
||||
## Problem: [Description]
|
||||
|
||||
### Priority 1: Configuration Issues (Check First)
|
||||
- [ ] Hypothesis 1a: Config file not loaded from expected path
|
||||
Verification: Add logging immediately after config load
|
||||
- [ ] Hypothesis 1b: SDK using unexpected default parameters
|
||||
Verification: Log all parameters passed to SDK constructor
|
||||
- [ ] Hypothesis 1c: Authentication credentials format issue
|
||||
Verification: Check credential string length, prefix, encoding
|
||||
|
||||
### Priority 2: Environment Issues
|
||||
- [ ] Hypothesis 2a: Dependency version mismatch
|
||||
Verification: Check package.json vs installed versions
|
||||
- [ ] Hypothesis 2b: Runtime environment differences
|
||||
Verification: Compare working vs failing environment variables
|
||||
|
||||
### Priority 3: Data Format Issues
|
||||
- [ ] Hypothesis 3a: Incorrect data structure assumptions
|
||||
Verification: Log raw data structure, don't assume
|
||||
- [ ] Hypothesis 3b: Encoding mismatch (UTF-8, base64, binary)
|
||||
Verification: Inspect first few bytes/characters
|
||||
|
||||
### Priority 4: Complex Issues (Only After Above Exhausted)
|
||||
- [ ] Hypothesis 4a: API limitation or bug
|
||||
Verification: Find official documentation or working examples
|
||||
- [ ] Hypothesis 4b: Fundamental incompatibility
|
||||
Verification: Find counter-examples proving it can work
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Isolation Strategy: Diagnostic Scripts
|
||||
|
||||
**The Most Powerful Debugging Technique**: Create minimal, single-purpose scripts that test ONE thing at a time.
|
||||
|
||||
### Why This Works
|
||||
|
||||
- **Eliminates confounding variables**: Complex scripts have multiple failure points
|
||||
- **Provides clear pass/fail criteria**: If this 10-line script fails, the problem is X
|
||||
- **Builds confidence incrementally**: Each passing diagnostic narrows the search space
|
||||
- **Creates reusable verification tools**: Same scripts can verify fixes
|
||||
|
||||
### Diagnostic Script Principles
|
||||
|
||||
1. **One Variable Per Script**: Test configuration loading separately from API calls separately from data processing
|
||||
2. **Verbose Logging**: Log every step, every value, every assumption
|
||||
3. **Explicit Success Criteria**: Script should clearly output PASS or FAIL
|
||||
4. **No Assumptions**: Check everything explicitly, even "obvious" things
|
||||
|
||||
### Template: Configuration Diagnostic
|
||||
|
||||
```javascript
|
||||
// diagnose-config.js
|
||||
// Purpose: Verify configuration loading and format
|
||||
// Success criteria: All checks pass with ✓
|
||||
|
||||
console.log('=== Configuration Diagnostic ===\n');
|
||||
|
||||
// 1. Environment
|
||||
console.log('--- Environment ---');
|
||||
console.log('Node version:', process.version);
|
||||
console.log('Working directory:', process.cwd());
|
||||
console.log('Script location:', __dirname);
|
||||
|
||||
// 2. Config File Loading
|
||||
console.log('\n--- Config File ---');
|
||||
const fs = require('fs');
|
||||
const path = require('path');
|
||||
|
||||
const configPath = path.join(__dirname, '.env');
|
||||
console.log('Expected path:', configPath);
|
||||
console.log('File exists:', fs.existsSync(configPath));
|
||||
|
||||
if (fs.existsSync(configPath)) {
|
||||
const content = fs.readFileSync(configPath, 'utf-8');
|
||||
console.log('File size:', content.length, 'bytes');
|
||||
console.log('Lines:', content.split('\n').length);
|
||||
}
|
||||
|
||||
// 3. Environment Variable
|
||||
require('dotenv').config({ path: configPath });
|
||||
console.log('\n--- Environment Variable ---');
|
||||
console.log('API_KEY defined:', !!process.env.API_KEY);
|
||||
console.log('API_KEY length:', process.env.API_KEY?.length);
|
||||
console.log('API_KEY prefix:', process.env.API_KEY?.slice(0, 4));
|
||||
console.log('API_KEY has whitespace:', /\s/.test(process.env.API_KEY || ''));
|
||||
|
||||
// 4. Format Validation
|
||||
console.log('\n--- Format Validation ---');
|
||||
const expectedPrefix = 'AIza'; // Example for Google API keys
|
||||
const expectedLength = 39;
|
||||
|
||||
const prefixMatch = process.env.API_KEY?.startsWith(expectedPrefix);
|
||||
const lengthMatch = process.env.API_KEY?.length === expectedLength;
|
||||
|
||||
console.log(prefixMatch ? '✓' : '✗', 'Prefix matches expected format');
|
||||
console.log(lengthMatch ? '✓' : '✗', 'Length matches expected format');
|
||||
|
||||
// 5. SDK Constructor (No Network Call)
|
||||
console.log('\n--- SDK Initialization ---');
|
||||
try {
|
||||
const SDK = require('./sdk'); // Your SDK
|
||||
const client = new SDK({
|
||||
apiKey: process.env.API_KEY,
|
||||
// Log all parameters, including defaults
|
||||
});
|
||||
console.log('✓ SDK initialized without errors');
|
||||
console.log('Client config:', JSON.stringify(client.config, null, 2));
|
||||
} catch (error) {
|
||||
console.log('✗ SDK initialization failed:', error.message);
|
||||
}
|
||||
|
||||
console.log('\n=== Diagnostic Complete ===');
|
||||
```
|
||||
|
||||
### Template: Data Structure Diagnostic
|
||||
|
||||
```javascript
|
||||
// diagnose-data-structure.js
|
||||
// Purpose: Inspect actual data structure without assumptions
|
||||
// Success criteria: Understand exact structure and location of target data
|
||||
|
||||
function inspectObject(obj, path = 'root', maxDepth = 3, currentDepth = 0) {
|
||||
const indent = ' '.repeat(currentDepth);
|
||||
|
||||
console.log(`${indent}${path}:`);
|
||||
console.log(`${indent} Type: ${typeof obj}`);
|
||||
console.log(`${indent} Null: ${obj === null}`);
|
||||
console.log(`${indent} Undefined: ${obj === undefined}`);
|
||||
|
||||
if (obj === null || obj === undefined) return;
|
||||
|
||||
if (typeof obj === 'object') {
|
||||
if (Buffer.isBuffer(obj)) {
|
||||
console.log(`${indent} [Buffer: ${obj.length} bytes]`);
|
||||
console.log(`${indent} First 20 bytes: ${obj.slice(0, 20).toString('hex')}`);
|
||||
return;
|
||||
}
|
||||
|
||||
if (Array.isArray(obj)) {
|
||||
console.log(`${indent} [Array: ${obj.length} items]`);
|
||||
if (obj.length > 0 && currentDepth < maxDepth) {
|
||||
inspectObject(obj[0], `${path}[0]`, maxDepth, currentDepth + 1);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
const keys = Object.keys(obj);
|
||||
console.log(`${indent} Keys: [${keys.join(', ')}]`);
|
||||
|
||||
if (currentDepth < maxDepth) {
|
||||
for (const key of keys.slice(0, 5)) { // Limit to first 5 keys
|
||||
inspectObject(obj[key], `${path}.${key}`, maxDepth, currentDepth + 1);
|
||||
}
|
||||
}
|
||||
} else if (typeof obj === 'string') {
|
||||
console.log(`${indent} Length: ${obj.length}`);
|
||||
console.log(`${indent} Preview: "${obj.slice(0, 50)}${obj.length > 50 ? '...' : ''}"`);
|
||||
|
||||
// Check encoding hints
|
||||
const isBase64Like = /^[A-Za-z0-9+/=]+$/.test(obj);
|
||||
const isHexLike = /^[0-9a-fA-F]+$/.test(obj);
|
||||
console.log(`${indent} Looks like base64: ${isBase64Like}`);
|
||||
console.log(`${indent} Looks like hex: ${isHexLike}`);
|
||||
} else {
|
||||
console.log(`${indent} Value: ${obj}`);
|
||||
}
|
||||
}
|
||||
|
||||
// Usage example with API response
|
||||
const response = getAPIResponse(); // Your actual API call
|
||||
console.log('=== Raw Response Structure ===\n');
|
||||
inspectObject(response);
|
||||
|
||||
// Specific path investigation
|
||||
console.log('\n=== Investigating Suspected Path ===');
|
||||
console.log('response.data exists:', !!response.data);
|
||||
console.log('response.body exists:', !!response.body);
|
||||
console.log('response.payload exists:', !!response.payload);
|
||||
|
||||
// If you're looking for binary data
|
||||
console.log('\n=== Binary Data Search ===');
|
||||
function findBuffers(obj, path = 'root') {
|
||||
if (Buffer.isBuffer(obj)) {
|
||||
console.log(`Found Buffer at: ${path} (${obj.length} bytes)`);
|
||||
return;
|
||||
}
|
||||
if (typeof obj === 'object' && obj !== null) {
|
||||
for (const key in obj) {
|
||||
findBuffers(obj[key], `${path}.${key}`);
|
||||
}
|
||||
}
|
||||
}
|
||||
findBuffers(response);
|
||||
```
|
||||
|
||||
### Template: API Interaction Diagnostic
|
||||
|
||||
```javascript
|
||||
// diagnose-api-minimal.js
|
||||
// Purpose: Minimal API call to isolate authentication from functionality
|
||||
// Success criteria: Connection succeeds, response received (any response)
|
||||
|
||||
console.log('=== Minimal API Test ===\n');
|
||||
|
||||
const API = require('./api-client');
|
||||
|
||||
async function testMinimalConnection() {
|
||||
console.log('1. Creating client...');
|
||||
const client = new API({
|
||||
apiKey: process.env.API_KEY,
|
||||
// Start with absolute minimal config
|
||||
});
|
||||
|
||||
console.log('2. Attempting connection...');
|
||||
try {
|
||||
await client.connect();
|
||||
console.log('✓ Connection established');
|
||||
} catch (error) {
|
||||
console.log('✗ Connection failed:', error.message);
|
||||
console.log('Error code:', error.code);
|
||||
console.log('Error details:', JSON.stringify(error, null, 2));
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
console.log('3. Sending minimal request...');
|
||||
try {
|
||||
// Simplest possible request
|
||||
const response = await client.send({ message: 'ping' });
|
||||
console.log('✓ Response received');
|
||||
console.log('Response type:', typeof response);
|
||||
console.log('Response keys:', Object.keys(response));
|
||||
} catch (error) {
|
||||
console.log('✗ Request failed:', error.message);
|
||||
}
|
||||
|
||||
console.log('4. Closing connection...');
|
||||
await client.close();
|
||||
console.log('✓ Test complete');
|
||||
}
|
||||
|
||||
testMinimalConnection().catch(console.error);
|
||||
```
|
||||
|
||||
### Real-World Example: WebSocket Authentication Mystery
|
||||
|
||||
**Problem**: WebSocket connection immediately closes with code 1007 "API key not valid"
|
||||
|
||||
**❌ Initial Approach** (jumping to complex hypotheses):
|
||||
- "Maybe the API doesn't support this model"
|
||||
- "Maybe the feature flag is disabled for my account"
|
||||
- "Maybe there's a rate limit"
|
||||
|
||||
**✅ Diagnostic Script Approach**:
|
||||
|
||||
```javascript
|
||||
// diagnose-websocket-auth.js
|
||||
console.log('=== WebSocket Auth Diagnostic ===\n');
|
||||
|
||||
// Step 1: Verify API key loading
|
||||
console.log('--- API Key ---');
|
||||
console.log('Loaded:', !!process.env.API_KEY);
|
||||
console.log('Length:', process.env.API_KEY?.length);
|
||||
console.log('Format:', process.env.API_KEY?.slice(0, 4) + '...' + process.env.API_KEY?.slice(-4));
|
||||
|
||||
// Step 2: Check SDK configuration
|
||||
console.log('\n--- SDK Config ---');
|
||||
const sdk = new SDK({
|
||||
apiKey: process.env.API_KEY,
|
||||
});
|
||||
|
||||
// KEY DISCOVERY: Log the actual endpoint being used
|
||||
console.log('Endpoint:', sdk.endpoint); // Revealed: Using Vertex AI endpoint!
|
||||
|
||||
// Step 3: Check SDK defaults
|
||||
console.log('Default vertexai:', sdk.config.vertexai); // Revealed: true by default!
|
||||
|
||||
// Step 4: Test with explicit configuration
|
||||
console.log('\n--- Testing explicit vertexai: false ---');
|
||||
const sdkFixed = new SDK({
|
||||
apiKey: process.env.API_KEY,
|
||||
vertexai: false, // Explicit override
|
||||
});
|
||||
console.log('Endpoint:', sdkFixed.endpoint); // Now using correct endpoint!
|
||||
```
|
||||
|
||||
**Result**: The diagnostic revealed that the SDK defaulted to `vertexai: true`, sending requests to Vertex AI endpoint instead of the Gemini Developer API endpoint. The fix was a single parameter.
|
||||
|
||||
**Time saved**: This 15-line diagnostic script found the issue in 2 minutes. The alternative (reading SDK source code or trial-and-error config changes) would have taken hours.
|
||||
|
||||
---
|
||||
|
||||
## 3. Comparison with Working Examples
|
||||
|
||||
### Why This Is Critical
|
||||
|
||||
When you have a working example (different language, different version, official sample), it's a **treasure map** showing the correct configuration.
|
||||
|
||||
**Working example exists → Problem is in the differences**
|
||||
|
||||
### Systematic Comparison Process
|
||||
|
||||
1. **Find the Working Example**
|
||||
- Official SDK examples (preferred)
|
||||
- Successful previous implementations in your codebase
|
||||
- Community examples with verified success (check issues/discussions)
|
||||
|
||||
2. **Compare Layer by Layer**
|
||||
```markdown
|
||||
## Comparison Checklist
|
||||
|
||||
### Language/Runtime
|
||||
- [ ] Working: Python 3.11, Failing: Node.js 20
|
||||
- [ ] Any known language-specific issues?
|
||||
|
||||
### SDK Versions
|
||||
- [ ] Working: v2.1.0, Failing: v2.3.1
|
||||
- [ ] Check changelog between versions
|
||||
|
||||
### Configuration Parameters
|
||||
Working:
|
||||
```python
|
||||
client = Client(api_key=key) # Only 1 parameter
|
||||
```
|
||||
|
||||
Failing:
|
||||
```javascript
|
||||
const client = new Client({ apiKey: key, ...manyOtherParams });
|
||||
```
|
||||
- [ ] What are those other params? What are their defaults?
|
||||
|
||||
### API Endpoint
|
||||
- [ ] Working: api.service.com, Failing: ???
|
||||
- [ ] Log actual endpoint used by SDK
|
||||
|
||||
### Request Format
|
||||
- [ ] Compare actual HTTP/WebSocket frames sent (use network inspector)
|
||||
```
|
||||
|
||||
3. **Identify Hidden Differences**
|
||||
|
||||
Common gotchas:
|
||||
- **Default parameters**: JavaScript SDK has `vertexai: true` default, Python doesn't have this parameter
|
||||
- **Authentication methods**: One uses header, another uses query param
|
||||
- **Endpoint URLs**: SDKs may auto-select endpoints based on config
|
||||
- **Retry behavior**: One SDK retries automatically, hiding transient failures
|
||||
|
||||
### Example: Cross-Language Comparison
|
||||
|
||||
**Problem**: Python POC works, JavaScript POC fails with authentication error
|
||||
|
||||
**Comparison**:
|
||||
|
||||
```python
|
||||
# Python (WORKING)
|
||||
import google.generativeai as genai
|
||||
|
||||
genai.configure(api_key=api_key) # Simple, one-line config
|
||||
model = genai.GenerativeModel('gemini-2.5-flash')
|
||||
response = model.generate_content('Hello')
|
||||
```
|
||||
|
||||
```javascript
|
||||
// JavaScript (FAILING)
|
||||
const { GoogleGenerativeAI } = require('@google/generative-ai');
|
||||
|
||||
const ai = new GoogleGenerativeAI({
|
||||
apiKey: process.env.API_KEY,
|
||||
// What's different?
|
||||
});
|
||||
```
|
||||
|
||||
**Investigation**:
|
||||
1. Python library source: `genai.configure()` only sets API key, no other parameters
|
||||
2. JavaScript SDK docs: Constructor accepts `vertexai` parameter (default: `true`)
|
||||
3. Hypothesis: JavaScript defaulting to Vertex AI endpoint
|
||||
|
||||
**Verification**:
|
||||
```javascript
|
||||
const ai = new GoogleGenerativeAI({
|
||||
apiKey: process.env.API_KEY,
|
||||
vertexai: false, // Match Python's implicit behavior
|
||||
});
|
||||
```
|
||||
|
||||
**Result**: Fixed. The difference was an implicit vs explicit endpoint selection.
|
||||
|
||||
---
|
||||
|
||||
## 4. Data Structure Verification: Never Assume
|
||||
|
||||
### The Anti-Pattern
|
||||
|
||||
```javascript
|
||||
// ❌ Assumption-based code
|
||||
const audioData = response.data; // Assuming 'data' contains audio
|
||||
audioFile.write(audioData);
|
||||
// Result: Writes undefined or wrong data, produces corrupted file
|
||||
```
|
||||
|
||||
### The Correct Pattern
|
||||
|
||||
```javascript
|
||||
// ✅ Verification-first code
|
||||
|
||||
// Step 1: Inspect actual structure
|
||||
console.log('Response keys:', Object.keys(response));
|
||||
console.log('Response structure:', JSON.stringify(response, null, 2).slice(0, 500));
|
||||
|
||||
// Step 2: Search for target data
|
||||
function findAudioData(obj, path = 'response') {
|
||||
if (Buffer.isBuffer(obj)) {
|
||||
console.log(`Found Buffer at ${path}: ${obj.length} bytes`);
|
||||
}
|
||||
if (typeof obj === 'object' && obj !== null) {
|
||||
for (const [key, value] of Object.entries(obj)) {
|
||||
if (key.includes('audio') || key.includes('data')) {
|
||||
console.log(`Candidate at ${path}.${key}:`, typeof value);
|
||||
}
|
||||
findAudioData(value, `${path}.${key}`);
|
||||
}
|
||||
}
|
||||
}
|
||||
findAudioData(response);
|
||||
|
||||
// Step 3: Verify encoding
|
||||
const candidateData = response.serverContent.modelTurn.parts[0].inlineData.data;
|
||||
console.log('Data type:', typeof candidateData);
|
||||
console.log('First 50 chars:', candidateData.slice(0, 50));
|
||||
console.log('Looks like base64:', /^[A-Za-z0-9+/=]+$/.test(candidateData));
|
||||
|
||||
// Step 4: Test decoding
|
||||
const decoded = Buffer.from(candidateData, 'base64');
|
||||
console.log('Decoded size:', decoded.length, 'bytes');
|
||||
console.log('First 10 bytes (hex):', decoded.slice(0, 10).toString('hex'));
|
||||
|
||||
// Step 5: Use verified data
|
||||
audioFile.write(decoded); // Now confident this is correct
|
||||
```
|
||||
|
||||
### Layer-by-Layer Verification Template
|
||||
|
||||
```javascript
|
||||
// For nested data structures (e.g., API responses, message objects)
|
||||
|
||||
function verifyPath(obj, pathString) {
|
||||
console.log(`\n=== Verifying: ${pathString} ===`);
|
||||
|
||||
const parts = pathString.split('.');
|
||||
let current = obj;
|
||||
let currentPath = 'root';
|
||||
|
||||
for (const part of parts) {
|
||||
currentPath += `.${part}`;
|
||||
|
||||
console.log(`Checking ${currentPath}...`);
|
||||
|
||||
if (current === null || current === undefined) {
|
||||
console.log(`✗ Path broken at ${currentPath}: value is ${current}`);
|
||||
return null;
|
||||
}
|
||||
|
||||
if (typeof current !== 'object') {
|
||||
console.log(`✗ Path broken at ${currentPath}: not an object (${typeof current})`);
|
||||
return null;
|
||||
}
|
||||
|
||||
if (!(part in current)) {
|
||||
console.log(`✗ Key "${part}" doesn't exist`);
|
||||
console.log(` Available keys:`, Object.keys(current));
|
||||
return null;
|
||||
}
|
||||
|
||||
console.log(`✓ ${part} exists`);
|
||||
current = current[part];
|
||||
|
||||
if (Array.isArray(current)) {
|
||||
console.log(` (Array with ${current.length} items)`);
|
||||
} else if (Buffer.isBuffer(current)) {
|
||||
console.log(` (Buffer with ${current.length} bytes)`);
|
||||
} else {
|
||||
console.log(` (${typeof current})`);
|
||||
}
|
||||
}
|
||||
|
||||
console.log(`\n✓ Full path verified: ${pathString}`);
|
||||
return current;
|
||||
}
|
||||
|
||||
// Usage
|
||||
const audioData = verifyPath(
|
||||
response,
|
||||
'serverContent.modelTurn.parts.0.inlineData.data'
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Evidence Quality Hierarchy
|
||||
|
||||
Not all evidence is equal. Rank your evidence sources:
|
||||
|
||||
### Tier 1: Direct Verification (Strongest)
|
||||
- Running code that succeeds/fails in front of you
|
||||
- Network traffic you personally captured
|
||||
- Logs you personally generated with verbose flags
|
||||
- Binary data you inspected byte-by-byte
|
||||
|
||||
### Tier 2: Official Sources
|
||||
- Official API documentation (with version number matching yours)
|
||||
- Official SDK examples (with version number matching yours)
|
||||
- Official changelog entries
|
||||
|
||||
### Tier 3: Working Examples
|
||||
- Community examples with verified success (stars, recent activity)
|
||||
- Stack Overflow answers with upvotes and recent dates
|
||||
- Your own previous successful implementations
|
||||
|
||||
### Tier 4: Problem Reports
|
||||
- GitHub Issues (open or closed)
|
||||
- Stack Overflow questions (problems, not solutions)
|
||||
- Forum discussions
|
||||
|
||||
### Tier 5: Speculation (Weakest)
|
||||
- "I think this API doesn't support..."
|
||||
- "This probably means..."
|
||||
- Assumptions based on API names or parameter names
|
||||
|
||||
### Applying the Hierarchy
|
||||
|
||||
**Scenario**: Investigating why authentication fails
|
||||
|
||||
```markdown
|
||||
## Evidence Analysis
|
||||
|
||||
### Hypothesis: API doesn't support this authentication method
|
||||
|
||||
Evidence collected:
|
||||
1. [Tier 4] GitHub Issue #123: User reports auth failure (OPEN, no resolution)
|
||||
2. [Tier 5] Parameter named "beta" suggests experimental feature
|
||||
3. [Tier 2] Official docs state: "Authentication via API key is supported"
|
||||
4. [Tier 3] Example repo uses API key successfully (last updated 2 months ago)
|
||||
|
||||
Conclusion:
|
||||
- Tier 2 (official docs) contradicts hypothesis
|
||||
- Tier 3 (working example) disproves hypothesis
|
||||
- Tier 4 evidence (open issue) indicates others hit same problem, but doesn't prove API limitation
|
||||
- Hypothesis REJECTED: Auth method IS supported, problem is likely in configuration
|
||||
```
|
||||
|
||||
**Key Principle**: Higher-tier evidence always overrules lower-tier evidence.
|
||||
|
||||
---
|
||||
|
||||
## 6. The First-Hand Execution Rule
|
||||
|
||||
**Rule**: Before forming conclusions, personally execute the failing code and observe the complete output.
|
||||
|
||||
### Why This Matters
|
||||
|
||||
Second-hand error reports often omit critical details:
|
||||
- Full error messages (users summarize or truncate)
|
||||
- Error codes (users paste message but not code)
|
||||
- Preceding warnings (users skip "unimportant" output)
|
||||
- Environment differences (users assume their env is "normal")
|
||||
|
||||
### Checklist Before Forming Hypothesis
|
||||
|
||||
- [ ] Have I executed the failing code myself?
|
||||
- [ ] Have I seen the complete console output (not summarized)?
|
||||
- [ ] Have I checked for errors in non-obvious places (close codes, HTTP status, exit codes)?
|
||||
- [ ] Have I added extra logging to expose internal state?
|
||||
|
||||
### Example: The Hidden Close Code
|
||||
|
||||
**Second-hand report**: "The WebSocket connection closes immediately with no error"
|
||||
|
||||
**Assumptions formed**:
|
||||
- "No error" → Maybe timeout?
|
||||
- "Closes immediately" → Maybe connection refused?
|
||||
|
||||
**First-hand execution**:
|
||||
```javascript
|
||||
// Added logging
|
||||
websocket.on('close', (code, reason) => {
|
||||
console.log('Close code:', code); // Revealed: 1007
|
||||
console.log('Close reason:', reason); // Revealed: "API key not valid"
|
||||
});
|
||||
```
|
||||
|
||||
**Result**: Error WAS present, just not logged by default. The close code (1007) immediately pointed to authentication issue.
|
||||
|
||||
---
|
||||
|
||||
## 7. Debugging Session Template
|
||||
|
||||
Use this template to structure your investigation:
|
||||
|
||||
```markdown
|
||||
## Problem Statement
|
||||
[Concise description of unexpected behavior]
|
||||
|
||||
## Environment
|
||||
- Runtime: [Node.js 20.11.0, Python 3.11, etc.]
|
||||
- Library versions: [Exact versions from package.json/requirements.txt]
|
||||
- OS: [If potentially relevant]
|
||||
|
||||
## Reproduction
|
||||
[Minimal code that reproduces the issue]
|
||||
|
||||
## Expected vs Actual
|
||||
- Expected: [What should happen]
|
||||
- Actual: [What actually happens, with exact error messages]
|
||||
|
||||
## Hypothesis Priority List
|
||||
|
||||
### Priority 1: Configuration (Check First)
|
||||
- [ ] Hypothesis 1a: [Specific config issue]
|
||||
Evidence needed: [What would prove/disprove this]
|
||||
Diagnostic: [Script or test to verify]
|
||||
|
||||
### Priority 2: Environment
|
||||
- [ ] Hypothesis 2a: [Specific env issue]
|
||||
Evidence needed: [...]
|
||||
Diagnostic: [...]
|
||||
|
||||
### Priority 3: Data Format
|
||||
[...]
|
||||
|
||||
### Priority 4: Complex Issues (Only if above exhausted)
|
||||
[...]
|
||||
|
||||
## Evidence Collected
|
||||
|
||||
### [Hypothesis 1a]
|
||||
- **Status**: ✓ PROVEN / ✗ DISPROVEN / ⚠ INCONCLUSIVE
|
||||
- **Evidence tier**: [1-5]
|
||||
- **Details**: [What you found]
|
||||
- **Source**: [Where this evidence came from]
|
||||
|
||||
[Repeat for each hypothesis]
|
||||
|
||||
## Working Examples Comparison
|
||||
|
||||
### Python Implementation (WORKING)
|
||||
```python
|
||||
[Code]
|
||||
```
|
||||
|
||||
### JavaScript Implementation (FAILING)
|
||||
```javascript
|
||||
[Code]
|
||||
```
|
||||
|
||||
### Differences Identified
|
||||
1. [Difference 1]
|
||||
2. [Difference 2]
|
||||
|
||||
## Solution
|
||||
|
||||
### Root Cause
|
||||
[Final verified cause, with evidence tier]
|
||||
|
||||
### Fix Applied
|
||||
```javascript
|
||||
[Exact code change]
|
||||
```
|
||||
|
||||
### Verification
|
||||
[How you verified the fix works]
|
||||
|
||||
## Lessons Learned
|
||||
[What would make this faster next time]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Common Anti-Patterns to Avoid
|
||||
|
||||
### Anti-Pattern 1: "Debugging by Modification"
|
||||
|
||||
**❌ Wrong**:
|
||||
```javascript
|
||||
// Try random changes hoping something works
|
||||
const client = new API({ apiKey: key, timeout: 5000 }); // Doesn't work
|
||||
const client = new API({ apiKey: key, timeout: 10000 }); // Doesn't work
|
||||
const client = new API({ apiKey: key, retry: true }); // Doesn't work
|
||||
// [30 more random attempts...]
|
||||
```
|
||||
|
||||
**✅ Right**:
|
||||
```javascript
|
||||
// Diagnose THEN fix
|
||||
// 1. Create diagnostic to understand current behavior
|
||||
// 2. Form hypothesis based on diagnostic output
|
||||
// 3. Make targeted change
|
||||
// 4. Verify with diagnostic
|
||||
```
|
||||
|
||||
### Anti-Pattern 2: "Complex First"
|
||||
|
||||
**❌ Wrong**: "The API must not support this feature with this model configuration"
|
||||
|
||||
**✅ Right**: "Let me first check if my API key is even loading correctly"
|
||||
|
||||
### Anti-Pattern 3: "Assumption Stacking"
|
||||
|
||||
**❌ Wrong**:
|
||||
```javascript
|
||||
// Assuming response.data exists
|
||||
// Assuming it's a Buffer
|
||||
// Assuming it's in the right format
|
||||
fs.writeFileSync('output.wav', response.data);
|
||||
```
|
||||
|
||||
**✅ Right**:
|
||||
```javascript
|
||||
// Verify each assumption
|
||||
console.log('data exists:', !!response.data);
|
||||
console.log('data type:', typeof response.data);
|
||||
console.log('is Buffer:', Buffer.isBuffer(response.data));
|
||||
// [Then use data]
|
||||
```
|
||||
|
||||
### Anti-Pattern 4: "Trust the Summary"
|
||||
|
||||
**❌ Wrong**: User says "no error", assume there's no error
|
||||
|
||||
**✅ Right**: Execute code yourself, log everything, find the hidden error code
|
||||
|
||||
---
|
||||
|
||||
## 9. Speed Optimization: Parallel Diagnostics
|
||||
|
||||
Once you've identified multiple hypotheses, test them in parallel when possible.
|
||||
|
||||
### Pattern: Parallel Diagnostic Scripts
|
||||
|
||||
```bash
|
||||
# Instead of running diagnostics sequentially:
|
||||
node diagnose-config.js # 5 seconds
|
||||
node diagnose-api.js # 10 seconds
|
||||
node diagnose-data-format.js # 5 seconds
|
||||
# Total: 20 seconds sequential
|
||||
|
||||
# Run in parallel:
|
||||
node diagnose-config.js &
|
||||
node diagnose-api.js &
|
||||
node diagnose-data-format.js &
|
||||
wait
|
||||
# Total: 10 seconds (limited by slowest)
|
||||
```
|
||||
|
||||
### Pattern: Multi-Hypothesis Test Script
|
||||
|
||||
```javascript
|
||||
// test-all-hypotheses.js
|
||||
async function testAll() {
|
||||
const tests = [
|
||||
testConfigLoading,
|
||||
testAPIEndpoint,
|
||||
testDataFormat,
|
||||
testVersionCompatibility
|
||||
];
|
||||
|
||||
const results = await Promise.allSettled(
|
||||
tests.map(test => test().catch(e => ({ error: e })))
|
||||
);
|
||||
|
||||
results.forEach((result, i) => {
|
||||
console.log(`\nTest ${i + 1}: ${tests[i].name}`);
|
||||
if (result.status === 'fulfilled') {
|
||||
console.log('✓ PASSED');
|
||||
} else {
|
||||
console.log('✗ FAILED:', result.reason);
|
||||
}
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Application to Error Troubleshooting
|
||||
|
||||
When using the error-troubleshooter skill:
|
||||
|
||||
1. **Start with Occam's Razor**: Always check configuration and environment issues first (90% of problems)
|
||||
2. **Create Diagnostic Scripts**: Write minimal scripts to isolate variables
|
||||
3. **Compare with Working Examples**: If it works somewhere else, find the differences
|
||||
4. **Never Assume Data Structure**: Verify every layer explicitly
|
||||
5. **Rank Your Evidence**: Tier 1 (direct verification) beats Tier 5 (speculation)
|
||||
6. **Execute First-Hand**: Don't trust summaries, see the complete output yourself
|
||||
7. **Avoid Anti-Patterns**: Diagnose first, fix second; simple first, complex later
|
||||
|
||||
This systematic approach leads to faster, more reliable problem resolution.
|
||||
378
skills/error-troubleshooter/references/troubleshooting-sop.md
Normal file
378
skills/error-troubleshooter/references/troubleshooting-sop.md
Normal file
@@ -0,0 +1,378 @@
|
||||
# Troubleshooting Standard Operating Procedures
|
||||
|
||||
This document provides detailed, step-by-step procedures for systematic error troubleshooting.
|
||||
|
||||
## Core Troubleshooting Workflow
|
||||
|
||||
### Phase 1: Error Recognition and Initial Response
|
||||
|
||||
When a tool, script, or command fails:
|
||||
|
||||
1. **Capture Complete Error Information**
|
||||
- Full error message (stdout and stderr)
|
||||
- Tool/command that was executed
|
||||
- Context of what was being attempted
|
||||
- Any stack traces or error codes
|
||||
|
||||
2. **Assess Error Clarity**
|
||||
- Is the error message self-explanatory?
|
||||
- Does it explicitly state what went wrong and how to fix it?
|
||||
- Is this a commonly encountered error pattern?
|
||||
|
||||
3. **Decide on Investigation Depth**
|
||||
- **Trivial/Clear**: Apply quick fix
|
||||
- **Non-trivial/Ambiguous**: Proceed to rigorous investigation
|
||||
|
||||
### Phase 2: Quick Fix Attempt (Happy Case)
|
||||
|
||||
**Criteria for attempting quick fix:**
|
||||
- Error message explicitly describes the problem and solution
|
||||
- Error matches a well-known trivial pattern
|
||||
- Fix requires minimal changes with low risk
|
||||
|
||||
**Procedure:**
|
||||
1. Apply the fix based on error message or experience
|
||||
2. Re-execute the failing command
|
||||
3. Evaluate the result:
|
||||
- **Success**: Error is resolved → Done
|
||||
- **No change**: Error message identical → Revert and escalate
|
||||
- **Worse**: New errors or degraded state → Revert immediately and escalate
|
||||
|
||||
**Reversion Protocol:**
|
||||
- Undo all changes made during quick fix attempt
|
||||
- Verify system is back to pre-fix state
|
||||
- Document what was attempted (if creating debug notes)
|
||||
|
||||
### Phase 3: Rigorous Investigation
|
||||
|
||||
Enter this phase when:
|
||||
- Quick fix failed
|
||||
- Error is non-trivial from the start
|
||||
- Multiple potential causes exist
|
||||
- Context is ambiguous
|
||||
|
||||
#### Step 1: Error Template Extraction
|
||||
|
||||
**Purpose**: Prepare error message for effective searching by removing variable components.
|
||||
|
||||
**Procedure:**
|
||||
1. Identify the error type/category (e.g., FileNotFoundError, TypeError, ConnectionError)
|
||||
2. Locate the core error message
|
||||
3. Remove variable components:
|
||||
- File paths: `/home/user/project/file.py` → (remove)
|
||||
- Usernames: `user@example.com` → (remove)
|
||||
- IDs/numbers: `id=12345` → (remove)
|
||||
- Timestamps: `2024-01-15 10:30:45` → (remove)
|
||||
- User inputs: `input='value'` → (remove)
|
||||
- Line numbers: `line 42` → (keep only if part of standard template)
|
||||
|
||||
4. Retain:
|
||||
- Error type/class names
|
||||
- Standard error message structure
|
||||
- Function/method names from standard library/SDK
|
||||
- Standard error codes
|
||||
|
||||
**Example Transformations:**
|
||||
```
|
||||
Original:
|
||||
ValueError: invalid literal for int() with base 10: 'abc' at line 42 in /home/user/script.py
|
||||
|
||||
Template:
|
||||
ValueError invalid literal for int() with base 10
|
||||
|
||||
Original:
|
||||
requests.exceptions.ConnectionError: HTTPConnectionPool(host='api.example.com', port=443): Max retries exceeded with url: /v1/users
|
||||
|
||||
Template:
|
||||
requests.exceptions.ConnectionError HTTPConnectionPool Max retries exceeded
|
||||
|
||||
Original:
|
||||
ModuleNotFoundError: No module named 'pandas'
|
||||
|
||||
Template:
|
||||
ModuleNotFoundError No module named
|
||||
```
|
||||
|
||||
#### Step 2: Environment Information Collection
|
||||
|
||||
**Purpose**: Gather context needed to understand and resolve the error.
|
||||
|
||||
**Collection Strategy:**
|
||||
1. **Start Minimal**: Only collect what's clearly relevant
|
||||
2. **Expand as Needed**: Add more context if initial research is inconclusive
|
||||
3. **Always Protect Privacy**: Never collect passwords, API keys, personal data without explicit permission
|
||||
|
||||
**Standard Environment Information:**
|
||||
|
||||
**System Context:**
|
||||
- Operating system and version
|
||||
- Shell/terminal environment
|
||||
- Current working directory (if relevant)
|
||||
|
||||
**Runtime Context:**
|
||||
- Language/SDK versions (Python, Node.js, etc.)
|
||||
- Package manager versions (pip, npm, etc.)
|
||||
- Virtual environment status
|
||||
|
||||
**Dependency Context:**
|
||||
- Installed package versions (for error-related packages)
|
||||
- Package lock file status
|
||||
- Dependency conflicts
|
||||
|
||||
**Configuration Context:**
|
||||
- Relevant configuration files (only if directly related to error)
|
||||
- Environment variables (only non-sensitive ones)
|
||||
|
||||
**Collection Commands by Context:**
|
||||
|
||||
```bash
|
||||
# Python environment
|
||||
python --version
|
||||
pip --version
|
||||
pip list | grep <package-name>
|
||||
|
||||
# Node.js environment
|
||||
node --version
|
||||
npm --version
|
||||
npm list <package-name>
|
||||
|
||||
# System information
|
||||
uname -a # Unix/Linux/Mac
|
||||
ver # Windows
|
||||
|
||||
# Package conflicts
|
||||
pip check # Python
|
||||
npm doctor # Node.js
|
||||
|
||||
# Environment variables (careful with sensitive data)
|
||||
env | grep <RELEVANT_VAR>
|
||||
```
|
||||
|
||||
**Privacy Protection:**
|
||||
- **Never collect**: Passwords, API keys, tokens, private keys, personal identifiable information
|
||||
- **Request permission before collecting**: Project-specific paths, configuration files, custom environment variables
|
||||
- **Sanitize output**: Remove sensitive data before recording
|
||||
|
||||
#### Step 3: Research and Information Gathering
|
||||
|
||||
**Research Sources (in order of efficiency):**
|
||||
|
||||
1. **Web Search with Error Template**
|
||||
- Search the extracted template (not full error)
|
||||
- Add language/framework name to query
|
||||
- Example: "ModuleNotFoundError No module named python"
|
||||
|
||||
2. **Official Documentation**
|
||||
- Error code references
|
||||
- SDK/API documentation
|
||||
- Known issues and breaking changes
|
||||
|
||||
3. **Community Resources**
|
||||
- Stack Overflow
|
||||
- GitHub Issues (especially for specific libraries)
|
||||
- Framework-specific forums
|
||||
|
||||
**Parallel Research Strategy:**
|
||||
|
||||
For complex problems, launch multiple research angles simultaneously using subagents:
|
||||
|
||||
```
|
||||
Investigation Angles:
|
||||
├─ Subagent 1: Web search for error template
|
||||
├─ Subagent 2: Search GitHub Issues for affected package
|
||||
├─ Subagent 3: Check official documentation for breaking changes
|
||||
└─ Subagent 4: Search for similar errors in codebase history
|
||||
```
|
||||
|
||||
**Research Efficiency:**
|
||||
- Delegate broad searches to subagents
|
||||
- Keep main context focused on synthesis and decision-making
|
||||
- Use file-based notes to accumulate findings
|
||||
|
||||
#### Step 4: Debug Notes Creation
|
||||
|
||||
**When to create debug notes:**
|
||||
- Investigation is expected to be complex
|
||||
- Multiple theories need tracking
|
||||
- Context is being consumed quickly
|
||||
- Investigation may span multiple sessions
|
||||
|
||||
**Debug Notes Structure:**
|
||||
|
||||
```markdown
|
||||
# Debug Session: [Error Summary]
|
||||
|
||||
## Error Information
|
||||
[Full error details]
|
||||
|
||||
## Environment
|
||||
[Relevant environment information]
|
||||
|
||||
## Theories
|
||||
1. [Theory 1]: [likelihood: high/medium/low]
|
||||
- Evidence: [supporting information]
|
||||
- Test: [how to verify]
|
||||
- Result: [pending/confirmed/rejected]
|
||||
|
||||
2. [Theory 2]: ...
|
||||
|
||||
## Research Findings
|
||||
- [Source]: [key information]
|
||||
- [Source]: [key information]
|
||||
|
||||
## Tests Conducted
|
||||
1. [Test description]
|
||||
- Command: [test command]
|
||||
- Result: [outcome]
|
||||
- Conclusion: [what was learned]
|
||||
|
||||
## Solution
|
||||
[Final solution that resolved the issue]
|
||||
```
|
||||
|
||||
Use the template in `assets/debug-notes-template.md` as a starting point.
|
||||
|
||||
#### Step 5: Theory Formulation and Testing
|
||||
|
||||
**Theory Development:**
|
||||
|
||||
Based on research and environment analysis:
|
||||
1. List all plausible explanations for the error
|
||||
2. Assess likelihood of each (high/medium/low)
|
||||
3. Identify evidence supporting or contradicting each theory
|
||||
4. Order theories by likelihood and ease of testing
|
||||
|
||||
**Theory Testing Protocol:**
|
||||
|
||||
For each theory (starting with most likely):
|
||||
|
||||
1. **Design Test**
|
||||
- What command/change will verify or reject this theory?
|
||||
- What outcome would confirm the theory?
|
||||
- What outcome would reject the theory?
|
||||
|
||||
2. **Execute Test**
|
||||
- Run the test in a controlled manner
|
||||
- Capture all output
|
||||
- Note any side effects
|
||||
|
||||
3. **Evaluate Result**
|
||||
- Does the result confirm or reject the theory?
|
||||
- Are there unexpected outcomes?
|
||||
- What new information was gained?
|
||||
|
||||
4. **Document in Debug Notes**
|
||||
- Record test and result
|
||||
- Update theory status
|
||||
- Note any new theories generated
|
||||
|
||||
5. **Iterate**
|
||||
- If theory confirmed: proceed to solution implementation
|
||||
- If theory rejected: move to next theory
|
||||
- If inconclusive: gather more information
|
||||
|
||||
**Testing Best Practices:**
|
||||
- Test one variable at a time
|
||||
- Use minimal reproducible cases when possible
|
||||
- Revert changes between theory tests
|
||||
- Document negative results (what didn't work is valuable information)
|
||||
|
||||
### Phase 4: Solution Implementation
|
||||
|
||||
Once the correct theory is identified:
|
||||
|
||||
1. **Plan Implementation**
|
||||
- What changes are needed?
|
||||
- Are there risks or side effects?
|
||||
- Can the solution be tested incrementally?
|
||||
|
||||
2. **Apply Fix**
|
||||
- Make the necessary changes
|
||||
- Document what was changed
|
||||
- Keep changes minimal and focused
|
||||
|
||||
3. **Verify Resolution**
|
||||
- Re-run the original failing command
|
||||
- Confirm error is completely resolved
|
||||
- Check for new errors or warnings
|
||||
|
||||
4. **Document Solution**
|
||||
- Record the fix in debug notes (if created)
|
||||
- Note root cause and solution for future reference
|
||||
- Consider adding to common error patterns if widely applicable
|
||||
|
||||
### Phase 5: Post-Resolution
|
||||
|
||||
1. **Clean Up**
|
||||
- Remove temporary debug files (if any)
|
||||
- Clean up test artifacts
|
||||
- Restore any temporary changes
|
||||
|
||||
2. **Knowledge Capture**
|
||||
- If this was a difficult problem with a general solution, consider documenting it
|
||||
- Update `references/common-error-patterns.md` if appropriate
|
||||
- Note any tools or techniques that were particularly effective
|
||||
|
||||
## Advanced Investigation Techniques
|
||||
|
||||
### Bisection Method
|
||||
|
||||
For errors introduced by recent changes:
|
||||
1. Identify last known good state
|
||||
2. Bisect the changes between good and bad state
|
||||
3. Test each bisection point
|
||||
4. Narrow down to the specific change that introduced the error
|
||||
|
||||
### Differential Diagnosis
|
||||
|
||||
When multiple theories seem equally plausible:
|
||||
1. Identify distinguishing characteristics of each theory
|
||||
2. Design tests that differentiate between theories
|
||||
3. Execute targeted tests to rule out theories
|
||||
4. Converge on the correct diagnosis through elimination
|
||||
|
||||
### Reproduction Reduction
|
||||
|
||||
For complex errors:
|
||||
1. Create minimal reproducible example
|
||||
2. Strip away unrelated code/configuration
|
||||
3. Isolate the essential elements that trigger the error
|
||||
4. Use reduced case for easier investigation
|
||||
|
||||
## Communication and Escalation
|
||||
|
||||
### When to Request User Input
|
||||
|
||||
Request user input when:
|
||||
- Multiple equally valid solutions exist
|
||||
- User preference affects solution choice
|
||||
- Sensitive information access is needed
|
||||
- Problem domain knowledge is required
|
||||
- Verification of fix needs user testing
|
||||
|
||||
### How to Present Findings
|
||||
|
||||
When communicating with user:
|
||||
1. **Summarize**: Brief description of the error and its cause
|
||||
2. **Explain**: Why the error occurred
|
||||
3. **Solution**: What was done to fix it
|
||||
4. **Verification**: How to confirm it's resolved
|
||||
5. **Prevention**: How to avoid in the future (if applicable)
|
||||
|
||||
## Common Pitfalls to Avoid
|
||||
|
||||
1. **Assumption Paralysis**: Don't assume too much; verify theories with tests
|
||||
2. **Fix Stacking**: Don't apply multiple fixes simultaneously; test one at a time
|
||||
3. **Context Drift**: Stay focused on the original error; avoid rabbit holes
|
||||
4. **Incomplete Reversion**: Always fully revert failed fixes
|
||||
5. **Premature Success**: Verify the error is truly resolved, not just hidden
|
||||
6. **Privacy Violations**: Never collect sensitive data without permission
|
||||
|
||||
## Success Criteria
|
||||
|
||||
A troubleshooting session is successful when:
|
||||
1. The original error is completely resolved
|
||||
2. The fix is understood and documented
|
||||
3. No new errors were introduced
|
||||
4. The solution is appropriate and maintainable
|
||||
5. Lessons learned are captured for future reference
|
||||
33
skills/error-troubleshooter/repackage.py
Normal file
33
skills/error-troubleshooter/repackage.py
Normal file
@@ -0,0 +1,33 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Repackage this skill into a distributable zip file.
|
||||
|
||||
Usage:
|
||||
cd error-troubleshooter
|
||||
python repackage.py
|
||||
|
||||
Output: ../error-troubleshooter.zip
|
||||
"""
|
||||
import zipfile
|
||||
from pathlib import Path
|
||||
|
||||
# Paths relative to this script
|
||||
script_dir = Path(__file__).parent
|
||||
skill_name = script_dir.name
|
||||
zip_path = script_dir.parent / f'{skill_name}.zip'
|
||||
|
||||
# Remove old zip if exists
|
||||
if zip_path.exists():
|
||||
zip_path.unlink()
|
||||
print(f"Removed old: {zip_path.name}")
|
||||
|
||||
print(f"Packaging skill: {skill_name}\n")
|
||||
|
||||
with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zf:
|
||||
for file_path in script_dir.rglob('*'):
|
||||
if file_path.is_file() and file_path.name != 'repackage.py': # Don't include this script
|
||||
arcname = file_path.relative_to(script_dir.parent)
|
||||
zf.write(file_path, arcname)
|
||||
print(f" Added: {arcname}")
|
||||
|
||||
print(f"\n✅ Successfully packaged to: {zip_path.absolute()}")
|
||||
Reference in New Issue
Block a user