Files
gh-kylesnowschwartz-simplec…/commands/sc-five-whys.md
2025-11-30 08:36:25 +08:00

155 lines
4.5 KiB
Markdown

# sc-five-whys: Autonomous Root Cause Analysis Protocol
_A systematic approach to finding root causes in software problems._
## Process:
### 1. Capture the Problem
**Define specifically what's broken:**
- Problem statement exists in ${ARGUMENTS} or conversation context
- You have access to relevant data/logs/documentation
- What's the observed issue? (error, wrong behavior, visual bug)
- When does it occur? (specific triggers, conditions, frequency)
- Where does it manifest? (backend/frontend/specific component)
- Who/what is affected? (users, systems, features)
**Gather evidence:**
- Error messages, stack traces, console logs
- Screenshots for visual issues
- Steps to reproduce
- Recent changes that might relate
### 2. Ask "Why?" (Usually 5 Times, Sometimes Less)
**Build your causal chain:**
```
Problem: [Specific issue]
├── Why? → [Technical cause A]
│ └── Why? → [Deeper cause A2]
│ └── Why? → [Continue until root]
└── Why? → [Technical cause B] (if multiple factors)
└── Why? → [Continue separately]
```
**Focus your whys based on problem type:**
- **Backend issues**: logic errors, data state, resources, timing
- **Frontend issues**: events, state management, CSS, browser differences
- **Visual bugs**: often only need 2-3 whys (styling → specificity → root)
- **Performance**: may need 6-7 whys to reach architectural roots
**Watch for false roots (keep digging if you hit these):**
- "Human error" / "Someone made a mistake"
- "Lack of time/resources"
- "That's how it's always been"
- "Random occurrence"
- "Just make the test pass"
- "I'll commit this with --no-verify"
### 3. Validate Your Root Cause
**Can you answer YES to:**
- Can I point to specific code/config that embodies this cause?
- If I fix this exact thing, will the problem definitely stop?
- Does this cause explain all observed symptoms?
**For UI/UX issues also check:**
- Can I demonstrate the fix in DevTools?
- Does this explain why it works in some contexts but not others?
### 4. Develop the Fix
**Address multiple levels:**
- **Immediate**: Stop the bleeding (workaround, feature flag, revert)
- **Root fix**: Address the actual cause you found
- **Prevention**: Ensure this class of issue can't recur (tests, types, linting)
## Example - Backend:
**Problem**: API returns 500 error for specific users
1. **Why?** → Database query times out
2. **Why?** → Query joins 5 tables for these users
3. **Why?** → These users have 1000x normal data volume
4. **Why?** → No pagination on user data retrieval
5. **Why?** → Original design assumed <100 items per user
**Root**: Missing pagination in data access layer
**Fix**: Add pagination, set query limits, add volume tests
## Example - Frontend:
**Problem**: Button doesn't respond on mobile
1. **Why?** → Click handler not firing
2. **Why?** → Parent div has touch handler that prevents bubbling
3. **Why?** → Swipe gesture detection consumes all touch events
**Root**: Overly aggressive event.preventDefault() in swipe handler
**Fix**: Add conditional logic to allow tap-through on buttons
## Smart Patterns:
**When to stop before 5:**
- Found the exact line of broken code
- Identified the specific config value
- Located the CSS rule causing misalignment
- Going deeper would just explain "why coding exists"
**When to go beyond 5:**
- Intermittent failures (race conditions need deep tracing)
- Performance degradation (often architectural)
- Complex state corruption (may have long cause chains)
**When to branch your investigation:**
- "Works on dev but not prod" → investigate environment differences
- "Sometimes fails" → investigate timing/concurrency separately
- "Only certain users" → investigate data patterns separately
## Output Template:
```markdown
### Problem
[One sentence description]
### Five Whys Analysis
1. Why? [Answer] - _Evidence: [what proves this]_
2. Why? [Answer] - _Evidence: [what proves this]_
[...continue to root]
### Root Cause
**Location**: [file:line or component]
**Issue**: [Specific technical problem]
**Confidence**: [High/Medium/Low based on evidence]
### Solution
**Immediate mitigation**: [If urgent]
**Root fix**: [Code/config change needed]
**Prevention**: [Test/process to prevent recurrence]
### Notes
[Any branches explored, assumptions made, or additional context]
```
---
_Remember: Five Whys is a thinking tool, not a rigid formula. Sometimes three whys finds the root. Sometimes you need seven. The goal is understanding, not completing a checklist._
${ARGUMENTS}