14 KiB
Pattern: Problem-Solution Structure
Status: ✅ Validated (2+ uses) Domain: Documentation (especially troubleshooting and diagnostic guides) Transferability: Universal (applies to all problem-solving documentation)
Problem
Users come to documentation with problems, not abstract interest in features. Traditional feature-first documentation makes users hunt for solutions.
Symptoms:
- Users can't find answers to "How do I fix X?" questions
- Documentation organized by feature, not by problem
- Troubleshooting sections are afterthoughts (if they exist)
- No systematic diagnostic guidance
Solution
Structure documentation around problems and their solutions, not features and capabilities.
Pattern: Problem → Diagnosis → Solution → Prevention
Key Principle: Start with user's problem state (symptoms), guide to root cause (diagnosis), provide actionable solution, then show how to prevent recurrence.
Implementation
Basic Structure
## Problem: [User's Issue]
**Symptoms**: [Observable signs user experiences]
**Example**: [Concrete manifestation of the problem]
---
**Diagnosis**: [How to identify root cause]
**Common Causes**:
1. [Cause 1] - [How to verify]
2. [Cause 2] - [How to verify]
3. [Cause 3] - [How to verify]
---
**Solution**:
[For Each Cause]:
**If [Cause]**:
1. [Step 1]
2. [Step 2]
3. [Verify fix worked]
---
**Prevention**: [How to avoid this problem in future]
Example: From BAIME Guide Troubleshooting
## Problem: Value scores not improving
**Symptoms**: V_instance or V_meta stuck or decreasing across iterations
**Example**:
Iteration 0: V_instance = 0.35, V_meta = 0.25 Iteration 1: V_instance = 0.37, V_meta = 0.28 (minimal progress) Iteration 2: V_instance = 0.34, V_meta = 0.30 (instance decreased!)
---
**Diagnosis**: Identify root cause of stagnation
**Common Causes**:
1. **Solving symptoms, not problems**
- Verify: Are you addressing surface issues or root causes?
- Example: "Low test coverage" (symptom) vs "No systematic testing strategy" (root cause)
2. **Incorrect value function definition**
- Verify: Do components actually measure quality?
- Example: Coverage % alone doesn't capture test quality
3. **Working on wrong priorities**
- Verify: Are you addressing highest-impact gaps?
- Example: Fixing grammar when structure is unclear
---
**Solution**:
**If Solving Symptoms**:
1. Re-analyze problems in iteration-N.md section 9
2. Identify root causes (not symptoms)
3. Focus next iteration on root cause solutions
**Example**:
❌ Problem: "Low test coverage" → Solution: "Write more tests" ✅ Problem: "No systematic testing strategy" → Solution: "Create TDD workflow pattern"
**If Incorrect Value Function**:
1. Review V_instance/V_meta component definitions
2. Ensure components measure actual quality, not proxies
3. Recalculate scores with corrected definitions
**If Wrong Priorities**:
1. Use gap analysis in evaluation section
2. Prioritize by impact (∆V potential)
3. Defer low-impact items
---
**Prevention**:
1. **Problem analysis before solution**: Spend 20% of iteration time on diagnosis
2. **Root cause identification**: Ask "why" 5 times to find true problem
3. **Impact-based prioritization**: Calculate potential ∆V for each gap
4. **Value function validation**: Ensure components measure real quality
---
**Success Indicators** (how to know fix worked):
- Next iteration shows meaningful progress (∆V ≥ 0.05)
- Problems addressed are root causes, not symptoms
- Value function components correlate with actual quality
When to Use
Use This Pattern For
✅ Troubleshooting guides (diagnosing and fixing issues) ✅ Diagnostic workflows (systematic problem identification) ✅ Error recovery (handling failures and restoring service) ✅ Optimization guides (identifying and removing bottlenecks) ✅ Debugging documentation (finding and fixing bugs)
Don't Use For
❌ Feature documentation (use example-driven or tutorial patterns) ❌ Conceptual explanations (use concept explanation pattern) ❌ Getting started guides (use progressive disclosure pattern)
Validation Evidence
Use 1: BAIME Guide Troubleshooting (Iteration 0-2)
- 3 issues documented: Value scores not improving, Low reusability, Can't reach convergence
- Each issue: Symptoms → Diagnosis → Solution → Prevention
- Pattern emerged from user pain points (anticipated, then validated)
- Result: Users can self-diagnose and solve problems without asking for help
Use 2: Troubleshooting Guide Template (Iteration 2)
- Template structure: Problem → Diagnosis → Solution → Prevention
- Comprehensive example with symptoms, decision trees, success indicators
- Validated through application to 3 BAIME issues
- Result: Reusable template for creating troubleshooting docs in any domain
Use 3: Error Recovery Methodology (Iteration 3, second example)
- 13-category error taxonomy
- 8 diagnostic workflows (each: Symptom → Context → Root Cause → Solution)
- 5 recovery patterns (each: Problem → Recovery Strategy → Implementation)
- 8 prevention guidelines
- Result: 95.4% historical error coverage, 23.7% prevention rate
Pattern Validated: ✅ 3 uses across BAIME guide, troubleshooting template, error recovery methodology
Best Practices
1. Start With User-Facing Symptoms
Good (User Perspective):
**Symptoms**: My tests keep failing with "fixture not found" errors
Less Effective (System Perspective):
**Problem**: Fixture loading mechanism is broken
Why: Users experience symptoms, not internal system states. Starting with symptoms meets users where they are.
2. Provide Multiple Root Causes
Good (Comprehensive Diagnosis):
**Common Causes**:
1. Fixture file missing (check path)
2. Fixture in wrong directory (check structure)
3. Fixture name misspelled (check spelling)
Less Effective (Single Cause):
**Cause**: File not found
Why: Same symptom can have multiple root causes. Comprehensive diagnosis helps users identify their specific issue.
3. Include Concrete Examples
Good (Concrete):
**Example**:
Iteration 0: V_instance = 0.35 Iteration 1: V_instance = 0.37 (+0.02, minimal)
Less Effective (Abstract):
**Example**: Value scores show little improvement
Why: Concrete examples help users recognize their situation ("Yes, that's exactly what I'm seeing!")
4. Provide Verification Steps
Good (Verifiable):
**Diagnosis**: Check if value function components measure real quality
**Verify**: Do test coverage improvements correlate with actual test quality?
**Test**: Lower coverage with better tests should score higher than high coverage with brittle tests
Less Effective (Unverifiable):
**Diagnosis**: Value function might be wrong
Why: Users need concrete steps to verify diagnosis, not just vague possibilities.
5. Include Success Indicators
Good (Measurable):
**Success Indicators**:
- Next iteration shows ∆V ≥ 0.05 (meaningful progress)
- Problems addressed are root causes
- Value scores correlate with perceived quality
Less Effective (Vague):
**Success**: Things get better
Why: Users need to know fix worked. Concrete indicators provide confidence.
6. Document Prevention, Not Just Solution
Good (Preventive):
**Solution**: [Fix current problem]
**Prevention**: Add automated test to catch this class of errors
Less Effective (Reactive):
**Solution**: [Fix current problem]
Why: Prevention reduces future support burden and improves user experience.
Variations
Variation 1: Decision Tree Diagnosis
Use For: Complex problems with many potential causes
Structure:
**Diagnosis Decision Tree**:
Is V_instance improving?
├─ Yes → Check V_meta (see below)
└─ No → Is work addressing root causes?
├─ Yes → Check value function definition
└─ No → Re-prioritize based on gap analysis
Example from BAIME Troubleshooting: Value score improvement decision tree
Variation 2: Before/After Solutions
Use For: Demonstrating fix impact
Structure:
**Before** (Problem State):
[Code/config/state showing problem]
**After** (Solution State):
[Code/config/state after fix]
**Impact**: [Measurable improvement]
Example:
**Before**:
```python
V_instance = 0.37 # Vague calculation
After:
V_instance = (Coverage + Quality + Maintainability) / 3
= (0.40 + 0.25 + 0.40) / 3
= 0.35
Impact: +0.20 accuracy through explicit component breakdown
### Variation 3: Symptom-Cause Matrix
**Use For**: Multiple symptoms mapping to overlapping causes
**Structure**: Table mapping symptoms to likely causes
**Example**:
| Symptom | Likely Cause 1 | Likely Cause 2 | Likely Cause 3 |
|---------|----------------|----------------|----------------|
| V stuck | Wrong priorities | Incorrect value function | Solving symptoms |
| V decreasing | New penalties discovered | Honest reassessment | System evolution broke deliverable |
### Variation 4: Diagnostic Workflow
**Use For**: Systematic problem investigation
**Structure**: Step-by-step investigation process
**Example from Error Recovery**:
1. **Symptom identification**: What error occurred?
2. **Context gathering**: When? Where? Under what conditions?
3. **Root cause analysis**: Why did it occur? (5 Whys)
4. **Solution selection**: Which recovery pattern applies?
5. **Implementation**: Apply solution with verification
6. **Prevention**: Add safeguards to prevent recurrence
---
## Common Mistakes
### Mistake 1: Starting With Solution Instead of Problem
**Bad**:
```markdown
## Use This New Feature
[Feature explanation]
Good:
## Problem: Can't Quickly Reference Commands
**Symptoms**: Spend 5+ minutes searching docs for syntax
**Solution**: Use Quick Reference (this new feature)
Why: Users care about solving problems, not learning features for their own sake.
Mistake 2: Diagnosis Without Verification Steps
Bad:
**Diagnosis**: Value function might be wrong
Good:
**Diagnosis**: Value function definition incorrect
**Verify**:
1. Review component definitions
2. Test: Do component scores correlate with perceived quality?
3. Check: Would high-quality deliverable score high?
Why: Users need concrete steps to confirm diagnosis.
Mistake 3: Solution Without Context
Bad:
**Solution**: Recalculate V_instance with corrected formula
Good:
**Solution** (If value function definition incorrect):
1. Review V_instance component definitions in iteration-0.md
2. Ensure components measure actual quality (not proxies)
3. Recalculate all historical scores with corrected definition
4. Update system-state.md with corrected values
Why: Context-free solutions are hard to apply correctly.
Mistake 4: No Prevention Guidance
Bad: Only provides fix for current problem
Good: Provides fix + prevention strategy
Why: Prevention reduces recurring issues and support burden.
Related Patterns
Example-Driven Explanation: Use examples to illustrate both problem and solution states
- Problem Example: "This is what goes wrong"
- Solution Example: "This is what it looks like when fixed"
Progressive Disclosure: Structure troubleshooting in layers
- Quick Fixes: Common issues (80% of cases)
- Diagnostic Guide: Systematic investigation
- Deep Troubleshooting: Edge cases and complex issues
Decision Trees: Structured diagnosis for complex problems
- Each decision point: Symptom → Question → Branch to cause/solution
Transferability Assessment
Domains Validated:
- ✅ BAIME troubleshooting (methodology improvement)
- ✅ Template creation (troubleshooting guide template)
- ✅ Error recovery (comprehensive diagnostic workflows)
Cross-Domain Applicability: 100%
- Pattern works for any problem-solving documentation
- Applies to software errors, system failures, user issues, process problems
- Universal structure: Problem → Diagnosis → Solution → Prevention
Adaptation Effort: 0%
- Pattern applies as-is to all troubleshooting domains
- Content changes (specific problems/solutions), structure identical
- No modifications needed for different domains
Evidence:
- Software error recovery: 13 error categories, 8 diagnostic workflows
- Methodology troubleshooting: 3 BAIME issues, each with full problem-solution structure
- Template reuse: Troubleshooting guide template used for diverse domains
Summary
Pattern: Problem → Diagnosis → Solution → Prevention
When: Troubleshooting, error recovery, diagnostic guides, optimization
Why: Users come with problems, not feature curiosity. Meeting users at problem state improves discoverability and satisfaction.
Structure:
- Symptoms: Observable user-facing issues
- Diagnosis: Root cause identification with verification
- Solution: Actionable fix with success indicators
- Prevention: How to avoid problem in future
Validation: ✅ 3+ uses (BAIME troubleshooting, troubleshooting template, error recovery)
Transferability: 100% (universal across all problem-solving documentation)
Best Practices:
- Start with user symptoms, not system internals
- Provide multiple root causes with verification steps
- Include concrete examples users can recognize
- Document prevention, not just reactive fixes
- Add success indicators so users know fix worked
Pattern Version: 1.0 Extracted: Iteration 3 (2025-10-19) Status: ✅ Validated and ready for reuse