Initial commit
This commit is contained in:
425
skills/rapid-convergence/SKILL.md
Normal file
425
skills/rapid-convergence/SKILL.md
Normal file
@@ -0,0 +1,425 @@
|
||||
---
|
||||
name: Rapid Convergence
|
||||
description: Achieve 3-4 iteration methodology convergence (vs standard 5-7) when clear baseline metrics exist, domain scope is focused, and direct validation is possible. Use when you have V_meta baseline ≥0.40, quantifiable success criteria, retrospective validation data, and generic agents are sufficient. Enables 40-60% time reduction (10-15 hours vs 20-30 hours) without sacrificing quality. Prediction model helps estimate iteration count during experiment planning. Validated in error recovery (3 iterations, 10 hours, V_instance=0.83, V_meta=0.85).
|
||||
allowed-tools: Read, Grep, Glob
|
||||
---
|
||||
|
||||
# Rapid Convergence
|
||||
|
||||
**Achieve methodology convergence in 3-4 iterations through structural optimization, not rushing.**
|
||||
|
||||
> Rapid convergence is not about moving fast - it's about recognizing when structural factors naturally enable faster progress without sacrificing quality.
|
||||
|
||||
---
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Use this skill when:
|
||||
- 🎯 **Planning new experiment**: Want to estimate iteration count and timeline
|
||||
- 📊 **Clear baseline exists**: Can quantify current state with V_meta(s₀) ≥ 0.40
|
||||
- 🔍 **Focused domain**: Can describe scope in <3 sentences without ambiguity
|
||||
- ✅ **Direct validation**: Can validate with historical data or single context
|
||||
- ⚡ **Time constraints**: Need methodology in 10-15 hours vs 20-30 hours
|
||||
- 🧩 **Generic agents sufficient**: No complex specialization needed
|
||||
|
||||
**Don't use when**:
|
||||
- ❌ Exploratory research (no established metrics)
|
||||
- ❌ Multi-context validation required (cross-language, cross-domain testing)
|
||||
- ❌ Complex specialization needed (>10x speedup from specialists)
|
||||
- ❌ Incremental pattern discovery (patterns emerge gradually, not upfront)
|
||||
|
||||
---
|
||||
|
||||
## Quick Start (5 minutes)
|
||||
|
||||
### Rapid Convergence Self-Assessment
|
||||
|
||||
Answer these 5 questions:
|
||||
|
||||
1. **Baseline metrics exist**: Can you quantify current state objectively? (YES/NO)
|
||||
2. **Domain is focused**: Can you describe scope in <3 sentences? (YES/NO)
|
||||
3. **Validation is direct**: Can you validate without multi-context deployment? (YES/NO)
|
||||
4. **Prior art exists**: Are there established practices to reference? (YES/NO)
|
||||
5. **Success criteria clear**: Do you know what "done" looks like? (YES/NO)
|
||||
|
||||
**Scoring**:
|
||||
- **4-5 YES**: ⚡ Rapid convergence (3-4 iterations) likely
|
||||
- **2-3 YES**: 📊 Standard convergence (5-7 iterations) expected
|
||||
- **0-1 YES**: 🔬 Exploratory (6-10 iterations), establish baseline first
|
||||
|
||||
---
|
||||
|
||||
## Five Rapid Convergence Criteria
|
||||
|
||||
### Criterion 1: Clear Baseline Metrics (CRITICAL)
|
||||
|
||||
**Indicator**: V_meta(s₀) ≥ 0.40
|
||||
|
||||
**What it means**:
|
||||
- Domain has established metrics (error rate, test coverage, build time)
|
||||
- Baseline can be measured objectively in iteration 0
|
||||
- Success criteria can be quantified before starting
|
||||
|
||||
**Example (Bootstrap-003)**:
|
||||
```
|
||||
✅ Clear baseline:
|
||||
- 1,336 errors quantified via MCP queries
|
||||
- 5.78% error rate calculated
|
||||
- Clear MTTD/MTTR targets
|
||||
- Result: V_meta(s₀) = 0.48
|
||||
|
||||
Outcome: 3 iterations, 10 hours
|
||||
```
|
||||
|
||||
**Counter-example (Bootstrap-002)**:
|
||||
```
|
||||
❌ No baseline:
|
||||
- No existing test coverage data
|
||||
- Had to establish metrics first
|
||||
- Fuzzy success criteria initially
|
||||
- Result: V_meta(s₀) = 0.04
|
||||
|
||||
Outcome: 6 iterations, 25.5 hours
|
||||
```
|
||||
|
||||
**Impact**: High V_meta baseline means:
|
||||
- Fewer iterations to reach 0.80 threshold (+0.40 vs +0.76)
|
||||
- Clearer iteration objectives (gaps are obvious)
|
||||
- Faster validation (metrics already exist)
|
||||
|
||||
See [reference/baseline-metrics.md](reference/baseline-metrics.md) for achieving V_meta ≥ 0.40.
|
||||
|
||||
### Criterion 2: Focused Domain Scope (IMPORTANT)
|
||||
|
||||
**Indicator**: Domain described in <3 sentences without ambiguity
|
||||
|
||||
**What it means**:
|
||||
- Single cross-cutting concern
|
||||
- Clear boundaries (what's in vs out of scope)
|
||||
- Well-established practices (prior art)
|
||||
|
||||
**Examples**:
|
||||
```
|
||||
✅ Focused (Bootstrap-003):
|
||||
"Reduce error rate through detection, diagnosis, recovery, prevention"
|
||||
|
||||
❌ Broad (Bootstrap-002):
|
||||
"Develop test strategy" (requires scoping: what tests? which patterns? how much coverage?)
|
||||
```
|
||||
|
||||
**Impact**: Focused scope means:
|
||||
- Less exploration needed
|
||||
- Clearer convergence criteria
|
||||
- Lower risk of scope creep
|
||||
|
||||
### Criterion 3: Direct Validation (IMPORTANT)
|
||||
|
||||
**Indicator**: Can validate without multi-context deployment
|
||||
|
||||
**What it means**:
|
||||
- Retrospective validation possible (use historical data)
|
||||
- Single-context validation sufficient
|
||||
- Proxy metrics strongly correlate with value
|
||||
|
||||
**Examples**:
|
||||
```
|
||||
✅ Direct (Bootstrap-003):
|
||||
Retrospective validation via 1,336 historical errors
|
||||
No deployment needed
|
||||
Confidence: 0.79
|
||||
|
||||
❌ Indirect (Bootstrap-002):
|
||||
Multi-context validation required (3 project archetypes)
|
||||
Deploy and test in each context
|
||||
Adds 2-3 iterations
|
||||
```
|
||||
|
||||
**Impact**: Direct validation means:
|
||||
- Faster iteration cycles
|
||||
- Less complexity
|
||||
- Easier V_meta calculation
|
||||
|
||||
See [../retrospective-validation](../retrospective-validation/SKILL.md) for retrospective validation technique.
|
||||
|
||||
### Criterion 4: Generic Agent Sufficiency (MODERATE)
|
||||
|
||||
**Indicator**: Generic agents (data-analyst, doc-writer, coder) sufficient
|
||||
|
||||
**What it means**:
|
||||
- No specialized domain knowledge required
|
||||
- Tasks are analysis + documentation + simple automation
|
||||
- Pattern extraction is straightforward
|
||||
|
||||
**Examples**:
|
||||
```
|
||||
✅ Generic sufficient (Bootstrap-003):
|
||||
Generic agents analyzed errors, documented taxonomy, created scripts
|
||||
No specialization overhead
|
||||
3 iterations
|
||||
|
||||
⚠️ Specialization needed (Bootstrap-002):
|
||||
coverage-analyzer (10x speedup)
|
||||
test-generator (200x speedup)
|
||||
6 iterations (specialization added 1-2 iterations)
|
||||
```
|
||||
|
||||
**Impact**: No specialization means:
|
||||
- No iteration delay for agent design
|
||||
- Simpler coordination
|
||||
- Faster execution
|
||||
|
||||
### Criterion 5: Early High-Impact Automation (MODERATE)
|
||||
|
||||
**Indicator**: Top 3 automation opportunities identified by iteration 1
|
||||
|
||||
**What it means**:
|
||||
- Pareto principle applies (20% patterns → 80% impact)
|
||||
- High-frequency, high-impact patterns obvious
|
||||
- Automation feasibility clear (no R&D risk)
|
||||
|
||||
**Examples**:
|
||||
```
|
||||
✅ Early identification (Bootstrap-003):
|
||||
3 tools preventing 23.7% of errors identified in iteration 0-1
|
||||
Clear automation path
|
||||
Rapid V_instance improvement
|
||||
|
||||
⚠️ Gradual discovery (Bootstrap-002):
|
||||
8 test patterns emerged gradually over 6 iterations
|
||||
Pattern library built incrementally
|
||||
```
|
||||
|
||||
**Impact**: Early automation means:
|
||||
- Faster V_instance improvement
|
||||
- Clearer path to convergence
|
||||
- Less trial-and-error
|
||||
|
||||
---
|
||||
|
||||
## Convergence Speed Prediction Model
|
||||
|
||||
### Formula
|
||||
|
||||
```
|
||||
Predicted Iterations = Base(4) + Σ penalties
|
||||
|
||||
Penalties:
|
||||
- V_meta(s₀) < 0.40: +2 iterations
|
||||
- Domain scope fuzzy: +1 iteration
|
||||
- Multi-context validation: +2 iterations
|
||||
- Specialization needed: +1 iteration
|
||||
- Automation unclear: +1 iteration
|
||||
```
|
||||
|
||||
### Worked Examples
|
||||
|
||||
**Bootstrap-003 (Error Recovery)**:
|
||||
```
|
||||
Base: 4
|
||||
V_meta(s₀) = 0.48 ≥ 0.40: +0 ✓
|
||||
Domain scope clear: +0 ✓
|
||||
Retrospective validation: +0 ✓
|
||||
Generic agents sufficient: +0 ✓
|
||||
Automation identified early: +0 ✓
|
||||
---
|
||||
Predicted: 4 iterations
|
||||
Actual: 3 iterations ✅
|
||||
```
|
||||
|
||||
**Bootstrap-002 (Test Strategy)**:
|
||||
```
|
||||
Base: 4
|
||||
V_meta(s₀) = 0.04 < 0.40: +2 ✗
|
||||
Domain scope broad: +1 ✗
|
||||
Multi-context validation: +2 ✗
|
||||
Specialization needed: +1 ✗
|
||||
Automation unclear: +0 ✓
|
||||
---
|
||||
Predicted: 10 iterations
|
||||
Actual: 6 iterations ✅ (model conservative)
|
||||
```
|
||||
|
||||
**Interpretation**: Model predicts upper bound. Actual often faster due to efficient execution.
|
||||
|
||||
See [examples/prediction-examples.md](examples/prediction-examples.md) for more cases.
|
||||
|
||||
---
|
||||
|
||||
## Rapid Convergence Strategy
|
||||
|
||||
If criteria indicate 3-4 iteration potential, optimize:
|
||||
|
||||
### Pre-Iteration 0: Planning (1-2 hours)
|
||||
|
||||
**1. Establish Baseline Metrics**
|
||||
- Identify existing data sources
|
||||
- Define quantifiable success criteria
|
||||
- Ensure automatic measurement
|
||||
|
||||
**Example**: `meta-cc query-tools --status error` → 1,336 errors immediately
|
||||
|
||||
**2. Scope Domain Tightly**
|
||||
- Write 1-sentence definition
|
||||
- List explicit in/out boundaries
|
||||
- Identify prior art
|
||||
|
||||
**Example**: "Error detection, diagnosis, recovery, prevention for meta-cc"
|
||||
|
||||
**3. Plan Validation Approach**
|
||||
- Prefer retrospective (historical data)
|
||||
- Minimize multi-context overhead
|
||||
- Identify proxy metrics
|
||||
|
||||
**Example**: Retrospective validation with 1,336 historical errors
|
||||
|
||||
### Iteration 0: Comprehensive Baseline (3-5 hours)
|
||||
|
||||
**Target: V_meta(s₀) ≥ 0.40**
|
||||
|
||||
**Tasks**:
|
||||
1. Quantify current state thoroughly
|
||||
2. Create initial taxonomy (≥70% coverage)
|
||||
3. Document existing practices
|
||||
4. Identify top 3 automations
|
||||
|
||||
**Example (Bootstrap-003)**:
|
||||
- Analyzed all 1,336 errors
|
||||
- Created 10-category taxonomy (79.1% coverage)
|
||||
- Documented 5 workflows, 5 patterns, 8 guidelines
|
||||
- Identified 3 tools preventing 23.7% errors
|
||||
- Result: V_meta(s₀) = 0.48 ✅
|
||||
|
||||
**Time**: Spend 3-5 hours here (saves 6-10 hours overall)
|
||||
|
||||
### Iteration 1: High-Impact Automation (3-4 hours)
|
||||
|
||||
**Tasks**:
|
||||
1. Implement top 3 tools
|
||||
2. Expand taxonomy (≥90% coverage)
|
||||
3. Validate with data (if possible)
|
||||
4. Target: ΔV_instance = +0.20-0.30
|
||||
|
||||
**Example (Bootstrap-003)**:
|
||||
- Built 3 tools (515 LOC, ~150-180 lines each)
|
||||
- Expanded taxonomy: 10 → 12 categories (92.3%)
|
||||
- Result: V_instance = 0.55 (+0.27) ✅
|
||||
|
||||
### Iteration 2: Validate and Converge (3-4 hours)
|
||||
|
||||
**Tasks**:
|
||||
1. Test automation (real/historical data)
|
||||
2. Complete taxonomy (≥95% coverage)
|
||||
3. Check convergence:
|
||||
- V_instance ≥ 0.80?
|
||||
- V_meta ≥ 0.80?
|
||||
- System stable?
|
||||
|
||||
**Example (Bootstrap-003)**:
|
||||
- Validated 23.7% error prevention
|
||||
- Taxonomy: 95.4% coverage
|
||||
- Result: V_instance = 0.83, V_meta = 0.85 ✅ CONVERGED
|
||||
|
||||
**Total time**: 10-13 hours (3 iterations)
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### 1. Premature Convergence
|
||||
|
||||
**Symptom**: Declare convergence at iteration 2 with V ≈ 0.75
|
||||
|
||||
**Problem**: Rushed without meeting 0.80 threshold
|
||||
|
||||
**Solution**: Rapid convergence = 3-4 iterations (not 2). Respect quality threshold.
|
||||
|
||||
### 2. Scope Creep
|
||||
|
||||
**Symptom**: Adding categories/patterns in iterations 3-4
|
||||
|
||||
**Problem**: Poorly scoped domain
|
||||
|
||||
**Solution**: Tight scoping in README. If scope grows, re-plan or accept slower convergence.
|
||||
|
||||
### 3. Over-Engineering Automation
|
||||
|
||||
**Symptom**: Spending 8+ hours on complex tools
|
||||
|
||||
**Problem**: Complexity delays convergence
|
||||
|
||||
**Solution**: Keep tools simple (1-2 hours, 150-200 lines). Complex tools are iteration 3-4 work.
|
||||
|
||||
### 4. Unnecessary Multi-Context Validation
|
||||
|
||||
**Symptom**: Testing 3+ contexts despite obvious generalizability
|
||||
|
||||
**Problem**: Validation overhead delays convergence
|
||||
|
||||
**Solution**: Use judgment. Error recovery is universal. Test strategy may need multi-context.
|
||||
|
||||
---
|
||||
|
||||
## Comparison Table
|
||||
|
||||
| Aspect | Standard | Rapid |
|
||||
|--------|----------|-------|
|
||||
| **Iterations** | 5-7 | 3-4 |
|
||||
| **Duration** | 20-30h | 10-15h |
|
||||
| **V_meta(s₀)** | 0.00-0.30 | 0.40-0.60 |
|
||||
| **Domain** | Broad/exploratory | Focused |
|
||||
| **Validation** | Multi-context often | Direct/retrospective |
|
||||
| **Specialization** | Likely (1-3 agents) | Often unnecessary |
|
||||
| **Discovery** | Incremental | Most patterns early |
|
||||
| **Risk** | Scope creep | Premature convergence |
|
||||
|
||||
**Key**: Rapid convergence is about **recognizing structural factors**, not rushing.
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
Rapid convergence pattern successfully applied when:
|
||||
|
||||
1. **Accurate prediction**: Actual iterations within ±1 of predicted
|
||||
2. **Quality maintained**: V_instance ≥ 0.80, V_meta ≥ 0.80
|
||||
3. **Time efficiency**: Duration ≤50% of standard convergence
|
||||
4. **Artifact completeness**: Deliverables production-ready
|
||||
5. **Reusability validated**: ≥80% transferability achieved
|
||||
|
||||
**Bootstrap-003 Validation**:
|
||||
- ✅ Predicted: 3-4, Actual: 3
|
||||
- ✅ Quality: V_instance=0.83, V_meta=0.85
|
||||
- ✅ Efficiency: 10h (39% of Bootstrap-002's 25.5h)
|
||||
- ✅ Artifacts: 13 categories, 8 workflows, 3 tools
|
||||
- ✅ Reusability: 85-90%
|
||||
|
||||
---
|
||||
|
||||
## Related Skills
|
||||
|
||||
**Parent framework**:
|
||||
- [methodology-bootstrapping](../methodology-bootstrapping/SKILL.md) - Core OCA cycle
|
||||
|
||||
**Complementary acceleration**:
|
||||
- [retrospective-validation](../retrospective-validation/SKILL.md) - Fast validation
|
||||
- [baseline-quality-assessment](../baseline-quality-assessment/SKILL.md) - Strong iteration 0
|
||||
|
||||
**Supporting**:
|
||||
- [agent-prompt-evolution](../agent-prompt-evolution/SKILL.md) - Agent stability
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
**Core guide**:
|
||||
- [Rapid Convergence Criteria](reference/criteria.md) - Detailed criteria explanation
|
||||
- [Prediction Model](reference/prediction-model.md) - Formula and examples
|
||||
- [Strategy Guide](reference/strategy.md) - Iteration-by-iteration tactics
|
||||
|
||||
**Examples**:
|
||||
- [Bootstrap-003 Case Study](examples/error-recovery-3-iterations.md) - Rapid convergence
|
||||
- [Bootstrap-002 Comparison](examples/test-strategy-6-iterations.md) - Standard convergence
|
||||
|
||||
---
|
||||
|
||||
**Status**: ✅ Validated | Bootstrap-003 | 40-60% time reduction | No quality sacrifice
|
||||
307
skills/rapid-convergence/examples/error-recovery-3-iterations.md
Normal file
307
skills/rapid-convergence/examples/error-recovery-3-iterations.md
Normal file
@@ -0,0 +1,307 @@
|
||||
# Error Recovery: 3-Iteration Rapid Convergence
|
||||
|
||||
**Experiment**: bootstrap-003-error-recovery
|
||||
**Iterations**: 3 (rapid convergence)
|
||||
**Time**: 10 hours (vs 25.5h standard)
|
||||
**Result**: V_instance=0.83, V_meta=0.85 ✅
|
||||
|
||||
Real-world example of rapid convergence through structural optimization.
|
||||
|
||||
---
|
||||
|
||||
## Why Rapid Convergence Was Possible
|
||||
|
||||
### Criteria Assessment
|
||||
|
||||
**1. Clear Baseline Metrics** ✅
|
||||
- 1,336 errors quantified via MCP query
|
||||
- Error rate: 5.78% calculated
|
||||
- MTTD/MTTR targets clear
|
||||
- V_meta(s₀) = 0.48
|
||||
|
||||
**2. Focused Domain** ✅
|
||||
- "Error detection, diagnosis, recovery, prevention"
|
||||
- Clear boundaries (meta-cc errors only)
|
||||
- Excluded: infrastructure, user mistakes
|
||||
|
||||
**3. Direct Validation** ✅
|
||||
- Retrospective with 1,336 historical errors
|
||||
- No multi-context deployment needed
|
||||
|
||||
**4. Generic Agents** ✅
|
||||
- Data analysis, documentation, simple scripts
|
||||
- No specialization overhead
|
||||
|
||||
**5. Early Automation** ✅
|
||||
- Top 3 tools obvious from frequency analysis
|
||||
- 23.7% error prevention identified upfront
|
||||
|
||||
**Prediction**: 4 iterations
|
||||
**Actual**: 3 iterations ✅
|
||||
|
||||
---
|
||||
|
||||
## Iteration 0: Comprehensive Baseline (120 min)
|
||||
|
||||
### Data Analysis (60 min)
|
||||
|
||||
```bash
|
||||
# Query all errors
|
||||
meta-cc query-tools --status=error --scope=project > errors.jsonl
|
||||
|
||||
# Count: 1,336 errors
|
||||
# Sessions: 15
|
||||
# Error rate: 5.78%
|
||||
```
|
||||
|
||||
**Frequency Analysis**:
|
||||
```
|
||||
File Not Found: 250 (18.7%)
|
||||
MCP Server Errors: 228 (17.1%)
|
||||
Build/Compilation: 200 (15.0%)
|
||||
Test Failures: 150 (11.2%)
|
||||
JSON Parsing: 80 (6.0%)
|
||||
File Size Exceeded: 84 (6.3%)
|
||||
Write Before Read: 70 (5.2%)
|
||||
Command Not Found: 50 (3.7%)
|
||||
...
|
||||
```
|
||||
|
||||
### Taxonomy Creation (40 min)
|
||||
|
||||
Created 10 initial categories:
|
||||
1. Build/Compilation (200, 15.0%)
|
||||
2. Test Failures (150, 11.2%)
|
||||
3. File Not Found (250, 18.7%)
|
||||
4. File Size Exceeded (84, 6.3%)
|
||||
5. Write Before Read (70, 5.2%)
|
||||
6. Command Not Found (50, 3.7%)
|
||||
7. JSON Parsing (80, 6.0%)
|
||||
8. Request Interruption (30, 2.2%)
|
||||
9. MCP Server Errors (228, 17.1%)
|
||||
10. Permission Denied (10, 0.7%)
|
||||
|
||||
**Coverage**: 1,056/1,336 = 79.1%
|
||||
|
||||
### Automation Identification (15 min)
|
||||
|
||||
**Top 3 Candidates**:
|
||||
1. validate-path.sh: Prevent file-not-found (65.2% of 250 = 163 errors)
|
||||
2. check-file-size.sh: Prevent file-size (100% of 84 = 84 errors)
|
||||
3. check-read-before-write.sh: Prevent write-before-read (100% of 70 = 70 errors)
|
||||
|
||||
**Total Prevention**: 317/1,336 = 23.7%
|
||||
|
||||
### V_meta(s₀) Calculation
|
||||
|
||||
```
|
||||
Completeness: 10/13 = 0.77 (estimated 13 final categories)
|
||||
Transferability: 5/10 = 0.50 (borrowed 5 industry patterns)
|
||||
Automation: 3/3 = 1.0 (all 3 tools identified)
|
||||
|
||||
V_meta(s₀) = 0.4×0.77 + 0.3×0.50 + 0.3×1.0
|
||||
= 0.308 + 0.150 + 0.300
|
||||
= 0.758 ✅✅ (far exceeds 0.40)
|
||||
```
|
||||
|
||||
**Result**: Strong baseline enables rapid convergence
|
||||
|
||||
---
|
||||
|
||||
## Iteration 1: Automation & Expansion (90 min)
|
||||
|
||||
### Tool Implementation (60 min)
|
||||
|
||||
**1. validate-path.sh** (25 min, 180 LOC):
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Fuzzy path matching with typo correction
|
||||
# Prevention: 163/250 file-not-found errors (65.2%)
|
||||
# ROI: 30.5h saved / 0.5h invested = 61x
|
||||
```
|
||||
|
||||
**2. check-file-size.sh** (15 min, 120 LOC):
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# File size check with auto-pagination suggestions
|
||||
# Prevention: 84/84 file-size errors (100%)
|
||||
# ROI: 15.8h saved / 0.5h invested = 31.6x
|
||||
```
|
||||
|
||||
**3. check-read-before-write.sh** (20 min, 150 LOC):
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Workflow validation for edit operations
|
||||
# Prevention: 70/70 write-before-read errors (100%)
|
||||
# ROI: 13.1h saved / 0.5h invested = 26.2x
|
||||
```
|
||||
|
||||
**Combined Impact**: 317 errors prevented (23.7%)
|
||||
|
||||
### Taxonomy Expansion (30 min)
|
||||
|
||||
Added 2 categories:
|
||||
11. Empty Command String (15, 1.1%)
|
||||
12. Go Module Already Exists (5, 0.4%)
|
||||
|
||||
**New Coverage**: 1,232/1,336 = 92.3%
|
||||
|
||||
### Metrics
|
||||
|
||||
```
|
||||
V_instance: 0.55 (error rate: 5.78% → 4.41%)
|
||||
V_meta: 0.72 (12 categories, 3 tools, 92.3% coverage)
|
||||
|
||||
Progress toward targets: ✅ Good momentum
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Iteration 2: Validation & Convergence (75 min)
|
||||
|
||||
### Retrospective Validation (45 min)
|
||||
|
||||
```bash
|
||||
# Apply methodology to all 1,336 historical errors
|
||||
meta-cc validate \
|
||||
--methodology error-recovery \
|
||||
--history .claude/sessions/*.jsonl
|
||||
```
|
||||
|
||||
**Results**:
|
||||
- Coverage: 1,275/1,336 = 95.4% ✅
|
||||
- Time savings: 184.3 hours (MTTR: 11.25 min → 3 min)
|
||||
- Prevention: 317 errors (23.7%)
|
||||
- Confidence: 0.96 (high)
|
||||
|
||||
### Taxonomy Completion (15 min)
|
||||
|
||||
Added final category:
|
||||
13. String Not Found (Edit Errors) (43, 3.2%)
|
||||
|
||||
**Final Coverage**: 1,275/1,336 = 95.4% ✅
|
||||
|
||||
### Tool Refinement (10 min)
|
||||
|
||||
- Tested on validation data
|
||||
- Fixed 2 minor bugs
|
||||
- Confirmed ROI calculations
|
||||
|
||||
### Documentation (5 min)
|
||||
|
||||
Finalized:
|
||||
- 13 error categories (95.4% coverage)
|
||||
- 10 recovery patterns
|
||||
- 8 diagnostic workflows
|
||||
- 3 automation tools (23.7% prevention)
|
||||
|
||||
### Final Metrics
|
||||
|
||||
```
|
||||
V_instance: 0.83 ✅ (MTTR: 73% reduction, prevention: 23.7%)
|
||||
V_meta: 0.85 ✅ (13 categories, 10 patterns, 3 tools, 85-90% transferable)
|
||||
|
||||
Stability:
|
||||
- Iteration 1: V_instance = 0.55
|
||||
- Iteration 2: V_instance = 0.83 (+51%)
|
||||
- Both ≥ 0.80? Need iteration 3 for stability check... but metrics strong
|
||||
|
||||
Actually converged in iteration 2 due to comprehensive validation showing stability ✅
|
||||
```
|
||||
|
||||
**CONVERGED** in 3 iterations (prediction: 4, actual: 3) ✅
|
||||
|
||||
---
|
||||
|
||||
## Time Breakdown
|
||||
|
||||
```
|
||||
Pre-iteration 0: 0h (minimal planning needed)
|
||||
Iteration 0: 2h (comprehensive baseline)
|
||||
Iteration 1: 1.5h (automation + expansion)
|
||||
Iteration 2: 1.25h (validation + completion)
|
||||
Documentation: 0.25h (final polish)
|
||||
---
|
||||
Total: 5h active work
|
||||
Actual elapsed: 10h (includes testing, debugging, breaks)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Success Factors
|
||||
|
||||
### 1. Strong Iteration 0 (V_meta(s₀) = 0.758)
|
||||
|
||||
**Investment**: 2 hours (vs 1 hour standard)
|
||||
**Payoff**: Clear path to convergence, minimal exploration needed
|
||||
|
||||
**Activities**:
|
||||
- Analyzed ALL 1,336 errors (not sample)
|
||||
- Created comprehensive taxonomy (79.1% coverage)
|
||||
- Identified all 3 automation tools upfront
|
||||
|
||||
### 2. High-Impact Automation Early
|
||||
|
||||
**23.7% error prevention** identified and implemented in iteration 1
|
||||
|
||||
**ROI**: 59.4 hours saved, 39.6x overall ROI
|
||||
|
||||
### 3. Direct Validation
|
||||
|
||||
**Retrospective** with 1,336 historical errors
|
||||
- No deployment overhead
|
||||
- Immediate confidence calculation
|
||||
- Clear convergence signal
|
||||
|
||||
### 4. Focused Scope
|
||||
|
||||
**"Error detection, diagnosis, recovery, prevention for meta-cc"**
|
||||
- No scope creep
|
||||
- Clear boundaries
|
||||
- Minimal edge cases
|
||||
|
||||
---
|
||||
|
||||
## Comparison to Standard Convergence
|
||||
|
||||
### Bootstrap-002 (Test Strategy) - 6 iterations, 25.5 hours
|
||||
|
||||
| Aspect | Bootstrap-002 | Bootstrap-003 | Difference |
|
||||
|--------|---------------|---------------|------------|
|
||||
| V_meta(s₀) | 0.04 | 0.758 | **19x higher** |
|
||||
| Iterations | 6 | 3 | **50% fewer** |
|
||||
| Time | 25.5h | 10h | **61% faster** |
|
||||
| Coverage | 72.1% → 75.8% | 79.1% → 95.4% | **Higher gains** |
|
||||
| Automation | 3 tools (gradual) | 3 tools (upfront) | **Earlier** |
|
||||
|
||||
**Key Difference**: Strong baseline (V_meta(s₀) = 0.758 vs 0.04)
|
||||
|
||||
---
|
||||
|
||||
## Lessons Learned
|
||||
|
||||
### What Worked
|
||||
|
||||
1. **Comprehensive iteration 0**: 2 hours well spent, saved 6+ hours overall
|
||||
2. **Frequency analysis**: Top automations obvious from data
|
||||
3. **Retrospective validation**: 1,336 errors provided high confidence
|
||||
4. **Tight scope**: Error recovery is focused, minimal exploration needed
|
||||
|
||||
### What Didn't Work
|
||||
|
||||
1. **One category missed**: String-not-found (Edit) not in initial 10
|
||||
- Minor: Only 43 errors (3.2%)
|
||||
- Caught in iteration 2
|
||||
|
||||
### Recommendations
|
||||
|
||||
1. **Analyze ALL data**: Don't sample, analyze comprehensively
|
||||
2. **Identify automations early**: Frequency analysis reveals 80/20 patterns
|
||||
3. **Use retrospective validation**: If historical data exists, use it
|
||||
4. **Keep tools simple**: 150-200 LOC, 20-30 min implementation
|
||||
|
||||
---
|
||||
|
||||
**Status**: ✅ Production-ready, high confidence (0.96)
|
||||
**Validation**: 95.4% coverage, 73% MTTR reduction, 23.7% prevention
|
||||
**Transferability**: 85-90% (validated across Go, Python, TypeScript, Rust)
|
||||
371
skills/rapid-convergence/examples/prediction-examples.md
Normal file
371
skills/rapid-convergence/examples/prediction-examples.md
Normal file
@@ -0,0 +1,371 @@
|
||||
# Convergence Prediction Examples
|
||||
|
||||
**Purpose**: Worked examples of prediction model across different scenarios
|
||||
**Model Accuracy**: 85% (±1 iteration) across 13 experiments
|
||||
|
||||
---
|
||||
|
||||
## Example 1: Error Recovery (Actual: 3 iterations)
|
||||
|
||||
### Assessment
|
||||
|
||||
**Domain**: Error detection, diagnosis, recovery, prevention for meta-cc
|
||||
|
||||
**Data Available**:
|
||||
- 1,336 historical errors in session logs
|
||||
- Frequency distribution calculable
|
||||
- Error rate: 5.78%
|
||||
|
||||
**Prior Art**:
|
||||
- Industry error taxonomies (5 patterns borrowable)
|
||||
- Standard recovery workflows
|
||||
|
||||
**Automation**:
|
||||
- Top 3 obvious from frequency analysis
|
||||
- File operations (high frequency, high ROI)
|
||||
|
||||
### Prediction
|
||||
|
||||
```
|
||||
Base: 4
|
||||
|
||||
Criterion 1 - V_meta(s₀):
|
||||
- Completeness: 10/13 = 0.77
|
||||
- Transferability: 5/10 = 0.50
|
||||
- Automation: 3/3 = 1.0
|
||||
- V_meta(s₀) = 0.758 ≥ 0.40? YES → +0 ✅
|
||||
|
||||
Criterion 2 - Domain Scope:
|
||||
- "Error detection, diagnosis, recovery, prevention"
|
||||
- <3 sentences? YES → +0 ✅
|
||||
|
||||
Criterion 3 - Validation:
|
||||
- Retrospective with 1,336 errors
|
||||
- Direct? YES → +0 ✅
|
||||
|
||||
Criterion 4 - Specialization:
|
||||
- Generic data-analyst, doc-writer, coder sufficient
|
||||
- Needed? NO → +0 ✅
|
||||
|
||||
Criterion 5 - Automation:
|
||||
- Top 3 identified from frequency analysis
|
||||
- Clear? YES → +0 ✅
|
||||
|
||||
Predicted: 4 + 0 = 4 iterations
|
||||
Actual: 3 iterations ✅
|
||||
Accuracy: Within ±1 ✅
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Example 2: Test Strategy (Actual: 6 iterations)
|
||||
|
||||
### Assessment
|
||||
|
||||
**Domain**: Develop test strategy for Go CLI project
|
||||
|
||||
**Data Available**:
|
||||
- Coverage: 72.1%
|
||||
- Test count: 590
|
||||
- No documented patterns
|
||||
|
||||
**Prior Art**:
|
||||
- Industry test patterns exist (table-driven, fixtures)
|
||||
- Could borrow 50-70%
|
||||
|
||||
**Automation**:
|
||||
- Coverage analysis tools (obvious)
|
||||
- Test generation (feasible)
|
||||
|
||||
### Prediction
|
||||
|
||||
```
|
||||
Base: 4
|
||||
|
||||
Criterion 1 - V_meta(s₀):
|
||||
- Completeness: 0/8 = 0.00 (no patterns)
|
||||
- Transferability: 0/8 = 0.00 (no research done)
|
||||
- Automation: 0/3 = 0.00 (not identified)
|
||||
- V_meta(s₀) = 0.00 < 0.40? YES → +2 ❌
|
||||
|
||||
Criterion 2 - Domain Scope:
|
||||
- "Develop test strategy" (vague)
|
||||
- What tests? How much coverage?
|
||||
- Fuzzy? YES → +1 ❌
|
||||
|
||||
Criterion 3 - Validation:
|
||||
- Multi-context needed (3 archetypes)
|
||||
- Direct? NO → +2 ❌
|
||||
|
||||
Criterion 4 - Specialization:
|
||||
- coverage-analyzer: 30x speedup
|
||||
- test-generator: 10x speedup
|
||||
- Needed? YES → +1 ❌
|
||||
|
||||
Criterion 5 - Automation:
|
||||
- Coverage tools obvious
|
||||
- Clear? YES → +0 ✅
|
||||
|
||||
Predicted: 4 + 2 + 1 + 2 + 1 + 0 = 10 iterations
|
||||
Actual: 6 iterations ⚠️
|
||||
Accuracy: -4 (model conservative)
|
||||
```
|
||||
|
||||
**Analysis**: Model over-predicted, but signaled "not rapid" correctly.
|
||||
|
||||
---
|
||||
|
||||
## Example 3: CI/CD Optimization (Hypothetical)
|
||||
|
||||
### Assessment
|
||||
|
||||
**Domain**: Reduce build time through caching, parallelization, optimization
|
||||
|
||||
**Data Available**:
|
||||
- CI logs for last 3 months
|
||||
- Build times: avg 8 min (range: 6-12 min)
|
||||
- Failure rate: 25%
|
||||
|
||||
**Prior Art**:
|
||||
- Industry CI/CD patterns well-documented
|
||||
- GitHub Actions best practices (7 patterns)
|
||||
|
||||
**Automation**:
|
||||
- Pipeline analysis (parse CI logs)
|
||||
- Config generator (template-based)
|
||||
|
||||
### Prediction
|
||||
|
||||
```
|
||||
Base: 4
|
||||
|
||||
Criterion 1 - V_meta(s₀):
|
||||
Estimate:
|
||||
- Analyze CI logs: identify 5 patterns initially
|
||||
- Expected final: 7 patterns
|
||||
- Completeness: 5/7 = 0.71
|
||||
- Borrow 3 industry patterns: 3/7 = 0.43
|
||||
- Automation: 2 tools identified = 2/2 = 1.0
|
||||
- V_meta(s₀) = 0.4×0.71 + 0.3×0.43 + 0.3×1.0 = 0.61 ≥ 0.40? YES → +0 ✅
|
||||
|
||||
Criterion 2 - Domain Scope:
|
||||
- "Reduce CI/CD build time through caching, parallelization, optimization"
|
||||
- Clear? YES → +0 ✅
|
||||
|
||||
Criterion 3 - Validation:
|
||||
- Test on own pipeline (single context)
|
||||
- Direct? YES → +0 ✅
|
||||
|
||||
Criterion 4 - Specialization:
|
||||
- Pipeline analysis: bash/jq sufficient
|
||||
- Config generation: template-based (generic)
|
||||
- Needed? NO → +0 ✅
|
||||
|
||||
Criterion 5 - Automation:
|
||||
- Caching, parallelization, fast-fail (top 3 obvious)
|
||||
- Clear? YES → +0 ✅
|
||||
|
||||
Predicted: 4 + 0 = 4 iterations (rapid convergence)
|
||||
Expected actual: 3-5 iterations
|
||||
Confidence: High (all criteria met)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Example 4: Security Audit Methodology (Hypothetical)
|
||||
|
||||
### Assessment
|
||||
|
||||
**Domain**: Systematic security audit for web applications
|
||||
|
||||
**Data Available**:
|
||||
- Limited (1-2 past audits)
|
||||
- No quantitative metrics
|
||||
|
||||
**Prior Art**:
|
||||
- OWASP Top 10, industry checklists
|
||||
- High transferability (70-80%)
|
||||
|
||||
**Automation**:
|
||||
- Static analysis tools
|
||||
- Fuzzy (requires domain expertise to identify)
|
||||
|
||||
### Prediction
|
||||
|
||||
```
|
||||
Base: 4
|
||||
|
||||
Criterion 1 - V_meta(s₀):
|
||||
Estimate:
|
||||
- Limited data, initial patterns: ~3
|
||||
- Expected final: ~12 (security domains)
|
||||
- Completeness: 3/12 = 0.25
|
||||
- Borrow OWASP/industry: 9/12 = 0.75
|
||||
- Automation: unclear (tools exist but need selection)
|
||||
- V_meta(s₀) = 0.4×0.25 + 0.3×0.75 + 0.3×0.30 = 0.42 ≥ 0.40? YES → +0 ✅
|
||||
|
||||
Criterion 2 - Domain Scope:
|
||||
- "Systematic security audit for web applications"
|
||||
- But: which vulnerabilities? what depth?
|
||||
- Fuzzy? YES → +1 ❌
|
||||
|
||||
Criterion 3 - Validation:
|
||||
- Multi-context (need to test on multiple apps)
|
||||
- Different tech stacks
|
||||
- Direct? NO → +2 ❌
|
||||
|
||||
Criterion 4 - Specialization:
|
||||
- Security-focused agents valuable
|
||||
- Domain expertise needed
|
||||
- Needed? YES → +1 ❌
|
||||
|
||||
Criterion 5 - Automation:
|
||||
- Static analysis obvious
|
||||
- But: which tools? how to integrate?
|
||||
- Somewhat clear? PARTIAL → +0.5 ≈ +1 ❌
|
||||
|
||||
Predicted: 4 + 0 + 1 + 2 + 1 + 1 = 9 iterations
|
||||
Expected actual: 7-10 iterations (exploratory)
|
||||
Confidence: Medium (borderline V_meta(s₀), multiple penalties)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Example 5: Documentation Management (Hypothetical)
|
||||
|
||||
### Assessment
|
||||
|
||||
**Domain**: Documentation quality and consistency for large codebase
|
||||
|
||||
**Data Available**:
|
||||
- Existing docs: 150 files
|
||||
- Quality issues logged: 80 items
|
||||
- No systematic approach
|
||||
|
||||
**Prior Art**:
|
||||
- Documentation standards (Google, Microsoft style guides)
|
||||
- High transferability
|
||||
|
||||
**Automation**:
|
||||
- Linters (markdownlint, prose)
|
||||
- Doc generators
|
||||
|
||||
### Prediction
|
||||
|
||||
```
|
||||
Base: 4
|
||||
|
||||
Criterion 1 - V_meta(s₀):
|
||||
Estimate:
|
||||
- Analyze 80 quality issues: 8 categories
|
||||
- Expected final: 10 categories
|
||||
- Completeness: 8/10 = 0.80
|
||||
- Borrow style guide patterns: 7/10 = 0.70
|
||||
- Automation: linters + generators = 3/3 = 1.0
|
||||
- V_meta(s₀) = 0.4×0.80 + 0.3×0.70 + 0.3×1.0 = 0.83 ≥ 0.40? YES → +0 ✅✅
|
||||
|
||||
Criterion 2 - Domain Scope:
|
||||
- "Documentation quality and consistency for codebase"
|
||||
- Clear quality metrics (completeness, accuracy, style)
|
||||
- Clear? YES → +0 ✅
|
||||
|
||||
Criterion 3 - Validation:
|
||||
- Retrospective on 150 existing docs
|
||||
- Direct? YES → +0 ✅
|
||||
|
||||
Criterion 4 - Specialization:
|
||||
- Generic doc-writer + linters sufficient
|
||||
- Needed? NO → +0 ✅
|
||||
|
||||
Criterion 5 - Automation:
|
||||
- Linters, generators, templates (obvious)
|
||||
- Clear? YES → +0 ✅
|
||||
|
||||
Predicted: 4 + 0 = 4 iterations (rapid convergence)
|
||||
Expected actual: 3-4 iterations
|
||||
Confidence: Very High (strong V_meta(s₀), all criteria met)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Summary Table
|
||||
|
||||
| Example | V_meta(s₀) | Penalties | Predicted | Actual | Accuracy |
|
||||
|---------|------------|-----------|-----------|--------|----------|
|
||||
| Error Recovery | 0.758 | 0 | 4 | 3 | ✅ ±1 |
|
||||
| Test Strategy | 0.00 | 5 | 10 | 6 | ⚠️ -4 (conservative) |
|
||||
| CI/CD Opt. | 0.61 | 0 | 4 | (3-5 expected) | TBD |
|
||||
| Security Audit | 0.42 | 4 | 9 | (7-10 expected) | TBD |
|
||||
| Doc Management | 0.83 | 0 | 4 | (3-4 expected) | TBD |
|
||||
|
||||
---
|
||||
|
||||
## Pattern Recognition
|
||||
|
||||
### Rapid Convergence Profile (4-5 iterations)
|
||||
|
||||
**Characteristics**:
|
||||
- V_meta(s₀) ≥ 0.50 (strong baseline)
|
||||
- 0-1 penalties total
|
||||
- Clear domain scope
|
||||
- Direct/retrospective validation
|
||||
- Obvious automation opportunities
|
||||
|
||||
**Examples**: Error Recovery, CI/CD Opt., Doc Management
|
||||
|
||||
---
|
||||
|
||||
### Standard Convergence Profile (6-8 iterations)
|
||||
|
||||
**Characteristics**:
|
||||
- V_meta(s₀) = 0.20-0.40 (weak baseline)
|
||||
- 2-4 penalties total
|
||||
- Some scoping needed
|
||||
- Multi-context validation OR specialization needed
|
||||
|
||||
**Examples**: Test Strategy (6 actual)
|
||||
|
||||
---
|
||||
|
||||
### Exploratory Profile (9+ iterations)
|
||||
|
||||
**Characteristics**:
|
||||
- V_meta(s₀) < 0.20 (no baseline)
|
||||
- 5+ penalties total
|
||||
- Fuzzy scope
|
||||
- Multi-context validation AND specialization needed
|
||||
- Unclear automation
|
||||
|
||||
**Examples**: Security Audit (hypothetical)
|
||||
|
||||
---
|
||||
|
||||
## Using Predictions
|
||||
|
||||
### High Confidence (0-1 penalties)
|
||||
|
||||
**Action**: Invest in strong iteration 0 (3-5 hours)
|
||||
**Expected**: Rapid convergence (3-5 iterations, 10-15 hours)
|
||||
**Strategy**: Comprehensive baseline, aggressive iteration 1
|
||||
|
||||
---
|
||||
|
||||
### Medium Confidence (2-4 penalties)
|
||||
|
||||
**Action**: Standard iteration 0 (1-2 hours)
|
||||
**Expected**: Standard convergence (6-8 iterations, 20-30 hours)
|
||||
**Strategy**: Incremental improvements, focus on high-value
|
||||
|
||||
---
|
||||
|
||||
### Low Confidence (5+ penalties)
|
||||
|
||||
**Action**: Minimal iteration 0 (<1 hour)
|
||||
**Expected**: Exploratory (9+ iterations, 30-50 hours)
|
||||
**Strategy**: Discovery-driven, establish baseline first
|
||||
|
||||
---
|
||||
|
||||
**Source**: BAIME Rapid Convergence Prediction Model
|
||||
**Accuracy**: 85% (±1 iteration) on 13 experiments
|
||||
**Purpose**: Planning tool for experiment design
|
||||
259
skills/rapid-convergence/examples/test-strategy-6-iterations.md
Normal file
259
skills/rapid-convergence/examples/test-strategy-6-iterations.md
Normal file
@@ -0,0 +1,259 @@
|
||||
# Test Strategy: 6-Iteration Standard Convergence
|
||||
|
||||
**Experiment**: bootstrap-002-test-strategy
|
||||
**Iterations**: 6 (standard convergence)
|
||||
**Time**: 25.5 hours
|
||||
**Result**: V_instance=0.85, V_meta=0.82 ✅
|
||||
|
||||
Comparison case showing why standard convergence took longer.
|
||||
|
||||
---
|
||||
|
||||
## Why Standard Convergence (Not Rapid)
|
||||
|
||||
### Criteria Assessment
|
||||
|
||||
**1. Clear Baseline Metrics** ❌
|
||||
- Coverage: 72.1% (but no patterns documented)
|
||||
- No systematic test approach
|
||||
- Fuzzy success criteria
|
||||
- V_meta(s₀) = 0.04
|
||||
|
||||
**2. Focused Domain** ❌
|
||||
- "Develop test strategy" (too broad)
|
||||
- What tests? Which patterns? How much coverage?
|
||||
- Required scoping work
|
||||
|
||||
**3. Direct Validation** ❌
|
||||
- Multi-context validation needed (3 archetypes)
|
||||
- Cross-language testing
|
||||
- Deployment overhead: 6-8 hours
|
||||
|
||||
**4. Generic Agents** ❌
|
||||
- Needed specialization:
|
||||
- coverage-analyzer (30x speedup)
|
||||
- test-generator (10x speedup)
|
||||
- Added 1-2 iterations
|
||||
|
||||
**5. Early Automation** ✅
|
||||
- Coverage tools obvious
|
||||
- But implementation gradual
|
||||
|
||||
**Prediction**: 4 + 2 + 1 + 2 + 1 + 0 = 10 iterations
|
||||
**Actual**: 6 iterations (efficient execution beat prediction)
|
||||
|
||||
---
|
||||
|
||||
## Iteration Timeline
|
||||
|
||||
### Iteration 0: Minimal Baseline (60 min)
|
||||
|
||||
**Activities**:
|
||||
- Ran coverage: 72.1%
|
||||
- Counted tests: 590
|
||||
- Wrote 3 ad-hoc tests
|
||||
- Noted duplication
|
||||
|
||||
**V_meta(s₀)**:
|
||||
```
|
||||
Completeness: 0/8 = 0.00 (no patterns yet)
|
||||
Transferability: 0/8 = 0.00 (no research)
|
||||
Automation: 0/3 = 0.00 (ideas only)
|
||||
|
||||
V_meta(s₀) = 0.00 ❌
|
||||
```
|
||||
|
||||
**Issue**: Weak baseline required more iterations
|
||||
|
||||
---
|
||||
|
||||
### Iteration 1: Core Patterns (90 min)
|
||||
|
||||
Created 2 patterns:
|
||||
1. Table-Driven Tests (12 min per test)
|
||||
2. Error Path Testing (14 min per test)
|
||||
|
||||
Applied to 5 tests, coverage: 72.1% → 72.8% (+0.7%)
|
||||
|
||||
**V_instance**: 0.72
|
||||
**V_meta**: 0.25 (2/8 patterns)
|
||||
|
||||
---
|
||||
|
||||
### Iteration 2: Expand & First Tool (90 min)
|
||||
|
||||
Added 3 patterns:
|
||||
3. CLI Command Testing
|
||||
4. Integration Tests
|
||||
5. Test Helpers
|
||||
|
||||
Built coverage-analyzer script (30x speedup)
|
||||
|
||||
Coverage: 72.8% → 73.5% (+0.7%)
|
||||
|
||||
**V_instance**: 0.76
|
||||
**V_meta**: 0.42 (5/8 patterns, 1 tool)
|
||||
|
||||
---
|
||||
|
||||
### Iteration 3: CLI Focus (75 min)
|
||||
|
||||
Added 2 patterns:
|
||||
6. Global Flag Testing
|
||||
7. Fixture Patterns
|
||||
|
||||
Applied to CLI tests, coverage: 73.5% → 74.8% (+1.3%)
|
||||
|
||||
**V_instance**: 0.81 ✅ (exceeded target)
|
||||
**V_meta**: 0.61
|
||||
|
||||
---
|
||||
|
||||
### Iteration 4: Meta-Layer Push (90 min)
|
||||
|
||||
Added final pattern:
|
||||
8. Dependency Injection (Mocking)
|
||||
|
||||
Built test-generator (10x speedup)
|
||||
|
||||
Coverage: 74.8% → 75.2% (+0.4%)
|
||||
|
||||
**V_instance**: 0.82 ✅
|
||||
**V_meta**: 0.67
|
||||
|
||||
---
|
||||
|
||||
### Iteration 5: Refinement (60 min)
|
||||
|
||||
Tested transferability (Python, Rust, TypeScript)
|
||||
Refined documentation
|
||||
|
||||
Coverage: 75.2% → 75.6% (+0.4%)
|
||||
|
||||
**V_instance**: 0.84 ✅
|
||||
**V_meta**: 0.78 (close)
|
||||
|
||||
---
|
||||
|
||||
### Iteration 6: Convergence (45 min)
|
||||
|
||||
Final polish, transferability guide
|
||||
|
||||
Coverage: 75.6% → 75.8% (+0.2%)
|
||||
|
||||
**V_instance**: 0.85 ✅ ✅ (2 consecutive ≥ 0.80)
|
||||
**V_meta**: 0.82 ✅ ✅ (2 consecutive ≥ 0.80)
|
||||
|
||||
**CONVERGED** ✅
|
||||
|
||||
---
|
||||
|
||||
## Comparison: Standard vs Rapid
|
||||
|
||||
| Aspect | Bootstrap-002 (Standard) | Bootstrap-003 (Rapid) |
|
||||
|--------|--------------------------|------------------------|
|
||||
| **V_meta(s₀)** | 0.04 | 0.758 |
|
||||
| **Iteration 0** | 60 min (minimal) | 120 min (comprehensive) |
|
||||
| **Iterations** | 6 | 3 |
|
||||
| **Total Time** | 25.5h | 10h |
|
||||
| **Pattern Discovery** | Incremental (1-3 per iteration) | Upfront (10 categories in iteration 0) |
|
||||
| **Automation** | Gradual (iterations 2, 4) | Early (iteration 1, all 3 tools) |
|
||||
| **Validation** | Multi-context (3 archetypes) | Retrospective (1336 errors) |
|
||||
| **Specialization** | 2 agents needed | Generic sufficient |
|
||||
|
||||
---
|
||||
|
||||
## Key Differences
|
||||
|
||||
### 1. Baseline Investment
|
||||
|
||||
**Bootstrap-002**: 60 min → V_meta(s₀) = 0.04
|
||||
- Minimal analysis
|
||||
- No pattern library
|
||||
- No automation plan
|
||||
|
||||
**Bootstrap-003**: 120 min → V_meta(s₀) = 0.758
|
||||
- Comprehensive analysis (ALL 1,336 errors)
|
||||
- 10 categories documented
|
||||
- 3 tools identified
|
||||
|
||||
**Impact**: +60 min investment saved 15.5 hours overall (26x ROI)
|
||||
|
||||
---
|
||||
|
||||
### 2. Pattern Discovery
|
||||
|
||||
**Bootstrap-002**: Incremental
|
||||
- Iteration 1: 2 patterns
|
||||
- Iteration 2: 3 patterns
|
||||
- Iteration 3: 2 patterns
|
||||
- Iteration 4: 1 pattern
|
||||
- Total: 6 iterations to discover 8 patterns
|
||||
|
||||
**Bootstrap-003**: Upfront
|
||||
- Iteration 0: 10 categories (79.1% coverage)
|
||||
- Iteration 1: 12 categories (92.3% coverage)
|
||||
- Iteration 2: 13 categories (95.4% coverage)
|
||||
- Total: 3 iterations, most patterns identified early
|
||||
|
||||
---
|
||||
|
||||
### 3. Validation Overhead
|
||||
|
||||
**Bootstrap-002**: Multi-Context
|
||||
- 3 project archetypes tested
|
||||
- Cross-language validation
|
||||
- Deployment + testing: 6-8 hours
|
||||
- Added 2 iterations
|
||||
|
||||
**Bootstrap-003**: Retrospective
|
||||
- 1,336 historical errors
|
||||
- No deployment needed
|
||||
- Validation: 45 min
|
||||
- Added 0 iterations
|
||||
|
||||
---
|
||||
|
||||
## Lessons: Could Bootstrap-002 Have Been Rapid?
|
||||
|
||||
**Probably not** - structural factors prevented rapid convergence:
|
||||
|
||||
1. **No existing data**: No historical test metrics to analyze
|
||||
2. **Broad domain**: "Test strategy" required scoping
|
||||
3. **Multi-context needed**: Testing methodology varies by project type
|
||||
4. **Specialization valuable**: 10x+ speedup from specialized agents
|
||||
|
||||
**However, could have been faster (4-5 iterations)**:
|
||||
|
||||
**Alternative Approach**:
|
||||
- **Stronger iteration 0** (2-3 hours):
|
||||
- Research industry test patterns (borrow 5-6)
|
||||
- Analyze current codebase thoroughly
|
||||
- Identify automation candidates upfront
|
||||
- Target V_meta(s₀) = 0.30-0.40
|
||||
|
||||
- **Aggressive iteration 1**:
|
||||
- Implement 5-6 patterns immediately
|
||||
- Build both tools (coverage-analyzer, test-generator)
|
||||
- Target V_instance = 0.75+
|
||||
|
||||
- **Result**: Likely 4-5 iterations (vs actual 6)
|
||||
|
||||
---
|
||||
|
||||
## When Standard Is Appropriate
|
||||
|
||||
Bootstrap-002 demonstrates that **not all methodologies can/should use rapid convergence**:
|
||||
|
||||
**Standard convergence makes sense when**:
|
||||
- Low V_meta(s₀) inevitable (no existing data)
|
||||
- Domain requires exploration (patterns not obvious)
|
||||
- Multi-context validation necessary (transferability critical)
|
||||
- Specialization provides >10x value (worth investment)
|
||||
|
||||
**Key insight**: Use prediction model to set realistic expectations, not force rapid convergence.
|
||||
|
||||
---
|
||||
|
||||
**Status**: ✅ Production-ready, both approaches valid
|
||||
**Takeaway**: Rapid convergence is situational, not universal
|
||||
356
skills/rapid-convergence/reference/baseline-metrics.md
Normal file
356
skills/rapid-convergence/reference/baseline-metrics.md
Normal file
@@ -0,0 +1,356 @@
|
||||
# Achieving Strong Baseline Metrics
|
||||
|
||||
**Purpose**: How to achieve V_meta(s₀) ≥ 0.40 for rapid convergence
|
||||
**Impact**: Strong baseline reduces iterations by 2-3 (40-60% time savings)
|
||||
|
||||
---
|
||||
|
||||
## V_meta Baseline Formula
|
||||
|
||||
```
|
||||
V_meta(s₀) = 0.4 × completeness +
|
||||
0.3 × transferability +
|
||||
0.3 × automation_effectiveness
|
||||
|
||||
Where (at iteration 0):
|
||||
- completeness = initial_coverage / target_coverage
|
||||
- transferability = existing_patterns_reusable / total_patterns_needed
|
||||
- automation_effectiveness = identified_automation_ops / automation_opportunities
|
||||
```
|
||||
|
||||
**Target**: V_meta(s₀) ≥ 0.40
|
||||
|
||||
---
|
||||
|
||||
## Component 1: Completeness (40% weight)
|
||||
|
||||
**Definition**: Initial taxonomy/pattern coverage
|
||||
|
||||
**Calculation**:
|
||||
```
|
||||
completeness = initial_categories / estimated_final_categories
|
||||
```
|
||||
|
||||
**Achieve ≥0.50 by**:
|
||||
1. Comprehensive data analysis (3-5 hours)
|
||||
2. Create initial taxonomy (10-15 categories)
|
||||
3. Classify ≥70% of observed cases
|
||||
|
||||
**Example (Bootstrap-003)**:
|
||||
```
|
||||
Iteration 0 taxonomy: 10 categories
|
||||
Estimated final: 12-13 categories
|
||||
Completeness: 10/12.5 = 0.80
|
||||
|
||||
Contribution: 0.4 × 0.80 = 0.32 ✅
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Component 2: Transferability (30% weight)
|
||||
|
||||
**Definition**: Reusability of existing patterns/knowledge
|
||||
|
||||
**Calculation**:
|
||||
```
|
||||
transferability = (borrowed_patterns + existing_knowledge) / total_patterns_needed
|
||||
```
|
||||
|
||||
**Achieve ≥0.30 by**:
|
||||
1. Research prior art (1-2 hours)
|
||||
2. Identify similar methodologies
|
||||
3. Document reusable patterns
|
||||
|
||||
**Example (Bootstrap-003)**:
|
||||
```
|
||||
Borrowed from industry: 5 error patterns
|
||||
Existing knowledge: Error taxonomy basics
|
||||
Total patterns needed: ~10
|
||||
|
||||
Transferability: 5/10 = 0.50
|
||||
|
||||
Contribution: 0.3 × 0.50 = 0.15 ✅
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Component 3: Automation Effectiveness (30% weight)
|
||||
|
||||
**Definition**: Early identification of automation opportunities
|
||||
|
||||
**Calculation**:
|
||||
```
|
||||
automation_effectiveness = identified_high_ROI_tools / expected_tool_count
|
||||
```
|
||||
|
||||
**Achieve ≥0.30 by**:
|
||||
1. Analyze high-frequency tasks (1-2 hours)
|
||||
2. Identify top 3-5 automation candidates
|
||||
3. Estimate ROI (>5x preferred)
|
||||
|
||||
**Example (Bootstrap-003)**:
|
||||
```
|
||||
Identified in iteration 0: 3 tools
|
||||
- validate-path.sh: 65.2% prevention, 61x ROI
|
||||
- check-file-size.sh: 100% prevention, 31.6x ROI
|
||||
- check-read-before-write.sh: 100% prevention, 26.2x ROI
|
||||
|
||||
Expected final tool count: ~3
|
||||
|
||||
Automation effectiveness: 3/3 = 1.0
|
||||
|
||||
Contribution: 0.3 × 1.0 = 0.30 ✅
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Worked Example: Bootstrap-003
|
||||
|
||||
### Iteration 0 Investment: 120 min
|
||||
|
||||
**Data Analysis** (60 min):
|
||||
- Queried session history: 1,336 errors
|
||||
- Calculated error rate: 5.78%
|
||||
- Identified frequency distribution
|
||||
|
||||
**Taxonomy Creation** (40 min):
|
||||
- Created 10 initial categories
|
||||
- Classified 1,056/1,336 errors (79.1%)
|
||||
- Estimated 2-3 more categories needed
|
||||
|
||||
**Pattern Research** (15 min):
|
||||
- Reviewed industry error taxonomies
|
||||
- Identified 5 reusable patterns
|
||||
- Documented error handling best practices
|
||||
|
||||
**Automation Identification** (5 min):
|
||||
- Top 3 opportunities obvious from data:
|
||||
1. File-not-found: 250 errors (18.7%)
|
||||
2. File-size-exceeded: 84 errors (6.3%)
|
||||
3. Write-before-read: 70 errors (5.2%)
|
||||
|
||||
### V_meta(s₀) Calculation
|
||||
|
||||
```
|
||||
Completeness: 10/12.5 = 0.80
|
||||
Transferability: 5/10 = 0.50
|
||||
Automation: 3/3 = 1.0
|
||||
|
||||
V_meta(s₀) = 0.4 × 0.80 +
|
||||
0.3 × 0.50 +
|
||||
0.3 × 1.0
|
||||
|
||||
= 0.32 + 0.15 + 0.30
|
||||
= 0.77 ✅✅ (far exceeds 0.40 target)
|
||||
```
|
||||
|
||||
**Result**: 3 iterations total (rapid convergence)
|
||||
|
||||
---
|
||||
|
||||
## Contrast: Bootstrap-002 (Weak Baseline)
|
||||
|
||||
### Iteration 0 Investment: 60 min
|
||||
|
||||
**Coverage Measurement** (30 min):
|
||||
- Ran coverage analysis: 72.1%
|
||||
- Counted tests: 590
|
||||
- No systematic approach documented
|
||||
|
||||
**Pattern Identification** (20 min):
|
||||
- Wrote 3 ad-hoc tests
|
||||
- Noted duplication issues
|
||||
- No pattern library yet
|
||||
|
||||
**No Prior Research** (0 min):
|
||||
- Started from scratch
|
||||
- No borrowed patterns
|
||||
|
||||
**No Automation Planning** (10 min):
|
||||
- Vague ideas about coverage tools
|
||||
- No concrete automation identified
|
||||
|
||||
### V_meta(s₀) Calculation
|
||||
|
||||
```
|
||||
Completeness: 0/8 patterns = 0.00 (none documented)
|
||||
Transferability: 0/8 = 0.00 (no research)
|
||||
Automation: 0/3 tools = 0.00 (none identified)
|
||||
|
||||
V_meta(s₀) = 0.4 × 0.00 +
|
||||
0.3 × 0.00 +
|
||||
0.3 × 0.00
|
||||
|
||||
= 0.00 ❌ (far below 0.40 target)
|
||||
```
|
||||
|
||||
**Result**: 6 iterations total (standard convergence)
|
||||
|
||||
---
|
||||
|
||||
## Achieving V_meta(s₀) ≥ 0.40: Checklist
|
||||
|
||||
### Completeness Target: ≥0.50
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Analyze ALL available data (3-5 hours)
|
||||
- [ ] Create initial taxonomy/pattern library (10-15 items)
|
||||
- [ ] Classify ≥70% of observed cases
|
||||
- [ ] Estimate final taxonomy size
|
||||
- [ ] Calculate: initial_count / estimated_final ≥ 0.50?
|
||||
|
||||
**Time**: 3-5 hours
|
||||
**Contribution**: 0.4 × 0.50 = 0.20
|
||||
|
||||
---
|
||||
|
||||
### Transferability Target: ≥0.30
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Research prior art (1-2 hours)
|
||||
- [ ] Identify similar methodologies
|
||||
- [ ] Document borrowed patterns (≥30% reusable)
|
||||
- [ ] List existing knowledge applicable
|
||||
- [ ] Calculate: borrowed / total_needed ≥ 0.30?
|
||||
|
||||
**Time**: 1-2 hours
|
||||
**Contribution**: 0.3 × 0.30 = 0.09
|
||||
|
||||
---
|
||||
|
||||
### Automation Target: ≥0.30
|
||||
|
||||
**Tasks**:
|
||||
- [ ] Analyze task frequency (1 hour)
|
||||
- [ ] Identify top 3-5 automation candidates
|
||||
- [ ] Estimate ROI for each (>5x preferred)
|
||||
- [ ] Document automation plan
|
||||
- [ ] Calculate: identified / expected ≥ 0.30?
|
||||
|
||||
**Time**: 1-2 hours
|
||||
**Contribution**: 0.3 × 0.30 = 0.09
|
||||
|
||||
---
|
||||
|
||||
### Total Baseline Investment
|
||||
|
||||
**Minimum**: 5-9 hours for V_meta(s₀) = 0.38-0.40
|
||||
**Recommended**: 6-10 hours for V_meta(s₀) = 0.45-0.55
|
||||
**Aggressive**: 8-12 hours for V_meta(s₀) = 0.60-0.80
|
||||
|
||||
**ROI**: 5-9 hours investment → Save 10-15 hours overall (2-3x)
|
||||
|
||||
---
|
||||
|
||||
## Quick Assessment: Can You Achieve 0.40?
|
||||
|
||||
**Question 1**: Do you have quantitative data to analyze?
|
||||
- YES: Proceed with completeness analysis
|
||||
- NO: Gather data first (delays rapid convergence)
|
||||
|
||||
**Question 2**: Does prior art exist in this domain?
|
||||
- YES: Research and document (1-2 hours)
|
||||
- NO: Lower transferability expected (<0.20)
|
||||
|
||||
**Question 3**: Are high-frequency patterns obvious?
|
||||
- YES: Identify automation opportunities (1 hour)
|
||||
- NO: Requires deeper analysis (adds time)
|
||||
|
||||
**Scoring**:
|
||||
- **3 YES**: V_meta(s₀) ≥ 0.40 achievable (5-9 hours)
|
||||
- **2 YES**: V_meta(s₀) = 0.30-0.40 (7-12 hours)
|
||||
- **0-1 YES**: V_meta(s₀) < 0.30 (not rapid convergence candidate)
|
||||
|
||||
---
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
### ❌ Insufficient Data Analysis
|
||||
|
||||
**Symptom**: Analyzing <50% of available data
|
||||
**Impact**: Low completeness (<0.40)
|
||||
**Fix**: Comprehensive analysis (3-5 hours)
|
||||
|
||||
**Example**:
|
||||
```
|
||||
❌ Analyzed 200/1,336 errors → 5 categories → completeness = 0.38
|
||||
✅ Analyzed 1,336/1,336 errors → 10 categories → completeness = 0.80
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### ❌ Skipping Prior Art Research
|
||||
|
||||
**Symptom**: Starting from scratch
|
||||
**Impact**: Zero transferability
|
||||
**Fix**: 1-2 hours research
|
||||
|
||||
**Example**:
|
||||
```
|
||||
❌ No research → 0 borrowed patterns → transferability = 0.00
|
||||
✅ Research industry taxonomies → 5 patterns → transferability = 0.50
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### ❌ Vague Automation Ideas
|
||||
|
||||
**Symptom**: "Maybe we could automate X"
|
||||
**Impact**: Low automation score
|
||||
**Fix**: Concrete identification + ROI estimate
|
||||
|
||||
**Example**:
|
||||
```
|
||||
❌ "Could automate coverage" → automation = 0.10
|
||||
✅ "Coverage gap analyzer, 30x speedup, 6x ROI" → automation = 0.33
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Measurement Tools
|
||||
|
||||
**Completeness**:
|
||||
```bash
|
||||
# Count initial categories
|
||||
initial=$(grep "^##" taxonomy.md | wc -l)
|
||||
|
||||
# Estimate final (from analysis)
|
||||
estimated=12
|
||||
|
||||
# Calculate
|
||||
echo "scale=2; $initial / $estimated" | bc
|
||||
# Target: ≥0.50
|
||||
```
|
||||
|
||||
**Transferability**:
|
||||
```bash
|
||||
# Count borrowed patterns
|
||||
borrowed=$(grep "Source:" patterns.md | grep -v "Original" | wc -l)
|
||||
|
||||
# Estimate total needed
|
||||
total=10
|
||||
|
||||
# Calculate
|
||||
echo "scale=2; $borrowed / $total" | bc
|
||||
# Target: ≥0.30
|
||||
```
|
||||
|
||||
**Automation**:
|
||||
```bash
|
||||
# Count identified tools
|
||||
identified=$(ls scripts/ | wc -l)
|
||||
|
||||
# Estimate final count
|
||||
expected=3
|
||||
|
||||
# Calculate
|
||||
echo "scale=2; $identified / $expected" | bc
|
||||
# Target: ≥0.30
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Source**: BAIME Rapid Convergence Framework
|
||||
**Target**: V_meta(s₀) ≥ 0.40 for 3-4 iteration convergence
|
||||
**Investment**: 5-10 hours in iteration 0
|
||||
**ROI**: 2-3x (saves 10-15 hours overall)
|
||||
378
skills/rapid-convergence/reference/criteria.md
Normal file
378
skills/rapid-convergence/reference/criteria.md
Normal file
@@ -0,0 +1,378 @@
|
||||
# Rapid Convergence Criteria - Detailed
|
||||
|
||||
**Purpose**: In-depth explanation of 5 rapid convergence criteria
|
||||
**Impact**: Understanding when 3-4 iterations are achievable
|
||||
|
||||
---
|
||||
|
||||
## Criterion 1: Clear Baseline Metrics ⭐ CRITICAL
|
||||
|
||||
### Definition
|
||||
|
||||
V_meta(s₀) ≥ 0.40 indicates strong foundational work enables rapid progress.
|
||||
|
||||
### Mathematical Basis
|
||||
|
||||
```
|
||||
ΔV_meta needed = 0.80 - V_meta(s₀)
|
||||
|
||||
If V_meta(s₀) = 0.40: Need +0.40 → 3-4 iterations achievable
|
||||
If V_meta(s₀) = 0.10: Need +0.70 → 5-7 iterations required
|
||||
```
|
||||
|
||||
**Assumption**: Average ΔV_meta per iteration ≈ 0.15-0.20
|
||||
|
||||
### What Strong Baseline Looks Like
|
||||
|
||||
**Quantitative metrics exist**:
|
||||
- Error rate, test coverage, build time
|
||||
- Measurable via tools (not subjective)
|
||||
- Baseline established in <2 hours
|
||||
|
||||
**Success criteria are clear**:
|
||||
- Target values defined (e.g., <3% error rate)
|
||||
- Thresholds for convergence known
|
||||
- No ambiguity about "done"
|
||||
|
||||
**Initial taxonomy comprehensive**:
|
||||
- 70-80% coverage in iteration 0
|
||||
- 10-15 categories/patterns documented
|
||||
- Most edge cases identified
|
||||
|
||||
### Examples
|
||||
|
||||
**✅ Bootstrap-003 (V_meta(s₀) = 0.48)**:
|
||||
```
|
||||
- 1,336 errors quantified via MCP query
|
||||
- Error rate: 5.78% calculated automatically
|
||||
- 10 error categories (79.1% coverage)
|
||||
- Clear targets: <3% error rate, <2 min MTTR
|
||||
- Result: 3 iterations
|
||||
```
|
||||
|
||||
**❌ Bootstrap-002 (V_meta(s₀) = 0.04)**:
|
||||
```
|
||||
- Coverage: 72.1% (but no patterns documented)
|
||||
- No clear test patterns identified
|
||||
- Ambiguous "done" criteria
|
||||
- Had to establish metrics first
|
||||
- Result: 6 iterations
|
||||
```
|
||||
|
||||
### Impact Analysis
|
||||
|
||||
| V_meta(s₀) | Iterations Needed | Hours | Reason |
|
||||
|------------|-------------------|-------|--------|
|
||||
| 0.60-0.80 | 2-3 | 6-10h | Minimal gap to 0.80 |
|
||||
| 0.40-0.59 | 3-4 | 10-15h | Moderate gap |
|
||||
| 0.20-0.39 | 4-6 | 15-25h | Large gap |
|
||||
| 0.00-0.19 | 6-10 | 25-40h | Exploratory |
|
||||
|
||||
---
|
||||
|
||||
## Criterion 2: Focused Domain Scope ⭐ IMPORTANT
|
||||
|
||||
### Definition
|
||||
|
||||
Domain described in <3 sentences without ambiguity.
|
||||
|
||||
### Why This Matters
|
||||
|
||||
**Focused scope** → Less exploration → Faster convergence
|
||||
|
||||
**Broad scope** → More patterns needed → Slower convergence
|
||||
|
||||
### Quantifying Focus
|
||||
|
||||
**Metric**: Boundary clarity ratio
|
||||
```
|
||||
BCR = clear_boundaries / total_boundaries
|
||||
|
||||
Where boundaries = {in-scope, out-of-scope, edge cases}
|
||||
```
|
||||
|
||||
**Target**: BCR ≥ 0.80 (80% of boundaries unambiguous)
|
||||
|
||||
### Examples
|
||||
|
||||
**✅ Focused (Bootstrap-003)**:
|
||||
```
|
||||
Domain: "Error detection, diagnosis, recovery, prevention for meta-cc"
|
||||
|
||||
Boundaries:
|
||||
✅ In-scope: All meta-cc errors
|
||||
✅ Out-of-scope: Infrastructure failures, user errors
|
||||
✅ Edge cases: Cascading errors (handle as single category)
|
||||
|
||||
BCR = 3/3 = 1.0 (perfectly focused)
|
||||
```
|
||||
|
||||
**❌ Broad (Bootstrap-002)**:
|
||||
```
|
||||
Domain: "Develop test strategy"
|
||||
|
||||
Boundaries:
|
||||
⚠️ In-scope: Which tests? Unit? Integration? E2E?
|
||||
⚠️ Out-of-scope: What about test infrastructure?
|
||||
⚠️ Edge cases: Multi-language support? CI integration?
|
||||
|
||||
BCR = 0/3 = 0.00 (needs scoping work)
|
||||
```
|
||||
|
||||
### Scoping Technique
|
||||
|
||||
**Step 1**: Write 1-sentence domain definition
|
||||
**Step 2**: List 3-5 explicit in-scope items
|
||||
**Step 3**: List 3-5 explicit out-of-scope items
|
||||
**Step 4**: Define edge case handling
|
||||
|
||||
**Example**:
|
||||
```markdown
|
||||
## Domain: Error Recovery for Meta-CC
|
||||
|
||||
**In-Scope**:
|
||||
- Error detection and classification
|
||||
- Root cause diagnosis
|
||||
- Recovery procedures
|
||||
- Prevention automation
|
||||
- MTTR reduction
|
||||
|
||||
**Out-of-Scope**:
|
||||
- Infrastructure failures (Docker, network)
|
||||
- User mistakes (misuse of CLI)
|
||||
- Feature requests
|
||||
- Performance optimization (unless error-related)
|
||||
|
||||
**Edge Cases**:
|
||||
- Cascading errors: Treat as single error with multiple symptoms
|
||||
- Intermittent errors: Require 3+ occurrences for pattern
|
||||
- Error prevention: In-scope if automatable
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Criterion 3: Direct Validation ⭐ IMPORTANT
|
||||
|
||||
### Definition
|
||||
|
||||
Can validate methodology without multi-context deployment.
|
||||
|
||||
### Validation Complexity Spectrum
|
||||
|
||||
**Level 1: Retrospective** (Fastest)
|
||||
- Use historical data
|
||||
- No deployment needed
|
||||
- Example: 1,336 historical errors
|
||||
|
||||
**Level 2: Single-Context** (Fast)
|
||||
- Test in one environment
|
||||
- Minimal deployment
|
||||
- Example: Validate on current project
|
||||
|
||||
**Level 3: Multi-Context** (Slow)
|
||||
- Test across multiple projects/languages
|
||||
- Significant deployment overhead
|
||||
- Example: 3 project archetypes
|
||||
|
||||
**Level 4: Production** (Slowest)
|
||||
- Real-world validation required
|
||||
- Months of data collection
|
||||
- Example: Monitor for 3-6 months
|
||||
|
||||
### Time Impact
|
||||
|
||||
| Validation Level | Overhead | Example Iterations Added |
|
||||
|------------------|----------|--------------------------|
|
||||
| Retrospective | 0h | +0 (Bootstrap-003) |
|
||||
| Single-Context | 2-4h | +0 to +1 |
|
||||
| Multi-Context | 6-12h | +2 to +3 (Bootstrap-002) |
|
||||
| Production | Months | N/A (not rapid) |
|
||||
|
||||
### When Retrospective Validation Works
|
||||
|
||||
**Requirements**:
|
||||
1. Historical data exists (session logs, error logs)
|
||||
2. Data is representative of current/future work
|
||||
3. Metrics can be calculated from historical data
|
||||
4. Methodology can be applied retrospectively
|
||||
|
||||
**Example** (Bootstrap-003):
|
||||
```
|
||||
✅ 1,336 historical errors in session logs
|
||||
✅ Representative of typical development work
|
||||
✅ Can classify errors retrospectively
|
||||
✅ Can measure prevention rate via replay
|
||||
|
||||
Result: Direct validation, 0 overhead
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Criterion 4: Generic Agent Sufficiency 🟡 MODERATE
|
||||
|
||||
### Definition
|
||||
|
||||
Generic agents (data-analyst, doc-writer, coder) sufficient for execution.
|
||||
|
||||
### Specialization Overhead
|
||||
|
||||
**Generic agents**: 0 overhead (use as-is)
|
||||
**Specialized agents**: +1 to +2 iterations for design + testing
|
||||
|
||||
### When Specialization Adds Value
|
||||
|
||||
**10x+ speedup opportunity**:
|
||||
- Example: coverage-analyzer (15 min → 30 sec = 30x)
|
||||
- Example: test-generator (10 min → 1 min = 10x)
|
||||
- Worth 1-2 iteration investment
|
||||
|
||||
**<5x speedup**:
|
||||
- Use generic agents + simple scripts
|
||||
- Not worth specialization overhead
|
||||
|
||||
### Examples
|
||||
|
||||
**✅ Generic Sufficient (Bootstrap-003)**:
|
||||
```
|
||||
Tasks:
|
||||
- Analyze errors (generic data-analyst)
|
||||
- Document taxonomy (generic doc-writer)
|
||||
- Create validation scripts (generic coder)
|
||||
|
||||
Speedup from specialization: 2-3x (not worth it)
|
||||
Result: 0 specialization overhead
|
||||
```
|
||||
|
||||
**⚠️ Specialization Needed (Bootstrap-002)**:
|
||||
```
|
||||
Tasks:
|
||||
- Coverage analysis (15 min → 30 sec = 30x with coverage-analyzer)
|
||||
- Test generation (10 min → 1 min = 10x with test-generator)
|
||||
|
||||
Speedup: >10x for both
|
||||
Investment: 1 iteration to design and test agents
|
||||
Result: +1 iteration, but ROI positive overall
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Criterion 5: Early High-Impact Automation 🟡 MODERATE
|
||||
|
||||
### Definition
|
||||
|
||||
Top 3 automation opportunities identified by iteration 1.
|
||||
|
||||
### Pareto Principle Application
|
||||
|
||||
**80/20 rule**: 20% of automations provide 80% of value
|
||||
|
||||
**Implication**: Identify top 3 early → rapid V_instance improvement
|
||||
|
||||
### Identification Signals
|
||||
|
||||
**High-frequency patterns**:
|
||||
- Appears in >10% of cases
|
||||
- Example: File-not-found (18.7% of errors)
|
||||
|
||||
**High-impact prevention**:
|
||||
- Prevents >50% of pattern occurrences
|
||||
- Example: validate-path.sh prevents 65.2%
|
||||
|
||||
**High ROI**:
|
||||
- Time saved / time invested > 5x
|
||||
- Example: validate-path.sh = 61x ROI
|
||||
|
||||
### Early Identification Techniques
|
||||
|
||||
**Frequency Analysis**:
|
||||
```bash
|
||||
# Count error types
|
||||
cat errors.jsonl | jq -r '.error_type' | sort | uniq -c | sort -rn
|
||||
|
||||
# Top 3 = high-frequency candidates
|
||||
```
|
||||
|
||||
**Impact Estimation**:
|
||||
```
|
||||
If tool prevents X% of pattern Y:
|
||||
- Pattern Y occurs N times
|
||||
- Prevention: X% × N
|
||||
- Impact: (X% × N) / total_errors
|
||||
```
|
||||
|
||||
**ROI Calculation**:
|
||||
```
|
||||
Manual time: M min per occurrence
|
||||
Tool investment: T hours
|
||||
Expected uses: N
|
||||
|
||||
ROI = (M × N) / (T × 60)
|
||||
```
|
||||
|
||||
### Example (Bootstrap-003)
|
||||
|
||||
**Iteration 0 Analysis**:
|
||||
```
|
||||
Top 3 by frequency:
|
||||
1. File-not-found: 250/1,336 = 18.7%
|
||||
2. MCP errors: 228/1,336 = 17.1%
|
||||
3. Build errors: 200/1,336 = 15.0%
|
||||
|
||||
Automation feasibility:
|
||||
1. File-not-found: ✅ Path validation (high prevention %)
|
||||
2. MCP errors: ❌ Infrastructure (low automation value)
|
||||
3. Build errors: ⚠️ Language-specific (moderate value)
|
||||
|
||||
Selected:
|
||||
1. validate-path.sh: 250 errors, 65.2% prevention, 61x ROI
|
||||
2. check-file-size.sh: 84 errors, 100% prevention, 31.6x ROI
|
||||
3. check-read-before-write.sh: 70 errors, 100% prevention, 26.2x ROI
|
||||
|
||||
Total impact: 317/1,336 = 23.7% error prevention
|
||||
```
|
||||
|
||||
**Result**: Clear automation path from iteration 0
|
||||
|
||||
---
|
||||
|
||||
## Criteria Interaction Matrix
|
||||
|
||||
| Criterion 1 | Criterion 2 | Criterion 3 | Likely Iterations |
|
||||
|-------------|-------------|-------------|-------------------|
|
||||
| ✅ (≥0.40) | ✅ Focused | ✅ Direct | 3-4 ⚡ |
|
||||
| ✅ (≥0.40) | ✅ Focused | ❌ Multi | 4-5 |
|
||||
| ✅ (≥0.40) | ❌ Broad | ✅ Direct | 4-5 |
|
||||
| ❌ (<0.40) | ✅ Focused | ✅ Direct | 5-6 |
|
||||
| ❌ (<0.40) | ❌ Broad | ❌ Multi | 7-10 |
|
||||
|
||||
**Key Insight**: Criteria 1-3 are multiplicative. Missing any = slower convergence.
|
||||
|
||||
---
|
||||
|
||||
## Decision Tree
|
||||
|
||||
```
|
||||
Start
|
||||
│
|
||||
├─ Can you achieve V_meta(s₀) ≥ 0.40?
|
||||
│ YES → Continue
|
||||
│ NO → Standard convergence (5-7 iterations)
|
||||
│
|
||||
├─ Is domain scope <3 sentences?
|
||||
│ YES → Continue
|
||||
│ NO → Refine scope first
|
||||
│
|
||||
├─ Can you validate without multi-context?
|
||||
│ YES → Rapid convergence likely (3-4 iterations)
|
||||
│ NO → Add +2 iterations for validation
|
||||
│
|
||||
└─ Generic agents sufficient?
|
||||
YES → No overhead
|
||||
NO → Add +1 iteration for specialization
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Source**: BAIME Rapid Convergence Criteria
|
||||
**Validation**: 13 experiments, 85% prediction accuracy
|
||||
**Critical Path**: Criteria 1-3 (must all be met for rapid convergence)
|
||||
329
skills/rapid-convergence/reference/prediction-model.md
Normal file
329
skills/rapid-convergence/reference/prediction-model.md
Normal file
@@ -0,0 +1,329 @@
|
||||
# Convergence Speed Prediction Model
|
||||
|
||||
**Purpose**: Predict iteration count before starting experiment
|
||||
**Accuracy**: 85% (±1 iteration) across 13 experiments
|
||||
|
||||
---
|
||||
|
||||
## Formula
|
||||
|
||||
```
|
||||
Predicted_Iterations = Base(4) + Σ penalties
|
||||
|
||||
Penalties:
|
||||
1. V_meta(s₀) < 0.40: +2
|
||||
2. Domain scope fuzzy: +1
|
||||
3. Multi-context validation: +2
|
||||
4. Specialization needed: +1
|
||||
5. Automation unclear: +1
|
||||
```
|
||||
|
||||
**Range**: 4-11 iterations (min 4, max 4+2+1+2+1+1=11)
|
||||
|
||||
---
|
||||
|
||||
## Penalty Definitions
|
||||
|
||||
### Penalty 1: Low Baseline (+2 iterations)
|
||||
|
||||
**Condition**: V_meta(s₀) < 0.40
|
||||
|
||||
**Rationale**: More gap to close (0.40+ needed to reach 0.80)
|
||||
|
||||
**Check**:
|
||||
```bash
|
||||
# Calculate V_meta(s₀) from iteration 0
|
||||
completeness=$(calculate_initial_coverage)
|
||||
transferability=$(calculate_borrowed_patterns)
|
||||
automation=$(calculate_identified_tools)
|
||||
|
||||
v_meta=$(echo "0.4*$completeness + 0.3*$transferability + 0.3*$automation" | bc)
|
||||
|
||||
if (( $(echo "$v_meta < 0.40" | bc -l) )); then
|
||||
penalty=2
|
||||
fi
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Penalty 2: Fuzzy Scope (+1 iteration)
|
||||
|
||||
**Condition**: Cannot describe domain in <3 clear sentences
|
||||
|
||||
**Rationale**: Requires scoping work, adds exploration
|
||||
|
||||
**Check**:
|
||||
- Write domain definition
|
||||
- Count sentences
|
||||
- Ask: Are boundaries clear?
|
||||
|
||||
**Example**:
|
||||
```
|
||||
✅ Clear: "Error detection, diagnosis, recovery, prevention for meta-cc"
|
||||
❌ Fuzzy: "Improve testing" (which tests? what aspects? how much?)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Penalty 3: Multi-Context Validation (+2 iterations)
|
||||
|
||||
**Condition**: Requires testing across multiple projects/languages
|
||||
|
||||
**Rationale**: Deployment + validation overhead
|
||||
|
||||
**Check**:
|
||||
- Is retrospective validation possible? (NO penalty)
|
||||
- Single-context sufficient? (NO penalty)
|
||||
- Need 2+ contexts? (+2 penalty)
|
||||
|
||||
---
|
||||
|
||||
### Penalty 4: Specialization Needed (+1 iteration)
|
||||
|
||||
**Condition**: Generic agents insufficient, need specialized agents
|
||||
|
||||
**Rationale**: Agent design + testing adds iteration
|
||||
|
||||
**Check**:
|
||||
- Can generic agents handle all tasks? (NO penalty)
|
||||
- Need >10x speedup from specialist? (+1 penalty)
|
||||
|
||||
---
|
||||
|
||||
### Penalty 5: Automation Unclear (+1 iteration)
|
||||
|
||||
**Condition**: Top 3 automations not obvious by iteration 0
|
||||
|
||||
**Rationale**: Requires discovery phase
|
||||
|
||||
**Check**:
|
||||
- Frequency analysis reveals clear candidates? (NO penalty)
|
||||
- Need exploration to find automations? (+1 penalty)
|
||||
|
||||
---
|
||||
|
||||
## Worked Examples
|
||||
|
||||
### Example 1: Bootstrap-003 (Error Recovery)
|
||||
|
||||
**Assessment**:
|
||||
```
|
||||
Base: 4
|
||||
|
||||
1. V_meta(s₀) = 0.48 ≥ 0.40? YES → +0 ✅
|
||||
2. Domain scope clear? YES ("Error detection, diagnosis...") → +0 ✅
|
||||
3. Retrospective validation? YES (1,336 historical errors) → +0 ✅
|
||||
4. Generic agents sufficient? YES → +0 ✅
|
||||
5. Automation clear? YES (top 3 from frequency analysis) → +0 ✅
|
||||
|
||||
Predicted: 4 + 0 = 4 iterations
|
||||
Actual: 3 iterations ✅ (within ±1)
|
||||
```
|
||||
|
||||
**Analysis**: All criteria met → minimal penalties → rapid convergence
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Bootstrap-002 (Test Strategy)
|
||||
|
||||
**Assessment**:
|
||||
```
|
||||
Base: 4
|
||||
|
||||
1. V_meta(s₀) = 0.04 < 0.40? NO → +2 ❌
|
||||
2. Domain scope clear? NO (testing is broad) → +1 ❌
|
||||
3. Multi-context validation? YES (3 archetypes) → +2 ❌
|
||||
4. Specialization needed? YES (coverage-analyzer, test-gen) → +1 ❌
|
||||
5. Automation clear? YES (coverage tools obvious) → +0 ✅
|
||||
|
||||
Predicted: 4 + 2 + 1 + 2 + 1 + 0 = 10 iterations
|
||||
Actual: 6 iterations ✅ (model conservative)
|
||||
```
|
||||
|
||||
**Analysis**: Model predicts upper bound. Efficient execution beat estimate.
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Hypothetical CI/CD Optimization
|
||||
|
||||
**Assessment**:
|
||||
```
|
||||
Base: 4
|
||||
|
||||
1. V_meta(s₀) = ?
|
||||
- Historical CI logs exist: YES
|
||||
- Initial analysis: 5 pipeline patterns identified
|
||||
- Estimated final: 7 patterns
|
||||
- Completeness: 5/7 = 0.71
|
||||
- Transferability: 0.40 (industry practices)
|
||||
- Automation: 0.67 (2/3 tools identified)
|
||||
- V_meta(s₀) = 0.4×0.71 + 0.3×0.40 + 0.3×0.67 = 0.49 ≥ 0.40 → +0 ✅
|
||||
|
||||
2. Domain scope: "Reduce CI/CD build time through caching, parallelization, optimization"
|
||||
- Clear? YES → +0 ✅
|
||||
|
||||
3. Validation: Single CI pipeline (own project)
|
||||
- Single-context? YES → +0 ✅
|
||||
|
||||
4. Specialization: Pipeline analysis can use generic bash/jq
|
||||
- Sufficient? YES → +0 ✅
|
||||
|
||||
5. Automation: Top 3 = caching, parallelization, fast-fail
|
||||
- Clear? YES → +0 ✅
|
||||
|
||||
Predicted: 4 + 0 = 4 iterations
|
||||
Expected actual: 3-5 iterations (rapid convergence)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Calibration Data
|
||||
|
||||
**13 Experiments, Actual vs Predicted**:
|
||||
|
||||
| Experiment | Predicted | Actual | Δ | Accurate? |
|
||||
|------------|-----------|--------|---|-----------|
|
||||
| Bootstrap-003 | 4 | 3 | -1 | ✅ |
|
||||
| Bootstrap-007 | 4 | 5 | +1 | ✅ |
|
||||
| Bootstrap-005 | 5 | 5 | 0 | ✅ |
|
||||
| Bootstrap-002 | 10 | 6 | -4 | ⚠️ |
|
||||
| Bootstrap-009 | 6 | 7 | +1 | ✅ |
|
||||
| Bootstrap-011 | 7 | 6 | -1 | ✅ |
|
||||
| ... | ... | ... | ... | ... |
|
||||
|
||||
**Accuracy**: 11/13 = 85% within ±1 iteration
|
||||
|
||||
**Model Bias**: Slightly conservative (over-predicts by avg 0.7 iterations)
|
||||
|
||||
---
|
||||
|
||||
## Usage Guide
|
||||
|
||||
### Step 1: Assess Domain (15 min)
|
||||
|
||||
**Tasks**:
|
||||
1. Analyze available data
|
||||
2. Research prior art
|
||||
3. Identify automation candidates
|
||||
4. Calculate V_meta(s₀)
|
||||
|
||||
**Output**: V_meta(s₀) value
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Evaluate Penalties (10 min)
|
||||
|
||||
**Checklist**:
|
||||
- [ ] V_meta(s₀) ≥ 0.40? (NO → +2)
|
||||
- [ ] Domain <3 clear sentences? (NO → +1)
|
||||
- [ ] Direct/retrospective validation? (NO → +2)
|
||||
- [ ] Generic agents sufficient? (NO → +1)
|
||||
- [ ] Top 3 automations clear? (NO → +1)
|
||||
|
||||
**Output**: Total penalty sum
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Calculate Prediction
|
||||
|
||||
```
|
||||
Predicted = 4 + penalty_sum
|
||||
|
||||
Examples:
|
||||
- 0 penalties → 4 iterations (rapid)
|
||||
- 2-3 penalties → 6-7 iterations (standard)
|
||||
- 5+ penalties → 9-11 iterations (exploratory)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Plan Experiment
|
||||
|
||||
**Rapid (4-5 iterations predicted)**:
|
||||
- Strong iteration 0: 3-5 hours
|
||||
- Aggressive iteration 1: Fix all P1 issues
|
||||
- Target: 10-15 hours total
|
||||
|
||||
**Standard (6-8 iterations predicted)**:
|
||||
- Normal iteration 0: 1-2 hours
|
||||
- Incremental improvements
|
||||
- Target: 20-30 hours total
|
||||
|
||||
**Exploratory (9+ iterations predicted)**:
|
||||
- Minimal iteration 0: <1 hour
|
||||
- Discovery-driven
|
||||
- Target: 30-50 hours total
|
||||
|
||||
---
|
||||
|
||||
## Prediction Confidence
|
||||
|
||||
**High Confidence** (0-2 penalties):
|
||||
- Predicted ±1 iteration
|
||||
- 90% accuracy
|
||||
|
||||
**Medium Confidence** (3-4 penalties):
|
||||
- Predicted ±2 iterations
|
||||
- 75% accuracy
|
||||
|
||||
**Low Confidence** (5+ penalties):
|
||||
- Predicted ±3 iterations
|
||||
- 60% accuracy
|
||||
|
||||
**Reason**: More penalties = more unknowns = higher variance
|
||||
|
||||
---
|
||||
|
||||
## Model Limitations
|
||||
|
||||
### 1. Assumes Competent Execution
|
||||
|
||||
**Model assumes**:
|
||||
- Comprehensive iteration 0 (if V_meta(s₀) ≥ 0.40)
|
||||
- Efficient iteration execution
|
||||
- No major blockers
|
||||
|
||||
**Reality**: Execution quality varies
|
||||
|
||||
---
|
||||
|
||||
### 2. Conservative Bias
|
||||
|
||||
**Model tends to over-predict** (actual < predicted)
|
||||
|
||||
**Reason**: Penalties are additive, but some synergies exist
|
||||
|
||||
**Example**: Bootstrap-002 predicted 10, actual 6 (efficient work offset penalties)
|
||||
|
||||
---
|
||||
|
||||
### 3. Domain-Specific Factors
|
||||
|
||||
**Not captured**:
|
||||
- Developer experience
|
||||
- Tool ecosystem maturity
|
||||
- Team collaboration
|
||||
- Unforeseen blockers
|
||||
|
||||
**Recommendation**: Use as guideline, not guarantee
|
||||
|
||||
---
|
||||
|
||||
## Decision Support
|
||||
|
||||
### Use Prediction to Decide:
|
||||
|
||||
**4-5 iterations predicted**:
|
||||
→ Invest in strong iteration 0 (rapid convergence worth it)
|
||||
|
||||
**6-8 iterations predicted**:
|
||||
→ Standard approach (diminishing returns from heavy baseline)
|
||||
|
||||
**9+ iterations predicted**:
|
||||
→ Exploratory mode (discovery-first, optimize later)
|
||||
|
||||
---
|
||||
|
||||
**Source**: BAIME Rapid Convergence Prediction Model
|
||||
**Validation**: 13 experiments, 85% accuracy (±1 iteration)
|
||||
**Usage**: Planning tool for experiment design
|
||||
426
skills/rapid-convergence/reference/strategy.md
Normal file
426
skills/rapid-convergence/reference/strategy.md
Normal file
@@ -0,0 +1,426 @@
|
||||
# Rapid Convergence Strategy Guide
|
||||
|
||||
**Purpose**: Iteration-by-iteration tactics for 3-4 iteration convergence
|
||||
**Time**: 10-15 hours total (vs 20-30 standard)
|
||||
|
||||
---
|
||||
|
||||
## Pre-Iteration 0: Planning (1-2 hours)
|
||||
|
||||
### Objectives
|
||||
|
||||
1. Confirm rapid convergence feasible
|
||||
2. Establish measurement infrastructure
|
||||
3. Define scope boundaries
|
||||
4. Plan validation approach
|
||||
|
||||
### Tasks
|
||||
|
||||
**1. Baseline Assessment** (30 min):
|
||||
```bash
|
||||
# Query existing data
|
||||
meta-cc query-tools --status=error
|
||||
meta-cc query-user-messages --pattern="test|coverage"
|
||||
|
||||
# Calculate baseline metrics
|
||||
# Estimate V_meta(s₀)
|
||||
```
|
||||
|
||||
**2. Scope Definition** (20 min):
|
||||
```markdown
|
||||
## Domain: [1-sentence definition]
|
||||
|
||||
**In-Scope**: [3-5 items]
|
||||
**Out-of-Scope**: [3-5 items]
|
||||
**Edge Cases**: [Handling approach]
|
||||
```
|
||||
|
||||
**3. Success Criteria** (20 min):
|
||||
```markdown
|
||||
## Convergence Targets
|
||||
|
||||
**V_instance ≥ 0.80**:
|
||||
- Metric 1: [Target]
|
||||
- Metric 2: [Target]
|
||||
|
||||
**V_meta ≥ 0.80**:
|
||||
- Patterns: [8-10 documented]
|
||||
- Tools: [3-5 created]
|
||||
- Transferability: [≥80%]
|
||||
```
|
||||
|
||||
**4. Prediction** (10 min):
|
||||
```
|
||||
Use prediction model:
|
||||
Base(4) + penalties = [X] iterations expected
|
||||
```
|
||||
|
||||
**Deliverable**: `README.md` with scope, targets, prediction
|
||||
|
||||
---
|
||||
|
||||
## Iteration 0: Comprehensive Baseline (3-5 hours)
|
||||
|
||||
### Objectives
|
||||
|
||||
- Achieve V_meta(s₀) ≥ 0.40
|
||||
- Initial taxonomy: 70-80% coverage
|
||||
- Identify top 3 automations
|
||||
|
||||
### Time Allocation
|
||||
|
||||
- Data analysis: 60-90 min (40%)
|
||||
- Taxonomy creation: 45-75 min (30%)
|
||||
- Pattern research: 30-45 min (20%)
|
||||
- Automation planning: 15-30 min (10%)
|
||||
|
||||
### Tasks
|
||||
|
||||
**1. Comprehensive Data Analysis** (60-90 min):
|
||||
```bash
|
||||
# Extract ALL available data
|
||||
meta-cc query-tools --scope=project > tools.jsonl
|
||||
meta-cc query-user-messages --pattern=".*" > messages.jsonl
|
||||
|
||||
# Analyze patterns
|
||||
cat tools.jsonl | jq -r '.error' | sort | uniq -c | sort -rn | head -20
|
||||
|
||||
# Calculate frequencies
|
||||
total=$(cat tools.jsonl | wc -l)
|
||||
# For each pattern: count / total
|
||||
```
|
||||
|
||||
**2. Initial Taxonomy** (45-75 min):
|
||||
```markdown
|
||||
## Taxonomy v0
|
||||
|
||||
### Category 1: [Name] ([frequency]%, [count])
|
||||
**Pattern**: [Description]
|
||||
**Examples**: [3-5 examples]
|
||||
**Root Cause**: [Analysis]
|
||||
|
||||
### Category 2: ...
|
||||
[Repeat for 10-15 categories]
|
||||
|
||||
**Coverage**: [X]% ([classified]/[total])
|
||||
```
|
||||
|
||||
**3. Pattern Research** (30-45 min):
|
||||
```markdown
|
||||
## Prior Art
|
||||
|
||||
**Source 1**: [Industry taxonomy/framework]
|
||||
- Borrowed: [Pattern A, Pattern B, ...]
|
||||
- Transferability: [X]%
|
||||
|
||||
**Source 2**: [Similar project]
|
||||
- Borrowed: [Pattern C, Pattern D, ...]
|
||||
- Adaptations needed: [List]
|
||||
|
||||
**Total Borrowable**: [X]/[Y] patterns = [Z]%
|
||||
```
|
||||
|
||||
**4. Automation Planning** (15-30 min):
|
||||
```markdown
|
||||
## Top Automation Candidates
|
||||
|
||||
**1. [Tool Name]**
|
||||
- Frequency: [X]% of cases
|
||||
- Prevention: [Y]% of pattern
|
||||
- ROI estimate: [Z]x
|
||||
- Feasibility: [High/Medium/Low]
|
||||
|
||||
**2. [Tool Name]**
|
||||
[Same structure]
|
||||
|
||||
**3. [Tool Name]**
|
||||
[Same structure]
|
||||
```
|
||||
|
||||
### Metrics
|
||||
|
||||
Calculate V_meta(s₀):
|
||||
```
|
||||
Completeness: [initial_categories] / [estimated_final] = [X]
|
||||
Transferability: [borrowed] / [total_needed] = [Y]
|
||||
Automation: [identified] / [expected] = [Z]
|
||||
|
||||
V_meta(s₀) = 0.4×[X] + 0.3×[Y] + 0.3×[Z] = [RESULT]
|
||||
|
||||
Target: ≥ 0.40 ✅/❌
|
||||
```
|
||||
|
||||
**Deliverables**:
|
||||
- `taxonomy-v0.md` (10-15 categories, ≥70% coverage)
|
||||
- `baseline-metrics.md` (V_meta(s₀), frequencies)
|
||||
- `automation-plan.md` (top 3 tools, ROI estimates)
|
||||
|
||||
---
|
||||
|
||||
## Iteration 1: High-Impact Automation (3-4 hours)
|
||||
|
||||
### Objectives
|
||||
|
||||
- V_instance ≥ 0.60 (significant improvement)
|
||||
- Implement top 2-3 tools
|
||||
- Expand taxonomy to 90%+ coverage
|
||||
|
||||
### Time Allocation
|
||||
|
||||
- Tool implementation: 90-120 min (50%)
|
||||
- Taxonomy expansion: 45-60 min (25%)
|
||||
- Testing & validation: 45-60 min (25%)
|
||||
|
||||
### Tasks
|
||||
|
||||
**1. Build Automation Tools** (90-120 min):
|
||||
```bash
|
||||
# Tool 1: validate-path.sh (30-40 min)
|
||||
#!/bin/bash
|
||||
# Fuzzy path matching, typo correction
|
||||
# Target: 150-200 LOC
|
||||
|
||||
# Tool 2: check-file-size.sh (20-30 min)
|
||||
#!/bin/bash
|
||||
# File size check, auto-pagination
|
||||
# Target: 100-150 LOC
|
||||
|
||||
# Tool 3: check-read-before-write.sh (40-50 min)
|
||||
#!/bin/bash
|
||||
# Workflow validation
|
||||
# Target: 150-200 LOC
|
||||
```
|
||||
|
||||
**2. Expand Taxonomy** (45-60 min):
|
||||
```markdown
|
||||
## Taxonomy v1
|
||||
|
||||
### [New Category 11]: [Name]
|
||||
[Analysis of remaining 10-20% of cases]
|
||||
|
||||
### [New Category 12]: [Name]
|
||||
[Continue until ≥90% coverage]
|
||||
|
||||
**Coverage**: [X]% ([classified]/[total])
|
||||
**Gap Analysis**: [Remaining uncategorized patterns]
|
||||
```
|
||||
|
||||
**3. Test & Measure** (45-60 min):
|
||||
```bash
|
||||
# Test tools on historical data
|
||||
./scripts/validate-path.sh "path/to/file" # Expect suggestions
|
||||
./scripts/check-file-size.sh "large-file.json" # Expect warning
|
||||
|
||||
# Calculate impact
|
||||
prevented=$(estimate_prevention_rate)
|
||||
time_saved=$(calculate_time_savings)
|
||||
roi=$(calculate_roi)
|
||||
|
||||
# Update metrics
|
||||
```
|
||||
|
||||
### Metrics
|
||||
|
||||
```
|
||||
V_instance calculation:
|
||||
- Success rate: [X]%
|
||||
- Quality: [Y]/5
|
||||
- Efficiency: [Z] min/task
|
||||
|
||||
V_instance = 0.4×[success] + 0.3×[quality/5] + 0.2×[efficiency] + 0.1×[reliability]
|
||||
= [RESULT]
|
||||
|
||||
Target: ≥ 0.60 (progress toward 0.80)
|
||||
```
|
||||
|
||||
**Deliverables**:
|
||||
- `scripts/tool1.sh`, `scripts/tool2.sh`, `scripts/tool3.sh`
|
||||
- `taxonomy-v1.md` (≥90% coverage)
|
||||
- `iteration-1-results.md` (V_instance, V_meta, gaps)
|
||||
|
||||
---
|
||||
|
||||
## Iteration 2: Validation & Refinement (3-4 hours)
|
||||
|
||||
### Objectives
|
||||
|
||||
- V_instance ≥ 0.80 ✅
|
||||
- V_meta ≥ 0.80 ✅
|
||||
- Validate stability (2 consecutive iterations)
|
||||
|
||||
### Time Allocation
|
||||
|
||||
- Retrospective validation: 60-90 min (40%)
|
||||
- Taxonomy completion: 30-45 min (20%)
|
||||
- Tool refinement: 45-60 min (25%)
|
||||
- Documentation: 30-45 min (15%)
|
||||
|
||||
### Tasks
|
||||
|
||||
**1. Retrospective Validation** (60-90 min):
|
||||
```bash
|
||||
# Apply methodology to historical data
|
||||
meta-cc validate \
|
||||
--methodology error-recovery \
|
||||
--history .claude/sessions/*.jsonl
|
||||
|
||||
# Measure:
|
||||
# - Coverage: [X]% of historical cases handled
|
||||
# - Time savings: [Y] hours saved
|
||||
# - Prevention: [Z]% errors prevented
|
||||
# - Confidence: [Score]
|
||||
```
|
||||
|
||||
**2. Complete Taxonomy** (30-45 min):
|
||||
```markdown
|
||||
## Taxonomy v2 (Final)
|
||||
|
||||
[Review all categories]
|
||||
[Add final 1-2 categories if needed]
|
||||
[Refine existing categories]
|
||||
|
||||
**Final Coverage**: [X]% ≥ 95% ✅
|
||||
**Uncategorized**: [Y]% (acceptable edge cases)
|
||||
```
|
||||
|
||||
**3. Refine Tools** (45-60 min):
|
||||
```bash
|
||||
# Based on validation feedback
|
||||
# - Fix bugs discovered
|
||||
# - Improve accuracy
|
||||
# - Add edge case handling
|
||||
# - Optimize performance
|
||||
|
||||
# Re-test
|
||||
# Re-measure ROI
|
||||
```
|
||||
|
||||
**4. Documentation** (30-45 min):
|
||||
```markdown
|
||||
## Complete Methodology
|
||||
|
||||
### Patterns: [8-10 documented]
|
||||
### Tools: [3-5 with usage]
|
||||
### Transferability: [≥80%]
|
||||
### Validation: [Results]
|
||||
```
|
||||
|
||||
### Metrics
|
||||
|
||||
```
|
||||
V_instance: [X] (≥0.80? ✅/❌)
|
||||
V_meta: [Y] (≥0.80? ✅/❌)
|
||||
|
||||
Stability check:
|
||||
- Iteration 1: V_instance = [A]
|
||||
- Iteration 2: V_instance = [B]
|
||||
- Change: [|B-A|] < 0.05? ✅/❌
|
||||
|
||||
Convergence: ✅/❌
|
||||
```
|
||||
|
||||
**Decision**:
|
||||
- ✅ Converged → Deploy
|
||||
- ❌ Not converged → Iteration 3 (gap analysis)
|
||||
|
||||
**Deliverables**:
|
||||
- `validation-report.md` (confidence, coverage, ROI)
|
||||
- `methodology-complete.md` (production-ready)
|
||||
- `transferability-guide.md` (80%+ reuse documentation)
|
||||
|
||||
---
|
||||
|
||||
## Iteration 3 (If Needed): Gap Closure (2-3 hours)
|
||||
|
||||
### Objectives
|
||||
|
||||
- Close specific gaps preventing convergence
|
||||
- Reach dual-layer convergence (V_instance ≥ 0.80, V_meta ≥ 0.80)
|
||||
|
||||
### Gap Analysis
|
||||
|
||||
```markdown
|
||||
## Why Not Converged?
|
||||
|
||||
**V_instance gaps** ([X] < 0.80):
|
||||
- Metric A: [current] vs [target] = gap [Z]
|
||||
- Root cause: [Analysis]
|
||||
- Fix: [Action]
|
||||
|
||||
**V_meta gaps** ([Y] < 0.80):
|
||||
- Component: [completeness/transferability/automation]
|
||||
- Current: [X]
|
||||
- Target: [Y]
|
||||
- Fix: [Action]
|
||||
```
|
||||
|
||||
### Focused Improvements
|
||||
|
||||
**Time**: 2-3 hours (targeted, not comprehensive)
|
||||
|
||||
**Tasks**:
|
||||
- Address 1-2 major gaps only
|
||||
- Refine existing work (no new patterns)
|
||||
- Validate fixes
|
||||
|
||||
**Re-measure**:
|
||||
```
|
||||
V_instance: [X] ≥ 0.80? ✅/❌
|
||||
V_meta: [Y] ≥ 0.80? ✅/❌
|
||||
Stable for 2 iterations? ✅/❌
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Timeline Summary
|
||||
|
||||
### Rapid Convergence (3 iterations)
|
||||
|
||||
```
|
||||
Pre-Iteration 0: 1-2h
|
||||
Iteration 0: 3-5h (comprehensive baseline)
|
||||
Iteration 1: 3-4h (automation + expansion)
|
||||
Iteration 2: 3-4h (validation + convergence)
|
||||
---
|
||||
Total: 10-15h ✅
|
||||
```
|
||||
|
||||
### Standard (If Iteration 3 Needed)
|
||||
|
||||
```
|
||||
Pre-Iteration 0: 1-2h
|
||||
Iteration 0: 3-5h
|
||||
Iteration 1: 3-4h
|
||||
Iteration 2: 3-4h
|
||||
Iteration 3: 2-3h (gap closure)
|
||||
---
|
||||
Total: 12-18h (still faster than standard 20-30h)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### ❌ Rushing Iteration 0
|
||||
|
||||
**Symptom**: Spending 1-2 hours (vs 3-5)
|
||||
**Impact**: Low V_meta(s₀), requires more iterations
|
||||
**Fix**: Invest 3-5 hours for comprehensive baseline
|
||||
|
||||
### ❌ Over-Engineering Tools
|
||||
|
||||
**Symptom**: Spending 4+ hours per tool
|
||||
**Impact**: Delays convergence
|
||||
**Fix**: Simple tools (150-200 LOC, 30-60 min each)
|
||||
|
||||
### ❌ Premature Convergence
|
||||
|
||||
**Symptom**: Declaring done at V = 0.75
|
||||
**Impact**: Quality issues in production
|
||||
**Fix**: Respect 0.80 threshold, ensure 2-iteration stability
|
||||
|
||||
---
|
||||
|
||||
**Source**: BAIME Rapid Convergence Strategy
|
||||
**Validation**: Bootstrap-003 (3 iterations, 10 hours)
|
||||
**Success Rate**: 85% (11/13 experiments)
|
||||
Reference in New Issue
Block a user