466 lines
12 KiB
Markdown
466 lines
12 KiB
Markdown
---
|
|
name: Baseline Quality Assessment
|
|
description: Achieve comprehensive baseline (V_meta ≥0.40) in iteration 0 to enable rapid convergence. Use when planning iteration 0 time allocation, domain has established practices to reference, rich historical data exists for immediate quantification, or targeting 3-4 iteration convergence. Provides 4 quality levels (minimal/basic/comprehensive/exceptional), component-by-component V_meta calculation guide, and 3 strategies for comprehensive baseline (leverage prior art, quantify baseline, domain universality analysis). 40-50% iteration reduction when V_meta(s₀) ≥0.40 vs <0.20. Spend 3-4 extra hours in iteration 0, save 3-6 hours overall.
|
|
allowed-tools: Read, Grep, Glob, Bash, Edit, Write
|
|
---
|
|
|
|
# Baseline Quality Assessment
|
|
|
|
**Invest in iteration 0 to save 40-50% total time.**
|
|
|
|
> A strong baseline (V_meta ≥0.40) is the foundation of rapid convergence. Spend hours in iteration 0 to save days overall.
|
|
|
|
---
|
|
|
|
## When to Use This Skill
|
|
|
|
Use this skill when:
|
|
- 📋 **Planning iteration 0**: Deciding time allocation and priorities
|
|
- 🎯 **Targeting rapid convergence**: Want 3-4 iterations (not 5-7)
|
|
- 📚 **Prior art exists**: Domain has established practices to reference
|
|
- 📊 **Historical data available**: Can quantify baseline immediately
|
|
- ⏰ **Time constraints**: Need methodology in 10-15 hours total
|
|
- 🔍 **Gap clarity needed**: Want obvious iteration objectives
|
|
|
|
**Don't use when**:
|
|
- ❌ Exploratory domain (no prior art)
|
|
- ❌ Greenfield project (no historical data)
|
|
- ❌ Time abundant (standard convergence acceptable)
|
|
- ❌ Incremental baseline acceptable (build up gradually)
|
|
|
|
---
|
|
|
|
## Quick Start (30 minutes)
|
|
|
|
### Baseline Quality Self-Assessment
|
|
|
|
Calculate your V_meta(s₀):
|
|
|
|
**V_meta = (Completeness + Effectiveness + Reusability + Validation) / 4**
|
|
|
|
**Completeness** (Documentation exists?):
|
|
- 0.00: No documentation
|
|
- 0.25: Basic notes only
|
|
- 0.50: Partial documentation (some categories)
|
|
- 0.75: Most documentation complete
|
|
- 1.00: Comprehensive documentation
|
|
|
|
**Effectiveness** (Speedup quantified?):
|
|
- 0.00: No baseline measurement
|
|
- 0.25: Informal estimates
|
|
- 0.50: Some metrics measured
|
|
- 0.75: Most metrics quantified
|
|
- 1.00: Full quantitative baseline
|
|
|
|
**Reusability** (Transferable patterns?):
|
|
- 0.00: No patterns identified
|
|
- 0.25: Ad-hoc solutions only
|
|
- 0.50: Some patterns emerging
|
|
- 0.75: Most patterns codified
|
|
- 1.00: Universal patterns documented
|
|
|
|
**Validation** (Evidence-based?):
|
|
- 0.00: No validation
|
|
- 0.25: Anecdotal only
|
|
- 0.50: Some data analysis
|
|
- 0.75: Systematic analysis
|
|
- 1.00: Comprehensive validation
|
|
|
|
**Example** (Bootstrap-003, V_meta(s₀) = 0.48):
|
|
```
|
|
Completeness: 0.60 (10-category taxonomy, 79.1% coverage)
|
|
Effectiveness: 0.40 (Error rate quantified: 5.78%)
|
|
Reusability: 0.40 (5 workflows, 5 patterns, 8 guidelines)
|
|
Validation: 0.50 (1,336 errors analyzed)
|
|
---
|
|
V_meta(s₀) = (0.60 + 0.40 + 0.40 + 0.50) / 4 = 0.475 ≈ 0.48
|
|
```
|
|
|
|
**Target**: V_meta(s₀) ≥ 0.40 for rapid convergence
|
|
|
|
---
|
|
|
|
## Four Baseline Quality Levels
|
|
|
|
### Level 1: Minimal (V_meta <0.20)
|
|
|
|
**Characteristics**:
|
|
- No or minimal documentation
|
|
- No quantitative metrics
|
|
- No pattern identification
|
|
- No validation
|
|
|
|
**Iteration 0 time**: 1-2 hours
|
|
**Total iterations**: 6-10 (standard to slow convergence)
|
|
**Example**: Starting from scratch in novel domain
|
|
|
|
**When acceptable**: Exploratory research, no prior art
|
|
|
|
### Level 2: Basic (V_meta 0.20-0.39)
|
|
|
|
**Characteristics**:
|
|
- Basic documentation (notes, informal structure)
|
|
- Some metrics identified (not quantified)
|
|
- Ad-hoc patterns (not codified)
|
|
- Anecdotal validation
|
|
|
|
**Iteration 0 time**: 2-3 hours
|
|
**Total iterations**: 5-7 (standard convergence)
|
|
**Example**: Bootstrap-002 (V_meta(s₀) = 0.04, but quickly built to basic)
|
|
|
|
**When acceptable**: Standard timelines, incremental approach
|
|
|
|
### Level 3: Comprehensive (V_meta 0.40-0.60) ⭐ TARGET
|
|
|
|
**Characteristics**:
|
|
- Structured documentation (taxonomy, categories)
|
|
- Quantified metrics (baseline measured)
|
|
- Codified patterns (initial pattern library)
|
|
- Systematic validation (data analysis)
|
|
|
|
**Iteration 0 time**: 3-5 hours
|
|
**Total iterations**: 3-4 (rapid convergence)
|
|
**Example**: Bootstrap-003 (V_meta(s₀) = 0.48, converged in 3 iterations)
|
|
|
|
**When to target**: Time constrained, prior art exists, data available
|
|
|
|
### Level 4: Exceptional (V_meta >0.60)
|
|
|
|
**Characteristics**:
|
|
- Comprehensive documentation (≥90% coverage)
|
|
- Full quantitative baseline (all metrics)
|
|
- Extensive pattern library
|
|
- Validated methodology (proven in 1+ contexts)
|
|
|
|
**Iteration 0 time**: 5-8 hours
|
|
**Total iterations**: 2-3 (exceptional rapid convergence)
|
|
**Example**: Hypothetical (not yet observed in experiments)
|
|
|
|
**When to target**: Adaptation of proven methodology, domain expertise high
|
|
|
|
---
|
|
|
|
## Three Strategies for Comprehensive Baseline
|
|
|
|
### Strategy 1: Leverage Prior Art (2-3 hours)
|
|
|
|
**When**: Domain has established practices
|
|
|
|
**Steps**:
|
|
|
|
1. **Literature review** (30 min):
|
|
- Industry best practices
|
|
- Existing methodologies
|
|
- Academic research
|
|
|
|
2. **Extract patterns** (60 min):
|
|
- Common approaches
|
|
- Known anti-patterns
|
|
- Success metrics
|
|
|
|
3. **Adapt to context** (60 min):
|
|
- What's applicable?
|
|
- What needs modification?
|
|
- What's missing?
|
|
|
|
**Example** (Bootstrap-003):
|
|
```
|
|
Prior art: Error handling literature
|
|
- Detection: Industry standard (logs, monitoring)
|
|
- Diagnosis: Root cause analysis patterns
|
|
- Recovery: Retry, fallback patterns
|
|
- Prevention: Static analysis, linting
|
|
|
|
Adaptation:
|
|
- Detection: meta-cc MCP queries (novel application)
|
|
- Diagnosis: Session history analysis (context-specific)
|
|
- Recovery: Generic patterns apply
|
|
- Prevention: Pre-tool validation (novel approach)
|
|
|
|
Result: V_completeness = 0.60 (60% from prior art, 40% novel)
|
|
```
|
|
|
|
### Strategy 2: Quantify Baseline (1-2 hours)
|
|
|
|
**When**: Rich historical data exists
|
|
|
|
**Steps**:
|
|
|
|
1. **Identify data sources** (15 min):
|
|
- Logs, session history, metrics
|
|
- Git history, CI/CD logs
|
|
- Issue trackers, user feedback
|
|
|
|
2. **Extract metrics** (30 min):
|
|
- Volume (total instances)
|
|
- Rate (frequency)
|
|
- Distribution (categories)
|
|
- Impact (cost)
|
|
|
|
3. **Analyze patterns** (45 min):
|
|
- What's most common?
|
|
- What's most costly?
|
|
- What's preventable?
|
|
|
|
**Example** (Bootstrap-003):
|
|
```
|
|
Data source: meta-cc MCP server
|
|
Query: meta-cc query-tools --status error
|
|
|
|
Results:
|
|
- Volume: 1,336 errors
|
|
- Rate: 5.78% error rate
|
|
- Distribution: File-not-found 12.2%, Read-before-write 5.2%, etc.
|
|
- Impact: MTTD 15 min, MTTR 30 min
|
|
|
|
Analysis:
|
|
- Top 3 categories account for 23.7% of errors
|
|
- File path issues most preventable
|
|
- Clear automation opportunities
|
|
|
|
Result: V_effectiveness = 0.40 (baseline quantified)
|
|
```
|
|
|
|
### Strategy 3: Domain Universality Analysis (1-2 hours)
|
|
|
|
**When**: Domain is universal (errors, testing, CI/CD)
|
|
|
|
**Steps**:
|
|
|
|
1. **Identify universal patterns** (30 min):
|
|
- What applies to all projects?
|
|
- What's language-agnostic?
|
|
- What's platform-agnostic?
|
|
|
|
2. **Document transferability** (30 min):
|
|
- What % is reusable?
|
|
- What needs adaptation?
|
|
- What's project-specific?
|
|
|
|
3. **Create initial taxonomy** (30 min):
|
|
- Categorize patterns
|
|
- Identify gaps
|
|
- Estimate coverage
|
|
|
|
**Example** (Bootstrap-003):
|
|
```
|
|
Universal patterns:
|
|
- Errors affect all software (100% universal)
|
|
- Detection, diagnosis, recovery, prevention (universal workflow)
|
|
- File operations, API calls, data validation (universal categories)
|
|
|
|
Taxonomy (iteration 0):
|
|
- 10 categories identified
|
|
- 1,058 errors classified (79.1% coverage)
|
|
- Gaps: Edge cases, complex interactions
|
|
|
|
Result: V_reusability = 0.40 (universal patterns identified)
|
|
```
|
|
|
|
---
|
|
|
|
## Baseline Investment ROI
|
|
|
|
**Trade-off**: Spend more in iteration 0 to save overall time
|
|
|
|
**Data** (from experiments):
|
|
|
|
| Baseline | Iter 0 Time | Total Iterations | Total Time | Savings |
|
|
|----------|-------------|------------------|------------|---------|
|
|
| Minimal (<0.20) | 1-2h | 6-10 | 24-40h | Baseline |
|
|
| Basic (0.20-0.39) | 2-3h | 5-7 | 20-28h | 10-30% |
|
|
| Comprehensive (0.40-0.60) | 3-5h | 3-4 | 12-16h | 40-50% |
|
|
| Exceptional (>0.60) | 5-8h | 2-3 | 10-15h | 50-60% |
|
|
|
|
**Example** (Bootstrap-003):
|
|
```
|
|
Comprehensive baseline:
|
|
- Iteration 0: 3 hours (vs 1 hour minimal)
|
|
- Total: 10 hours, 3 iterations
|
|
- Savings: 15-25 hours vs minimal baseline (60-70%)
|
|
|
|
ROI: +2 hours investment → 15-25 hours saved
|
|
```
|
|
|
|
**Recommendation**: Target comprehensive (V_meta ≥0.40) when:
|
|
- Time constrained (need fast convergence)
|
|
- Prior art exists (can leverage quickly)
|
|
- Data available (can quantify immediately)
|
|
|
|
---
|
|
|
|
## Component-by-Component Guide
|
|
|
|
### Completeness (Documentation)
|
|
|
|
**0.00**: No documentation
|
|
|
|
**0.25**: Basic notes
|
|
- Informal observations
|
|
- Bullet points
|
|
- No structure
|
|
|
|
**0.50**: Partial documentation
|
|
- Some categories/patterns
|
|
- 40-60% coverage
|
|
- Basic structure
|
|
|
|
**0.75**: Most documentation
|
|
- Structured taxonomy
|
|
- 70-90% coverage
|
|
- Clear organization
|
|
|
|
**1.00**: Comprehensive
|
|
- Complete taxonomy
|
|
- 90%+ coverage
|
|
- Production-ready
|
|
|
|
**Target for V_meta ≥0.40**: Completeness ≥0.50
|
|
|
|
### Effectiveness (Quantification)
|
|
|
|
**0.00**: No baseline measurement
|
|
|
|
**0.25**: Informal estimates
|
|
- "Errors happen sometimes"
|
|
- No numbers
|
|
|
|
**0.50**: Some metrics
|
|
- Volume measured (e.g., 1,336 errors)
|
|
- Rate not calculated
|
|
|
|
**0.75**: Most metrics
|
|
- Volume, rate, distribution
|
|
- Missing impact (MTTD/MTTR)
|
|
|
|
**1.00**: Full quantification
|
|
- All metrics measured
|
|
- Baseline fully quantified
|
|
|
|
**Target for V_meta ≥0.40**: Effectiveness ≥0.30
|
|
|
|
### Reusability (Patterns)
|
|
|
|
**0.00**: No patterns
|
|
|
|
**0.25**: Ad-hoc solutions
|
|
- One-off fixes
|
|
- No generalization
|
|
|
|
**0.50**: Some patterns
|
|
- 3-5 patterns identified
|
|
- Partial universality
|
|
|
|
**0.75**: Most patterns
|
|
- 5-10 patterns codified
|
|
- High transferability
|
|
|
|
**1.00**: Universal patterns
|
|
- Complete pattern library
|
|
- 90%+ transferable
|
|
|
|
**Target for V_meta ≥0.40**: Reusability ≥0.40
|
|
|
|
### Validation (Evidence)
|
|
|
|
**0.00**: No validation
|
|
|
|
**0.25**: Anecdotal
|
|
- "Seems to work"
|
|
- No data
|
|
|
|
**0.50**: Some data
|
|
- Basic analysis
|
|
- Limited scope
|
|
|
|
**0.75**: Systematic
|
|
- Comprehensive analysis
|
|
- Clear evidence
|
|
|
|
**1.00**: Validated
|
|
- Multiple contexts
|
|
- Statistical confidence
|
|
|
|
**Target for V_meta ≥0.40**: Validation ≥0.30
|
|
|
|
---
|
|
|
|
## Iteration 0 Checklist (for V_meta ≥0.40)
|
|
|
|
**Documentation** (Target: Completeness ≥0.50):
|
|
- [ ] Create initial taxonomy (≥5 categories)
|
|
- [ ] Document 3-5 patterns/workflows
|
|
- [ ] Achieve 60-80% coverage
|
|
- [ ] Structured markdown documentation
|
|
|
|
**Quantification** (Target: Effectiveness ≥0.30):
|
|
- [ ] Measure volume (total instances)
|
|
- [ ] Calculate rate (frequency)
|
|
- [ ] Analyze distribution (category breakdown)
|
|
- [ ] Baseline quantified with numbers
|
|
|
|
**Patterns** (Target: Reusability ≥0.40):
|
|
- [ ] Identify 3-5 universal patterns
|
|
- [ ] Document transferability
|
|
- [ ] Estimate reusability %
|
|
- [ ] Distinguish universal vs domain-specific
|
|
|
|
**Validation** (Target: Validation ≥0.30):
|
|
- [ ] Analyze historical data
|
|
- [ ] Sample validation (≥30 instances)
|
|
- [ ] Evidence-based claims
|
|
- [ ] Data sources documented
|
|
|
|
**Time Investment**: 3-5 hours
|
|
|
|
**Expected V_meta(s₀)**: 0.40-0.50
|
|
|
|
---
|
|
|
|
## Success Criteria
|
|
|
|
Baseline quality assessment succeeded when:
|
|
|
|
1. **V_meta target met**: V_meta(s₀) ≥ 0.40 achieved
|
|
2. **Iteration reduction**: 3-4 iterations vs 5-7 (40-50% reduction)
|
|
3. **Time savings**: Total time ≤12-16 hours (comprehensive baseline)
|
|
4. **Gap clarity**: Clear objectives for iteration 1-2
|
|
5. **ROI positive**: Baseline investment <total time saved
|
|
|
|
**Bootstrap-003 Validation**:
|
|
- ✅ V_meta(s₀) = 0.48 (target met)
|
|
- ✅ 3 iterations (vs 6 for Bootstrap-002 with minimal baseline)
|
|
- ✅ 10 hours total (60% reduction)
|
|
- ✅ Gaps clear (top 3 automations identified)
|
|
- ✅ ROI: +2h investment → 15h saved
|
|
|
|
---
|
|
|
|
## Related Skills
|
|
|
|
**Parent framework**:
|
|
- [methodology-bootstrapping](../methodology-bootstrapping/SKILL.md) - Core OCA cycle
|
|
|
|
**Uses baseline for**:
|
|
- [rapid-convergence](../rapid-convergence/SKILL.md) - V_meta ≥0.40 is criterion #1
|
|
|
|
**Validation**:
|
|
- [retrospective-validation](../retrospective-validation/SKILL.md) - Data quantification
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
**Core guide**:
|
|
- [Quality Levels](reference/quality-levels.md) - Detailed level definitions
|
|
- [Component Guide](reference/components.md) - V_meta calculation
|
|
- [Investment ROI](reference/roi.md) - Time savings analysis
|
|
|
|
**Examples**:
|
|
- [Bootstrap-003 Comprehensive](examples/error-recovery-comprehensive-baseline.md) - V_meta=0.48
|
|
- [Bootstrap-002 Minimal](examples/testing-strategy-minimal-baseline.md) - V_meta=0.04
|
|
|
|
---
|
|
|
|
**Status**: ✅ Validated | 40-50% iteration reduction | Positive ROI
|