Initial commit

2025-11-30 09:07:22 +08:00
commit fab98d059b
179 changed files with 46209 additions and 0 deletions
--- a/skills/methodology-bootstrapping/reference/three-layer-architecture.md
+++ b/skills/methodology-bootstrapping/reference/three-layer-architecture.md
@@ -0,0 +1,522 @@
+# Three-Layer OCA Architecture
+
+**Version**: 1.0
+**Framework**: BAIME - Observe-Codify-Automate
+**Layers**: 3 (Observe, Codify, Automate)
+
+Complete architectural reference for the OCA cycle.
+
+---
+
+## Overview
+
+The OCA (Observe-Codify-Automate) cycle is the core of BAIME, consisting of three iterative layers that transform ad-hoc development into systematic, reusable methodologies.
+
+```
+ITERATION N:
+  Observe → Codify → Automate → [Next Iteration]
+     ↑                             ↓
+     └──────────── Feedback ───────┘
+```
+
+---
+
+## Layer 1: Observe
+
+**Purpose**: Gather empirical data through hands-on work
+
+**Duration**: 30-40% of iteration time (~20-30 min)
+
+**Activities**:
+1. **Apply** existing patterns/tools (if any)
+2. **Execute** actual work on project
+3. **Measure** results and effectiveness
+4. **Identify** problems and gaps
+5. **Document** observations
+
+**Outputs**:
+- Baseline metrics
+- Problem list (prioritized)
+- Pattern usage data
+- Time measurements
+- Quality metrics
+
+**Example** (Testing Strategy, Iteration 1):
+```markdown
+## Observations
+
+**Applied**:
+- Wrote 5 unit tests manually
+- Tried different test structures
+
+**Measured**:
+- Time per test: 15-20 min
+- Coverage increase: +2.3%
+- Tests passing: 5/5 (100%)
+
+**Problems Identified**:
+1. Setup code duplicated across tests
+2. Unclear which functions to test first
+3. No standard test structure
+4. Coverage analysis manual and slow
+
+**Time Spent**: 90 min (5 tests × 18 min avg)
+```
+
+### Observation Techniques
+
+#### 1. Baseline Measurement
+
+**What to measure**:
+- Current state metrics (coverage, build time, error rate)
+- Time spent on tasks
+- Pain points and blockers
+- Quality indicators
+
+**Tools**:
+```bash
+# Testing
+go test -cover ./...
+go tool cover -func=coverage.out
+
+# CI/CD
+time make build
+grep "FAIL" ci-logs.txt | wc -l
+
+# Errors
+grep "error" session.jsonl | wc -l
+```
+
+#### 2. Work Sampling
+
+**Technique**: Track time on representative tasks
+
+**Example**:
+```markdown
+Task: Write 5 unit tests
+
+Sample 1: TestFunction1 - 18 min
+Sample 2: TestFunction2 - 15 min
+Sample 3: TestFunction3 - 22 min (complex)
+Sample 4: TestFunction4 - 12 min (simple)
+Sample 5: TestFunction5 - 16 min
+
+Average: 16.6 min per test
+Range: 12-22 min
+Variance: High (complexity-dependent)
+```
+
+#### 3. Problem Taxonomy
+
+**Classify problems**:
+- **High frequency, high impact**: Urgent patterns needed
+- **High frequency, low impact**: Automation candidates
+- **Low frequency, high impact**: Document workarounds
+- **Low frequency, low impact**: Ignore
+
+---
+
+## Layer 2: Codify
+
+**Purpose**: Transform observations into documented patterns
+
+**Duration**: 35-45% of iteration time (~25-35 min)
+
+**Activities**:
+1. **Analyze** observations for patterns
+2. **Design** reusable solutions
+3. **Document** patterns with examples
+4. **Test** patterns on 2-3 cases
+5. **Refine** based on feedback
+
+**Outputs**:
+- Pattern documents (problem-solution pairs)
+- Code examples
+- Usage guidelines
+- Time/quality metrics per pattern
+
+**Example** (Testing Strategy, Iteration 1):
+```markdown
+## Pattern: Table-Driven Tests
+
+**Problem**: Writing multiple similar test cases is repetitive
+
+**Solution**: Use table-driven pattern with test struct
+
+**Structure**:
+```go
+func TestFunction(t *testing.T) {
+    tests := []struct {
+        name     string
+        input    Type
+        expected Type
+    }{
+        {"case1", input1, output1},
+        {"case2", input2, output2},
+    }
+
+    for _, tt := range tests {
+        t.Run(tt.name, func(t *testing.T) {
+            got := Function(tt.input)
+            assert.Equal(t, tt.expected, got)
+        })
+    }
+}
+```
+
+**Time**: 12 min per test (vs 18 min manual)
+**Savings**: 33% time reduction
+**Validated**: 3 test functions, all passed
+```
+
+### Codification Techniques
+
+#### 1. Pattern Template
+
+```markdown
+## Pattern: [Name]
+
+**Category**: [Testing/CI/Error/etc.]
+
+**Problem**:
+[What problem does this solve?]
+
+**Context**:
+[When is this applicable?]
+
+**Solution**:
+[How to solve it? Step-by-step]
+
+**Structure**:
+[Code template or procedure]
+
+**Example**:
+[Real working example]
+
+**Metrics**:
+- Time: [X min]
+- Quality: [metric]
+- Reusability: [X%]
+
+**Variations**:
+[Alternative approaches]
+
+**Anti-patterns**:
+[Common mistakes]
+```
+
+#### 2. Pattern Hierarchy
+
+**Level 1: Core Patterns** (6-8)
+- Universal, high frequency
+- Foundation for other patterns
+- Example: Table-driven tests, Error classification
+
+**Level 2: Composite Patterns** (2-4)
+- Combine multiple core patterns
+- Domain-specific
+- Example: Coverage-driven gap closure (table-driven + prioritization)
+
+**Level 3: Specialized Patterns** (0-2)
+- Rare, specific use cases
+- Optional extensions
+- Example: Golden file testing for large outputs
+
+#### 3. Progressive Refinement
+
+**Iteration 0**: Observe only (no patterns yet)
+**Iteration 1**: 2-3 core patterns (basics)
+**Iteration 2**: 4-6 patterns (expanded)
+**Iteration 3**: 6-8 patterns (refined)
+**Iteration 4+**: Consolidate, no new patterns
+
+---
+
+## Layer 3: Automate
+
+**Purpose**: Create tools to accelerate pattern application
+
+**Duration**: 20-30% of iteration time (~15-20 min)
+
+**Activities**:
+1. **Identify** repetitive tasks (>3 times)
+2. **Design** automation approach
+3. **Implement** scripts/tools
+4. **Test** on real examples
+5. **Measure** speedup
+
+**Outputs**:
+- Automation scripts
+- Tool documentation
+- Speedup metrics (Nx faster)
+- ROI calculations
+
+**Example** (Testing Strategy, Iteration 2):
+```markdown
+## Tool: Coverage Gap Analyzer
+
+**Purpose**: Identify which functions need tests (automated)
+
+**Implementation**:
+```bash
+#!/bin/bash
+# scripts/analyze-coverage-gaps.sh
+
+go tool cover -func=coverage.out |
+  grep "0.0%" |
+  awk '{print $1, $2}' |
+  while read file func; do
+    # Categorize function type
+    if grep -q "Error\|Valid" <<< "$func"; then
+      echo "P1: $file:$func (error handling)"
+    elif grep -q "Parse\|Process" <<< "$func"; then
+      echo "P2: $file:$func (business logic)"
+    else
+      echo "P3: $file:$func (utility)"
+    fi
+  done | sort
+```
+
+**Speedup**: 15 min manual → 5 sec automated (180x)
+**ROI**: 30 min investment, 10 uses = 150 min saved = 5x ROI
+**Validated**: Used in iterations 2-4, always accurate
+```
+
+### Automation Techniques
+
+#### 1. ROI Calculation
+
+```
+ROI = (time_saved × uses) / time_invested
+
+Example:
+- Manual task: 10 min
+- Automation time: 1 hour
+- Break-even: 6 uses
+- Expected uses: 20
+- ROI = (10 × 20) / 60 = 3.3x
+```
+
+**Rules**:
+- ROI < 2x: Don't automate (not worth it)
+- ROI 2-5x: Automate if frequently used
+- ROI > 5x: Always automate
+
+#### 2. Automation Tiers
+
+**Tier 1: Simple Scripts** (15-30 min)
+- Bash/Python scripts
+- Parse existing tool output
+- Generate boilerplate
+- Example: Coverage gap analyzer
+
+**Tier 2: Workflow Tools** (1-2 hours)
+- Multi-step automation
+- Integrate multiple tools
+- Smart suggestions
+- Example: Test generator with pattern detection
+
+**Tier 3: Full Integration** (>2 hours)
+- IDE/editor plugins
+- CI/CD integration
+- Pre-commit hooks
+- Example: Automated methodology guide
+
+**Start with Tier 1**, only progress to Tier 2/3 if ROI justifies
+
+#### 3. Incremental Automation
+
+**Phase 1**: Manual process documented
+**Phase 2**: Script to assist (not fully automated)
+**Phase 3**: Fully automated with validation
+**Phase 4**: Integrated into workflow (hooks, CI)
+
+**Example** (Test generation):
+```
+Phase 1: Copy-paste test template manually
+Phase 2: Script generates template, manual fill-in
+Phase 3: Script generates with smart defaults
+Phase 4: Pre-commit hook suggests tests for new functions
+```
+
+---
+
+## Dual-Layer Value Functions
+
+### V_instance (Instance Quality)
+
+**Measures**: Quality of work produced using methodology
+
+**Formula**:
+```
+V_instance = Σ(w_i × metric_i)
+
+Where:
+- w_i = weight for metric i
+- metric_i = normalized metric value (0-1)
+- Σw_i = 1.0
+```
+
+**Example** (Testing):
+```
+V_instance = 0.5 × (coverage/target) +
+             0.3 × (pass_rate) +
+             0.2 × (maintainability)
+
+Target: V_instance ≥ 0.80
+```
+
+**Convergence**: Stable for 2 consecutive iterations
+
+### V_meta (Methodology Quality)
+
+**Measures**: Quality and reusability of methodology itself
+
+**Formula**:
+```
+V_meta = 0.4 × completeness +
+         0.3 × transferability +
+         0.3 × automation_effectiveness
+
+Where:
+- completeness = patterns_documented / patterns_needed
+- transferability = cross_project_reuse_score (0-1)
+- automation_effectiveness = time_with_tools / time_manual
+```
+
+**Example** (Testing):
+```
+V_meta = 0.4 × (8/8) +
+         0.3 × (0.90) +
+         0.3 × (4min/20min)
+
+       = 0.4 + 0.27 + 0.06
+       = 0.73
+
+Target: V_meta ≥ 0.80
+```
+
+**Convergence**: Stable for 2 consecutive iterations
+
+### Dual Convergence Criteria
+
+**Both must be met**:
+1. V_instance ≥ 0.80 for 2 consecutive iterations
+2. V_meta ≥ 0.80 for 2 consecutive iterations
+
+**Why dual-layer?**:
+- V_instance alone: Could be good results with bad process
+- V_meta alone: Could be great methodology with poor results
+- Both together: Good results + reusable methodology
+
+---
+
+## Iteration Coordination
+
+### Standard Flow
+
+```
+ITERATION N:
+├─ Start (5 min)
+│  ├─ Review previous iteration results
+│  ├─ Set goals for this iteration
+│  └─ Load context (patterns, tools, metrics)
+│
+├─ Observe (25 min)
+│  ├─ Apply existing patterns
+│  ├─ Work on project tasks
+│  ├─ Measure results
+│  └─ Document problems
+│
+├─ Codify (30 min)
+│  ├─ Analyze observations
+│  ├─ Create/refine patterns
+│  ├─ Document with examples
+│  └─ Validate on 2-3 cases
+│
+├─ Automate (20 min)
+│  ├─ Identify automation opportunities
+│  ├─ Create/improve tools
+│  ├─ Measure speedup
+│  └─ Calculate ROI
+│
+└─ Close (10 min)
+   ├─ Calculate V_instance and V_meta
+   ├─ Check convergence criteria
+   ├─ Document iteration summary
+   └─ Plan next iteration (if needed)
+```
+
+### Convergence Detection
+
+```python
+def check_convergence(history):
+    if len(history) < 2:
+        return False
+
+    # Check last 2 iterations
+    last_two = history[-2:]
+
+    # Both V_instance and V_meta must be ≥ 0.80
+    instance_converged = all(v.instance >= 0.80 for v in last_two)
+    meta_converged = all(v.meta >= 0.80 for v in last_two)
+
+    # No significant gaps remaining
+    no_critical_gaps = last_two[-1].critical_gaps == 0
+
+    return instance_converged and meta_converged and no_critical_gaps
+```
+
+---
+
+## Best Practices
+
+### Do's
+
+✅ **Start with Observe** - Don't skip baseline
+✅ **Validate patterns** - Test on 2-3 real examples
+✅ **Measure everything** - Time, quality, speedup
+✅ **Iterate quickly** - 60-90 min per iteration
+✅ **Focus on ROI** - Automate high-value tasks
+✅ **Document continuously** - Don't wait until end
+
+### Don'ts
+
+❌ **Don't skip Observe** - Patterns without data are guesses
+❌ **Don't over-codify** - 6-8 patterns maximum
+❌ **Don't premature automation** - Understand problem first
+❌ **Don't ignore transferability** - Aim for 80%+ reuse
+❌ **Don't continue past convergence** - Stop at dual 0.80
+
+---
+
+## Architecture Variations
+
+### Rapid Convergence (3-4 iterations)
+
+**Modifications**:
+- Strong Iteration 0 (comprehensive baseline)
+- Borrow patterns from similar projects (70-90% reuse)
+- Parallel pattern development
+- Focus on high-impact only
+
+### Slow Convergence (>6 iterations)
+
+**Causes**:
+- Weak Iteration 0 (insufficient baseline)
+- Too many patterns (>10)
+- Complex domain
+- Insufficient automation
+
+**Fixes**:
+- Strengthen baseline analysis
+- Consolidate patterns
+- Increase automation investment
+- Focus on critical paths only
+
+---
+
+**Source**: BAIME Framework
+**Status**: Production-ready, validated across 13 methodologies
+**Convergence Rate**: 100% (all experiments converged)
+**Average Iterations**: 4.9 (median 5)