# Three-Layer OCA Architecture **Version**: 1.0 **Framework**: BAIME - Observe-Codify-Automate **Layers**: 3 (Observe, Codify, Automate) Complete architectural reference for the OCA cycle. --- ## Overview The OCA (Observe-Codify-Automate) cycle is the core of BAIME, consisting of three iterative layers that transform ad-hoc development into systematic, reusable methodologies. ``` ITERATION N: Observe → Codify → Automate → [Next Iteration] ↑ ↓ └──────────── Feedback ───────┘ ``` --- ## Layer 1: Observe **Purpose**: Gather empirical data through hands-on work **Duration**: 30-40% of iteration time (~20-30 min) **Activities**: 1. **Apply** existing patterns/tools (if any) 2. **Execute** actual work on project 3. **Measure** results and effectiveness 4. **Identify** problems and gaps 5. **Document** observations **Outputs**: - Baseline metrics - Problem list (prioritized) - Pattern usage data - Time measurements - Quality metrics **Example** (Testing Strategy, Iteration 1): ```markdown ## Observations **Applied**: - Wrote 5 unit tests manually - Tried different test structures **Measured**: - Time per test: 15-20 min - Coverage increase: +2.3% - Tests passing: 5/5 (100%) **Problems Identified**: 1. Setup code duplicated across tests 2. Unclear which functions to test first 3. No standard test structure 4. Coverage analysis manual and slow **Time Spent**: 90 min (5 tests × 18 min avg) ``` ### Observation Techniques #### 1. Baseline Measurement **What to measure**: - Current state metrics (coverage, build time, error rate) - Time spent on tasks - Pain points and blockers - Quality indicators **Tools**: ```bash # Testing go test -cover ./... go tool cover -func=coverage.out # CI/CD time make build grep "FAIL" ci-logs.txt | wc -l # Errors grep "error" session.jsonl | wc -l ``` #### 2. Work Sampling **Technique**: Track time on representative tasks **Example**: ```markdown Task: Write 5 unit tests Sample 1: TestFunction1 - 18 min Sample 2: TestFunction2 - 15 min Sample 3: TestFunction3 - 22 min (complex) Sample 4: TestFunction4 - 12 min (simple) Sample 5: TestFunction5 - 16 min Average: 16.6 min per test Range: 12-22 min Variance: High (complexity-dependent) ``` #### 3. Problem Taxonomy **Classify problems**: - **High frequency, high impact**: Urgent patterns needed - **High frequency, low impact**: Automation candidates - **Low frequency, high impact**: Document workarounds - **Low frequency, low impact**: Ignore --- ## Layer 2: Codify **Purpose**: Transform observations into documented patterns **Duration**: 35-45% of iteration time (~25-35 min) **Activities**: 1. **Analyze** observations for patterns 2. **Design** reusable solutions 3. **Document** patterns with examples 4. **Test** patterns on 2-3 cases 5. **Refine** based on feedback **Outputs**: - Pattern documents (problem-solution pairs) - Code examples - Usage guidelines - Time/quality metrics per pattern **Example** (Testing Strategy, Iteration 1): ```markdown ## Pattern: Table-Driven Tests **Problem**: Writing multiple similar test cases is repetitive **Solution**: Use table-driven pattern with test struct **Structure**: ```go func TestFunction(t *testing.T) { tests := []struct { name string input Type expected Type }{ {"case1", input1, output1}, {"case2", input2, output2}, } for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { got := Function(tt.input) assert.Equal(t, tt.expected, got) }) } } ``` **Time**: 12 min per test (vs 18 min manual) **Savings**: 33% time reduction **Validated**: 3 test functions, all passed ``` ### Codification Techniques #### 1. Pattern Template ```markdown ## Pattern: [Name] **Category**: [Testing/CI/Error/etc.] **Problem**: [What problem does this solve?] **Context**: [When is this applicable?] **Solution**: [How to solve it? Step-by-step] **Structure**: [Code template or procedure] **Example**: [Real working example] **Metrics**: - Time: [X min] - Quality: [metric] - Reusability: [X%] **Variations**: [Alternative approaches] **Anti-patterns**: [Common mistakes] ``` #### 2. Pattern Hierarchy **Level 1: Core Patterns** (6-8) - Universal, high frequency - Foundation for other patterns - Example: Table-driven tests, Error classification **Level 2: Composite Patterns** (2-4) - Combine multiple core patterns - Domain-specific - Example: Coverage-driven gap closure (table-driven + prioritization) **Level 3: Specialized Patterns** (0-2) - Rare, specific use cases - Optional extensions - Example: Golden file testing for large outputs #### 3. Progressive Refinement **Iteration 0**: Observe only (no patterns yet) **Iteration 1**: 2-3 core patterns (basics) **Iteration 2**: 4-6 patterns (expanded) **Iteration 3**: 6-8 patterns (refined) **Iteration 4+**: Consolidate, no new patterns --- ## Layer 3: Automate **Purpose**: Create tools to accelerate pattern application **Duration**: 20-30% of iteration time (~15-20 min) **Activities**: 1. **Identify** repetitive tasks (>3 times) 2. **Design** automation approach 3. **Implement** scripts/tools 4. **Test** on real examples 5. **Measure** speedup **Outputs**: - Automation scripts - Tool documentation - Speedup metrics (Nx faster) - ROI calculations **Example** (Testing Strategy, Iteration 2): ```markdown ## Tool: Coverage Gap Analyzer **Purpose**: Identify which functions need tests (automated) **Implementation**: ```bash #!/bin/bash # scripts/analyze-coverage-gaps.sh go tool cover -func=coverage.out | grep "0.0%" | awk '{print $1, $2}' | while read file func; do # Categorize function type if grep -q "Error\|Valid" <<< "$func"; then echo "P1: $file:$func (error handling)" elif grep -q "Parse\|Process" <<< "$func"; then echo "P2: $file:$func (business logic)" else echo "P3: $file:$func (utility)" fi done | sort ``` **Speedup**: 15 min manual → 5 sec automated (180x) **ROI**: 30 min investment, 10 uses = 150 min saved = 5x ROI **Validated**: Used in iterations 2-4, always accurate ``` ### Automation Techniques #### 1. ROI Calculation ``` ROI = (time_saved × uses) / time_invested Example: - Manual task: 10 min - Automation time: 1 hour - Break-even: 6 uses - Expected uses: 20 - ROI = (10 × 20) / 60 = 3.3x ``` **Rules**: - ROI < 2x: Don't automate (not worth it) - ROI 2-5x: Automate if frequently used - ROI > 5x: Always automate #### 2. Automation Tiers **Tier 1: Simple Scripts** (15-30 min) - Bash/Python scripts - Parse existing tool output - Generate boilerplate - Example: Coverage gap analyzer **Tier 2: Workflow Tools** (1-2 hours) - Multi-step automation - Integrate multiple tools - Smart suggestions - Example: Test generator with pattern detection **Tier 3: Full Integration** (>2 hours) - IDE/editor plugins - CI/CD integration - Pre-commit hooks - Example: Automated methodology guide **Start with Tier 1**, only progress to Tier 2/3 if ROI justifies #### 3. Incremental Automation **Phase 1**: Manual process documented **Phase 2**: Script to assist (not fully automated) **Phase 3**: Fully automated with validation **Phase 4**: Integrated into workflow (hooks, CI) **Example** (Test generation): ``` Phase 1: Copy-paste test template manually Phase 2: Script generates template, manual fill-in Phase 3: Script generates with smart defaults Phase 4: Pre-commit hook suggests tests for new functions ``` --- ## Dual-Layer Value Functions ### V_instance (Instance Quality) **Measures**: Quality of work produced using methodology **Formula**: ``` V_instance = Σ(w_i × metric_i) Where: - w_i = weight for metric i - metric_i = normalized metric value (0-1) - Σw_i = 1.0 ``` **Example** (Testing): ``` V_instance = 0.5 × (coverage/target) + 0.3 × (pass_rate) + 0.2 × (maintainability) Target: V_instance ≥ 0.80 ``` **Convergence**: Stable for 2 consecutive iterations ### V_meta (Methodology Quality) **Measures**: Quality and reusability of methodology itself **Formula**: ``` V_meta = 0.4 × completeness + 0.3 × transferability + 0.3 × automation_effectiveness Where: - completeness = patterns_documented / patterns_needed - transferability = cross_project_reuse_score (0-1) - automation_effectiveness = time_with_tools / time_manual ``` **Example** (Testing): ``` V_meta = 0.4 × (8/8) + 0.3 × (0.90) + 0.3 × (4min/20min) = 0.4 + 0.27 + 0.06 = 0.73 Target: V_meta ≥ 0.80 ``` **Convergence**: Stable for 2 consecutive iterations ### Dual Convergence Criteria **Both must be met**: 1. V_instance ≥ 0.80 for 2 consecutive iterations 2. V_meta ≥ 0.80 for 2 consecutive iterations **Why dual-layer?**: - V_instance alone: Could be good results with bad process - V_meta alone: Could be great methodology with poor results - Both together: Good results + reusable methodology --- ## Iteration Coordination ### Standard Flow ``` ITERATION N: ├─ Start (5 min) │ ├─ Review previous iteration results │ ├─ Set goals for this iteration │ └─ Load context (patterns, tools, metrics) │ ├─ Observe (25 min) │ ├─ Apply existing patterns │ ├─ Work on project tasks │ ├─ Measure results │ └─ Document problems │ ├─ Codify (30 min) │ ├─ Analyze observations │ ├─ Create/refine patterns │ ├─ Document with examples │ └─ Validate on 2-3 cases │ ├─ Automate (20 min) │ ├─ Identify automation opportunities │ ├─ Create/improve tools │ ├─ Measure speedup │ └─ Calculate ROI │ └─ Close (10 min) ├─ Calculate V_instance and V_meta ├─ Check convergence criteria ├─ Document iteration summary └─ Plan next iteration (if needed) ``` ### Convergence Detection ```python def check_convergence(history): if len(history) < 2: return False # Check last 2 iterations last_two = history[-2:] # Both V_instance and V_meta must be ≥ 0.80 instance_converged = all(v.instance >= 0.80 for v in last_two) meta_converged = all(v.meta >= 0.80 for v in last_two) # No significant gaps remaining no_critical_gaps = last_two[-1].critical_gaps == 0 return instance_converged and meta_converged and no_critical_gaps ``` --- ## Best Practices ### Do's ✅ **Start with Observe** - Don't skip baseline ✅ **Validate patterns** - Test on 2-3 real examples ✅ **Measure everything** - Time, quality, speedup ✅ **Iterate quickly** - 60-90 min per iteration ✅ **Focus on ROI** - Automate high-value tasks ✅ **Document continuously** - Don't wait until end ### Don'ts ❌ **Don't skip Observe** - Patterns without data are guesses ❌ **Don't over-codify** - 6-8 patterns maximum ❌ **Don't premature automation** - Understand problem first ❌ **Don't ignore transferability** - Aim for 80%+ reuse ❌ **Don't continue past convergence** - Stop at dual 0.80 --- ## Architecture Variations ### Rapid Convergence (3-4 iterations) **Modifications**: - Strong Iteration 0 (comprehensive baseline) - Borrow patterns from similar projects (70-90% reuse) - Parallel pattern development - Focus on high-impact only ### Slow Convergence (>6 iterations) **Causes**: - Weak Iteration 0 (insufficient baseline) - Too many patterns (>10) - Complex domain - Insufficient automation **Fixes**: - Strengthen baseline analysis - Consolidate patterns - Increase automation investment - Focus on critical paths only --- **Source**: BAIME Framework **Status**: Production-ready, validated across 13 methodologies **Convergence Rate**: 100% (all experiments converged) **Average Iterations**: 4.9 (median 5)