12 KiB
Three-Layer OCA Architecture
Version: 1.0 Framework: BAIME - Observe-Codify-Automate Layers: 3 (Observe, Codify, Automate)
Complete architectural reference for the OCA cycle.
Overview
The OCA (Observe-Codify-Automate) cycle is the core of BAIME, consisting of three iterative layers that transform ad-hoc development into systematic, reusable methodologies.
ITERATION N:
Observe → Codify → Automate → [Next Iteration]
↑ ↓
└──────────── Feedback ───────┘
Layer 1: Observe
Purpose: Gather empirical data through hands-on work
Duration: 30-40% of iteration time (~20-30 min)
Activities:
- Apply existing patterns/tools (if any)
- Execute actual work on project
- Measure results and effectiveness
- Identify problems and gaps
- Document observations
Outputs:
- Baseline metrics
- Problem list (prioritized)
- Pattern usage data
- Time measurements
- Quality metrics
Example (Testing Strategy, Iteration 1):
## Observations
**Applied**:
- Wrote 5 unit tests manually
- Tried different test structures
**Measured**:
- Time per test: 15-20 min
- Coverage increase: +2.3%
- Tests passing: 5/5 (100%)
**Problems Identified**:
1. Setup code duplicated across tests
2. Unclear which functions to test first
3. No standard test structure
4. Coverage analysis manual and slow
**Time Spent**: 90 min (5 tests × 18 min avg)
Observation Techniques
1. Baseline Measurement
What to measure:
- Current state metrics (coverage, build time, error rate)
- Time spent on tasks
- Pain points and blockers
- Quality indicators
Tools:
# Testing
go test -cover ./...
go tool cover -func=coverage.out
# CI/CD
time make build
grep "FAIL" ci-logs.txt | wc -l
# Errors
grep "error" session.jsonl | wc -l
2. Work Sampling
Technique: Track time on representative tasks
Example:
Task: Write 5 unit tests
Sample 1: TestFunction1 - 18 min
Sample 2: TestFunction2 - 15 min
Sample 3: TestFunction3 - 22 min (complex)
Sample 4: TestFunction4 - 12 min (simple)
Sample 5: TestFunction5 - 16 min
Average: 16.6 min per test
Range: 12-22 min
Variance: High (complexity-dependent)
3. Problem Taxonomy
Classify problems:
- High frequency, high impact: Urgent patterns needed
- High frequency, low impact: Automation candidates
- Low frequency, high impact: Document workarounds
- Low frequency, low impact: Ignore
Layer 2: Codify
Purpose: Transform observations into documented patterns
Duration: 35-45% of iteration time (~25-35 min)
Activities:
- Analyze observations for patterns
- Design reusable solutions
- Document patterns with examples
- Test patterns on 2-3 cases
- Refine based on feedback
Outputs:
- Pattern documents (problem-solution pairs)
- Code examples
- Usage guidelines
- Time/quality metrics per pattern
Example (Testing Strategy, Iteration 1):
## Pattern: Table-Driven Tests
**Problem**: Writing multiple similar test cases is repetitive
**Solution**: Use table-driven pattern with test struct
**Structure**:
```go
func TestFunction(t *testing.T) {
tests := []struct {
name string
input Type
expected Type
}{
{"case1", input1, output1},
{"case2", input2, output2},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := Function(tt.input)
assert.Equal(t, tt.expected, got)
})
}
}
Time: 12 min per test (vs 18 min manual) Savings: 33% time reduction Validated: 3 test functions, all passed
### Codification Techniques
#### 1. Pattern Template
```markdown
## Pattern: [Name]
**Category**: [Testing/CI/Error/etc.]
**Problem**:
[What problem does this solve?]
**Context**:
[When is this applicable?]
**Solution**:
[How to solve it? Step-by-step]
**Structure**:
[Code template or procedure]
**Example**:
[Real working example]
**Metrics**:
- Time: [X min]
- Quality: [metric]
- Reusability: [X%]
**Variations**:
[Alternative approaches]
**Anti-patterns**:
[Common mistakes]
2. Pattern Hierarchy
Level 1: Core Patterns (6-8)
- Universal, high frequency
- Foundation for other patterns
- Example: Table-driven tests, Error classification
Level 2: Composite Patterns (2-4)
- Combine multiple core patterns
- Domain-specific
- Example: Coverage-driven gap closure (table-driven + prioritization)
Level 3: Specialized Patterns (0-2)
- Rare, specific use cases
- Optional extensions
- Example: Golden file testing for large outputs
3. Progressive Refinement
Iteration 0: Observe only (no patterns yet) Iteration 1: 2-3 core patterns (basics) Iteration 2: 4-6 patterns (expanded) Iteration 3: 6-8 patterns (refined) Iteration 4+: Consolidate, no new patterns
Layer 3: Automate
Purpose: Create tools to accelerate pattern application
Duration: 20-30% of iteration time (~15-20 min)
Activities:
- Identify repetitive tasks (>3 times)
- Design automation approach
- Implement scripts/tools
- Test on real examples
- Measure speedup
Outputs:
- Automation scripts
- Tool documentation
- Speedup metrics (Nx faster)
- ROI calculations
Example (Testing Strategy, Iteration 2):
## Tool: Coverage Gap Analyzer
**Purpose**: Identify which functions need tests (automated)
**Implementation**:
```bash
#!/bin/bash
# scripts/analyze-coverage-gaps.sh
go tool cover -func=coverage.out |
grep "0.0%" |
awk '{print $1, $2}' |
while read file func; do
# Categorize function type
if grep -q "Error\|Valid" <<< "$func"; then
echo "P1: $file:$func (error handling)"
elif grep -q "Parse\|Process" <<< "$func"; then
echo "P2: $file:$func (business logic)"
else
echo "P3: $file:$func (utility)"
fi
done | sort
Speedup: 15 min manual → 5 sec automated (180x) ROI: 30 min investment, 10 uses = 150 min saved = 5x ROI Validated: Used in iterations 2-4, always accurate
### Automation Techniques
#### 1. ROI Calculation
ROI = (time_saved × uses) / time_invested
Example:
- Manual task: 10 min
- Automation time: 1 hour
- Break-even: 6 uses
- Expected uses: 20
- ROI = (10 × 20) / 60 = 3.3x
**Rules**:
- ROI < 2x: Don't automate (not worth it)
- ROI 2-5x: Automate if frequently used
- ROI > 5x: Always automate
#### 2. Automation Tiers
**Tier 1: Simple Scripts** (15-30 min)
- Bash/Python scripts
- Parse existing tool output
- Generate boilerplate
- Example: Coverage gap analyzer
**Tier 2: Workflow Tools** (1-2 hours)
- Multi-step automation
- Integrate multiple tools
- Smart suggestions
- Example: Test generator with pattern detection
**Tier 3: Full Integration** (>2 hours)
- IDE/editor plugins
- CI/CD integration
- Pre-commit hooks
- Example: Automated methodology guide
**Start with Tier 1**, only progress to Tier 2/3 if ROI justifies
#### 3. Incremental Automation
**Phase 1**: Manual process documented
**Phase 2**: Script to assist (not fully automated)
**Phase 3**: Fully automated with validation
**Phase 4**: Integrated into workflow (hooks, CI)
**Example** (Test generation):
Phase 1: Copy-paste test template manually Phase 2: Script generates template, manual fill-in Phase 3: Script generates with smart defaults Phase 4: Pre-commit hook suggests tests for new functions
---
## Dual-Layer Value Functions
### V_instance (Instance Quality)
**Measures**: Quality of work produced using methodology
**Formula**:
V_instance = Σ(w_i × metric_i)
Where:
- w_i = weight for metric i
- metric_i = normalized metric value (0-1)
- Σw_i = 1.0
**Example** (Testing):
V_instance = 0.5 × (coverage/target) + 0.3 × (pass_rate) + 0.2 × (maintainability)
Target: V_instance ≥ 0.80
**Convergence**: Stable for 2 consecutive iterations
### V_meta (Methodology Quality)
**Measures**: Quality and reusability of methodology itself
**Formula**:
V_meta = 0.4 × completeness + 0.3 × transferability + 0.3 × automation_effectiveness
Where:
- completeness = patterns_documented / patterns_needed
- transferability = cross_project_reuse_score (0-1)
- automation_effectiveness = time_with_tools / time_manual
**Example** (Testing):
V_meta = 0.4 × (8/8) + 0.3 × (0.90) + 0.3 × (4min/20min)
= 0.4 + 0.27 + 0.06
= 0.73
Target: V_meta ≥ 0.80
**Convergence**: Stable for 2 consecutive iterations
### Dual Convergence Criteria
**Both must be met**:
1. V_instance ≥ 0.80 for 2 consecutive iterations
2. V_meta ≥ 0.80 for 2 consecutive iterations
**Why dual-layer?**:
- V_instance alone: Could be good results with bad process
- V_meta alone: Could be great methodology with poor results
- Both together: Good results + reusable methodology
---
## Iteration Coordination
### Standard Flow
ITERATION N: ├─ Start (5 min) │ ├─ Review previous iteration results │ ├─ Set goals for this iteration │ └─ Load context (patterns, tools, metrics) │ ├─ Observe (25 min) │ ├─ Apply existing patterns │ ├─ Work on project tasks │ ├─ Measure results │ └─ Document problems │ ├─ Codify (30 min) │ ├─ Analyze observations │ ├─ Create/refine patterns │ ├─ Document with examples │ └─ Validate on 2-3 cases │ ├─ Automate (20 min) │ ├─ Identify automation opportunities │ ├─ Create/improve tools │ ├─ Measure speedup │ └─ Calculate ROI │ └─ Close (10 min) ├─ Calculate V_instance and V_meta ├─ Check convergence criteria ├─ Document iteration summary └─ Plan next iteration (if needed)
### Convergence Detection
```python
def check_convergence(history):
if len(history) < 2:
return False
# Check last 2 iterations
last_two = history[-2:]
# Both V_instance and V_meta must be ≥ 0.80
instance_converged = all(v.instance >= 0.80 for v in last_two)
meta_converged = all(v.meta >= 0.80 for v in last_two)
# No significant gaps remaining
no_critical_gaps = last_two[-1].critical_gaps == 0
return instance_converged and meta_converged and no_critical_gaps
Best Practices
Do's
✅ Start with Observe - Don't skip baseline ✅ Validate patterns - Test on 2-3 real examples ✅ Measure everything - Time, quality, speedup ✅ Iterate quickly - 60-90 min per iteration ✅ Focus on ROI - Automate high-value tasks ✅ Document continuously - Don't wait until end
Don'ts
❌ Don't skip Observe - Patterns without data are guesses ❌ Don't over-codify - 6-8 patterns maximum ❌ Don't premature automation - Understand problem first ❌ Don't ignore transferability - Aim for 80%+ reuse ❌ Don't continue past convergence - Stop at dual 0.80
Architecture Variations
Rapid Convergence (3-4 iterations)
Modifications:
- Strong Iteration 0 (comprehensive baseline)
- Borrow patterns from similar projects (70-90% reuse)
- Parallel pattern development
- Focus on high-impact only
Slow Convergence (>6 iterations)
Causes:
- Weak Iteration 0 (insufficient baseline)
- Too many patterns (>10)
- Complex domain
- Insufficient automation
Fixes:
- Strengthen baseline analysis
- Consolidate patterns
- Increase automation investment
- Focus on critical paths only
Source: BAIME Framework Status: Production-ready, validated across 13 methodologies Convergence Rate: 100% (all experiments converged) Average Iterations: 4.9 (median 5)