37 KiB
name, description, keywords, category, version, based_on, transferability, effectiveness
| name | description | keywords | category | version | based_on | transferability | effectiveness |
|---|---|---|---|---|---|---|---|
| bootstrapped-se | Apply Bootstrapped Software Engineering (BSE) methodology to evolve project-specific development practices through systematic Observe-Codify-Automate cycles | bootstrapping, meta-methodology, OCA, observe, codify, automate, self-improvement, empirical, methodology-development | methodology | 1.0.0 | docs/methodology/bootstrapped-software-engineering.md | 95% | 10-50x methodology development speedup |
Bootstrapped Software Engineering
Evolve project-specific methodologies through systematic observation, codification, and automation.
The best methodologies are not designed but evolved through systematic observation, codification, and automation of successful practices.
Core Insight
Traditional methodologies are theory-driven and static. Bootstrapped Software Engineering (BSE) enables development processes to:
- Observe themselves through instrumentation and data collection
- Codify discovered patterns into reusable methodologies
- Automate methodology enforcement and validation
- Self-improve by applying the methodology to its own evolution
Three-Tuple Output
Every BSE process produces:
(O, Aₙ, Mₙ)
where:
O = Task output (code, documentation, system)
Aₙ = Converged agent set (reusable for similar tasks)
Mₙ = Converged meta-agent (transferable to new domains)
The OCA Framework
Three-Phase Cycle: Observe → Codify → Automate
Phase 1: OBSERVE
Instrument your development process to collect data
Tools:
- Session history analysis (meta-cc)
- Git commit analysis
- Code metrics (coverage, complexity)
- Access pattern tracking
- Error rate monitoring
Example (from meta-cc):
# Analyze file access patterns
meta-cc query files --threshold 5
# Result: plan.md accessed 423 times (highest)
# Insight: Core reference document, needs optimization
Output: Empirical data about actual development patterns
Phase 2: CODIFY
Extract patterns and document as reusable methodologies
Process:
- Pattern Recognition: Identify recurring successful practices
- Hypothesis Formation: Formulate testable claims
- Documentation: Write methodology documents
- Validation: Test methodology on real scenarios
Example (from meta-cc):
# Discovered Pattern: Role-Based Documentation
Observation:
- plan.md: 423 accesses (Coordination role)
- CLAUDE.md: ~300 implicit loads (Entry Point role)
- features.md: 89 accesses (Reference role)
Methodology:
- Classify docs by actual access patterns
- Optimize high-access docs for token efficiency
- Create role-specific maintenance procedures
Validation:
- CLAUDE.md reduction: 607 → 278 lines (-54%)
- Token cost reduction: 47%
- Access efficiency: Maintained
Output: Documented methodology with empirical validation
Phase 3: AUTOMATE
Convert methodology into automated checks and tools
Automation Levels:
- Detection: Automated pattern detection
- Validation: Check compliance with methodology
- Enforcement: CI/CD integration, block violations
- Suggestion: Automated fix recommendations
Example (from meta-cc):
# Automation: /meta doc-health capability
# Checks:
- Role classification compliance
- Token efficiency (lines < threshold)
- Cross-reference completeness
- Update frequency
# Actions:
- Flag oversized documents
- Suggest restructuring
- Validate role assignments
Output: Automated tools enforcing methodology
Self-Referential Feedback Loop
The ultimate power of BSE: Apply the methodology to improve itself
Layer 0: Basic Functionality
→ Build tools (meta-cc CLI)
Layer 1: Self-Observation
→ Use tools on self (query own sessions)
→ Discovery: Usage patterns, bottlenecks
Layer 2: Pattern Recognition
→ Analyze data (R/E ratio, access density)
→ Discovery: Document roles, optimization opportunities
Layer 3: Methodology Extraction
→ Codify patterns (role-based-documentation.md)
→ Definition: Classification algorithm, maintenance procedures
Layer 4: Tool Automation
→ Implement checks (/meta doc-health)
→ Auto-validate: Methodology compliance
Layer 5: Continuous Evolution
→ Apply tools to self
→ Discover new patterns → Update methodology → Update tools
This creates a closed loop: Tools improve tools, methodologies optimize methodologies.
Parameters
- domain:
documentation|testing|architecture|custom(default:custom) - observation_period: number of days/commits to analyze (default: auto-detect)
- automation_level:
detect|validate|enforce|suggest(default:validate) - iteration_count: number of OCA cycles (default: 3)
Execution Flow
Phase 1: Observation Setup
1. Identify observation targets
- Code metrics (LOC, complexity, coverage)
- Development patterns (commits, PRs, errors)
- Access patterns (file reads, searches)
- Quality metrics (test results, build time)
2. Install instrumentation
- meta-cc integration (session analysis)
- Git hooks (commit tracking)
- Coverage tracking
- CI/CD metrics
3. Collect baseline data
- Run for observation_period
- Generate initial reports
- Identify data gaps
Phase 2: Pattern Analysis
4. Analyze collected data
- Statistical analysis (frequencies, correlations)
- Pattern recognition (recurring behaviors)
- Anomaly detection (outliers, inefficiencies)
5. Formulate hypotheses
- "High-access docs should be < 300 lines"
- "Test coverage gaps correlate with bugs"
- "Batch remediation is 5x more efficient"
6. Validate hypotheses
- Historical data validation
- A/B testing if possible
- Expert review
Phase 3: Codification
7. Document patterns
- Pattern name and description
- Context and applicability
- Implementation steps
- Validation criteria
- Examples and counter-examples
8. Create methodology
- Problem statement
- Solution approach
- Procedures and guidelines
- Metrics and validation
9. Peer review
- Team review
- Iterate based on feedback
Phase 4: Automation
10. Design automation
- Detection: Identify when pattern applies
- Validation: Check compliance
- Enforcement: Prevent violations
- Suggestion: Recommend fixes
11. Implement tools
- Scripts (bash, Python)
- CI/CD integration
- IDE plugins
- Bot automation
12. Deploy and monitor
- Gradual rollout
- Collect usage data
- Measure effectiveness
Phase 5: Evolution
13. Apply to self
- Use tools on development process
- Discover meta-patterns
- Optimize methodology
14. Iterate OCA cycle
- New observations → New patterns
- Refined hypotheses → Better validation
- Enhanced automation → Higher compliance
Usage Examples
Example 1: Documentation Optimization
# User: "Optimize project documentation"
bootstrapped-se domain=documentation
# Execution:
[OBSERVE Phase]
✓ Analyzing file access patterns (30 days)...
- README.md: 423 accesses (Entry Point)
- ARCHITECTURE.md: 89 accesses (Reference)
- API.md: 234 accesses (Reference)
✓ Measuring token efficiency...
- README.md: 1909 lines (HIGH - inefficient)
- ARCHITECTURE.md: 456 lines (OK)
- API.md: 789 lines (MEDIUM)
[CODIFY Phase]
✓ Pattern identified: Role-Based Documentation
- Entry Point docs: Should be < 300 lines
- Reference docs: Should be < 500 lines
- Specialized docs: Can be longer
✓ Methodology documented:
- Created: docs/methodology/role-based-documentation.md
- Includes: Classification algorithm, optimization guide
[AUTOMATE Phase]
✓ Automation created:
- Script: scripts/check-doc-health.sh
- CI check: .github/workflows/doc-validation.yml
- Capability: /meta doc-health
[EVOLUTION Phase]
✓ Applied to self:
- Optimized README.md: 1909 → 275 lines (-85%)
- Optimized CLAUDE.md: 607 → 278 lines (-54%)
- Token cost reduction: 47%
✓ Next iteration insights:
- New pattern: Cross-reference density
- New automation: Auto-linking checker
Example 2: Testing Strategy Evolution
# User: "Develop systematic testing methodology"
bootstrapped-se domain=testing observation_period=60
# Execution:
[OBSERVE Phase]
✓ Analyzing test patterns (60 days)...
- Coverage: 75% (good)
- Test distribution: Uneven (80% unit, 15% integration, 5% e2e)
- Flaky tests: 12% failure rate
- Slowest tests: integration (avg 45s)
[CODIFY Phase]
✓ Patterns identified:
1. Coverage-driven development works
2. Integration tests need fixtures
3. Flaky tests have environmental dependencies
✓ Methodology created:
- Test strategy matrix (when to write what)
- Fixture patterns for integration tests
- Flaky test elimination procedures
[AUTOMATE Phase]
✓ Automation implemented:
- Coverage gate: Block if < 75%
- Fixture generator: Auto-create from examples
- Flaky detector: Flag unstable tests
✓ Results:
- Coverage: 75% → 86%
- Flaky rate: 12% → 2%
- Test time: 45s → 12s (with parallel execution)
Example 3: Error Recovery Patterns
# User: "Extract error recovery methodology"
bootstrapped-se domain=custom
# Execution:
[OBSERVE Phase]
✓ Analyzing error patterns...
- Total errors: 423 (from sessions)
- Error rate: 6.06%
- Most common: Type errors (45%), Logic errors (30%), Deps (15%)
[CODIFY Phase]
✓ Pattern: Error Classification Taxonomy
- Categories: Type, Logic, Dependency, Integration, Infrastructure
- Recovery strategies per category
- Prevention guidelines
✓ Methodology: Systematic Error Recovery
- Detection: Error signature extraction
- Classification: Rule-based categorization
- Recovery: Strategy pattern application
- Prevention: Root cause analysis → Code patterns
[AUTOMATE Phase]
✓ Tools created:
- Error classifier (ML-based)
- Recovery strategy recommender
- Prevention linter (detect anti-patterns)
✓ CI/CD Integration:
- Auto-classify build failures
- Suggest recovery steps
- Track error trends
Validated Outcomes
From meta-cc project (8 experiments, 95% transferability):
Documentation Methodology
- Observation: 423 file access patterns analyzed
- Codification: Role-based documentation methodology
- Automation: /meta doc-health capability
- Result: 47% token cost reduction, maintained accessibility
Testing Strategy
- Observation: 75% coverage, uneven distribution
- Codification: Coverage-driven gap closure
- Automation: CI coverage gates, fixture generators
- Result: 75% → 86% coverage, 15x speedup vs ad-hoc
Error Recovery
- Observation: 6.06% error rate, 423 errors analyzed
- Codification: Error taxonomy, recovery patterns
- Automation: Error classifier, recovery recommender
- Result: 85% transferability, systematic recovery
Dependency Health
- Observation: 7 vulnerabilities, 11 outdated deps
- Codification: 6 patterns (vulnerability, update, license, etc.)
- Automation: 3 scripts + CI/CD workflow
- Result: 6x speedup (9h → 1.5h), 88% transferability
Observability
- Observation: 0 logs, 0 metrics, 0 traces (baseline)
- Codification: Three Pillars methodology (Logging + Metrics + Tracing)
- Automation: Code generators, instrumentation templates
- Result: 23-46x speedup, 90-95% transferability
Transferability
95% transferable across domains and projects:
What Transfers (95%+)
- OCA framework itself (universal process)
- Self-referential feedback loop pattern
- Observation → Pattern → Automation pipeline
- Empirical validation approach
- Continuous evolution mindset
What Needs Adaptation (5%)
- Specific observation tools (meta-cc → custom tools)
- Domain-specific patterns (docs vs testing vs architecture)
- Automation implementation details (language, platform)
Adaptation Effort
- Same project, new domain: 2-4 hours
- New project, same domain: 4-8 hours
- New project, new domain: 8-16 hours
Prerequisites
Tools Required
- Session analysis: meta-cc or equivalent
- Git analysis: Git installed, access to repository
- Metrics collection: Coverage tools, static analyzers
- Automation: CI/CD platform (GitHub Actions, GitLab CI, etc.)
Skills Required
- Basic data analysis (statistics, pattern recognition)
- Methodology documentation
- Scripting (bash, Python, or equivalent)
- CI/CD configuration
Implementation Guidance
Start Small
# Week 1: Observe
- Install meta-cc
- Track file accesses for 1 week
- Collect simple metrics
# Week 2: Codify
- Analyze top 10 access patterns
- Document 1-2 simple patterns
- Get team feedback
# Week 3: Automate
- Create 1 simple validation script
- Add to CI/CD
- Monitor compliance
# Week 4: Iterate
- Apply tools to development
- Discover new patterns
- Refine methodology
Scale Up
# Month 2: Expand domains
- Apply to testing
- Apply to architecture
- Cross-validate patterns
# Month 3: Deep automation
- Build sophisticated checkers
- Integrate with IDE
- Create dashboards
# Month 4: Evolution
- Meta-patterns emerge
- Methodology generator
- Cross-project application
Theoretical Foundation
The Convergence Theorem
Conjecture: For any domain D, there exists a methodology M* such that:
- M is locally optimal* for D (cannot be significantly improved)
- M can be reached through bootstrapping* (systematic self-improvement)
- Convergence speed increases with each iteration (learning effect)
Implication: We can automatically discover optimal methodologies for any domain.
Scientific Method Analogy
1. Observation = Instrumentation (meta-cc tools)
2. Hypothesis = "CLAUDE.md should be <300 lines"
3. Experiment = Implement constraint, measure effects
4. Data Collection = query-files, git log analysis
5. Analysis = Calculate R/E ratio, access density
6. Conclusion = "300-line limit effective: 47% reduction"
7. Publication = Codify as methodology document
8. Replication = Apply to other projects
Success Criteria
| Metric | Target | Validation |
|---|---|---|
| Pattern Discovery | ≥3 patterns per cycle | Documented patterns |
| Methodology Quality | Peer-reviewed | Team acceptance |
| Automation Coverage | ≥80% of patterns | CI integration |
| Effectiveness | ≥3x improvement | Before/after metrics |
| Transferability | ≥85% reusability | Cross-project validation |
Domain Adaptation Guide
Different domains have different complexity characteristics that affect iteration count, agent needs, and convergence patterns. This guide helps predict and adapt to domain-specific challenges.
Domain Complexity Classes
Based on 8 completed Bootstrap experiments, we've identified three complexity classes:
Simple Domains (3-4 iterations)
Characteristics:
- Well-defined problem space
- Clear success criteria
- Limited interdependencies
- Established best practices exist
- Straightforward automation
Examples:
-
Bootstrap-010 (Dependency Health): 3 iterations
- Clear goals: vulnerabilities, freshness, licenses
- Existing tools: govulncheck, go-licenses
- Straightforward automation: CI/CD scripts
- Converged fastest in series
-
Bootstrap-011 (Knowledge Transfer): 3-4 iterations
- Well-understood domain: onboarding paths
- Clear structure: Day-1, Week-1, Month-1
- Existing patterns: progressive disclosure
- High transferability (95%+)
Adaptation Strategy:
Simple Domain Approach:
1. Start with generic agents only (coder, data-analyst, doc-writer)
2. Focus on automation (tools, scripts, CI)
3. Expect fast convergence (3-4 iterations)
4. Prioritize transferability (aim for 85%+)
5. Minimal agent specialization needed
Expected Outcomes:
- Iterations: 3-4
- Duration: 6-8 hours
- Specialized agents: 0-1
- Transferability: 85-95%
- V_instance: Often exceeds 0.80 significantly (e.g., 0.92)
Medium Complexity Domains (4-6 iterations)
Characteristics:
- Multiple dimensions to optimize
- Some ambiguity in success criteria
- Moderate interdependencies
- Require domain expertise
- Automation has nuances
Examples:
-
Bootstrap-001 (Documentation): 3 iterations (simple side of medium)
- Multiple roles to define
- Access patterns analysis needed
- Search infrastructure complexity
- 85% transferability
-
Bootstrap-002 (Testing): 5 iterations
- Coverage vs quality trade-offs
- Multiple test types (unit, integration, e2e)
- Fixture patterns discovery
- 89% transferability
-
Bootstrap-009 (Observability): 6 iterations
- Three pillars (logging, metrics, tracing)
- Performance vs verbosity trade-offs
- Integration complexity
- 90-95% transferability
Adaptation Strategy:
Medium Domain Approach:
1. Start with generic agents, add 1-2 specialized as needed
2. Expect iterative refinement of value functions
3. Plan for 4-6 iterations
4. Balance instance and meta objectives equally
5. Document trade-offs explicitly
Expected Outcomes:
- Iterations: 4-6
- Duration: 8-12 hours
- Specialized agents: 1-3
- Transferability: 85-90%
- V_instance: Typically 0.80-0.87
Complex Domains (6-8+ iterations)
Characteristics:
- High interdependency
- Emergent patterns (not obvious upfront)
- Multiple competing objectives
- Requires novel agent capabilities
- Automation is sophisticated
Examples:
-
Bootstrap-013 (Cross-Cutting Concerns): 8 iterations
- Pattern extraction from existing code
- Convention definition ambiguity
- Automated enforcement complexity
- Large codebase scope (all modules)
- Longest experiment but highest ROI (16.7x)
-
Bootstrap-003 (Error Recovery): 5 iterations (complex side)
- Error taxonomy creation
- Root cause diagnosis
- Recovery strategy patterns
- 85% transferability
-
Bootstrap-012 (Technical Debt): 4 iterations (medium-complex)
- SQALE quantification
- Prioritization complexity
- Subjective vs objective debt
- 85% transferability
Adaptation Strategy:
Complex Domain Approach:
1. Expect agent evolution throughout
2. Plan for 6-8+ iterations
3. Accept lower initial V values (baseline often <0.35)
4. Focus on one dimension per iteration
5. Create specialized agents proactively when gaps identified
6. Document emergent patterns as discovered
Expected Outcomes:
- Iterations: 6-8+
- Duration: 12-18 hours
- Specialized agents: 3-5
- Transferability: 70-85%
- V_instance: Hard-earned 0.80-0.85
- Largest single-iteration gains possible (e.g., +27.3% in Bootstrap-013 Iteration 7)
Domain-Specific Considerations
Documentation-Heavy Domains
Examples: Documentation (001), Knowledge Transfer (011)
Key Adaptations:
- Prioritize clarity over completeness
- Role-based structuring
- Accessibility optimization
- Cross-referencing systems
Success Indicators:
- Access/line ratio > 1.0
- User satisfaction surveys
- Search effectiveness
Technical Implementation Domains
Examples: Observability (009), Dependency Health (010)
Key Adaptations:
- Performance overhead monitoring
- Automation-first approach
- Integration testing critical
- CI/CD pipeline emphasis
Success Indicators:
- Automated coverage %
- Performance impact < 10%
- CI/CD reliability
Quality/Analysis Domains
Examples: Testing (002), Error Recovery (003), Technical Debt (012)
Key Adaptations:
- Quantification frameworks essential
- Baseline measurement critical
- Before/after comparisons
- Statistical validation
Success Indicators:
- Coverage metrics
- Error rate reduction
- Time savings quantified
Systematic Enforcement Domains
Examples: Cross-Cutting Concerns (013), Code Review (008 planned)
Key Adaptations:
- Pattern extraction from existing code
- Linter/checker development
- Gradual enforcement rollout
- Exception handling
Success Indicators:
- Pattern consistency %
- Violation detection rate
- Developer adoption rate
Predicting Iteration Count
Based on empirical data from 8 experiments:
Base estimate: 5 iterations
Adjust based on:
- Well-defined domain: -2 iterations
- Existing tools available: -1 iteration
- High interdependency: +2 iterations
- Novel patterns needed: +1 iteration
- Large codebase scope: +1 iteration
- Multiple competing goals: +1 iteration
Examples:
Dependency Health: 5 - 2 - 1 = 2 → actual 3 ✓
Observability: 5 + 0 + 1 = 6 → actual 6 ✓
Cross-Cutting: 5 + 2 + 1 = 8 → actual 8 ✓
Agent Specialization Prediction
Generic agents sufficient when:
- Domain has established patterns
- Clear best practices exist
- Automation is straightforward
→ Examples: Dependency Health, Knowledge Transfer
Specialized agents needed when:
- Novel pattern extraction required
- Domain-specific expertise needed
- Complex analysis algorithms
→ Examples: Observability (log-analyzer, metric-designer)
Cross-Cutting (pattern-extractor, convention-definer)
Rule of thumb:
- Simple domains: 0-1 specialized agents
- Medium domains: 1-3 specialized agents
- Complex domains: 3-5 specialized agents
Meta-Agent Evolution Prediction
Key finding from 8 experiments: M₀ was sufficient in ALL cases
Meta-Agent M₀ capabilities (5):
1. observe: Pattern observation
2. plan: Iteration planning
3. execute: Agent orchestration
4. reflect: Value assessment
5. evolve: System evolution
No evolution needed because:
- M₀ capabilities cover full lifecycle
- Agent specialization handles domain gaps
- Modular design allows capability reuse
When to evolve Meta-Agent (theoretical, not yet observed):
- Novel coordination pattern needed
- Capability gap in lifecycle
- Cross-agent orchestration complexity
- New convergence pattern discovered
Convergence Pattern Prediction
Based on domain characteristics:
Standard Dual Convergence (most common):
- Both V_instance and V_meta reach 0.80+
- Examples: Observability (009), Dependency Health (010), Technical Debt (012), Cross-Cutting (013)
- Use when: Both objectives equally important
Meta-Focused Convergence:
- V_meta reaches 0.80+, V_instance practically sufficient
- Example: Knowledge Transfer (011) - V_meta = 0.877, V_instance = 0.585
- Use when: Methodology is primary goal, instance is vehicle
Practical Convergence:
- Combined quality exceeds metrics, justified partial criteria
- Example: Testing (002) - V_instance = 0.848, quality > coverage %
- Use when: Quality evidence exceeds raw numbers
Domain Transfer Considerations
Transferability varies by domain abstraction:
High (90-95%):
- Knowledge Transfer (95%+): Learning principles universal
- Observability (90-95%): Three Pillars apply everywhere
Medium-High (85-90%):
- Testing (89%): Test types similar across languages
- Dependency Health (88%): Package manager patterns similar
- Documentation (85%): Role-based structure universal
- Error Recovery (85%): Error taxonomy concepts transfer
- Technical Debt (85%): SQALE principles universal
Medium (70-85%):
- Cross-Cutting Concerns (70-80%): Language-specific patterns
- Refactoring (80% est.): Code smells vary by language
Adaptation effort:
Same language/ecosystem: 10-20% effort (adapt examples)
Similar language (Go→Rust): 30-40% effort (remap patterns)
Different paradigm (Go→JS): 50-60% effort (rethink patterns)
Context Management for LLM Execution
λ(iteration, context_state) → (work_output, context_optimized) | context < limit:
Context management is critical for LLM-based execution where token limits constrain iteration depth and agent effectiveness.
Context Allocation Protocol
context_allocation :: Phase → Percentage
context_allocation(phase) = match phase {
observation → 0.30, -- Data collection, pattern analysis
codification → 0.40, -- Documentation, methodology writing
automation → 0.20, -- Tool creation, CI integration
reflection → 0.10 -- Evaluation, planning
} where Σ = 1.0
Rationale: Based on 8 experiments, codification consumes most context (methodology docs, agent definitions), followed by observation (data analysis), automation (code writing), and reflection (evaluation).
Context Pressure Management
context_pressure :: State → Strategy
context_pressure(s) =
if usage(s) > 0.80 then overflow_protocol(s)
else if usage(s) > 0.50 then compression_protocol(s)
else standard_protocol(s)
Overflow Protocol (Context >80%)
overflow_protocol :: State → Action
overflow_protocol(s) = prioritize(
serialize_to_disk: save(s.knowledge/*) ∧ compress(s.history),
reference_compression: link(files) ∧ ¬inline(content),
session_split: checkpoint(s) ∧ continue(s_{n+1}, fresh_context)
) where preserve_critical ∧ drop_redundant
Actions:
- Serialize to disk: Save iteration state to
knowledge/directory - Reference compression: Link to files instead of inlining content
- Session split: Complete current phase, start new session for next iteration
Example (from Bootstrap-013, 8 iterations):
- Iteration 4: Context 85% → Serialized analysis to
knowledge/pattern-analysis.md - Iteration 5: Started fresh session, loaded serialized state via file references
- Result: Continued 4 more iterations without context overflow
Compression Protocol (Context 50-80%)
compression_protocol :: State → Optimizations
compression_protocol(s) = apply(
deduplication: merge(similar_patterns) ∧ reference_once,
summarization: compress(historical_context) ∧ keep(structure),
lazy_loading: defer(load) ∧ fetch_on_demand
)
Optimizations:
- Deduplication: Merge similar patterns, reference once
- Summarization: Compress historical iterations while preserving structure
- Lazy loading: Load agent definitions only when invoked
Convergence Adjustment Under Context Pressure
convergence_adjustment :: (Context, V_i, V_m) → Threshold
convergence_adjustment(ctx, V_i, V_m) =
if usage(ctx) > 0.80 then
prefer(meta_focused) ∧ accept(V_i ≥ 0.55 ∧ V_m ≥ 0.80)
else if usage(ctx) > 0.50 then
standard_dual ∧ target(V_i ≥ 0.80 ∧ V_m ≥ 0.80)
else
extended_optimization ∧ pursue(V_i ≥ 0.90)
Principle: Under high context pressure, prioritize methodology quality (V_meta) over instance quality (V_instance), as methodology is more transferable and valuable long-term.
Validation (Bootstrap-011):
- Context pressure: High (95%+ transferability methodology)
- Converged with: V_meta = 0.877, V_instance = 0.585
- Pattern: Meta-Focused Convergence justified by context constraints
Context Tracking Metrics
context_metrics :: State → Metrics
context_metrics(s) = {
usage_percentage: tokens_used / tokens_limit,
phase_distribution: {obs: 0.30, cod: 0.40, aut: 0.20, ref: 0.10},
compression_ratio: compressed_size / original_size,
session_splits: count(checkpoints)
}
Track these metrics to predict when intervention needed.
Prompt Evolution Protocol
λ(agent, effectiveness_data) → agent' | ∀evolution: evidence_driven:
Systematic prompt engineering based on empirical effectiveness data, not intuition.
Core Prompt Patterns
prompt_pattern :: Pattern → Template
prompt_pattern(p) = match p {
context_bounded:
"Process {input} in chunks of {size}. For each chunk: {analysis}. Aggregate: {synthesis}.",
tool_orchestrating:
"Execute: {tool_sequence}. For each result: {validation}. If {condition}: {fallback}.",
iterative_refinement:
"Attempt: {approach_1}. Assess: {criteria}. If insufficient: {approach_2}. Repeat until: {threshold}.",
evidence_accumulation:
"Hypothesis: {H}. Seek confirming: {C}. Seek disconfirming: {D}. Weight: {W}. Decide: {decision}."
}
Usage:
- context_bounded: When processing large datasets (e.g., log analysis, file scanning)
- tool_orchestrating: When coordinating multiple MCP tools (e.g., query cascade)
- iterative_refinement: When solution quality improves through iteration (e.g., optimization)
- evidence_accumulation: When validating hypotheses (e.g., pattern discovery)
Prompt Effectiveness Measurement
prompt_effectiveness :: Prompt → Metrics
prompt_effectiveness(P) = measure(
convergence_contribution: ΔV_per_iteration,
token_efficiency: output_value / tokens_used,
error_rate: failures / total_invocations,
reusability: cross_domain_success_rate
) where empirical_data ∧ comparative_baseline
Metrics:
- Convergence contribution: How much does agent improve V_instance or V_meta per iteration?
- Token efficiency: Value delivered per token consumed (cost-effectiveness)
- Error rate: Percentage of invocations that fail or produce invalid output
- Reusability: Success rate when applied to different domains
Example (from Bootstrap-009):
- log-analyzer agent:
- ΔV_per_iteration: +0.12 average
- Token efficiency: 0.85 (high value, moderate tokens)
- Error rate: 3% (acceptable)
- Reusability: 90% (worked in 009, 010, 012)
- Result: Prompt kept, agent reused in subsequent experiments
Prompt Evolution Decision
prompt_evolution :: (P, Evidence) → P'
prompt_evolution(P, E) =
if improvement_demonstrated(E) ∧ generalization_validated(E) then
update(P → P') ∧ version(P'.version + 1) ∧ document(E.rationale)
else
maintain(P) ∧ log(evolution_rejected, E.reason)
where ¬premature_optimization ∧ n_samples ≥ 3
Evolution criteria:
- Improvement demonstrated: Evidence shows measurable improvement (ΔV > 0.05 or error_rate < 50%)
- Generalization validated: Works across ≥3 different scenarios
- n_samples ≥ 3: Avoid overfitting to single case
Example (theoretical - prompt evolution not yet observed in 8 experiments):
Original prompt: "Analyze logs for errors."
Evidence: Error detection rate 67%, false positives 23%
Evolved prompt: "Analyze {logs} for errors. For each: classify(type, severity, context). Filter: severity >= {threshold}. Output: structured_json."
Evidence: Error detection rate 89%, false positives 8%
Decision: Evolution accepted (improvement demonstrated, validated across 3 log types)
Agent Prompt Protocol
agent_prompt_protocol :: Agent → Execution
agent_prompt_protocol(A) = ∀invocation:
read(agents/{A.name}.md) ∧
extract(prompt_latest_version) ∧
apply(prompt) ∧
track(effectiveness) ∧
¬cache_prompt
Critical: Always read agent definition fresh (no caching) to ensure latest prompt version used.
Tracking:
- Log each invocation: agent_name, prompt_version, input, output, success/failure
- Aggregate metrics: Calculate effectiveness scores periodically
- Trigger evolution: When n_samples ≥ 3 and improvement opportunity identified
Relationship to Other Methodologies
bootstrapped-se is the CORE framework that integrates and extends two complementary methodologies.
Relationship to empirical-methodology (Inclusion)
bootstrapped-se INCLUDES AND EXTENDS empirical-methodology:
empirical-methodology (5 phases):
Observe → Analyze → Codify → Automate → Evolve
bootstrapped-se (OCA cycle + extensions):
Observe ───────────→ Codify ────→ Automate
↑ ↓
└─────────────── Evolve ──────────┘
(Self-referential feedback loop)
What bootstrapped-se adds beyond empirical-methodology:
- Three-Tuple Output (O, Aₙ, Mₙ) - Reusable artifacts at system level
- Agent Framework - Specialized agents emerge from domain needs
- Meta-Agent System - Modular capabilities for coordination
- Self-Referential Loop - Framework applies to itself
- Formal Convergence - System stability criteria (M_n == M_{n-1}, A_n == A_{n-1})
When to use empirical-methodology explicitly:
- Need detailed scientific method guidance
- Require fine-grained observation tool selection
- Want explicit separation of Analyze phase
When to use bootstrapped-se:
- Always - It's the core framework
- All Bootstrap experiments use bootstrapped-se as foundation
- Provides complete OCA cycle with agent system
Relationship to value-optimization (Mutual Support)
value-optimization PROVIDES QUANTIFICATION for bootstrapped-se:
bootstrapped-se needs: value-optimization provides:
- Quality measurement → Dual-layer value functions
- Convergence detection → Formal convergence criteria
- Evolution decisions → ΔV calculations, trends
- Success validation → V_instance ≥ 0.80, V_meta ≥ 0.80
bootstrapped-se ENABLES value-optimization:
- OCA cycle generates state transitions (s_i → s_{i+1})
- Agent work produces V_instance improvements
- Meta-Agent work produces V_meta improvements
- Iteration framework implements optimization loop
When to use value-optimization:
- Always with bootstrapped-se - Provides evaluation framework
- Calculate V_instance and V_meta at every iteration
- Check convergence criteria formally
- Compare across experiments
Integration:
Every bootstrapped-se iteration:
1. Execute OCA cycle (Observe → Codify → Automate)
2. Calculate V(s_n) using value-optimization
3. Check convergence (system stable + dual threshold)
4. If not converged: Continue iteration
5. If converged: Generate (O, Aₙ, Mₙ)
Three-Methodology Integration
Complete workflow (as used in all Bootstrap experiments):
┌─ methodology-framework ─────────────────────┐
│ │
│ ┌─ bootstrapped-se (CORE) ───────────────┐ │
│ │ │ │
│ │ ┌─ empirical-methodology ──────────┐ │ │
│ │ │ │ │ │
│ │ │ Observe + Analyze │ │ │
│ │ │ Codify (with evidence) │ │ │
│ │ │ Automate (CI/CD) │ │ │
│ │ │ Evolve (self-referential) │ │ │
│ │ │ │ │ │
│ │ └───────────────────────────────────┘ │ │
│ │ ↓ │ │
│ │ Produce: (O, Aₙ, Mₙ) │ │
│ │ ↓ │ │
│ │ ┌─ value-optimization ──────────────┐ │ │
│ │ │ │ │ │
│ │ │ V_instance(s_n) = domain quality │ │ │
│ │ │ V_meta(s_n) = methodology quality│ │ │
│ │ │ │ │ │
│ │ │ Convergence check: │ │ │
│ │ │ - System stable? │ │ │
│ │ │ - Dual threshold met? │ │ │
│ │ │ │ │ │
│ │ └───────────────────────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────┘
Usage Recommendation:
- Start here: Read bootstrapped-se.md (this file)
- Add evaluation: Read value-optimization.md
- Add rigor: Read empirical-methodology.md (optional)
- See integration: Read bootstrapped-ai-methodology-engineering.md (BAIME framework)
Related Skills
- bootstrapped-ai-methodology-engineering: Unified BAIME framework integrating all three methodologies
- empirical-methodology: Scientific foundation (included in bootstrapped-se)
- value-optimization: Quantitative evaluation framework (used by bootstrapped-se)
- iteration-executor: Implementation agent (coordinates bootstrapped-se execution)
Knowledge Base
Source Documentation
- Core methodology:
docs/methodology/bootstrapped-software-engineering.md - Related:
docs/methodology/empirical-methodology-development.md - Examples:
experiments/bootstrap-*/(8 validated experiments)
Key Concepts
- OCA Framework (Observe-Codify-Automate)
- Three-Tuple Output (O, Aₙ, Mₙ)
- Self-Referential Feedback Loop
- Convergence Theorem
- Meta-Methodology
Version History
- v1.0.0 (2025-10-18): Initial release
- Based on meta-cc methodology development
- 8 experiments validated (95% transferability)
- OCA framework with 5-layer feedback loop
- Empirical validation from 277 commits, 11 days
Status: ✅ Production-ready Validation: 8 experiments (Bootstrap-001 to -013) Effectiveness: 10-50x methodology development speedup Transferability: 95% (framework universal, tools adaptable)