--- name: Methodology Bootstrapping description: Apply Bootstrapped AI Methodology Engineering (BAIME) to develop project-specific methodologies through systematic Observe-Codify-Automate cycles with dual-layer value functions (instance quality + methodology quality). Use when creating testing strategies, CI/CD pipelines, error handling patterns, observability systems, or any reusable development methodology. Provides structured framework with convergence criteria, agent coordination, and empirical validation. Validated in 8 experiments with 100% success rate, 4.9 avg iterations, 10-50x speedup vs ad-hoc. Works for testing, CI/CD, error recovery, dependency management, documentation systems, knowledge transfer, technical debt, cross-cutting concerns. allowed-tools: Read, Grep, Glob, Edit, Write, Bash --- # Methodology Bootstrapping **Apply Bootstrapped AI Methodology Engineering (BAIME) to systematically develop and validate software engineering methodologies through observation, codification, and automation.** > The best methodologies are not designed but evolved through systematic observation, codification, and automation of successful practices. --- ## What is BAIME? **BAIME (Bootstrapped AI Methodology Engineering)** is a unified framework that integrates three complementary methodologies optimized for LLM-based development: 1. **OCA Cycle** (Observe-Codify-Automate) - Core iterative framework 2. **Empirical Validation** - Scientific method and data-driven decisions 3. **Value Optimization** - Dual-layer value functions for quantitative evaluation This skill provides the complete BAIME framework for systematic methodology development. The methodology is especially powerful when combined with AI agents (like Claude Code) that can execute the OCA cycle, coordinate specialized agents, and calculate value functions automatically. **Key Innovation**: BAIME treats methodology development like software developmentβ€”with empirical observation, automated testing, continuous iteration, and quantitative metrics. --- ## When to Use This Skill Use this skill when you need to: - 🎯 **Create systematic methodologies** for testing, CI/CD, error handling, observability, etc. - πŸ“Š **Validate methodologies empirically** with data-driven evidence - πŸ”„ **Evolve practices iteratively** using OCA (Observe-Codify-Automate) cycle - πŸ“ˆ **Measure methodology quality** with dual-layer value functions - πŸš€ **Achieve rapid convergence** (typically 3-7 iterations, 6-15 hours) - 🌍 **Create transferable methodologies** (70-95% reusable across projects) **Don't use this skill for**: - ❌ One-time ad-hoc tasks without reusability goals - ❌ Trivial processes (<100 lines of code/docs) - ❌ When established industry standards fully solve your problem --- ## Quick Start with BAIME (10 minutes) ### 1. Define Your Domain Choose what methodology you want to develop using BAIME: - Testing strategy (15x speedup example) - CI/CD pipeline (2.5-3.5x speedup example) - Error recovery patterns (80% error reduction example) - Observability system (23-46x speedup example) - Dependency management (6x speedup example) - Documentation system (47% token cost reduction example) - Knowledge transfer (3-8x speedup example) - Technical debt management - Cross-cutting concerns ### 2. Establish Baseline Measure current state: ```bash # Example: Testing domain - Current coverage: 65% - Test quality: Ad-hoc - No systematic approach - Bug rate: Baseline # Example: CI/CD domain - Build time: 5 minutes - No quality gates - Manual releases ``` ### 3. Set Dual Goals Define both layers: - **Instance goal** (domain-specific): "Reach 80% test coverage" - **Meta goal** (methodology): "Create reusable testing strategy with 85%+ transferability" ### 4. Start Iteration 0 Follow the OCA cycle (see [reference/observe-codify-automate.md](reference/observe-codify-automate.md)) --- ## Specialized Subagents BAIME provides two specialized Claude Code subagents to streamline experiment execution: ### iteration-prompt-designer **When to use**: At experiment start, to create comprehensive ITERATION-PROMPTS.md **What it does**: - Designs iteration templates tailored to your domain - Incorporates modular Meta-Agent architecture - Provides domain-specific guidance for each iteration - Creates structured prompts for baseline and subsequent iterations **How to invoke**: ``` Use the Task tool with subagent_type="iteration-prompt-designer" Example: "Design ITERATION-PROMPTS.md for refactoring methodology experiment" ``` **Benefits**: - βœ… Comprehensive iteration prompts (saves 2-3 hours setup time) - βœ… Domain-specific value function design - βœ… Proper baseline iteration structure - βœ… Evidence-driven evolution guidance --- ### iteration-executor **When to use**: For each iteration execution (Iteration 0, 1, 2, ...) **What it does**: - Executes iteration through lifecycle phases (Observe β†’ Codify β†’ Automate β†’ Evaluate) - Coordinates Meta-Agent capabilities and agent invocations - Tracks state transitions (M_{n-1} β†’ M_n, A_{n-1} β†’ A_n, s_{n-1} β†’ s_n) - Calculates dual-layer value functions (V_instance, V_meta) systematically - Evaluates convergence criteria rigorously - Generates complete iteration documentation **How to invoke**: ``` Use the Task tool with subagent_type="iteration-executor" Example: "Execute Iteration 2 of testing methodology experiment using iteration-executor" ``` **Benefits**: - βœ… Consistent iteration structure across experiments - βœ… Systematic value calculation (reduces bias, improves honesty) - βœ… Proper convergence evaluation (prevents premature convergence) - βœ… Complete artifact generation (data, knowledge, reflections) - βœ… Reduced iteration time (structured execution vs ad-hoc) **Important**: iteration-executor reads capability files fresh each iteration (no caching) to ensure latest guidance is applied. --- ### knowledge-extractor **When to use**: After experiment converges, to extract and transform knowledge into reusable artifacts **What it does**: - Extracts patterns, principles, templates from converged BAIME experiment - Transforms experiment artifacts into production-ready Claude Code skills - Creates knowledge base entries (patterns/*.md, principles/*.md) - Validates output quality with structured criteria (V_instance β‰₯ 0.85) - Achieves 195x speedup (2 min vs 390 min manual extraction) - Produces distributable, reusable artifacts for the community **How to invoke**: ``` Use the Task tool with subagent_type="knowledge-extractor" Example: "Extract knowledge from Bootstrap-004 refactoring experiment and create code-refactoring skill using knowledge-extractor" ``` **Benefits**: - βœ… Systematic knowledge preservation (vs ad-hoc documentation) - βœ… Reusable Claude Code skills (ready for distribution) - βœ… Quality validation (95% content equivalence to hand-crafted) - βœ… Fast extraction (2-5 min, 195x speedup) - βœ… Knowledge base population (patterns, principles, templates) - βœ… Automated artifact generation (43% workflow automation with 4 tools) **Lifecycle position**: Post-Convergence phase ``` Experiment Design β†’ iteration-prompt-designer β†’ ITERATION-PROMPTS.md ↓ Iterate β†’ iteration-executor (x N) β†’ iteration-0..N.md ↓ Converge β†’ Create results.md ↓ Extract β†’ knowledge-extractor β†’ .claude/skills/ + knowledge/ ↓ Distribute β†’ Claude Code users ``` **Validated performance** (Bootstrap-005): - Speedup: 195x (390 min β†’ 2 min) - Quality: V_instance = 0.87, 95% content equivalence - Reliability: 100% success across 3 experiments - Automation: 43% of workflow (6/14 steps) --- ## Core Framework ### The OCA Cycle ``` Observe β†’ Codify β†’ Automate ↑ ↓ └────── Evolve β”€β”€β”€β”€β”€β”€β”˜ ``` **Observe**: Collect empirical data about current practices - Use meta-cc MCP tools to analyze session history - Git analysis for commit patterns - Code metrics (coverage, complexity) - Access pattern tracking - Error rate monitoring **Codify**: Extract patterns and document methodologies - Pattern recognition from data - Hypothesis formation - Documentation as markdown - Validation with real scenarios **Automate**: Convert methodologies to automated checks - Detection: Identify when pattern applies - Validation: Check compliance - Enforcement: CI/CD gates - Suggestion: Automated fix recommendations **Evolve**: Apply methodology to itself for continuous improvement - Use tools on development process - Discover meta-patterns - Optimize methodology **Detailed guide**: [reference/observe-codify-automate.md](reference/observe-codify-automate.md) ### Dual-Layer Value Functions Every iteration calculates two scores: **V_instance(s)**: Domain-specific task quality - Example (testing): coverage Γ— quality Γ— stability Γ— performance - Example (CI/CD): speed Γ— reliability Γ— automation Γ— observability - Target: β‰₯0.80 **V_meta(s)**: Methodology transferability quality - Components: completeness Γ— effectiveness Γ— reusability Γ— validation - Completeness: Is methodology fully documented? - Effectiveness: What speedup does it provide? - Reusability: What % transferable across projects? - Validation: Is it empirically validated? - Target: β‰₯0.80 **Detailed guide**: [reference/dual-value-functions.md](reference/dual-value-functions.md) ### Convergence Criteria Methodology complete when: 1. βœ… **System stable**: Agent set unchanged for 2+ iterations 2. βœ… **Dual threshold**: V_instance β‰₯ 0.80 AND V_meta β‰₯ 0.80 3. βœ… **Objectives complete**: All planned work finished 4. βœ… **Diminishing returns**: Ξ”V < 0.02 for 2+ iterations **Alternative patterns**: - **Meta-Focused Convergence**: V_meta β‰₯ 0.80, V_instance β‰₯ 0.55 (when methodology is primary goal) - **Practical Convergence**: Combined quality exceeds metrics, justified partial criteria **Detailed guide**: [reference/convergence-criteria.md](reference/convergence-criteria.md) --- ## Iteration Documentation Structure Every BAIME iteration must produce a comprehensive iteration report following a standardized 10-section structure. This ensures consistent quality, complete knowledge capture, and reproducible methodology development. ### Required Sections **See complete example**: [examples/iteration-documentation-example.md](examples/iteration-documentation-example.md) **Use blank template**: [examples/iteration-structure-template.md](examples/iteration-structure-template.md) 1. **Executive Summary** (2-3 paragraphs) - Iteration focus and objectives - Key achievements - Key learnings - Value scores (V_instance, V_meta) 2. **Pre-Execution Context** - Previous state: M_{n-1}, A_{n-1}, s_{n-1} - Previous values: V_instance(s_{n-1}), V_meta(s_{n-1}) with component breakdowns - Primary objectives for this iteration 3. **Work Executed** (organized by BAIME phases) - **Phase 1: OBSERVE** - Data collection, measurements, gap identification - **Phase 2: CODIFY** - Pattern extraction, documentation, knowledge creation - **Phase 3: AUTOMATE** - Tool creation, script development, enforcement - **Phase 4: EVALUATE** - Metric calculation, value assessment 4. **Value Calculations** (detailed, evidence-based) - **V_instance(s_n)** with component breakdowns - Each component score with concrete evidence - Formula application with arithmetic - Final score calculation - Change from previous iteration (Ξ”V) - **V_meta(s_n)** with rubric assessments - Completeness score (checklist-based, with evidence) - Effectiveness score (speedup, quality gains, with evidence) - Reusability score (transferability estimate, with evidence) - Final score calculation - Change from previous iteration (Ξ”V) 5. **Gap Analysis** - **Instance layer gaps** (what's needed to reach V_instance β‰₯ 0.80) - Prioritized list with estimated effort - **Meta layer gaps** (what's needed to reach V_meta β‰₯ 0.80) - Prioritized list with estimated effort - Estimated work remaining 6. **Convergence Check** (systematic criteria evaluation) - **Dual threshold**: V_instance β‰₯ 0.80 AND V_meta β‰₯ 0.80 - **System stability**: M_n == M_{n-1} AND A_n == A_{n-1} - **Objectives completeness**: All planned work finished - **Diminishing returns**: Ξ”V < 0.02 for 2+ iterations - **Convergence decision**: YES/NO with detailed rationale 7. **Evolution Decisions** (evidence-driven) - **Agent sufficiency analysis** (A_n vs A_{n-1}) - Each agent's performance assessment - Decision: evolution needed or not - Rationale with evidence - **Meta-Agent sufficiency analysis** (M_n vs M_{n-1}) - Each capability's effectiveness assessment - Decision: evolution needed or not - Rationale with evidence 8. **Artifacts Created** - Data files (coverage reports, metrics, measurements) - Knowledge files (patterns, principles, methodology documents) - Code changes (implementation, tests, tools) - Other deliverables 9. **Reflections** - **What worked well** (successes to repeat) - **What didn't work** (failures to avoid) - **Learnings** (insights from this iteration) - **Insights for methodology** (meta-level learnings) 10. **Conclusion** - Iteration summary - Key metrics and improvements - Critical decisions made - Next steps - Confidence assessment ### File Naming Convention ``` iterations/iteration-N.md ``` Where N = 0, 1, 2, 3, ... (starting from 0 for baseline) ### Documentation Quality Standards **Evidence-based scores**: - Every value component score must have concrete evidence - Avoid vague assessments ("seems good" ❌, "72.3% coverage, +5% from baseline" βœ…) - Show arithmetic for all calculations **Honest assessment**: - Low scores early are expected and acceptable (baseline V_meta often 0.15-0.25) - Don't inflate scores to meet targets - Document gaps explicitly - Acknowledge when objectives are not met **Complete coverage**: - All 10 sections must be present - Don't skip reflections (valuable for meta-learning) - Don't skip gap analysis (critical for planning) - Don't skip convergence check (prevents premature convergence) ### Tools for Iteration Documentation **Recommended workflow**: 1. Copy [examples/iteration-structure-template.md](examples/iteration-structure-template.md) to `iterations/iteration-N.md` 2. Invoke `iteration-executor` subagent to execute iteration with structured documentation 3. Review [examples/iteration-documentation-example.md](examples/iteration-documentation-example.md) for quality reference **Automated generation**: Use `iteration-executor` subagent to ensure consistent structure and systematic value calculation. --- ## Three-Layer Architecture **BAIME** integrates three complementary methodologies into a unified framework: **Layer 1: Core Framework (OCA Cycle)** - Observe β†’ Codify β†’ Automate β†’ Evolve - Three-tuple output: (O, Aβ‚™, Mβ‚™) - Self-referential feedback loop - Agent coordination **Layer 2: Scientific Foundation (Empirical Methodology)** - Empirical observation tools - Data-driven pattern extraction - Hypothesis testing - Scientific validation **Layer 3: Quantitative Evaluation (Value Optimization)** - Dual-layer value functions (V_instance + V_meta) - Convergence mathematics - Agent as gradient, Meta-Agent as Hessian - Optimization perspective **Why "BAIME"?** The framework bootstraps itselfβ€”methodologies developed using BAIME can be applied to improve BAIME itself. This self-referential property, combined with AI-agent coordination, makes it uniquely suited for LLM-based development tools. **Detailed guide**: [reference/three-layer-architecture.md](reference/three-layer-architecture.md) --- ## Proven Results **Validated in 8 experiments**: - βœ… 100% success rate (8/8 converged) - ⏱️ Average: 4.9 iterations, 9.1 hours - πŸ“ˆ V_instance average: 0.784 (range: 0.585-0.92) - πŸ“ˆ V_meta average: 0.840 (range: 0.83-0.877) - 🌍 Transferability: 70-95%+ - πŸš€ Speedup: 3-46x vs ad-hoc **Example applications**: - **Testing strategy**: 15x speedup, 75%β†’86% coverage ([examples/testing-methodology.md](examples/testing-methodology.md)) - **CI/CD pipeline**: 2.5-3.5x speedup, 91.7% pattern validation ([examples/ci-cd-optimization.md](examples/ci-cd-optimization.md)) - **Error recovery**: 80% error reduction, 85% transferability - **Observability**: 23-46x speedup, 90-95% transferability - **Dependency health**: 6x speedup (9hβ†’1.5h), 88% transferability - **Knowledge transfer**: 3-8x onboarding speedup, 95%+ transferability - **Documentation**: 47% token cost reduction, 85% transferability - **Technical debt**: SQALE quantification, 85% transferability --- ## Usage Templates ### Experiment Template Use [templates/experiment-template.md](templates/experiment-template.md) to structure your methodology development: - README.md structure - Iteration prompts - Knowledge extraction format - Results documentation ### Iteration Prompt Template Use [templates/iteration-prompts-template.md](templates/iteration-prompts-template.md) to guide each iteration: - Iteration N objectives - OCA cycle execution steps - Value calculation rubrics - Convergence checks **Automated generation**: Use `iteration-prompt-designer` subagent to create domain-specific iteration prompts. ### Iteration Documentation Template **Structure template**: [examples/iteration-structure-template.md](examples/iteration-structure-template.md) - 10-section standardized structure - Blank template ready to copy and fill - Includes all required components **Complete example**: [examples/iteration-documentation-example.md](examples/iteration-documentation-example.md) - Real iteration from test strategy experiment - Shows proper value calculations with evidence - Demonstrates honest assessment and gap analysis - Illustrates quality reflections and insights **Automated execution**: Use `iteration-executor` subagent to ensure consistent structure and systematic value calculation. **Quality standards**: - Evidence-based scoring (concrete data, not vague assessments) - Honest evaluation (low scores acceptable, inflation harmful) - Complete coverage (all 10 sections required) - Arithmetic shown (all value calculations with steps) --- ## Common Pitfalls ❌ **Don't**: - Use only one methodology layer in isolation (except quick prototyping) - Predetermine agent evolution path (let specialization emerge from data) - Force convergence at target iteration count (trust the criteria) - Inflate value metrics to meet targets (honest assessment critical) - Skip empirical validation (data-driven decisions only) βœ… **Do**: - Start with OCA cycle, add evaluation and validation - Let agent specialization emerge from domain needs - Trust the convergence criteria (system knows when done) - Calculate V(s) honestly based on actual state - Complete all analysis thoroughly before codifying ### Iteration Documentation Pitfalls ❌ **Don't**: - Skip iteration documentation (every iteration needs iteration-N.md) - Calculate V-scores without component breakdowns and evidence - Use vague assessments ("seems good", "probably 0.7") - Omit gap analysis or convergence checks - Document only successes (failures provide valuable learnings) - Assume convergence without systematic criteria evaluation - Inflate scores to meet targets (honesty is critical) - Skip reflections section (meta-learning opportunity) βœ… **Do**: - Use `iteration-executor` subagent for consistent structure - Provide concrete evidence for each value component - Show arithmetic for all calculations - Document both instance and meta layer gaps explicitly - Include reflections (what worked, didn't work, learnings, insights) - Be honest about scores (baseline V_meta of 0.20 is normal and acceptable) - Follow the 10-section structure for every iteration - Reference iteration documentation example for quality standards --- ## Related Skills **Acceleration techniques** (achieve 3-4 iteration convergence): - [rapid-convergence](../rapid-convergence/SKILL.md) - Fast convergence patterns - [retrospective-validation](../retrospective-validation/SKILL.md) - Historical data validation - [baseline-quality-assessment](../baseline-quality-assessment/SKILL.md) - Strong iteration 0 **Supporting skills**: - [agent-prompt-evolution](../agent-prompt-evolution/SKILL.md) - Track agent specialization **Domain applications** (ready-to-use methodologies): - [testing-strategy](../testing-strategy/SKILL.md) - TDD, coverage-driven, fixtures - [error-recovery](../error-recovery/SKILL.md) - Error taxonomy, recovery patterns - [ci-cd-optimization](../ci-cd-optimization/SKILL.md) - Quality gates, automation - [observability-instrumentation](../observability-instrumentation/SKILL.md) - Logging, metrics, tracing - [dependency-health](../dependency-health/SKILL.md) - Security, freshness, compliance - [knowledge-transfer](../knowledge-transfer/SKILL.md) - Onboarding, learning paths - [technical-debt-management](../technical-debt-management/SKILL.md) - SQALE, prioritization - [cross-cutting-concerns](../cross-cutting-concerns/SKILL.md) - Pattern extraction, enforcement --- ## References **Core documentation**: - [Overview](reference/overview.md) - Architecture and philosophy - [OCA Cycle](reference/observe-codify-automate.md) - Detailed process - [Value Functions](reference/dual-value-functions.md) - Evaluation framework - [Convergence Criteria](reference/convergence-criteria.md) - When to stop - [Three-Layer Architecture](reference/three-layer-architecture.md) - Framework layers **Quick start**: - [Quick Start Guide](reference/quick-start-guide.md) - Step-by-step tutorial **Examples**: - [Testing Methodology](examples/testing-methodology.md) - Complete walkthrough - [CI/CD Optimization](examples/ci-cd-optimization.md) - Pipeline example - [Error Recovery](examples/error-recovery.md) - Error handling example **Templates**: - [Experiment Template](templates/experiment-template.md) - Structure your experiment - [Iteration Prompts](templates/iteration-prompts-template.md) - Guide each iteration --- **Status**: βœ… Production-ready | BAIME Framework | 8 experiments | 100% success rate | 95% transferable **Terminology**: This skill implements the **Bootstrapped AI Methodology Engineering (BAIME)** framework. Use "BAIME" when referring to this methodology in documentation, research, or when asking Claude Code for assistance with methodology development.