Initial commit

2025-11-30 09:07:22 +08:00
commit fab98d059b
179 changed files with 46209 additions and 0 deletions
--- a/skills/testing-strategy/reference/automation-tools.md
+++ b/skills/testing-strategy/reference/automation-tools.md
@@ -0,0 +1,355 @@
+# Test Automation Tools
+
+**Version**: 2.0
+**Source**: Bootstrap-002 Test Strategy Development
+**Last Updated**: 2025-10-18
+
+This document describes 3 automation tools that accelerate test development through coverage analysis and test generation.
+
+---
+
+## Tool 1: Coverage Gap Analyzer
+
+**Purpose**: Identify functions with low coverage and suggest priorities
+
+**Usage**:
+```bash
+./scripts/analyze-coverage-gaps.sh coverage.out
+./scripts/analyze-coverage-gaps.sh coverage.out --threshold 70 --top 5
+./scripts/analyze-coverage-gaps.sh coverage.out --category error-handling
+```
+
+**Output**:
+- Prioritized list of functions (P1-P4)
+- Suggested test patterns
+- Time estimates
+- Coverage impact estimates
+
+**Features**:
+- Categorizes by function type (error-handling, business-logic, cli, etc.)
+- Assigns priority based on category
+- Suggests appropriate test patterns
+- Estimates time and coverage impact
+
+**Time Saved**: 10-15 minutes per testing session (vs manual coverage analysis)
+
+**Speedup**: 186x faster than manual analysis
+
+### Priority Matrix
+
+| Category | Target Coverage | Priority | Time/Test |
+|----------|----------------|----------|-----------|
+| Error Handling | 80-90% | P1 | 15 min |
+| Business Logic | 75-85% | P2 | 12 min |
+| CLI Handlers | 70-80% | P2 | 12 min |
+| Integration | 70-80% | P3 | 20 min |
+| Utilities | 60-70% | P3 | 8 min |
+| Infrastructure | Best effort | P4 | 25 min |
+
+### Example Output
+
+```
+HIGH PRIORITY (Error Handling):
+1. ValidateInput (0.0%) - P1
+   Pattern: Error Path + Table-Driven
+   Estimated time: 15 min
+   Expected coverage impact: +0.25%
+
+2. CheckFormat (25.0%) - P1
+   Pattern: Error Path + Table-Driven
+   Estimated time: 12 min
+   Expected coverage impact: +0.18%
+
+MEDIUM PRIORITY (Business Logic):
+3. ProcessData (45.0%) - P2
+   Pattern: Table-Driven
+   Estimated time: 12 min
+   Expected coverage impact: +0.20%
+```
+
+---
+
+## Tool 2: Test Generator
+
+**Purpose**: Generate test scaffolds from function signatures
+
+**Usage**:
+```bash
+./scripts/generate-test.sh ParseQuery --pattern table-driven
+./scripts/generate-test.sh ValidateInput --pattern error-path --scenarios 4
+./scripts/generate-test.sh Execute --pattern cli-command
+```
+
+**Supported Patterns**:
+- `unit`: Simple unit test
+- `table-driven`: Multiple scenarios
+- `error-path`: Error handling
+- `cli-command`: CLI testing
+- `global-flag`: Flag parsing
+
+**Output**:
+- Test file with pattern structure
+- Appropriate imports
+- TODO comments for customization
+- Formatted with gofmt
+
+**Time Saved**: 5-8 minutes per test (vs writing from scratch)
+
+**Speedup**: 200x faster than manual test scaffolding
+
+### Example: Generate Error Path Test
+
+```bash
+$ ./scripts/generate-test.sh ValidateInput --pattern error-path --scenarios 4 \
+  --package validation --output internal/validation/validate_test.go
+```
+
+**Generated Output**:
+```go
+package validation
+
+import (
+    "strings"
+    "testing"
+)
+
+func TestValidateInput_ErrorCases(t *testing.T) {
+    tests := []struct {
+        name    string
+        input   interface{} // TODO: Replace with actual type
+        wantErr bool
+        errMsg  string
+    }{
+        {
+            name:    "nil input",
+            input:   nil, // TODO: Fill in test data
+            wantErr: true,
+            errMsg:  "", // TODO: Expected error message
+        },
+        {
+            name:    "empty input",
+            input:   nil, // TODO: Fill in test data
+            wantErr: true,
+            errMsg:  "", // TODO: Expected error message
+        },
+        {
+            name:    "invalid format",
+            input:   nil, // TODO: Fill in test data
+            wantErr: true,
+            errMsg:  "", // TODO: Expected error message
+        },
+        {
+            name:    "out of range",
+            input:   nil, // TODO: Fill in test data
+            wantErr: true,
+            errMsg:  "", // TODO: Expected error message
+        },
+    }
+
+    for _, tt := range tests {
+        t.Run(tt.name, func(t *testing.T) {
+            _, err := ValidateInput(tt.input) // TODO: Add correct arguments
+
+            if (err != nil) != tt.wantErr {
+                t.Errorf("ValidateInput() error = %v, wantErr %v", err, tt.wantErr)
+                return
+            }
+
+            if tt.wantErr && !strings.Contains(err.Error(), tt.errMsg) {
+                t.Errorf("expected error containing '%s', got '%s'", tt.errMsg, err.Error())
+            }
+        })
+    }
+}
+```
+
+---
+
+## Tool 3: Workflow Integration
+
+**Purpose**: Seamless integration between coverage analysis and test generation
+
+Both tools work together in a streamlined workflow:
+
+```bash
+# 1. Identify gaps
+./scripts/analyze-coverage-gaps.sh coverage.out --top 10
+
+# Output shows:
+# 1. ValidateInput (0.0%) - P1 error-handling
+#    Pattern: Error Path Pattern (Pattern 4) + Table-Driven (Pattern 2)
+
+# 2. Generate test
+./scripts/generate-test.sh ValidateInput --pattern error-path --scenarios 4
+
+# 3. Fill in TODOs and run
+go test ./internal/validation/
+```
+
+**Combined Time Saved**: 15-20 minutes per testing session
+
+**Overall Speedup**: 7.5x faster methodology development
+
+---
+
+## Effectiveness Comparison
+
+### Without Tools (Manual Approach)
+
+**Per Testing Session**:
+- Coverage gap analysis: 15-20 min
+- Pattern selection: 5-10 min
+- Test scaffolding: 8-12 min
+- **Total overhead**: ~30-40 min
+
+### With Tools (Automated Approach)
+
+**Per Testing Session**:
+- Coverage gap analysis: 2 min (run tool)
+- Pattern selection: Suggested by tool
+- Test scaffolding: 1 min (generate test)
+- **Total overhead**: ~5 min
+
+**Speedup**: 6-8x faster test planning and setup
+
+---
+
+## Complete Workflow Example
+
+### Scenario: Add Tests for Validation Package
+
+**Step 1: Analyze Coverage**
+```bash
+$ go test -coverprofile=coverage.out ./...
+$ ./scripts/analyze-coverage-gaps.sh coverage.out --category error-handling
+
+HIGH PRIORITY (Error Handling):
+1. ValidateInput (0.0%) - Pattern: Error Path + Table-Driven
+2. CheckFormat (25.0%) - Pattern: Error Path + Table-Driven
+```
+
+**Step 2: Generate Test for ValidateInput**
+```bash
+$ ./scripts/generate-test.sh ValidateInput --pattern error-path --scenarios 4 \
+  --package validation --output internal/validation/validate_test.go
+```
+
+**Step 3: Fill in Generated Test** (see Tool 2 example above)
+
+**Step 4: Run and Verify**
+```bash
+$ go test ./internal/validation/ -v
+=== RUN   TestValidateInput_ErrorCases
+=== RUN   TestValidateInput_ErrorCases/nil_input
+=== RUN   TestValidateInput_ErrorCases/empty_input
+=== RUN   TestValidateInput_ErrorCases/invalid_format
+=== RUN   TestValidateInput_ErrorCases/out_of_range
+--- PASS: TestValidateInput_ErrorCases (0.00s)
+PASS
+
+$ go test -cover ./internal/validation/
+coverage: 75.2% of statements
+```
+
+**Result**: Coverage increased from 57.9% to 75.2% (+17.3%) in ~15 minutes
+
+---
+
+## Installation and Setup
+
+### Prerequisites
+
+```bash
+# Ensure Go is installed
+go version
+
+# Ensure standard Unix tools available
+which awk sed grep
+```
+
+### Tool Files Location
+
+```
+scripts/
+├── analyze-coverage-gaps.sh    # Coverage analyzer
+└── generate-test.sh             # Test generator
+```
+
+### Usage Tips
+
+1. **Always generate coverage first**:
+   ```bash
+   go test -coverprofile=coverage.out ./...
+   ```
+
+2. **Use analyzer categories** for focused analysis:
+   - `--category error-handling`: High-priority validation/error functions
+   - `--category business-logic`: Core functionality
+   - `--category cli`: Command handlers
+
+3. **Customize test generator output**:
+   - Use `--scenarios N` to control number of test cases
+   - Use `--output path` to specify target file
+   - Use `--package name` to set package name
+
+4. **Iterate quickly**:
+   ```bash
+   # Generate, fill, test, repeat
+   ./scripts/generate-test.sh Function --pattern table-driven
+   vim path/to/test_file.go  # Fill TODOs
+   go test ./...
+   ```
+
+---
+
+## Troubleshooting
+
+### Coverage Gap Analyzer Issues
+
+```bash
+# Error: go command not found
+# Solution: Ensure Go installed and in PATH
+
+# Error: coverage file not found
+# Solution: Generate coverage first:
+go test -coverprofile=coverage.out ./...
+
+# Error: invalid coverage format
+# Solution: Use raw coverage file, not processed output
+```
+
+### Test Generator Issues
+
+```bash
+# Error: gofmt not found
+# Solution: Install Go tools or skip formatting
+
+# Generated test doesn't compile
+# Solution: Fill in TODO items with actual types/values
+```
+
+---
+
+## Effectiveness Metrics
+
+**Measured over 4 iterations**:
+
+| Metric | Without Tools | With Tools | Speedup |
+|--------|--------------|------------|---------|
+| Coverage analysis | 15-20 min | 2 min | 186x |
+| Test scaffolding | 8-12 min | 1 min | 200x |
+| Total overhead | 30-40 min | 5 min | 6-8x |
+| Per test time | 20-25 min | 4-5 min | 5x |
+
+**Real-World Results** (from experiment):
+- Tests added: 17 tests
+- Average time per test: 11 min (with tools)
+- Estimated ad-hoc time: 20 min per test
+- Time saved: ~150 min total
+- **Efficiency gain: 45%**
+
+---
+
+**Source**: Bootstrap-002 Test Strategy Development
+**Framework**: BAIME (Bootstrapped AI Methodology Engineering)
+**Status**: Production-ready, validated through 4 iterations
--- a/skills/testing-strategy/reference/cross-language-guide.md
+++ b/skills/testing-strategy/reference/cross-language-guide.md
@@ -0,0 +1,609 @@
+# Cross-Language Test Strategy Adaptation
+
+**Version**: 2.0
+**Source**: Bootstrap-002 Test Strategy Development
+**Last Updated**: 2025-10-18
+
+This document provides guidance for adapting test patterns and methodology to different programming languages and frameworks.
+
+---
+
+## Transferability Overview
+
+### Universal Concepts (100% Transferable)
+
+The following concepts apply to ALL languages:
+
+1. **Coverage-Driven Workflow**: Analyze → Prioritize → Test → Verify
+2. **Priority Matrix**: P1 (error handling) → P4 (infrastructure)
+3. **Pattern-Based Testing**: Structured approaches to common scenarios
+4. **Table-Driven Approach**: Multiple scenarios with shared logic
+5. **Error Path Testing**: Systematic edge case coverage
+6. **Dependency Injection**: Mock external dependencies
+7. **Quality Standards**: Test structure and best practices
+8. **TDD Cycle**: Red-Green-Refactor
+
+### Language-Specific Elements (Require Adaptation)
+
+1. **Syntax and Imports**: Language-specific
+2. **Testing Framework APIs**: Different per ecosystem
+3. **Coverage Tool Commands**: Language-specific tools
+4. **Mock Implementation**: Different mocking libraries
+5. **Build/Run Commands**: Different toolchains
+
+---
+
+## Go → Python Adaptation
+
+### Transferability: 80-90%
+
+### Testing Framework Mapping
+
+| Go Concept | Python Equivalent |
+|------------|------------------|
+| `testing` package | `unittest` or `pytest` |
+| `t.Run()` subtests | `pytest` parametrize or `unittest` subtests |
+| `t.Helper()` | `pytest` fixtures |
+| `t.Cleanup()` | `pytest` fixtures with yield or `unittest` tearDown |
+| Table-driven tests | `@pytest.mark.parametrize` |
+
+### Pattern Adaptations
+
+#### Pattern 1: Unit Test
+
+**Go**:
+```go
+func TestFunction(t *testing.T) {
+    result := Function(input)
+    if result != expected {
+        t.Errorf("got %v, want %v", result, expected)
+    }
+}
+```
+
+**Python (pytest)**:
+```python
+def test_function():
+    result = function(input)
+    assert result == expected, f"got {result}, want {expected}"
+```
+
+**Python (unittest)**:
+```python
+class TestFunction(unittest.TestCase):
+    def test_function(self):
+        result = function(input)
+        self.assertEqual(result, expected)
+```
+
+#### Pattern 2: Table-Driven Test
+
+**Go**:
+```go
+func TestFunction(t *testing.T) {
+    tests := []struct {
+        name     string
+        input    int
+        expected int
+    }{
+        {"case1", 1, 2},
+        {"case2", 2, 4},
+    }
+
+    for _, tt := range tests {
+        t.Run(tt.name, func(t *testing.T) {
+            result := Function(tt.input)
+            if result != tt.expected {
+                t.Errorf("got %v, want %v", result, tt.expected)
+            }
+        })
+    }
+}
+```
+
+**Python (pytest)**:
+```python
+@pytest.mark.parametrize("input,expected", [
+    (1, 2),
+    (2, 4),
+])
+def test_function(input, expected):
+    result = function(input)
+    assert result == expected
+```
+
+**Python (unittest)**:
+```python
+class TestFunction(unittest.TestCase):
+    def test_cases(self):
+        cases = [
+            ("case1", 1, 2),
+            ("case2", 2, 4),
+        ]
+        for name, input, expected in cases:
+            with self.subTest(name=name):
+                result = function(input)
+                self.assertEqual(result, expected)
+```
+
+#### Pattern 6: Dependency Injection (Mocking)
+
+**Go**:
+```go
+type Executor interface {
+    Execute(args Args) (Result, error)
+}
+
+type MockExecutor struct {
+    Results map[string]Result
+}
+
+func (m *MockExecutor) Execute(args Args) (Result, error) {
+    return m.Results[args.Key], nil
+}
+```
+
+**Python (unittest.mock)**:
+```python
+from unittest.mock import Mock, MagicMock
+
+def test_process():
+    mock_executor = Mock()
+    mock_executor.execute.return_value = expected_result
+
+    result = process_data(mock_executor)
+
+    assert result == expected
+    mock_executor.execute.assert_called_once()
+```
+
+**Python (pytest-mock)**:
+```python
+def test_process(mocker):
+    mock_executor = mocker.Mock()
+    mock_executor.execute.return_value = expected_result
+
+    result = process_data(mock_executor)
+
+    assert result == expected
+```
+
+### Coverage Tools
+
+**Go**:
+```bash
+go test -coverprofile=coverage.out ./...
+go tool cover -func=coverage.out
+go tool cover -html=coverage.out
+```
+
+**Python (pytest-cov)**:
+```bash
+pytest --cov=package --cov-report=term
+pytest --cov=package --cov-report=html
+pytest --cov=package --cov-report=term-missing
+```
+
+**Python (coverage.py)**:
+```bash
+coverage run -m pytest
+coverage report
+coverage html
+```
+
+---
+
+## Go → JavaScript/TypeScript Adaptation
+
+### Transferability: 75-85%
+
+### Testing Framework Mapping
+
+| Go Concept | JavaScript/TypeScript Equivalent |
+|------------|--------------------------------|
+| `testing` package | Jest, Mocha, Vitest |
+| `t.Run()` subtests | `describe()` / `it()` blocks |
+| Table-driven tests | `test.each()` (Jest) |
+| Mocking | Jest mocks, Sinon |
+| Coverage | Jest built-in, nyc/istanbul |
+
+### Pattern Adaptations
+
+#### Pattern 1: Unit Test
+
+**Go**:
+```go
+func TestFunction(t *testing.T) {
+    result := Function(input)
+    if result != expected {
+        t.Errorf("got %v, want %v", result, expected)
+    }
+}
+```
+
+**JavaScript (Jest)**:
+```javascript
+test('function returns expected result', () => {
+    const result = functionUnderTest(input);
+    expect(result).toBe(expected);
+});
+```
+
+**TypeScript (Jest)**:
+```typescript
+describe('functionUnderTest', () => {
+    it('returns expected result', () => {
+        const result = functionUnderTest(input);
+        expect(result).toBe(expected);
+    });
+});
+```
+
+#### Pattern 2: Table-Driven Test
+
+**Go**:
+```go
+func TestFunction(t *testing.T) {
+    tests := []struct {
+        name     string
+        input    int
+        expected int
+    }{
+        {"case1", 1, 2},
+        {"case2", 2, 4},
+    }
+
+    for _, tt := range tests {
+        t.Run(tt.name, func(t *testing.T) {
+            result := Function(tt.input)
+            if result != tt.expected {
+                t.Errorf("got %v, want %v", result, tt.expected)
+            }
+        })
+    }
+}
+```
+
+**JavaScript/TypeScript (Jest)**:
+```typescript
+describe('functionUnderTest', () => {
+    test.each([
+        ['case1', 1, 2],
+        ['case2', 2, 4],
+    ])('%s: input %i should return %i', (name, input, expected) => {
+        const result = functionUnderTest(input);
+        expect(result).toBe(expected);
+    });
+});
+```
+
+**Alternative with object syntax**:
+```typescript
+describe('functionUnderTest', () => {
+    test.each([
+        { name: 'case1', input: 1, expected: 2 },
+        { name: 'case2', input: 2, expected: 4 },
+    ])('$name', ({ input, expected }) => {
+        const result = functionUnderTest(input);
+        expect(result).toBe(expected);
+    });
+});
+```
+
+#### Pattern 6: Dependency Injection (Mocking)
+
+**Go**:
+```go
+type MockExecutor struct {
+    Results map[string]Result
+}
+```
+
+**JavaScript (Jest)**:
+```javascript
+const mockExecutor = {
+    execute: jest.fn((args) => {
+        return results[args.key];
+    })
+};
+
+test('processData uses executor', () => {
+    const result = processData(mockExecutor, testData);
+
+    expect(result).toBe(expected);
+    expect(mockExecutor.execute).toHaveBeenCalledWith(testData);
+});
+```
+
+**TypeScript (Jest)**:
+```typescript
+const mockExecutor: Executor = {
+    execute: jest.fn((args: Args): Result => {
+        return results[args.key];
+    })
+};
+```
+
+### Coverage Tools
+
+**Jest (built-in)**:
+```bash
+jest --coverage
+jest --coverage --coverageReporters=html
+jest --coverage --coverageReporters=text-summary
+```
+
+**nyc (for Mocha)**:
+```bash
+nyc mocha
+nyc report --reporter=html
+nyc report --reporter=text-summary
+```
+
+---
+
+## Go → Rust Adaptation
+
+### Transferability: 70-80%
+
+### Testing Framework Mapping
+
+| Go Concept | Rust Equivalent |
+|------------|----------------|
+| `testing` package | Built-in `#[test]` |
+| `t.Run()` subtests | `#[test]` functions |
+| Table-driven tests | Loop or macro |
+| Error handling | `Result<T, E>` assertions |
+| Mocking | `mockall` crate |
+
+### Pattern Adaptations
+
+#### Pattern 1: Unit Test
+
+**Go**:
+```go
+func TestFunction(t *testing.T) {
+    result := Function(input)
+    if result != expected {
+        t.Errorf("got %v, want %v", result, expected)
+    }
+}
+```
+
+**Rust**:
+```rust
+#[test]
+fn test_function() {
+    let result = function(input);
+    assert_eq!(result, expected);
+}
+```
+
+#### Pattern 2: Table-Driven Test
+
+**Go**:
+```go
+func TestFunction(t *testing.T) {
+    tests := []struct {
+        name     string
+        input    int
+        expected int
+    }{
+        {"case1", 1, 2},
+        {"case2", 2, 4},
+    }
+
+    for _, tt := range tests {
+        t.Run(tt.name, func(t *testing.T) {
+            result := Function(tt.input)
+            if result != tt.expected {
+                t.Errorf("got %v, want %v", result, tt.expected)
+            }
+        })
+    }
+}
+```
+
+**Rust**:
+```rust
+#[test]
+fn test_function() {
+    let tests = vec![
+        ("case1", 1, 2),
+        ("case2", 2, 4),
+    ];
+
+    for (name, input, expected) in tests {
+        let result = function(input);
+        assert_eq!(result, expected, "test case: {}", name);
+    }
+}
+```
+
+**Rust (using rstest crate)**:
+```rust
+use rstest::rstest;
+
+#[rstest]
+#[case(1, 2)]
+#[case(2, 4)]
+fn test_function(#[case] input: i32, #[case] expected: i32) {
+    let result = function(input);
+    assert_eq!(result, expected);
+}
+```
+
+#### Pattern 4: Error Path Testing
+
+**Go**:
+```go
+func TestFunction_Error(t *testing.T) {
+    _, err := Function(invalidInput)
+    if err == nil {
+        t.Error("expected error, got nil")
+    }
+}
+```
+
+**Rust**:
+```rust
+#[test]
+fn test_function_error() {
+    let result = function(invalid_input);
+    assert!(result.is_err(), "expected error");
+}
+
+#[test]
+#[should_panic(expected = "invalid input")]
+fn test_function_panic() {
+    function_that_panics(invalid_input);
+}
+```
+
+### Coverage Tools
+
+**tarpaulin**:
+```bash
+cargo tarpaulin --out Html
+cargo tarpaulin --out Lcov
+```
+
+**llvm-cov (nightly)**:
+```bash
+cargo +nightly llvm-cov --html
+cargo +nightly llvm-cov --text
+```
+
+---
+
+## Adaptation Checklist
+
+When adapting test methodology to a new language:
+
+### Phase 1: Map Core Concepts
+
+- [ ] Identify language testing framework (unittest, pytest, Jest, etc.)
+- [ ] Map test structure (functions vs classes vs methods)
+- [ ] Map assertion style (if/error vs assert vs expect)
+- [ ] Map test organization (subtests, parametrize, describe/it)
+- [ ] Map mocking approach (interfaces vs dependency injection vs mocks)
+
+### Phase 2: Adapt Patterns
+
+- [ ] Translate Pattern 1 (Unit Test) to target language
+- [ ] Translate Pattern 2 (Table-Driven) to target language
+- [ ] Translate Pattern 4 (Error Path) to target language
+- [ ] Identify language-specific patterns (e.g., decorator tests in Python)
+- [ ] Document language-specific gotchas
+
+### Phase 3: Adapt Tools
+
+- [ ] Identify coverage tool (coverage.py, Jest, tarpaulin, etc.)
+- [ ] Create coverage gap analyzer script for target language
+- [ ] Create test generator script for target language
+- [ ] Adapt automation workflow to target toolchain
+
+### Phase 4: Adapt Workflow
+
+- [ ] Update coverage generation commands
+- [ ] Update test execution commands
+- [ ] Update IDE/editor integration
+- [ ] Update CI/CD pipeline
+- [ ] Document language-specific workflow
+
+### Phase 5: Validate
+
+- [ ] Apply methodology to sample project
+- [ ] Measure effectiveness (time per test, coverage increase)
+- [ ] Document lessons learned
+- [ ] Refine patterns based on feedback
+
+---
+
+## Language-Specific Considerations
+
+### Python
+
+**Strengths**:
+- `pytest` parametrize is excellent for table-driven tests
+- Fixtures provide powerful setup/teardown
+- `unittest.mock` is very flexible
+
+**Challenges**:
+- Dynamic typing can hide errors caught at compile time in Go
+- Coverage tools sometimes struggle with decorators
+- Import-time code execution complicates testing
+
+**Tips**:
+- Use type hints to catch errors early
+- Use `pytest-cov` for coverage
+- Use `pytest-mock` for simpler mocking
+- Test module imports separately
+
+### JavaScript/TypeScript
+
+**Strengths**:
+- Jest has excellent built-in mocking
+- `test.each` is natural for table-driven tests
+- TypeScript adds compile-time type safety
+
+**Challenges**:
+- Async/Promise handling adds complexity
+- Module mocking can be tricky
+- Coverage of TypeScript types vs runtime code
+
+**Tips**:
+- Use TypeScript for better IDE support and type safety
+- Use Jest's `async/await` test support
+- Use `ts-jest` for TypeScript testing
+- Mock at module boundaries, not implementation details
+
+### Rust
+
+**Strengths**:
+- Built-in testing framework is simple and fast
+- Compile-time guarantees reduce need for some tests
+- `Result<T, E>` makes error testing explicit
+
+**Challenges**:
+- Less mature test tooling ecosystem
+- Mocking requires more setup (mockall crate)
+- Lifetime and ownership can complicate test data
+
+**Tips**:
+- Use `rstest` for parametrized tests
+- Use `mockall` for mocking traits
+- Use integration tests (`tests/` directory) for public API
+- Use unit tests for internal logic
+
+---
+
+## Effectiveness Across Languages
+
+### Expected Methodology Transfer
+
+| Language | Pattern Transfer | Tool Adaptation | Overall Transfer |
+|----------|-----------------|----------------|-----------------|
+| **Python** | 95% | 80% | 80-90% |
+| **JavaScript/TypeScript** | 90% | 75% | 75-85% |
+| **Rust** | 85% | 70% | 70-80% |
+| **Java** | 90% | 80% | 80-85% |
+| **C#** | 90% | 85% | 85-90% |
+| **Ruby** | 85% | 75% | 75-80% |
+
+### Time to Adapt
+
+| Activity | Estimated Time |
+|----------|---------------|
+| Map core concepts | 2-3 hours |
+| Adapt patterns | 3-4 hours |
+| Create automation tools | 4-6 hours |
+| Validate on sample project | 2-3 hours |
+| Document adaptations | 1-2 hours |
+| **Total** | **12-18 hours** |
+
+---
+
+**Source**: Bootstrap-002 Test Strategy Development
+**Framework**: BAIME (Bootstrapped AI Methodology Engineering)
+**Status**: Production-ready, validated through 4 iterations
--- a/skills/testing-strategy/reference/gap-closure.md
+++ b/skills/testing-strategy/reference/gap-closure.md
@@ -0,0 +1,534 @@
+# Coverage Gap Closure Methodology
+
+**Version**: 2.0
+**Source**: Bootstrap-002 Test Strategy Development
+**Last Updated**: 2025-10-18
+
+This document describes the systematic approach to closing coverage gaps through prioritization, pattern selection, and continuous verification.
+
+---
+
+## Overview
+
+Coverage gap closure is a systematic process for improving test coverage by:
+
+1. Identifying functions with low/zero coverage
+2. Prioritizing based on criticality
+3. Selecting appropriate test patterns
+4. Implementing tests efficiently
+5. Verifying coverage improvements
+6. Tracking progress
+
+---
+
+## Step-by-Step Gap Closure Process
+
+### Step 1: Baseline Coverage Analysis
+
+Generate current coverage report:
+
+```bash
+go test -coverprofile=coverage.out ./...
+go tool cover -func=coverage.out > coverage-baseline.txt
+```
+
+**Extract key metrics**:
+```bash
+# Overall coverage
+go tool cover -func=coverage.out | tail -1
+# total: (statements) 72.1%
+
+# Per-package coverage
+go tool cover -func=coverage.out | grep "^github.com" | awk '{print $1, $NF}' | sort -t: -k1,1 -k2,2n
+```
+
+**Document baseline**:
+```
+Date: 2025-10-18
+Total Coverage: 72.1%
+Packages Below Target (<75%):
+- internal/query: 65.3%
+- internal/analyzer: 68.7%
+- cmd/meta-cc: 55.2%
+```
+
+### Step 2: Identify Coverage Gaps
+
+**Automated approach** (recommended):
+```bash
+./scripts/analyze-coverage-gaps.sh coverage.out --top 20 --threshold 70
+```
+
+**Manual approach**:
+```bash
+# Find zero-coverage functions
+go tool cover -func=coverage.out | grep "0.0%" > zero-coverage.txt
+
+# Find low-coverage functions (<60%)
+go tool cover -func=coverage.out | awk '$NF+0 < 60.0' > low-coverage.txt
+
+# Group by package
+cat zero-coverage.txt | awk -F: '{print $1}' | sort | uniq -c
+```
+
+**Output example**:
+```
+Zero Coverage Functions (42 total):
+  12 internal/query/filters.go
+   8 internal/analyzer/patterns.go
+   6 cmd/meta-cc/server.go
+   ...
+
+Low Coverage Functions (<60%, 23 total):
+   7 internal/query/executor.go (45-55% coverage)
+   5 internal/parser/jsonl.go (50-58% coverage)
+   ...
+```
+
+### Step 3: Categorize and Prioritize
+
+**Categorization criteria**:
+
+| Category | Characteristics | Priority |
+|----------|----------------|----------|
+| **Error Handling** | Validation, error paths, edge cases | P1 |
+| **Business Logic** | Core algorithms, data processing | P2 |
+| **CLI Handlers** | Command execution, flag parsing | P2 |
+| **Integration** | End-to-end flows, handlers | P3 |
+| **Utilities** | Helpers, formatters | P3 |
+| **Infrastructure** | Init, setup, configuration | P4 |
+
+**Prioritization algorithm**:
+
+```
+For each function with <target coverage:
+  1. Identify category (error-handling, business-logic, etc.)
+  2. Assign priority (P1-P4)
+  3. Estimate time (based on pattern + complexity)
+  4. Estimate coverage impact (+0.1% to +0.3%)
+  5. Calculate ROI = impact / time
+  6. Sort by priority, then ROI
+```
+
+**Example prioritized list**:
+```
+P1 (Critical - Error Handling):
+1. ValidateInput (0%) - Error Path + Table → 15 min, +0.25%
+2. CheckFormat (25%) - Error Path → 12 min, +0.18%
+3. ParseQuery (33%) - Error Path + Table → 15 min, +0.20%
+
+P2 (High - Business Logic):
+4. ProcessData (45%) - Table-Driven → 12 min, +0.20%
+5. ApplyFilters (52%) - Table-Driven → 10 min, +0.15%
+
+P2 (High - CLI):
+6. ExecuteCommand (0%) - CLI Command → 13 min, +0.22%
+7. ParseFlags (38%) - Global Flag → 11 min, +0.18%
+```
+
+### Step 4: Create Test Plan
+
+For each testing session (target: 2-3 hours):
+
+**Plan template**:
+```
+Session: Validation Error Paths
+Date: 2025-10-18
+Target: +5% package coverage, +1.5% total coverage
+Time Budget: 2 hours (120 min)
+
+Tests Planned:
+1. ValidateInput - Error Path + Table (15 min) → +0.25%
+2. CheckFormat - Error Path (12 min) → +0.18%
+3. ParseQuery - Error Path + Table (15 min) → +0.20%
+4. ProcessData - Table-Driven (12 min) → +0.20%
+5. ApplyFilters - Table-Driven (10 min) → +0.15%
+6. Buffer time: 56 min (for debugging, refactoring)
+
+Expected Outcome:
+- 5 new test functions
+- Coverage: 72.1% → 73.1% (+1.0%)
+```
+
+### Step 5: Implement Tests
+
+For each test in the plan:
+
+**Workflow**:
+```bash
+# 1. Generate test scaffold
+./scripts/generate-test.sh FunctionName --pattern PATTERN
+
+# 2. Fill in test details
+vim path/to/test_file.go
+
+# 3. Run test
+go test ./package/... -v -run TestFunctionName
+
+# 4. Verify coverage improvement
+go test -coverprofile=temp.out ./package/...
+go tool cover -func=temp.out | grep FunctionName
+```
+
+**Example implementation**:
+```go
+// Generated scaffold
+func TestValidateInput_ErrorCases(t *testing.T) {
+    tests := []struct {
+        name    string
+        input   *Input  // TODO: Fill in
+        wantErr bool
+        errMsg  string
+    }{
+        {
+            name:    "nil input",
+            input:   nil,  // ← Fill in
+            wantErr: true,
+            errMsg:  "cannot be nil",  // ← Fill in
+        },
+        // TODO: Add more cases
+    }
+
+    for _, tt := range tests {
+        t.Run(tt.name, func(t *testing.T) {
+            _, err := ValidateInput(tt.input)
+            // Assertions...
+        })
+    }
+}
+
+// After filling TODOs (takes ~10-12 min per test)
+```
+
+### Step 6: Verify Coverage Impact
+
+After implementing each test:
+
+```bash
+# Run new test
+go test ./internal/validation/ -v -run TestValidateInput
+
+# Generate coverage for package
+go test -coverprofile=new_coverage.out ./internal/validation/
+
+# Compare with baseline
+echo "=== Before ==="
+go tool cover -func=coverage.out | grep "internal/validation/"
+
+echo "=== After ==="
+go tool cover -func=new_coverage.out | grep "internal/validation/"
+
+# Calculate improvement
+echo "=== Change ==="
+diff <(go tool cover -func=coverage.out | grep ValidateInput) \
+     <(go tool cover -func=new_coverage.out | grep ValidateInput)
+```
+
+**Expected output**:
+```
+=== Before ===
+internal/validation/validate.go:15:  ValidateInput  0.0%
+
+=== After ===
+internal/validation/validate.go:15:  ValidateInput  85.7%
+
+=== Change ===
+< internal/validation/validate.go:15:  ValidateInput  0.0%
+> internal/validation/validate.go:15:  ValidateInput  85.7%
+```
+
+### Step 7: Track Progress
+
+**Per-test tracking**:
+```
+Test: TestValidateInput_ErrorCases
+Time: 12 min (estimated 15 min) → 20% faster
+Pattern: Error Path + Table-Driven
+Coverage Impact:
+  - Function: 0% → 85.7% (+85.7%)
+  - Package: 57.9% → 62.3% (+4.4%)
+  - Total: 72.1% → 72.3% (+0.2%)
+Issues: None
+Notes: Table-driven very efficient for error cases
+```
+
+**Session summary**:
+```
+Session: Validation Error Paths
+Date: 2025-10-18
+Duration: 110 min (planned 120 min)
+
+Tests Completed: 5/5
+1. ValidateInput → +0.25% (actual: +0.2%)
+2. CheckFormat → +0.18% (actual: +0.15%)
+3. ParseQuery → +0.20% (actual: +0.22%)
+4. ProcessData → +0.20% (actual: +0.18%)
+5. ApplyFilters → +0.15% (actual: +0.12%)
+
+Total Impact:
+- Coverage: 72.1% → 72.97% (+0.87%)
+- Tests added: 5 test functions, 18 test cases
+- Time efficiency: 110 min / 5 tests = 22 min/test (vs 25 min/test ad-hoc)
+
+Lessons:
+- Error Path + Table-Driven pattern very effective
+- Test generator saved ~40 min total
+- Buffer time well-used for edge cases
+```
+
+### Step 8: Iterate
+
+Repeat the process:
+
+```bash
+# Update baseline
+mv new_coverage.out coverage.out
+
+# Re-analyze gaps
+./scripts/analyze-coverage-gaps.sh coverage.out --top 15
+
+# Plan next session
+# ...
+```
+
+---
+
+## Coverage Improvement Patterns
+
+### Pattern: Rapid Low-Hanging Fruit
+
+**When**: Many zero-coverage functions, need quick wins
+
+**Approach**:
+1. Target P1/P2 zero-coverage functions
+2. Use simple patterns (Unit, Table-Driven)
+3. Skip complex infrastructure functions
+4. Aim for 60-70% function coverage quickly
+
+**Expected**: +5-10% total coverage in 3-4 hours
+
+### Pattern: Systematic Package Closure
+
+**When**: Specific package below target
+
+**Approach**:
+1. Focus on single package
+2. Close all P1/P2 gaps in that package
+3. Achieve 75-80% package coverage
+4. Move to next package
+
+**Expected**: +10-15% package coverage in 4-6 hours
+
+### Pattern: Critical Path Hardening
+
+**When**: Need high confidence in core functionality
+
+**Approach**:
+1. Identify critical business logic
+2. Achieve 85-90% coverage on critical functions
+3. Use Error Path + Integration patterns
+4. Add edge case coverage
+
+**Expected**: +0.5-1% total coverage per critical function
+
+---
+
+## Troubleshooting
+
+### Issue: Coverage Not Increasing
+
+**Symptoms**: Add tests, coverage stays same
+
+**Diagnosis**:
+```bash
+# Check if function is actually being tested
+go test -coverprofile=coverage.out ./...
+go tool cover -func=coverage.out | grep FunctionName
+```
+
+**Causes**:
+- Testing already-covered code (indirect coverage)
+- Test not actually calling target function
+- Function has unreachable code
+
+**Solutions**:
+- Focus on 0% coverage functions
+- Verify test actually exercises target code path
+- Use coverage visualization: `go tool cover -html=coverage.out`
+
+### Issue: Coverage Decreasing
+
+**Symptoms**: Coverage goes down after adding code
+
+**Causes**:
+- New code added without tests
+- Refactoring exposed previously hidden code
+
+**Solutions**:
+- Always add tests for new code (TDD)
+- Update coverage baseline after new features
+- Set up pre-commit hooks to block coverage decreases
+
+### Issue: Hard to Test Functions
+
+**Symptoms**: Can't achieve good coverage on certain functions
+
+**Causes**:
+- Complex dependencies
+- Infrastructure code (init, config)
+- Difficult-to-mock external systems
+
+**Solutions**:
+- Use Dependency Injection (Pattern 6)
+- Accept lower coverage for infrastructure (40-60%)
+- Consider refactoring if truly untestable
+- Extract testable business logic
+
+### Issue: Slow Progress
+
+**Symptoms**: Tests take much longer than estimated
+
+**Causes**:
+- Complex setup required
+- Unclear function behavior
+- Pattern mismatch
+
+**Solutions**:
+- Create test helpers (Pattern 5)
+- Read function implementation first
+- Adjust pattern selection
+- Break into smaller tests
+
+---
+
+## Metrics and Goals
+
+### Healthy Coverage Progression
+
+**Typical trajectory** (starting from 60-70%):
+
+```
+Week 1: 62% → 68% (+6%)  - Low-hanging fruit
+Week 2: 68% → 72% (+4%)  - Package-focused
+Week 3: 72% → 75% (+3%)  - Critical paths
+Week 4: 75% → 77% (+2%)  - Edge cases
+Maintenance: 75-80%      - New code + decay prevention
+```
+
+**Time investment**:
+- Initial ramp-up: 8-12 hours total
+- Maintenance: 1-2 hours per week
+
+### Coverage Targets by Project Phase
+
+| Phase | Target | Focus |
+|-------|--------|-------|
+| **MVP** | 50-60% | Core happy paths |
+| **Beta** | 65-75% | + Error handling |
+| **Production** | 75-80% | + Edge cases, integration |
+| **Mature** | 80-85% | + Documentation examples |
+
+### When to Stop
+
+**Diminishing returns** occur when:
+- Coverage >80% total
+- All P1/P2 functions >75%
+- Remaining gaps are infrastructure/init code
+- Time per 1% increase >3 hours
+
+**Don't aim for 100%**:
+- Infrastructure code hard to test (40-60% ok)
+- Some code paths may be unreachable
+- ROI drops significantly >85%
+
+---
+
+## Example: Complete Gap Closure Session
+
+### Starting State
+
+```
+Package: internal/validation
+Current Coverage: 57.9%
+Target Coverage: 75%+
+Gap: 17.1%
+
+Zero Coverage Functions:
+- ValidateInput (0%)
+- CheckFormat (0%)
+- ParseQuery (0%)
+
+Low Coverage Functions:
+- ValidateFilter (45%)
+- NormalizeInput (52%)
+```
+
+### Plan
+
+```
+Session: Close validation coverage gaps
+Time Budget: 2 hours
+Target: 57.9% → 75%+ (+17.1%)
+
+Tests:
+1. ValidateInput (15 min) → +4.5%
+2. CheckFormat (12 min) → +3.2%
+3. ParseQuery (15 min) → +4.1%
+4. ValidateFilter gaps (12 min) → +2.8%
+5. NormalizeInput gaps (10 min) → +2.5%
+Total: 64 min active, 56 min buffer
+```
+
+### Execution
+
+```bash
+# Test 1: ValidateInput
+$ ./scripts/generate-test.sh ValidateInput --pattern error-path --scenarios 4
+$ vim internal/validation/validate_test.go
+# ... fill in TODOs (10 min) ...
+$ go test ./internal/validation/ -run TestValidateInput -v
+PASS (12 min actual)
+
+# Test 2: CheckFormat
+$ ./scripts/generate-test.sh CheckFormat --pattern error-path --scenarios 3
+$ vim internal/validation/format_test.go
+# ... fill in TODOs (8 min) ...
+$ go test ./internal/validation/ -run TestCheckFormat -v
+PASS (11 min actual)
+
+# Test 3: ParseQuery
+$ ./scripts/generate-test.sh ParseQuery --pattern table-driven --scenarios 5
+$ vim internal/validation/query_test.go
+# ... fill in TODOs (12 min) ...
+$ go test ./internal/validation/ -run TestParseQuery -v
+PASS (14 min actual)
+
+# Test 4: ValidateFilter (add missing cases)
+$ vim internal/validation/filter_test.go
+# ... add 3 edge cases (8 min) ...
+$ go test ./internal/validation/ -run TestValidateFilter -v
+PASS (10 min actual)
+
+# Test 5: NormalizeInput (add missing cases)
+$ vim internal/validation/normalize_test.go
+# ... add 2 edge cases (6 min) ...
+$ go test ./internal/validation/ -run TestNormalizeInput -v
+PASS (8 min actual)
+```
+
+### Result
+
+```
+Time: 55 min (vs 64 min estimated)
+Coverage: 57.9% → 75.2% (+17.3%)
+Tests Added: 5 functions, 17 test cases
+Efficiency: 11 min per test (vs 15 min ad-hoc estimate)
+
+SUCCESS: Target achieved (75%+)
+```
+
+---
+
+**Source**: Bootstrap-002 Test Strategy Development
+**Framework**: BAIME (Bootstrapped AI Methodology Engineering)
+**Status**: Production-ready, validated through 4 iterations
--- a/skills/testing-strategy/reference/patterns.md
+++ b/skills/testing-strategy/reference/patterns.md
@@ -0,0 +1,425 @@
+# Test Pattern Library
+
+**Version**: 2.0
+**Source**: Bootstrap-002 Test Strategy Development
+**Last Updated**: 2025-10-18
+
+This document provides 8 proven test patterns for Go testing with practical examples and usage guidance.
+
+---
+
+## Pattern 1: Unit Test Pattern
+
+**Purpose**: Test a single function or method in isolation
+
+**Structure**:
+```go
+func TestFunctionName_Scenario(t *testing.T) {
+    // Setup
+    input := createTestInput()
+
+    // Execute
+    result, err := FunctionUnderTest(input)
+
+    // Assert
+    if err != nil {
+        t.Fatalf("unexpected error: %v", err)
+    }
+
+    if result != expected {
+        t.Errorf("expected %v, got %v", expected, result)
+    }
+}
+```
+
+**When to Use**:
+- Testing pure functions (no side effects)
+- Simple input/output validation
+- Single test scenario
+
+**Time**: ~8-10 minutes per test
+
+---
+
+## Pattern 2: Table-Driven Test Pattern
+
+**Purpose**: Test multiple scenarios with the same test logic
+
+**Structure**:
+```go
+func TestFunction(t *testing.T) {
+    tests := []struct {
+        name     string
+        input    InputType
+        expected OutputType
+        wantErr  bool
+    }{
+        {
+            name:     "valid input",
+            input:    validInput,
+            expected: validOutput,
+            wantErr:  false,
+        },
+        {
+            name:     "invalid input",
+            input:    invalidInput,
+            expected: zeroValue,
+            wantErr:  true,
+        },
+    }
+
+    for _, tt := range tests {
+        t.Run(tt.name, func(t *testing.T) {
+            result, err := Function(tt.input)
+
+            if (err != nil) != tt.wantErr {
+                t.Errorf("Function() error = %v, wantErr %v", err, tt.wantErr)
+                return
+            }
+
+            if !tt.wantErr && result != tt.expected {
+                t.Errorf("Function() = %v, expected %v", result, tt.expected)
+            }
+        })
+    }
+}
+```
+
+**When to Use**:
+- Testing boundary conditions
+- Multiple input variations
+- Comprehensive coverage
+
+**Time**: ~10-15 minutes for 3-5 scenarios
+
+---
+
+## Pattern 3: Integration Test Pattern
+
+**Purpose**: Test complete request/response flow through handlers
+
+**Structure**:
+```go
+func TestHandler(t *testing.T) {
+    // Setup: Create request
+    req := createTestRequest()
+
+    // Setup: Capture output
+    var buf bytes.Buffer
+    outputWriter = &buf
+    defer func() { outputWriter = originalWriter }()
+
+    // Execute
+    handleRequest(req)
+
+    // Assert: Parse response
+    var resp Response
+    if err := json.Unmarshal(buf.Bytes(), &resp); err != nil {
+        t.Fatalf("failed to parse response: %v", err)
+    }
+
+    // Assert: Validate response
+    if resp.Error != nil {
+        t.Errorf("unexpected error: %v", resp.Error)
+    }
+}
+```
+
+**When to Use**:
+- Testing MCP server handlers
+- HTTP endpoint testing
+- End-to-end flows
+
+**Time**: ~15-20 minutes per test
+
+---
+
+## Pattern 4: Error Path Test Pattern
+
+**Purpose**: Systematically test error handling and edge cases
+
+**Structure**:
+```go
+func TestFunction_ErrorCases(t *testing.T) {
+    tests := []struct {
+        name    string
+        input   InputType
+        wantErr bool
+        errMsg  string
+    }{
+        {
+            name:    "nil input",
+            input:   nil,
+            wantErr: true,
+            errMsg:  "input cannot be nil",
+        },
+        {
+            name:    "empty input",
+            input:   InputType{},
+            wantErr: true,
+            errMsg:  "input cannot be empty",
+        },
+    }
+
+    for _, tt := range tests {
+        t.Run(tt.name, func(t *testing.T) {
+            _, err := Function(tt.input)
+
+            if (err != nil) != tt.wantErr {
+                t.Errorf("Function() error = %v, wantErr %v", err, tt.wantErr)
+                return
+            }
+
+            if tt.wantErr && !strings.Contains(err.Error(), tt.errMsg) {
+                t.Errorf("expected error containing '%s', got '%s'", tt.errMsg, err.Error())
+            }
+        })
+    }
+}
+```
+
+**When to Use**:
+- Testing validation logic
+- Boundary condition testing
+- Error recovery
+
+**Time**: ~12-15 minutes for 3-4 error cases
+
+---
+
+## Pattern 5: Test Helper Pattern
+
+**Purpose**: Reduce duplication and improve maintainability
+
+**Structure**:
+```go
+// Test helper function
+func createTestInput(t *testing.T, options ...Option) *InputType {
+    t.Helper()  // Mark as helper for better error reporting
+
+    input := &InputType{
+        Field1: "default",
+        Field2: 42,
+    }
+
+    for _, opt := range options {
+        opt(input)
+    }
+
+    return input
+}
+
+// Usage
+func TestFunction(t *testing.T) {
+    input := createTestInput(t, WithField1("custom"))
+    result, err := Function(input)
+    // ...
+}
+```
+
+**When to Use**:
+- Complex test setup
+- Repeated fixture creation
+- Test data builders
+
+**Time**: ~5 minutes to create, saves 2-3 min per test using it
+
+---
+
+## Pattern 6: Dependency Injection Pattern
+
+**Purpose**: Test components that depend on external systems
+
+**Structure**:
+```go
+// 1. Define interface
+type Executor interface {
+    Execute(args Args) (Result, error)
+}
+
+// 2. Production implementation
+type RealExecutor struct{}
+func (e *RealExecutor) Execute(args Args) (Result, error) {
+    // Real implementation
+}
+
+// 3. Mock implementation
+type MockExecutor struct {
+    Results map[string]Result
+    Errors  map[string]error
+}
+
+func (m *MockExecutor) Execute(args Args) (Result, error) {
+    if err, ok := m.Errors[args.Key]; ok {
+        return Result{}, err
+    }
+    return m.Results[args.Key], nil
+}
+
+// 4. Tests use mock
+func TestProcess(t *testing.T) {
+    mock := &MockExecutor{
+        Results: map[string]Result{"key": {Value: "expected"}},
+    }
+    err := ProcessData(mock, testData)
+    // ...
+}
+```
+
+**When to Use**:
+- Testing components that execute commands
+- Testing HTTP clients
+- Testing database operations
+
+**Time**: ~20-25 minutes (includes refactoring)
+
+---
+
+## Pattern 7: CLI Command Test Pattern
+
+**Purpose**: Test Cobra command execution with flags
+
+**Structure**:
+```go
+func TestCommand(t *testing.T) {
+    // Setup: Create command
+    cmd := &cobra.Command{
+        Use: "command",
+        RunE: func(cmd *cobra.Command, args []string) error {
+            // Command logic
+            return nil
+        },
+    }
+
+    // Setup: Add flags
+    cmd.Flags().StringP("flag", "f", "default", "description")
+
+    // Setup: Set arguments
+    cmd.SetArgs([]string{"--flag", "value"})
+
+    // Setup: Capture output
+    var buf bytes.Buffer
+    cmd.SetOut(&buf)
+
+    // Execute
+    err := cmd.Execute()
+
+    // Assert
+    if err != nil {
+        t.Fatalf("command failed: %v", err)
+    }
+
+    // Verify output
+    if !strings.Contains(buf.String(), "expected") {
+        t.Errorf("unexpected output: %s", buf.String())
+    }
+}
+```
+
+**When to Use**:
+- Testing CLI command handlers
+- Flag parsing verification
+- Command composition testing
+
+**Time**: ~12-15 minutes per test
+
+---
+
+## Pattern 8: Global Flag Test Pattern
+
+**Purpose**: Test global flag parsing and propagation
+
+**Structure**:
+```go
+func TestGlobalFlags(t *testing.T) {
+    tests := []struct {
+        name     string
+        args     []string
+        expected GlobalOptions
+    }{
+        {
+            name: "default",
+            args: []string{},
+            expected: GlobalOptions{ProjectPath: getCwd()},
+        },
+        {
+            name: "with flag",
+            args: []string{"--session", "abc"},
+            expected: GlobalOptions{SessionID: "abc"},
+        },
+    }
+
+    for _, tt := range tests {
+        t.Run(tt.name, func(t *testing.T) {
+            resetGlobalFlags()  // Important: reset state
+            rootCmd.SetArgs(tt.args)
+            rootCmd.ParseFlags(tt.args)
+            opts := getGlobalOptions()
+
+            if opts.SessionID != tt.expected.SessionID {
+                t.Errorf("SessionID = %v, expected %v", opts.SessionID, tt.expected.SessionID)
+            }
+        })
+    }
+}
+```
+
+**When to Use**:
+- Testing global flag parsing
+- Flag interaction testing
+- Option struct population
+
+**Time**: ~10-12 minutes (table-driven, high efficiency)
+
+---
+
+## Pattern Selection Decision Tree
+
+```
+What are you testing?
+├─ CLI command with flags?
+│  ├─ Multiple flag combinations? → Pattern 8 (Global Flag)
+│  ├─ Integration test needed? → Pattern 7 (CLI Command)
+│  └─ Command execution? → Pattern 7 (CLI Command)
+├─ Error paths?
+│  ├─ Multiple error scenarios? → Pattern 4 (Error Path) + Pattern 2 (Table-Driven)
+│  └─ Single error case? → Pattern 4 (Error Path)
+├─ Unit function?
+│  ├─ Multiple inputs? → Pattern 2 (Table-Driven)
+│  └─ Single input? → Pattern 1 (Unit Test)
+├─ External dependency?
+│  └─ → Pattern 6 (Dependency Injection)
+└─ Integration flow?
+   └─ → Pattern 3 (Integration Test)
+```
+
+---
+
+## Pattern Efficiency Metrics
+
+**Time per Test** (measured):
+- Unit Test (Pattern 1): ~8 min
+- Table-Driven (Pattern 2): ~12 min (3-4 scenarios)
+- Integration Test (Pattern 3): ~18 min
+- Error Path (Pattern 4): ~14 min (4 scenarios)
+- Test Helper (Pattern 5): ~5 min to create
+- Dependency Injection (Pattern 6): ~22 min (includes refactoring)
+- CLI Command (Pattern 7): ~13 min
+- Global Flag (Pattern 8): ~11 min
+
+**Coverage Impact per Test**:
+- Table-Driven: 0.20-0.30% total coverage (high impact)
+- Error Path: 0.10-0.15% total coverage
+- CLI Command: 0.15-0.25% total coverage
+- Unit Test: 0.10-0.20% total coverage
+
+**Best ROI Patterns**:
+1. Global Flag Tests (Pattern 8): High coverage, fast execution
+2. Table-Driven Tests (Pattern 2): Multiple scenarios, efficient
+3. Error Path Tests (Pattern 4): Critical coverage, systematic
+
+---
+
+**Source**: Bootstrap-002 Test Strategy Development
+**Framework**: BAIME (Bootstrapped AI Methodology Engineering)
+**Status**: Production-ready, validated through 4 iterations
--- a/skills/testing-strategy/reference/quality-criteria.md
+++ b/skills/testing-strategy/reference/quality-criteria.md
@@ -0,0 +1,442 @@
+# Test Quality Standards
+
+**Version**: 2.0
+**Source**: Bootstrap-002 Test Strategy Development
+**Last Updated**: 2025-10-18
+
+This document defines quality criteria, coverage targets, and best practices for test development.
+
+---
+
+## Test Quality Checklist
+
+For every test, ensure compliance with these quality standards:
+
+### Structure
+
+- [ ] Test name clearly describes scenario
+- [ ] Setup is minimal and focused
+- [ ] Single concept tested per test
+- [ ] Clear error messages with context
+
+### Execution
+
+- [ ] Cleanup handled (defer, t.Cleanup)
+- [ ] No hard-coded paths or values
+- [ ] Deterministic (no randomness)
+- [ ] Fast execution (<100ms for unit tests)
+
+### Coverage
+
+- [ ] Tests both happy and error paths
+- [ ] Uses test helpers where appropriate
+- [ ] Follows documented patterns
+- [ ] Includes edge cases
+
+---
+
+## CLI Test Additional Checklist
+
+When testing CLI commands, also ensure:
+
+- [ ] Command flags reset between tests
+- [ ] Output captured properly (stdout/stderr)
+- [ ] Environment variables reset (if used)
+- [ ] Working directory restored (if changed)
+- [ ] Temporary files cleaned up
+- [ ] No dependency on external binaries (unless integration test)
+- [ ] Tests both happy path and error cases
+- [ ] Help text validated (if command has help)
+
+---
+
+## Coverage Target Goals
+
+### By Category
+
+Different code categories require different coverage levels based on criticality:
+
+| Category | Target Coverage | Priority | Rationale |
+|----------|----------------|----------|-----------|
+| Error Handling | 80-90% | P1 | Critical for reliability |
+| Business Logic | 75-85% | P2 | Core functionality |
+| CLI Handlers | 70-80% | P2 | User-facing behavior |
+| Integration | 70-80% | P3 | End-to-end validation |
+| Utilities | 60-70% | P3 | Supporting functions |
+| Infrastructure | 40-60% | P4 | Best effort |
+
+**Overall Project Target**: 75-80%
+
+### Priority Decision Tree
+
+```
+Is function critical to core functionality?
+├─ YES: Is it error handling or validation?
+│  ├─ YES: Priority 1 (80%+ coverage target)
+│  └─ NO: Is it business logic?
+│     ├─ YES: Priority 2 (75%+ coverage)
+│     └─ NO: Priority 3 (60%+ coverage)
+└─ NO: Is it infrastructure/initialization?
+   ├─ YES: Priority 4 (test if easy, skip if hard)
+   └─ NO: Priority 5 (skip)
+```
+
+---
+
+## Test Naming Conventions
+
+### Unit Tests
+
+```go
+// Format: TestFunctionName_Scenario
+TestValidateInput_NilInput
+TestValidateInput_EmptyInput
+TestProcessData_ValidFormat
+```
+
+### Table-Driven Tests
+
+```go
+// Format: TestFunctionName (scenarios in table)
+TestValidateInput  // Table contains: "nil input", "empty input", etc.
+TestProcessData    // Table contains: "valid format", "invalid format", etc.
+```
+
+### Integration Tests
+
+```go
+// Format: TestHandler_Scenario or TestIntegration_Feature
+TestQueryTools_SuccessfulQuery
+TestGetSessionStats_ErrorHandling
+TestIntegration_CompleteWorkflow
+```
+
+---
+
+## Test Structure Best Practices
+
+### Setup-Execute-Assert Pattern
+
+```go
+func TestFunction(t *testing.T) {
+    // Setup: Create test data and dependencies
+    input := createTestInput()
+    mock := createMockDependency()
+
+    // Execute: Call the function under test
+    result, err := Function(input, mock)
+
+    // Assert: Verify expected behavior
+    if err != nil {
+        t.Fatalf("unexpected error: %v", err)
+    }
+    if result != expected {
+        t.Errorf("expected %v, got %v", expected, result)
+    }
+}
+```
+
+### Cleanup Handling
+
+```go
+func TestFunction(t *testing.T) {
+    // Using defer for cleanup
+    originalValue := globalVar
+    defer func() { globalVar = originalValue }()
+
+    // Or using t.Cleanup (preferred)
+    t.Cleanup(func() {
+        globalVar = originalValue
+    })
+
+    // Test logic...
+}
+```
+
+### Helper Functions
+
+```go
+// Mark as helper for better error reporting
+func createTestInput(t *testing.T) *Input {
+    t.Helper()  // Errors will point to caller, not this line
+
+    return &Input{
+        Field1: "test",
+        Field2: 42,
+    }
+}
+```
+
+---
+
+## Error Message Guidelines
+
+### Good Error Messages
+
+```go
+// Include context and actual values
+if result != expected {
+    t.Errorf("Function() = %v, expected %v", result, expected)
+}
+
+// Include relevant state
+if len(results) != expectedCount {
+    t.Errorf("got %d results, expected %d: %+v",
+        len(results), expectedCount, results)
+}
+```
+
+### Poor Error Messages
+
+```go
+// Avoid: No context
+if err != nil {
+    t.Fatal("error occurred")
+}
+
+// Avoid: Missing actual values
+if !valid {
+    t.Error("validation failed")
+}
+```
+
+---
+
+## Test Performance Standards
+
+### Unit Tests
+
+- **Target**: <100ms per test
+- **Maximum**: <500ms per test
+- **If slower**: Consider mocking or refactoring
+
+### Integration Tests
+
+- **Target**: <1s per test
+- **Maximum**: <5s per test
+- **If slower**: Use `testing.Short()` to skip in short mode
+
+```go
+func TestIntegration_SlowOperation(t *testing.T) {
+    if testing.Short() {
+        t.Skip("skipping slow integration test in short mode")
+    }
+    // Test logic...
+}
+```
+
+### Running Tests
+
+```bash
+# Fast tests only
+go test -short ./...
+
+# All tests with timeout
+go test -timeout 5m ./...
+```
+
+---
+
+## Test Data Management
+
+### Inline Test Data
+
+For small, simple data:
+
+```go
+tests := []struct {
+    name  string
+    input string
+}{
+    {"empty", ""},
+    {"single", "a"},
+    {"multiple", "abc"},
+}
+```
+
+### Fixture Files
+
+For complex data structures:
+
+```go
+func loadTestFixture(t *testing.T, name string) []byte {
+    t.Helper()
+    data, err := os.ReadFile(filepath.Join("testdata", name))
+    if err != nil {
+        t.Fatalf("failed to load fixture %s: %v", name, err)
+    }
+    return data
+}
+```
+
+### Golden Files
+
+For output validation:
+
+```go
+func TestFormatOutput(t *testing.T) {
+    output := formatOutput(testData)
+
+    goldenPath := filepath.Join("testdata", "expected_output.golden")
+
+    if *update {
+        os.WriteFile(goldenPath, []byte(output), 0644)
+    }
+
+    expected, _ := os.ReadFile(goldenPath)
+    if string(expected) != output {
+        t.Errorf("output mismatch\ngot:\n%s\nwant:\n%s", output, expected)
+    }
+}
+```
+
+---
+
+## Common Anti-Patterns to Avoid
+
+### 1. Testing Implementation Instead of Behavior
+
+```go
+// Bad: Tests internal implementation
+func TestFunction(t *testing.T) {
+    obj := New()
+    if obj.internalField != "expected" {  // Don't test internals
+        t.Error("internal field wrong")
+    }
+}
+
+// Good: Tests observable behavior
+func TestFunction(t *testing.T) {
+    obj := New()
+    result := obj.PublicMethod()  // Test public interface
+    if result != expected {
+        t.Error("unexpected result")
+    }
+}
+```
+
+### 2. Overly Complex Test Setup
+
+```go
+// Bad: Complex setup obscures test intent
+func TestFunction(t *testing.T) {
+    // 50 lines of setup...
+    result := Function(complex, setup, params)
+    // Assert...
+}
+
+// Good: Use helper functions
+func TestFunction(t *testing.T) {
+    setup := createTestSetup(t)  // Helper abstracts complexity
+    result := Function(setup)
+    // Assert...
+}
+```
+
+### 3. Testing Multiple Concepts in One Test
+
+```go
+// Bad: Tests multiple unrelated things
+func TestValidation(t *testing.T) {
+    // Tests format validation
+    // Tests length validation
+    // Tests encoding validation
+    // Tests error handling
+}
+
+// Good: Separate tests for each concept
+func TestValidation_Format(t *testing.T) { /*...*/ }
+func TestValidation_Length(t *testing.T) { /*...*/ }
+func TestValidation_Encoding(t *testing.T) { /*...*/ }
+func TestValidation_ErrorHandling(t *testing.T) { /*...*/ }
+```
+
+### 4. Shared State Between Tests
+
+```go
+// Bad: Tests depend on execution order
+var sharedState string
+
+func TestFirst(t *testing.T) {
+    sharedState = "initialized"
+}
+
+func TestSecond(t *testing.T) {
+    // Breaks if TestFirst doesn't run first
+    if sharedState != "initialized" { /*...*/ }
+}
+
+// Good: Each test is independent
+func TestFirst(t *testing.T) {
+    state := "initialized"  // Local state
+    // Test...
+}
+
+func TestSecond(t *testing.T) {
+    state := setupState()  // Creates own state
+    // Test...
+}
+```
+
+---
+
+## Code Review Checklist for Tests
+
+When reviewing test code, verify:
+
+- [ ] Tests are independent (can run in any order)
+- [ ] Test names are descriptive
+- [ ] Happy path and error paths both covered
+- [ ] Edge cases included
+- [ ] No magic numbers or strings (use constants)
+- [ ] Cleanup handled properly
+- [ ] Error messages provide context
+- [ ] Tests are reasonably fast
+- [ ] No commented-out test code
+- [ ] Follows established patterns in codebase
+
+---
+
+## Continuous Improvement
+
+### Track Test Metrics
+
+Record for each test batch:
+
+```
+Date: 2025-10-18
+Batch: Validation error paths (4 tests)
+Pattern: Error Path + Table-Driven
+Time: 50 min (estimated 60 min) → 17% faster
+Coverage: internal/validation 57.9% → 75.2% (+17.3%)
+Total coverage: 72.3% → 73.5% (+1.2%)
+Efficiency: 0.3% per test
+Issues: None
+Lessons: Table-driven error tests very efficient
+```
+
+### Regular Coverage Analysis
+
+```bash
+# Weekly coverage review
+go test -coverprofile=coverage.out ./...
+go tool cover -func=coverage.out | tail -20
+
+# Identify degradation
+diff coverage-last-week.txt coverage-this-week.txt
+```
+
+### Test Suite Health
+
+Monitor:
+- Total test count (growing)
+- Test execution time (stable or decreasing)
+- Coverage percentage (stable or increasing)
+- Flaky test rate (near zero)
+- Test maintenance time (decreasing)
+
+---
+
+**Source**: Bootstrap-002 Test Strategy Development
+**Framework**: BAIME (Bootstrapped AI Methodology Engineering)
+**Status**: Production-ready, validated through 4 iterations
--- a/skills/testing-strategy/reference/tdd-workflow.md
+++ b/skills/testing-strategy/reference/tdd-workflow.md
@@ -0,0 +1,545 @@
+# TDD Workflow and Coverage-Driven Development
+
+**Version**: 2.0
+**Source**: Bootstrap-002 Test Strategy Development
+**Last Updated**: 2025-10-18
+
+This document describes the Test-Driven Development (TDD) workflow and coverage-driven testing approach.
+
+---
+
+## Coverage-Driven Workflow
+
+### Step 1: Generate Coverage Report
+
+```bash
+go test -coverprofile=coverage.out ./...
+go tool cover -func=coverage.out > coverage-by-func.txt
+```
+
+### Step 2: Identify Gaps
+
+**Option A: Use automation tool**
+```bash
+./scripts/analyze-coverage-gaps.sh coverage.out --top 15
+```
+
+**Option B: Manual analysis**
+```bash
+# Find low-coverage functions
+go tool cover -func=coverage.out | grep "^github.com" | awk '$NF < 60.0'
+
+# Find zero-coverage functions
+go tool cover -func=coverage.out | grep "0.0%"
+```
+
+### Step 3: Prioritize Targets
+
+**Decision Tree**:
+```
+Is function critical to core functionality?
+├─ YES: Is it error handling or validation?
+│  ├─ YES: Priority 1 (80%+ coverage target)
+│  └─ NO: Is it business logic?
+│     ├─ YES: Priority 2 (75%+ coverage)
+│     └─ NO: Priority 3 (60%+ coverage)
+└─ NO: Is it infrastructure/initialization?
+   ├─ YES: Priority 4 (test if easy, skip if hard)
+   └─ NO: Priority 5 (skip)
+```
+
+**Priority Matrix**:
+| Category | Target Coverage | Priority | Time/Test |
+|----------|----------------|----------|-----------|
+| Error Handling | 80-90% | P1 | 15 min |
+| Business Logic | 75-85% | P2 | 12 min |
+| CLI Handlers | 70-80% | P2 | 12 min |
+| Integration | 70-80% | P3 | 20 min |
+| Utilities | 60-70% | P3 | 8 min |
+| Infrastructure | Best effort | P4 | 25 min |
+
+### Step 4: Select Pattern
+
+**Pattern Selection Decision Tree**:
+```
+What are you testing?
+├─ CLI command with flags?
+│  ├─ Multiple flag combinations? → Pattern 8 (Global Flag)
+│  ├─ Integration test needed? → Pattern 7 (CLI Command)
+│  └─ Command execution? → Pattern 7 (CLI Command)
+├─ Error paths?
+│  ├─ Multiple error scenarios? → Pattern 4 (Error Path) + Pattern 2 (Table-Driven)
+│  └─ Single error case? → Pattern 4 (Error Path)
+├─ Unit function?
+│  ├─ Multiple inputs? → Pattern 2 (Table-Driven)
+│  └─ Single input? → Pattern 1 (Unit Test)
+├─ External dependency?
+│  └─ → Pattern 6 (Dependency Injection)
+└─ Integration flow?
+   └─ → Pattern 3 (Integration Test)
+```
+
+### Step 5: Generate Test
+
+**Option A: Use automation tool**
+```bash
+./scripts/generate-test.sh FunctionName --pattern PATTERN --scenarios N
+```
+
+**Option B: Manual from template**
+- Copy pattern template from patterns.md
+- Adapt to function signature
+- Fill in test data
+
+### Step 6: Implement Test
+
+1. Fill in TODO comments
+2. Add test data (inputs, expected outputs)
+3. Customize assertions
+4. Add edge cases
+
+### Step 7: Verify Coverage Impact
+
+```bash
+# Run tests
+go test ./package/...
+
+# Generate new coverage
+go test -coverprofile=new_coverage.out ./...
+
+# Compare
+echo "Old coverage:"
+go tool cover -func=coverage.out | tail -1
+
+echo "New coverage:"
+go tool cover -func=new_coverage.out | tail -1
+
+# Show improved functions
+diff <(go tool cover -func=coverage.out) <(go tool cover -func=new_coverage.out) | grep "^>"
+```
+
+### Step 8: Track Metrics
+
+**Per Test Batch**:
+- Pattern(s) used
+- Time spent (actual)
+- Coverage increase achieved
+- Issues encountered
+
+**Example Log**:
+```
+Date: 2025-10-18
+Batch: Validation error paths (4 tests)
+Pattern: Error Path + Table-Driven
+Time: 50 min (estimated 60 min) → 17% faster
+Coverage: internal/validation 57.9% → 75.2% (+17.3%)
+Total coverage: 72.3% → 73.5% (+1.2%)
+Efficiency: 0.3% per test
+Issues: None
+Lessons: Table-driven error tests very efficient
+```
+
+---
+
+## Red-Green-Refactor TDD Cycle
+
+### Overview
+
+The classic TDD cycle consists of three phases:
+
+1. **Red**: Write a failing test
+2. **Green**: Write minimal code to make it pass
+3. **Refactor**: Improve code while keeping tests green
+
+### Phase 1: Red (Write Failing Test)
+
+**Goal**: Define expected behavior through a test that fails
+
+```go
+func TestValidateEmail_ValidFormat(t *testing.T) {
+    // Write test BEFORE implementation exists
+    email := "user@example.com"
+
+    err := ValidateEmail(email)  // Function doesn't exist yet
+
+    if err != nil {
+        t.Errorf("ValidateEmail(%s) returned error: %v", email, err)
+    }
+}
+```
+
+**Run test**:
+```bash
+$ go test ./...
+# Compilation error: ValidateEmail undefined
+```
+
+**Checklist for Red Phase**:
+- [ ] Test clearly describes expected behavior
+- [ ] Test compiles (stub function if needed)
+- [ ] Test fails for the right reason
+- [ ] Failure message is clear
+
+### Phase 2: Green (Make It Pass)
+
+**Goal**: Write simplest possible code to make test pass
+
+```go
+func ValidateEmail(email string) error {
+    // Minimal implementation
+    if !strings.Contains(email, "@") {
+        return fmt.Errorf("invalid email: missing @")
+    }
+    return nil
+}
+```
+
+**Run test**:
+```bash
+$ go test ./...
+PASS
+```
+
+**Checklist for Green Phase**:
+- [ ] Test passes
+- [ ] Implementation is minimal (no over-engineering)
+- [ ] No premature optimization
+- [ ] All existing tests still pass
+
+### Phase 3: Refactor (Improve Code)
+
+**Goal**: Improve code quality without changing behavior
+
+```go
+func ValidateEmail(email string) error {
+    // Refactor: Use regex for proper validation
+    emailRegex := regexp.MustCompile(`^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$`)
+    if !emailRegex.MatchString(email) {
+        return fmt.Errorf("invalid email format: %s", email)
+    }
+    return nil
+}
+```
+
+**Run tests**:
+```bash
+$ go test ./...
+PASS  # All tests still pass after refactoring
+```
+
+**Checklist for Refactor Phase**:
+- [ ] Code is more readable
+- [ ] Duplication eliminated
+- [ ] All tests still pass
+- [ ] No new functionality added
+
+---
+
+## TDD for New Features
+
+### Example: Add Email Validation Feature
+
+**Iteration 1: Basic Structure**
+
+1. **Red**: Test for valid email
+```go
+func TestValidateEmail_ValidFormat(t *testing.T) {
+    err := ValidateEmail("user@example.com")
+    if err != nil {
+        t.Errorf("valid email rejected: %v", err)
+    }
+}
+```
+
+2. **Green**: Minimal implementation
+```go
+func ValidateEmail(email string) error {
+    if !strings.Contains(email, "@") {
+        return fmt.Errorf("invalid email")
+    }
+    return nil
+}
+```
+
+3. **Refactor**: Extract constant
+```go
+const emailPattern = "@"
+
+func ValidateEmail(email string) error {
+    if !strings.Contains(email, emailPattern) {
+        return fmt.Errorf("invalid email")
+    }
+    return nil
+}
+```
+
+**Iteration 2: Add Edge Cases**
+
+1. **Red**: Test for empty email
+```go
+func TestValidateEmail_Empty(t *testing.T) {
+    err := ValidateEmail("")
+    if err == nil {
+        t.Error("empty email should be invalid")
+    }
+}
+```
+
+2. **Green**: Add empty check
+```go
+func ValidateEmail(email string) error {
+    if email == "" {
+        return fmt.Errorf("email cannot be empty")
+    }
+    if !strings.Contains(email, "@") {
+        return fmt.Errorf("invalid email")
+    }
+    return nil
+}
+```
+
+3. **Refactor**: Use regex
+```go
+var emailRegex = regexp.MustCompile(`^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$`)
+
+func ValidateEmail(email string) error {
+    if email == "" {
+        return fmt.Errorf("email cannot be empty")
+    }
+    if !emailRegex.MatchString(email) {
+        return fmt.Errorf("invalid email format")
+    }
+    return nil
+}
+```
+
+**Iteration 3: Add More Cases**
+
+Convert to table-driven test:
+
+```go
+func TestValidateEmail(t *testing.T) {
+    tests := []struct {
+        name    string
+        email   string
+        wantErr bool
+    }{
+        {"valid", "user@example.com", false},
+        {"empty", "", true},
+        {"no @", "userexample.com", true},
+        {"no domain", "user@", true},
+        {"no user", "@example.com", true},
+    }
+
+    for _, tt := range tests {
+        t.Run(tt.name, func(t *testing.T) {
+            err := ValidateEmail(tt.email)
+            if (err != nil) != tt.wantErr {
+                t.Errorf("ValidateEmail(%s) error = %v, wantErr %v",
+                    tt.email, err, tt.wantErr)
+            }
+        })
+    }
+}
+```
+
+---
+
+## TDD for Bug Fixes
+
+### Workflow
+
+1. **Reproduce bug with test** (Red)
+2. **Fix bug** (Green)
+3. **Refactor if needed** (Refactor)
+4. **Verify bug doesn't regress** (Test stays green)
+
+### Example: Fix Nil Pointer Bug
+
+**Step 1: Write failing test that reproduces bug**
+
+```go
+func TestProcessData_NilInput(t *testing.T) {
+    // This currently crashes with nil pointer
+    _, err := ProcessData(nil)
+
+    if err == nil {
+        t.Error("ProcessData(nil) should return error, not crash")
+    }
+}
+```
+
+**Run test**:
+```bash
+$ go test ./...
+panic: runtime error: invalid memory address or nil pointer dereference
+FAIL
+```
+
+**Step 2: Fix the bug**
+
+```go
+func ProcessData(input *Input) (Result, error) {
+    // Add nil check
+    if input == nil {
+        return Result{}, fmt.Errorf("input cannot be nil")
+    }
+
+    // Original logic...
+    return result, nil
+}
+```
+
+**Run test**:
+```bash
+$ go test ./...
+PASS
+```
+
+**Step 3: Add more edge cases**
+
+```go
+func TestProcessData_ErrorCases(t *testing.T) {
+    tests := []struct {
+        name    string
+        input   *Input
+        wantErr bool
+        errMsg  string
+    }{
+        {
+            name:    "nil input",
+            input:   nil,
+            wantErr: true,
+            errMsg:  "cannot be nil",
+        },
+        {
+            name:    "empty input",
+            input:   &Input{},
+            wantErr: true,
+            errMsg:  "empty",
+        },
+    }
+
+    for _, tt := range tests {
+        t.Run(tt.name, func(t *testing.T) {
+            _, err := ProcessData(tt.input)
+
+            if (err != nil) != tt.wantErr {
+                t.Errorf("ProcessData() error = %v, wantErr %v", err, tt.wantErr)
+            }
+
+            if tt.wantErr && !strings.Contains(err.Error(), tt.errMsg) {
+                t.Errorf("expected error containing '%s', got '%s'", tt.errMsg, err.Error())
+            }
+        })
+    }
+}
+```
+
+---
+
+## Integration with Coverage-Driven Development
+
+TDD and coverage-driven approaches complement each other:
+
+### Pure TDD (New Feature Development)
+
+**When**: Building new features from scratch
+
+**Workflow**: Red → Green → Refactor (repeat)
+
+**Focus**: Design through tests, emergent architecture
+
+### Coverage-Driven (Existing Codebase)
+
+**When**: Improving test coverage of existing code
+
+**Workflow**: Analyze coverage → Prioritize → Write tests → Verify
+
+**Focus**: Systematic gap closure, efficiency
+
+### Hybrid Approach (Recommended)
+
+**For new features**:
+1. Use TDD to drive design
+2. Track coverage as you go
+3. Use coverage tools to identify blind spots
+
+**For existing code**:
+1. Use coverage-driven to systematically add tests
+2. Apply TDD for any refactoring
+3. Apply TDD for bug fixes
+
+---
+
+## Best Practices
+
+### Do's
+
+✅ Write test before code (for new features)
+✅ Keep Red phase short (minutes, not hours)
+✅ Make smallest possible change to get to Green
+✅ Refactor frequently
+✅ Run all tests after each change
+✅ Commit after each successful Red-Green-Refactor cycle
+
+### Don'ts
+
+❌ Skip the Red phase (writing tests for existing working code is not TDD)
+❌ Write multiple tests before making them pass
+❌ Write too much code in Green phase
+❌ Refactor while tests are red
+❌ Skip Refactor phase
+❌ Ignore test failures
+
+---
+
+## Common Challenges
+
+### Challenge 1: Test Takes Too Long to Write
+
+**Symptom**: Spending 30+ minutes on single test
+
+**Causes**:
+- Testing too much at once
+- Complex setup required
+- Unclear requirements
+
+**Solutions**:
+- Break into smaller tests
+- Create test helpers for setup
+- Clarify requirements before writing test
+
+### Challenge 2: Can't Make Test Pass Without Large Changes
+
+**Symptom**: Green phase requires extensive code changes
+
+**Causes**:
+- Test is too ambitious
+- Existing code not designed for testability
+- Missing intermediate steps
+
+**Solutions**:
+- Write smaller test
+- Refactor existing code first (with existing tests passing)
+- Add intermediate tests to build up gradually
+
+### Challenge 3: Tests Pass But Coverage Doesn't Improve
+
+**Symptom**: Writing tests but coverage metrics don't increase
+
+**Causes**:
+- Testing already-covered code paths
+- Tests not exercising target functions
+- Indirect coverage already exists
+
+**Solutions**:
+- Check per-function coverage: `go tool cover -func=coverage.out`
+- Focus on 0% coverage functions
+- Use coverage tools to identify true gaps
+
+---
+
+**Source**: Bootstrap-002 Test Strategy Development
+**Framework**: BAIME (Bootstrapped AI Methodology Engineering)
+**Status**: Production-ready, validated through 4 iterations