Initial commit

2025-11-30 09:07:22 +08:00
commit fab98d059b
179 changed files with 46209 additions and 0 deletions
--- a/skills/code-refactoring/templates/incremental-commit-protocol.md
+++ b/skills/code-refactoring/templates/incremental-commit-protocol.md
@@ -0,0 +1,589 @@
+# Incremental Commit Protocol
+
+**Purpose**: Ensure clean, revertible git history through disciplined incremental commits
+
+**When to Use**: During ALL refactoring work
+
+**Origin**: Iteration 1 - Problem E3 (No Incremental Commit Discipline)
+
+---
+
+## Core Principle
+
+**Every refactoring step = One commit with passing tests**
+
+**Benefits**:
+- **Rollback**: Can revert any single change easily
+- **Review**: Small commits easier to review
+- **Bisect**: Can use `git bisect` to find which change caused issue
+- **Collaboration**: Easy to cherry-pick or rebase individual changes
+- **Safety**: Never have large uncommitted work at risk of loss
+
+---
+
+## Commit Frequency Rule
+
+**COMMIT AFTER**:
+- Every refactoring step (Extract Method, Rename, Simplify Conditional)
+- Every test addition
+- Every passing test run after code change
+- Approximately every 5-10 minutes of work
+- Before taking a break or switching context
+
+**DO NOT COMMIT**:
+- While tests are failing (except for WIP commits on feature branches)
+- Large batches of changes (>200 lines in single commit)
+- Multiple unrelated changes together
+
+---
+
+## Commit Message Convention
+
+### Format
+
+```
+<type>(<scope>): <subject>
+
+[optional body]
+
+[optional footer]
+```
+
+### Types for Refactoring
+
+| Type | When to Use | Example |
+|------|-------------|---------|
+| `refactor` | Restructuring code without behavior change | `refactor(sequences): extract collectTimestamps helper` |
+| `test` | Adding or modifying tests | `test(sequences): add edge cases for calculateTimeSpan` |
+| `docs` | Adding/updating GoDoc comments | `docs(sequences): document calculateTimeSpan parameters` |
+| `style` | Formatting, naming (no logic change) | `style(sequences): rename ts to timestamp` |
+| `perf` | Performance improvement | `perf(sequences): optimize timestamp collection loop` |
+
+### Scope
+
+**Use package or file name**:
+- `sequences` (for internal/query/sequences.go)
+- `context` (for internal/query/context.go)
+- `file_access` (for internal/query/file_access.go)
+- `query` (for changes across multiple files in package)
+
+### Subject Line Rules
+
+**Format**: `<verb> <what> [<pattern>]`
+
+**Verbs**:
+- `extract`: Extract Method pattern
+- `inline`: Inline Method pattern
+- `simplify`: Simplify Conditionals pattern
+- `rename`: Rename pattern
+- `move`: Move Method/Field pattern
+- `add`: Add tests, documentation
+- `remove`: Remove dead code, duplication
+- `update`: Update existing code/tests
+
+**Examples**:
+- ✅ `refactor(sequences): extract collectTimestamps helper`
+- ✅ `refactor(sequences): simplify timestamp filtering logic`
+- ✅ `refactor(sequences): rename ts to timestamp for clarity`
+- ✅ `test(sequences): add edge cases for empty occurrences`
+- ✅ `docs(sequences): document calculateSequenceTimeSpan return value`
+
+**Avoid**:
+- ❌ `fix bugs` (vague, no scope)
+- ❌ `refactor calculateSequenceTimeSpan` (no scope, unclear what changed)
+- ❌ `WIP` (not descriptive, avoid on main branch)
+- ❌ `refactor: various changes` (not specific)
+
+### Body (Optional but Recommended)
+
+**When to add body**:
+- Change is not obvious from subject
+- Multiple related changes in one commit
+- Need to explain WHY (not WHAT)
+
+**Example**:
+```
+refactor(sequences): extract collectTimestamps helper
+
+Reduces complexity of calculateSequenceTimeSpan from 10 to 7.
+Extracted timestamp collection logic to dedicated helper for clarity.
+All tests pass, coverage maintained at 85%.
+```
+
+### Footer (For Tracking)
+
+**Pattern**: `Pattern: <pattern-name>`
+
+**Examples**:
+```
+refactor(sequences): extract collectTimestamps helper
+
+Pattern: Extract Method
+```
+
+```
+test(sequences): add edge cases for calculateTimeSpan
+
+Pattern: Characterization Tests
+```
+
+---
+
+## Commit Workflow (Step-by-Step)
+
+### Before Starting Refactoring
+
+**1. Ensure Clean Baseline**
+
+```bash
+git status
+```
+
+**Checklist**:
+- [ ] No uncommitted changes: `nothing to commit, working tree clean`
+- [ ] If dirty: Stash or commit before starting: `git stash` or `git commit`
+
+**2. Create Refactoring Branch** (optional but recommended)
+
+```bash
+git checkout -b refactor/calculate-sequence-timespan
+```
+
+**Checklist**:
+- [ ] Branch created: `refactor/<descriptive-name>`
+- [ ] On correct branch: `git branch` shows current branch
+
+---
+
+### During Refactoring (Per Step)
+
+**For Each Refactoring Step**:
+
+#### 1. Make Single Change
+
+- Focused, minimal change (e.g., extract one helper method)
+- No unrelated changes in same commit
+
+#### 2. Run Tests
+
+```bash
+go test ./internal/query/... -v
+```
+
+**Checklist**:
+- [ ] All tests pass: PASS / FAIL
+- [ ] If FAIL: Fix issue before committing
+
+#### 3. Stage Changes
+
+```bash
+git add internal/query/sequences.go internal/query/sequences_test.go
+```
+
+**Checklist**:
+- [ ] Only relevant files staged: `git status` shows green files
+- [ ] No unintended files: Review `git diff --cached`
+
+**Review Staged Changes**:
+```bash
+git diff --cached
+```
+
+**Verify**:
+- [ ] Changes are what you intended
+- [ ] No debug code, commented code, or temporary changes
+- [ ] No unrelated changes sneaked in
+
+#### 4. Commit with Descriptive Message
+
+```bash
+git commit -m "refactor(sequences): extract collectTimestamps helper"
+```
+
+**Or with body**:
+```bash
+git commit -m "refactor(sequences): extract collectTimestamps helper
+
+Reduces complexity from 10 to 7.
+Extracts timestamp collection logic to dedicated helper.
+
+Pattern: Extract Method"
+```
+
+**Checklist**:
+- [ ] Commit message follows convention
+- [ ] Commit hash: _______________ (from `git log -1 --oneline`)
+- [ ] Commit is small (<200 lines): `git show --stat`
+
+#### 5. Verify Commit
+
+```bash
+git log -1 --stat
+```
+
+**Checklist**:
+- [ ] Commit message correct
+- [ ] Files changed correct
+- [ ] Line count reasonable (<200 insertions + deletions)
+
+**Repeat for each refactoring step**
+
+---
+
+### After Refactoring Complete
+
+**1. Review Commit History**
+
+```bash
+git log --oneline
+```
+
+**Checklist**:
+- [ ] Each commit is small, focused
+- [ ] Each commit message is descriptive
+- [ ] Commits tell a story of refactoring progression
+- [ ] No "fix typo" or "oops" commits (if any, squash them)
+
+**2. Run Final Test Suite**
+
+```bash
+go test ./... -v
+```
+
+**Checklist**:
+- [ ] All tests pass
+- [ ] Test coverage: `go test -cover ./internal/query/...`
+- [ ] Coverage ≥85%: YES / NO
+
+**3. Verify Each Commit Independently** (optional but good practice)
+
+```bash
+git rebase -i HEAD~N  # N = number of commits
+# For each commit:
+git checkout <commit-hash>
+go test ./internal/query/...
+```
+
+**Checklist**:
+- [ ] Each commit has passing tests: YES / NO
+- [ ] Each commit is a valid state: YES / NO
+- [ ] If any commit fails tests: Reorder or squash commits
+
+---
+
+## Commit Size Guidelines
+
+### Ideal Commit Size
+
+| Metric | Target | Max |
+|--------|--------|-----|
+| **Lines changed** | 20-50 | 200 |
+| **Files changed** | 1-2 | 5 |
+| **Time to review** | 2-5 min | 15 min |
+| **Complexity change** | -1 to -3 | -5 |
+
+**Rationale**:
+- Small commits easier to review
+- Small commits easier to revert
+- Small commits easier to understand in history
+
+### When Commit is Too Large
+
+**Signs**:
+- >200 lines changed
+- >5 files changed
+- Commit message says "and" (doing multiple things)
+- Hard to write descriptive subject (too complex)
+
+**Fix**:
+- Break into multiple smaller commits:
+  ```bash
+  git reset HEAD~1  # Undo last commit, keep changes
+  # Stage and commit parts separately
+  git add <file1>
+  git commit -m "refactor: <first change>"
+  git add <file2>
+  git commit -m "refactor: <second change>"
+  ```
+
+- Or use interactive staging:
+  ```bash
+  git add -p <file>  # Stage hunks interactively
+  git commit -m "refactor: <specific change>"
+  ```
+
+---
+
+## Rollback Scenarios
+
+### Scenario 1: Last Commit Was Mistake
+
+**Undo last commit, keep changes**:
+```bash
+git reset HEAD~1
+```
+
+**Checklist**:
+- [ ] Commit removed from history: `git log`
+- [ ] Changes still in working directory: `git status`
+- [ ] Can re-commit differently: `git add` + `git commit`
+
+**Undo last commit, discard changes**:
+```bash
+git reset --hard HEAD~1
+```
+
+**WARNING**: This DELETES changes permanently
+- [ ] Confirm you want to lose changes: YES / NO
+- [ ] Backup created if needed: YES / NO / N/A
+
+---
+
+### Scenario 2: Need to Revert Specific Commit
+
+**Revert a commit** (keeps history, creates new commit undoing changes):
+```bash
+git revert <commit-hash>
+```
+
+**Checklist**:
+- [ ] Commit hash identified: _______________
+- [ ] Revert commit created: `git log -1`
+- [ ] Tests pass after revert: PASS / FAIL
+
+**Example**:
+```bash
+# Revert the "extract helper" commit
+git log --oneline  # Find commit hash
+git revert abc123  # Revert that commit
+git commit -m "revert: extract collectTimestamps helper
+
+Tests failed due to nil pointer. Rolling back to investigate.
+
+Pattern: Rollback"
+```
+
+---
+
+### Scenario 3: Multiple Commits Need Rollback
+
+**Revert range of commits**:
+```bash
+git revert <oldest-commit>..<newest-commit>
+```
+
+**Or reset to earlier state**:
+```bash
+git reset --hard <commit-hash>
+```
+
+**Checklist**:
+- [ ] Identified rollback point: <commit-hash>
+- [ ] Confirmed losing commits OK: YES / NO
+- [ ] Branch backed up if needed: `git branch backup-$(date +%Y%m%d)`
+- [ ] Tests pass after rollback: PASS / FAIL
+
+---
+
+## Clean History Practices
+
+### Practice 1: Squash Fixup Commits
+
+**Scenario**: Made small "oops" commits (typo fix, forgot file)
+
+**Before Pushing** (local history only):
+```bash
+git rebase -i HEAD~N  # N = number of commits to review
+# Mark fixup commits as "fixup" or "squash"
+# Save and close
+```
+
+**Example**:
+```
+pick abc123 refactor: extract collectTimestamps helper
+fixup def456 fix: forgot to commit test file
+pick ghi789 refactor: extract findMinMax helper
+fixup jkl012 fix: typo in variable name
+```
+
+**After rebase**:
+```
+abc123 refactor: extract collectTimestamps helper
+ghi789 refactor: extract findMinMax helper
+```
+
+**Checklist**:
+- [ ] Fixup commits squashed: YES / NO
+- [ ] History clean: `git log --oneline`
+- [ ] Tests still pass: PASS / FAIL
+
+---
+
+### Practice 2: Reorder Commits Logically
+
+**Scenario**: Commits out of logical order (test commit before code commit)
+
+**Reorder with Interactive Rebase**:
+```bash
+git rebase -i HEAD~N
+# Reorder lines to desired sequence
+# Save and close
+```
+
+**Example**:
+```
+# Before:
+pick abc123 refactor: extract helper
+pick def456 test: add edge case tests
+pick ghi789 docs: add GoDoc comments
+
+# After (logical order):
+pick def456 test: add edge case tests
+pick abc123 refactor: extract helper
+pick ghi789 docs: add GoDoc comments
+```
+
+**Checklist**:
+- [ ] Commits reordered logically: YES / NO
+- [ ] Each commit still has passing tests: VERIFY
+- [ ] History makes sense: `git log --oneline`
+
+---
+
+## Git Hooks for Enforcement
+
+### Pre-Commit Hook (Prevent Committing Failing Tests)
+
+**Create `.git/hooks/pre-commit`**:
+```bash
+#!/bin/bash
+# Run tests before allowing commit
+go test ./... > /dev/null 2>&1
+if [ $? -ne 0 ]; then
+    echo "❌ Tests failing. Fix tests before committing."
+    echo "Run 'go test ./...' to see failures."
+    echo ""
+    echo "To commit anyway (NOT RECOMMENDED):"
+    echo "  git commit --no-verify"
+    exit 1
+fi
+
+echo "✅ Tests pass. Proceeding with commit."
+exit 0
+```
+
+**Make executable**:
+```bash
+chmod +x .git/hooks/pre-commit
+```
+
+**Checklist**:
+- [ ] Pre-commit hook installed: YES / NO
+- [ ] Hook prevents failing test commits: VERIFY
+- [ ] Hook can be bypassed if needed: `--no-verify` works
+
+---
+
+### Commit-Msg Hook (Enforce Commit Message Convention)
+
+**Create `.git/hooks/commit-msg`**:
+```bash
+#!/bin/bash
+# Validate commit message format
+commit_msg_file=$1
+commit_msg=$(cat "$commit_msg_file")
+
+# Pattern: type(scope): subject
+pattern="^(refactor|test|docs|style|perf)\([a-z_]+\): .{10,}"
+
+if ! echo "$commit_msg" | grep -qE "$pattern"; then
+    echo "❌ Invalid commit message format."
+    echo ""
+    echo "Required format: type(scope): subject"
+    echo "  Types: refactor, test, docs, style, perf"
+    echo "  Scope: package or file name (lowercase)"
+    echo "  Subject: descriptive (min 10 chars)"
+    echo ""
+    echo "Example: refactor(sequences): extract collectTimestamps helper"
+    echo ""
+    echo "Your message:"
+    echo "$commit_msg"
+    exit 1
+fi
+
+echo "✅ Commit message format valid."
+exit 0
+```
+
+**Make executable**:
+```bash
+chmod +x .git/hooks/commit-msg
+```
+
+**Checklist**:
+- [ ] Commit-msg hook installed: YES / NO
+- [ ] Hook enforces convention: VERIFY
+- [ ] Can be bypassed if needed: `--no-verify` works
+
+---
+
+## Commit Statistics (Track Over Time)
+
+**Refactoring Session**: ___ (e.g., calculateSequenceTimeSpan - 2025-10-19)
+
+| Metric | Value |
+|--------|-------|
+| **Total commits** | ___ |
+| **Commits with passing tests** | ___ |
+| **Average commit size** | ___ lines |
+| **Largest commit** | ___ lines |
+| **Smallest commit** | ___ lines |
+| **Rollbacks needed** | ___ |
+| **Fixup commits** | ___ |
+| **Commits per hour** | ___ |
+
+**Commit Discipline Score**: (Commits with passing tests) / (Total commits) × 100% = ___%
+
+**Target**: 100% commit discipline (every commit has passing tests)
+
+---
+
+## Example Commit Sequence
+
+**Refactoring**: calculateSequenceTimeSpan (Complexity 10 → <8)
+
+```bash
+# Baseline
+abc123 test: add edge cases for calculateSequenceTimeSpan
+def456 refactor(sequences): extract collectOccurrenceTimestamps helper
+ghi789 test: add unit tests for collectOccurrenceTimestamps
+jkl012 refactor(sequences): extract findMinMaxTimestamps helper
+mno345 test: add unit tests for findMinMaxTimestamps
+pqr678 refactor(sequences): simplify calculateSequenceTimeSpan using helpers
+stu901 docs(sequences): add GoDoc for calculateSequenceTimeSpan
+vwx234 test(sequences): verify complexity reduced to 6
+```
+
+**Statistics**:
+- Total commits: 8
+- Average size: ~30 lines
+- Largest commit: def456 (extract helper, 45 lines)
+- All commits with passing tests: 8/8 (100%)
+- Complexity progression: 10 → 7 (def456) → 6 (pqr678)
+
+---
+
+## Notes
+
+- **Discipline**: Commit after EVERY refactoring step
+- **Small**: Keep commits <200 lines
+- **Passing**: Every commit must have passing tests
+- **Descriptive**: Subject line tells what changed
+- **Revertible**: Each commit can be reverted independently
+- **Story**: Commit history tells story of refactoring progression
+
+---
+
+**Version**: 1.0 (Iteration 1)
+**Next Review**: Iteration 2 (refine based on usage data)
+**Automation**: See git hooks section for automated enforcement
--- a/skills/code-refactoring/templates/iteration-template.md
+++ b/skills/code-refactoring/templates/iteration-template.md
@@ -0,0 +1,64 @@
+# Iteration {{NUM}}: {{TITLE}}
+
+**Date**: {{DATE}}
+**Duration**: ~{{DURATION}}
+**Status**: {{STATUS}}
+**Framework**: BAIME (Bootstrapped AI Methodology Engineering)
+
+---
+
+## 1. Executive Summary
+- Focus:
+- Achievements:
+- Learnings:
+- Value Scores: V_instance(s_{{NUM}}) = {{V_INSTANCE}}, V_meta(s_{{NUM}}) = {{V_META}}
+
+---
+
+## 2. Pre-Execution Context
+- Previous State Summary:
+- Key Gaps:
+- Objectives:
+
+---
+
+## 3. Work Executed
+### Observe
+- Metrics:
+- Findings:
+- Gaps:
+
+### Codify
+- Deliverables:
+- Decisions:
+- Rationale:
+
+### Automate
+- Changes:
+- Tests:
+- Evidence:
+
+---
+
+## 4. Evaluation
+- V_instance Components:
+- V_meta Components:
+- Evidence Links:
+
+---
+
+## 5. Convergence & Next Steps
+- Gap Analysis:
+- Next Iteration Focus:
+
+---
+
+## 6. Reflections
+- What Worked:
+- What Didn’t Work:
+- Methodology Insights:
+
+---
+
+**Status**: {{STATUS}}
+**Next**: {{NEXT_FOCUS}}
--- a/skills/code-refactoring/templates/refactoring-safety-checklist.md
+++ b/skills/code-refactoring/templates/refactoring-safety-checklist.md
@@ -0,0 +1,275 @@
+# Refactoring Safety Checklist
+
+**Purpose**: Ensure safe, behavior-preserving refactoring through systematic verification
+
+**When to Use**: Before starting ANY refactoring work
+
+**Origin**: Iteration 1 - Problem P1 (No Refactoring Safety Checklist)
+
+---
+
+## Pre-Refactoring Checklist
+
+### 1. Baseline Verification
+
+- [ ] **All tests passing**: Run full test suite (`go test ./...`)
+  - Status: PASS / FAIL
+  - If FAIL: Fix failing tests BEFORE refactoring
+
+- [ ] **No uncommitted changes**: Check git status
+  - Status: CLEAN / DIRTY
+  - If DIRTY: Commit or stash before refactoring
+
+- [ ] **Baseline metrics recorded**: Capture current complexity, coverage, duplication
+  - Complexity: `gocyclo -over 1 <target-package>/`
+  - Coverage: `go test -cover <target-package>/...`
+  - Duplication: `dupl -threshold 15 <target-package>/`
+  - Saved to: `data/iteration-N/baseline-<target>.txt`
+
+### 2. Test Coverage Verification
+
+- [ ] **Target code has tests**: Verify tests exist for code being refactored
+  - Test file: `<target>_test.go`
+  - Coverage: ___% (from `go test -cover`)
+  - If <75%: Write tests FIRST (TDD)
+
+- [ ] **Tests cover current behavior**: Run tests and verify they pass
+  - Characterization tests: Tests that document current behavior
+  - Edge cases covered: Empty inputs, nil checks, error conditions
+  - If gaps found: Write additional tests FIRST
+
+### 3. Refactoring Plan
+
+- [ ] **Refactoring pattern selected**: Choose appropriate pattern
+  - Pattern: _______________ (e.g., Extract Method, Simplify Conditionals)
+  - Reference: `knowledge/patterns/<pattern>.md`
+
+- [ ] **Incremental steps defined**: Break into small, verifiable steps
+  - Step 1: _______________
+  - Step 2: _______________
+  - Step 3: _______________
+  - (Each step should take <10 minutes, pass tests)
+
+- [ ] **Rollback plan documented**: Define how to undo if problems occur
+  - Rollback method: Git revert / Git reset / Manual
+  - Rollback triggers: Tests fail, complexity increases, coverage decreases >5%
+
+---
+
+## During Refactoring Checklist (Per Step)
+
+### Step N: <Step Description>
+
+#### Before Making Changes
+
+- [ ] **Tests pass**: `go test ./...`
+  - Status: PASS / FAIL
+  - Time: ___s
+
+#### Making Changes
+
+- [ ] **One change at a time**: Make minimal, focused change
+  - Files modified: _______________
+  - Lines changed: ___
+  - Scope: Single function / Multiple functions / Cross-file
+
+- [ ] **No behavioral changes**: Only restructure, don't change logic
+  - Verified: Code does same thing, just organized differently
+
+#### After Making Changes
+
+- [ ] **Tests still pass**: `go test ./...`
+  - Status: PASS / FAIL
+  - Time: ___s
+  - If FAIL: Rollback immediately
+
+- [ ] **Coverage maintained or improved**: `go test -cover ./...`
+  - Before: ___%
+  - After: ___%
+  - Change: +/- ___%
+  - If decreased >1%: Investigate and add tests
+
+- [ ] **No new complexity**: `gocyclo -over 10 <target-file>`
+  - Functions >10: ___
+  - If increased: Rollback or simplify further
+
+- [ ] **Commit incremental progress**: `git add . && git commit -m "refactor: <description>"`
+  - Commit hash: _______________
+  - Message: "refactor: <pattern> - <what changed>"
+  - Safe rollback point: Can revert this specific change
+
+---
+
+## Post-Refactoring Checklist
+
+### 1. Final Verification
+
+- [ ] **All tests pass**: `go test ./...`
+  - Status: PASS
+  - Duration: ___s
+
+- [ ] **Coverage improved or maintained**: `go test -cover ./...`
+  - Baseline: ___%
+  - Final: ___%
+  - Change: +___%
+  - Target: ≥85% overall, ≥95% for refactored code
+
+- [ ] **Complexity reduced**: `gocyclo -avg <target-package>/`
+  - Baseline: ___
+  - Final: ___
+  - Reduction: ___%
+  - Target function: <10 complexity
+
+- [ ] **No duplication introduced**: `dupl -threshold 15 <target-package>/`
+  - Baseline groups: ___
+  - Final groups: ___
+  - Change: -___ groups
+
+- [ ] **No new static warnings**: `go vet <target-package>/...`
+  - Warnings: 0
+  - If >0: Fix before finalizing
+
+### 2. Behavior Preservation
+
+- [ ] **Integration tests pass** (if applicable)
+  - Status: PASS / N/A
+
+- [ ] **Manual verification** (for critical code)
+  - Test scenario 1: _______________
+  - Test scenario 2: _______________
+  - Result: Behavior unchanged
+
+- [ ] **Performance not regressed** (if applicable)
+  - Benchmark: `go test -bench . <target-package>/...`
+  - Change: +/- ___%
+  - Acceptable: <10% regression
+
+### 3. Documentation
+
+- [ ] **Code documented**: Add/update GoDoc comments
+  - Public functions: ___ documented / ___ total
+  - Target: 100% of public APIs
+
+- [ ] **Refactoring logged**: Document refactoring in session log
+  - File: `data/iteration-N/refactoring-log.md`
+  - Logged: Pattern, time, issues, lessons
+
+### 4. Final Commit
+
+- [ ] **Clean git history**: All incremental commits made
+  - Total commits: ___
+  - Clean messages: YES / NO
+  - Revertible: YES / NO
+
+- [ ] **Final metrics recorded**: Save post-refactoring metrics
+  - File: `data/iteration-N/final-<target>.txt`
+  - Metrics: Complexity, coverage, duplication saved
+
+---
+
+## Rollback Protocol
+
+**When to Rollback**:
+- Tests fail after a refactoring step
+- Coverage decreases >5%
+- Complexity increases
+- New static analysis errors
+- Refactoring taking >2x estimated time
+- Uncertainty about correctness
+
+**How to Rollback**:
+1. **Immediate**: Stop making changes
+2. **Assess**: Identify which commit introduced problem
+3. **Revert**: `git revert <commit-hash>` or `git reset --hard <last-good-commit>`
+4. **Verify**: Run tests to confirm rollback successful
+5. **Document**: Log why rollback was needed
+6. **Re-plan**: Choose different approach or break into smaller steps
+
+**Rollback Checklist**:
+- [ ] Identified problem commit: _______________
+- [ ] Reverted changes: `git revert _______________`
+- [ ] Tests pass after rollback: PASS / FAIL
+- [ ] Documented rollback reason: _______________
+- [ ] New plan documented: _______________
+
+---
+
+## Safety Statistics (Track Over Time)
+
+**Refactoring Session**: ___ (e.g., calculateSequenceTimeSpan - 2025-10-19)
+
+| Metric | Value |
+|--------|-------|
+| **Steps completed** | ___ |
+| **Rollbacks needed** | ___ |
+| **Tests failed** | ___ times |
+| **Coverage regression** | YES / NO |
+| **Complexity regression** | YES / NO |
+| **Total time** | ___ minutes |
+| **Average time per step** | ___ minutes |
+| **Safety incidents** | ___ (breaking changes, lost work, etc.) |
+
+**Safety Score**: (Steps completed - Rollbacks - Safety incidents) / Steps completed × 100% = ___%
+
+**Target**: ≥95% safety score (≤5% incidents)
+
+---
+
+## Checklist Usage Example
+
+**Refactoring**: `calculateSequenceTimeSpan` (Complexity 10 → <8)
+**Pattern**: Extract Method (collectOccurrenceTimestamps, findMinMaxTimestamps)
+**Date**: 2025-10-19
+
+### Pre-Refactoring
+- [x] All tests passing: PASS (0.008s)
+- [x] No uncommitted changes: CLEAN
+- [x] Baseline metrics: Saved to `data/iteration-1/baseline-sequences.txt`
+  - Complexity: 10
+  - Coverage: 85%
+  - Duplication: 0 groups in this file
+- [x] Target has tests: `sequences_test.go` exists
+- [x] Coverage: 85% (need to add edge case tests)
+- [x] Pattern: Extract Method
+- [x] Steps: 1) Write edge case tests, 2) Extract collectTimestamps, 3) Extract findMinMax
+- [x] Rollback: Git revert if tests fail
+
+### During Refactoring - Step 1: Write Edge Case Tests
+- [x] Tests pass before: PASS
+- [x] Added tests for empty timestamps, single timestamp
+- [x] Tests pass after: PASS
+- [x] Coverage: 85% → 95%
+- [x] Commit: `git commit -m "test: add edge cases for calculateSequenceTimeSpan"`
+
+### During Refactoring - Step 2: Extract collectTimestamps
+- [x] Tests pass before: PASS
+- [x] Extracted helper, updated main function
+- [x] Tests pass after: PASS
+- [x] Coverage: 95% (maintained)
+- [x] Complexity: 10 → 7
+- [x] Commit: `git commit -m "refactor: extract collectTimestamps helper"`
+
+### Post-Refactoring
+- [x] All tests pass: PASS
+- [x] Coverage: 85% → 95% (+10%)
+- [x] Complexity: 10 → 6 (-40%)
+- [x] Duplication: 0 (no change)
+- [x] Documentation: Added GoDoc to calculateSequenceTimeSpan
+- [x] Logged: `data/iteration-1/refactoring-log.md`
+
+**Safety Score**: 3 steps, 0 rollbacks, 0 incidents = 100%
+
+---
+
+## Notes
+
+- **Honesty**: Mark actual status, not desired status
+- **Discipline**: Don't skip checks "because it seems fine"
+- **Speed**: Checks should be quick (<1 minute total per step)
+- **Automation**: Use scripts to automate metric collection (see Problem V1)
+- **Adaptation**: Adjust checklist based on project needs, but maintain core safety principles
+
+---
+
+**Version**: 1.0 (Iteration 1)
+**Next Review**: Iteration 2 (refine based on usage data)
--- a/skills/code-refactoring/templates/tdd-refactoring-workflow.md
+++ b/skills/code-refactoring/templates/tdd-refactoring-workflow.md
@@ -0,0 +1,516 @@
+# TDD Refactoring Workflow
+
+**Purpose**: Enforce test-driven discipline during refactoring to ensure behavior preservation and quality
+
+**When to Use**: During ALL refactoring work
+
+**Origin**: Iteration 1 - Problem E1 (No TDD Enforcement)
+
+---
+
+## TDD Principle for Refactoring
+
+**Red-Green-Refactor Cycle** (adapted for refactoring existing code):
+
+1. **Green** (Baseline): Ensure existing tests pass
+2. **Red** (Add Tests): Write tests for uncovered behavior (tests should pass immediately since code exists)
+3. **Refactor**: Restructure code while maintaining green tests
+4. **Green** (Verify): Confirm all tests still pass after refactoring
+
+**Key Difference from New Development TDD**:
+- **New Development**: Write failing test → Make it pass → Refactor
+- **Refactoring**: Ensure passing tests → Add missing tests (passing) → Refactor → Keep tests passing
+
+---
+
+## Workflow Steps
+
+### Phase 1: Baseline Green (Ensure Safety Net)
+
+**Goal**: Verify existing tests provide safety net for refactoring
+
+#### Step 1: Run Existing Tests
+
+```bash
+go test -v ./internal/query/... > tests-baseline.txt
+```
+
+**Checklist**:
+- [ ] All existing tests pass: YES / NO
+- [ ] Test count: ___ tests
+- [ ] Duration: ___s
+- [ ] If any fail: FIX BEFORE PROCEEDING
+
+#### Step 2: Check Coverage
+
+```bash
+go test -cover ./internal/query/...
+go test -coverprofile=coverage.out ./internal/query/...
+go tool cover -html=coverage.out -o coverage.html
+```
+
+**Checklist**:
+- [ ] Overall coverage: ___%
+- [ ] Target function coverage: ___%
+- [ ] Uncovered lines identified: YES / NO
+- [ ] Coverage file: `coverage.html` (review in browser)
+
+#### Step 3: Identify Coverage Gaps
+
+**Review `coverage.html` and identify**:
+- [ ] Uncovered branches: _______________
+- [ ] Uncovered error paths: _______________
+- [ ] Uncovered edge cases: _______________
+- [ ] Missing edge case examples:
+  - Empty inputs: ___ (e.g., empty slice, nil, zero)
+  - Boundary conditions: ___ (e.g., single element, max value)
+  - Error conditions: ___ (e.g., invalid input, out of range)
+
+**Decision Point**:
+- If coverage ≥95% on target code: Proceed to Phase 2 (Refactor)
+- If coverage <95%: Proceed to Phase 1b (Write Missing Tests)
+
+---
+
+### Phase 1b: Write Missing Tests (Red → Immediate Green)
+
+**Goal**: Add tests for uncovered code paths BEFORE refactoring
+
+#### For Each Coverage Gap:
+
+**1. Write Characterization Test** (documents current behavior):
+
+```go
+func TestCalculateSequenceTimeSpan_<EdgeCase>(t *testing.T) {
+    // Setup: Create input that triggers uncovered path
+    // ...
+
+    // Execute: Call function
+    result := calculateSequenceTimeSpan(occurrences, entries, toolCalls)
+
+    // Verify: Document current behavior (even if it's wrong)
+    assert.Equal(t, <expected>, result, "current behavior")
+}
+```
+
+**Test Naming Convention**:
+- `Test<FunctionName>_<EdgeCase>` (e.g., `TestCalculateTimeSpan_EmptyOccurrences`)
+- `Test<FunctionName>_<Scenario>` (e.g., `TestCalculateTimeSpan_SingleOccurrence`)
+
+**2. Verify Test Passes** (should pass immediately since code exists):
+
+```bash
+go test -v -run Test<FunctionName>_<EdgeCase> ./...
+```
+
+**Checklist**:
+- [ ] Test written: `Test<FunctionName>_<EdgeCase>`
+- [ ] Test passes immediately: YES / NO
+- [ ] If NO: Bug in test or unexpected current behavior → Fix test
+- [ ] Coverage increased: __% → ___%
+
+**3. Commit Test**:
+
+```bash
+git add <test_file>
+git commit -m "test: add <edge-case> test for <function>"
+```
+
+**Repeat for all coverage gaps until target coverage ≥95%**
+
+#### Coverage Target
+
+- [ ] **Overall coverage**: ≥85% (project minimum)
+- [ ] **Target function coverage**: ≥95% (refactoring requirement)
+- [ ] **New test coverage**: ≥100% (all new tests pass)
+
+**Checkpoint**: Before proceeding to refactoring:
+- [ ] All tests pass: PASS
+- [ ] Target function coverage: ≥95%
+- [ ] Coverage gaps documented if <95%: _______________
+
+---
+
+### Phase 2: Refactor (Maintain Green)
+
+**Goal**: Restructure code while keeping all tests passing
+
+#### For Each Refactoring Step:
+
+**1. Plan Single Refactoring Transformation**:
+
+- [ ] Transformation type: _______________ (Extract Method, Inline, Rename, etc.)
+- [ ] Target code: _______________ (function, lines, scope)
+- [ ] Expected outcome: _______________ (complexity reduction, clarity, etc.)
+- [ ] Estimated time: ___ minutes
+
+**2. Make Minimal Change**:
+
+**Examples**:
+- Extract Method: Move lines X-Y to new function `<name>`
+- Simplify Conditional: Replace nested if with guard clause
+- Rename: Change `<oldName>` to `<newName>`
+
+**Checklist**:
+- [ ] Single, focused change: YES / NO
+- [ ] No behavioral changes: Only structural / organizational
+- [ ] Files modified: _______________
+- [ ] Lines changed: ~___
+
+**3. Run Tests Immediately**:
+
+```bash
+go test -v ./internal/query/... | tee test-results-step-N.txt
+```
+
+**Checklist**:
+- [ ] All tests pass: PASS / FAIL
+- [ ] Duration: ___s (should be quick, <10s)
+- [ ] If FAIL: **ROLLBACK IMMEDIATELY**
+
+**4. Verify Coverage Maintained**:
+
+```bash
+go test -cover ./internal/query/...
+```
+
+**Checklist**:
+- [ ] Coverage: Before __% → After ___%
+- [ ] Change: +/- ___%
+- [ ] If decreased >1%: Investigate (might need to update tests)
+- [ ] If decreased >5%: **ROLLBACK**
+
+**5. Verify Complexity**:
+
+```bash
+gocyclo -over 10 internal/query/<target-file>.go
+```
+
+**Checklist**:
+- [ ] Target function complexity: ___
+- [ ] Change from previous: +/- ___
+- [ ] If increased: Not a valid refactoring step → ROLLBACK
+
+**6. Commit Incremental Progress**:
+
+```bash
+git add .
+git commit -m "refactor(<file>): <pattern> - <what changed>"
+```
+
+**Example Commit Messages**:
+- `refactor(sequences): extract collectTimestamps helper`
+- `refactor(sequences): simplify min/max calculation`
+- `refactor(sequences): rename ts to timestamp for clarity`
+
+**Checklist**:
+- [ ] Commit hash: _______________
+- [ ] Message follows convention: YES / NO
+- [ ] Commit is small, focused: YES / NO
+
+**Repeat refactoring steps until refactoring complete or target achieved**
+
+---
+
+### Phase 3: Final Verification (Confirm Green)
+
+**Goal**: Comprehensive verification that refactoring succeeded
+
+#### 1. Run Full Test Suite
+
+```bash
+go test -v ./... | tee test-results-final.txt
+```
+
+**Checklist**:
+- [ ] All tests pass: PASS / FAIL
+- [ ] Test count: ___ (should match baseline or increase)
+- [ ] Duration: ___s
+- [ ] No flaky tests: All consistent
+
+#### 2. Verify Coverage Improved or Maintained
+
+```bash
+go test -cover ./internal/query/...
+go test -coverprofile=coverage-final.out ./internal/query/...
+go tool cover -func=coverage-final.out | grep total
+```
+
+**Checklist**:
+- [ ] Baseline coverage: ___%
+- [ ] Final coverage: ___%
+- [ ] Change: +___%
+- [ ] Target met (≥85% overall, ≥95% refactored code): YES / NO
+
+#### 3. Compare Baseline and Final Metrics
+
+| Metric | Baseline | Final | Change | Target Met |
+|--------|----------|-------|--------|------------|
+| **Complexity** | ___ | ___ | ___% | YES / NO |
+| **Coverage** | ___% | ___% | +___% | YES / NO |
+| **Test count** | ___ | ___ | +___ | N/A |
+| **Test duration** | ___s | ___s | ___s | N/A |
+
+**Checklist**:
+- [ ] All targets met: YES / NO
+- [ ] If NO: Document gaps and plan next iteration
+
+#### 4. Update Documentation
+
+```bash
+# Add/update GoDoc comments for refactored code
+# Example:
+// calculateSequenceTimeSpan calculates the time span in minutes between
+// the first and last occurrence of a sequence pattern across turns.
+// Returns 0 if no valid timestamps found.
+```
+
+**Checklist**:
+- [ ] GoDoc added/updated: YES / NO
+- [ ] Public functions documented: ___ / ___ (100%)
+- [ ] Parameter descriptions clear: YES / NO
+- [ ] Return value documented: YES / NO
+
+---
+
+## TDD Metrics (Track Over Time)
+
+**Refactoring Session**: ___ (e.g., calculateSequenceTimeSpan - 2025-10-19)
+
+| Metric | Value |
+|--------|-------|
+| **Baseline coverage** | ___% |
+| **Final coverage** | ___% |
+| **Coverage improvement** | +___% |
+| **Tests added** | ___ |
+| **Test failures during refactoring** | ___ |
+| **Rollbacks due to test failures** | ___ |
+| **Time spent writing tests** | ___ min |
+| **Time spent refactoring** | ___ min |
+| **Test writing : Refactoring ratio** | ___:1 |
+
+**TDD Discipline Score**: (Tests passing after each step) / (Total steps) × 100% = ___%
+
+**Target**: 100% TDD discipline (tests pass after EVERY step)
+
+---
+
+## Common TDD Refactoring Patterns
+
+### Pattern 1: Extract Method with Tests
+
+**Scenario**: Function too complex, need to extract helper
+
+**Steps**:
+1. ✅ Ensure tests pass
+2. ✅ Write test for behavior to be extracted (if not covered)
+3. ✅ Extract method
+4. ✅ Tests still pass
+5. ✅ Write direct test for new extracted method
+6. ✅ Tests pass
+7. ✅ Commit
+
+**Example**:
+```go
+// Before:
+func calculate() {
+    // ... 20 lines of timestamp collection
+    // ... 15 lines of min/max finding
+}
+
+// After:
+func calculate() {
+    timestamps := collectTimestamps()
+    return findMinMax(timestamps)
+}
+
+func collectTimestamps() []int64 { /* extracted */ }
+func findMinMax([]int64) int { /* extracted */ }
+```
+
+**Tests**:
+- Existing: `TestCalculate` (still passes)
+- New: `TestCollectTimestamps` (covers extracted logic)
+- New: `TestFindMinMax` (covers min/max logic)
+
+---
+
+### Pattern 2: Simplify Conditionals with Tests
+
+**Scenario**: Nested conditionals hard to read, need to simplify
+
+**Steps**:
+1. ✅ Ensure tests pass (covering all branches)
+2. ✅ If branches uncovered: Add tests for all paths
+3. ✅ Simplify conditionals (guard clauses, early returns)
+4. ✅ Tests still pass
+5. ✅ Commit
+
+**Example**:
+```go
+// Before: Nested conditionals
+if len(timestamps) > 0 {
+    minTs := timestamps[0]
+    maxTs := timestamps[0]
+    for _, ts := range timestamps[1:] {
+        if ts < minTs {
+            minTs = ts
+        }
+        if ts > maxTs {
+            maxTs = ts
+        }
+    }
+    return int((maxTs - minTs) / 60)
+} else {
+    return 0
+}
+
+// After: Guard clause
+if len(timestamps) == 0 {
+    return 0
+}
+minTs := timestamps[0]
+maxTs := timestamps[0]
+for _, ts := range timestamps[1:] {
+    if ts < minTs {
+        minTs = ts
+    }
+    if ts > maxTs {
+        maxTs = ts
+    }
+}
+return int((maxTs - minTs) / 60)
+```
+
+**Tests**: No new tests needed (behavior unchanged), existing tests verify correctness
+
+---
+
+### Pattern 3: Remove Duplication with Tests
+
+**Scenario**: Duplicated code blocks, need to extract to shared helper
+
+**Steps**:
+1. ✅ Ensure tests pass
+2. ✅ Identify duplication: Lines X-Y in File A same as Lines M-N in File B
+3. ✅ Extract to shared helper
+4. ✅ Replace first occurrence with helper call
+5. ✅ Tests pass
+6. ✅ Replace second occurrence
+7. ✅ Tests pass
+8. ✅ Commit
+
+**Example**:
+```go
+// Before: Duplication
+// File A:
+if startTs > 0 {
+    timestamps = append(timestamps, startTs)
+}
+
+// File B:
+if endTs > 0 {
+    timestamps = append(timestamps, endTs)
+}
+
+// After: Shared helper
+func appendIfValid(timestamps []int64, ts int64) []int64 {
+    if ts > 0 {
+        return append(timestamps, ts)
+    }
+    return timestamps
+}
+
+// File A: timestamps = appendIfValid(timestamps, startTs)
+// File B: timestamps = appendIfValid(timestamps, endTs)
+```
+
+**Tests**:
+- Existing tests for Files A and B (still pass)
+- New: `TestAppendIfValid` (covers helper)
+
+---
+
+## TDD Anti-Patterns (Avoid These)
+
+### ❌ Anti-Pattern 1: "Skip Tests, Code Seems Fine"
+
+**Problem**: Refactor without running tests
+**Risk**: Break behavior without noticing
+**Fix**: ALWAYS run tests after each change
+
+### ❌ Anti-Pattern 2: "Write Tests After Refactoring"
+
+**Problem**: Tests written to match new code (not verify behavior)
+**Risk**: Tests pass but behavior changed
+**Fix**: Write tests BEFORE refactoring (characterization tests)
+
+### ❌ Anti-Pattern 3: "Batch Multiple Changes Before Testing"
+
+**Problem**: Make 3-4 changes, then run tests
+**Risk**: If tests fail, hard to identify which change broke it
+**Fix**: Test after EACH change
+
+### ❌ Anti-Pattern 4: "Update Tests to Match New Code"
+
+**Problem**: Tests fail after refactoring, so "fix" tests
+**Risk**: Masking behavioral changes
+**Fix**: If tests fail, rollback refactoring → Fix code, not tests
+
+### ❌ Anti-Pattern 5: "Low Coverage is OK for Refactoring"
+
+**Problem**: Refactor code with <75% coverage
+**Risk**: Behavioral changes not caught by tests
+**Fix**: Achieve ≥95% coverage BEFORE refactoring
+
+---
+
+## Automation Support
+
+**Continuous Testing** (automatically run tests on file save):
+
+### Option 1: File Watcher (entr)
+
+```bash
+# Install entr
+go install github.com/eradman/entr@latest
+
+# Auto-run tests on file change
+find internal/query -name '*.go' | entr -c go test ./internal/query/...
+```
+
+### Option 2: IDE Integration
+
+- **VS Code**: Go extension auto-runs tests on save
+- **GoLand**: Configure test auto-run in settings
+- **Vim**: Use vim-go with `:GoTestFunc` on save
+
+### Option 3: Pre-Commit Hook
+
+```bash
+# .git/hooks/pre-commit
+#!/bin/bash
+go test ./... || exit 1
+go test -cover ./... | grep -E 'coverage: [0-9]+' || exit 1
+```
+
+**Checklist**:
+- [ ] Automation setup: YES / NO
+- [ ] Tests run automatically: YES / NO
+- [ ] Feedback time: ___s (target <5s)
+
+---
+
+## Notes
+
+- **TDD Discipline**: Tests must pass after EVERY single change
+- **Small Steps**: Each refactoring step should take <10 minutes
+- **Fast Tests**: Test suite should run in <10 seconds for fast feedback
+- **No Guessing**: If unsure about behavior, write test to document it
+- **Coverage Goal**: ≥95% for code being refactored, ≥85% overall
+
+---
+
+**Version**: 1.0 (Iteration 1)
+**Next Review**: Iteration 2 (refine based on usage data)
+**Automation**: See Problem V1 for automated complexity checking integration