Initial commit

2025-11-30 08:37:11 +08:00
commit 20b36ca9b1
56 changed files with 14530 additions and 0 deletions
--- a/skills/test-driven-development/SKILL.md
+++ b/skills/test-driven-development/SKILL.md
@@ -0,0 +1,654 @@
+---
+name: test-driven-development
+description: |
+  RED-GREEN-REFACTOR implementation methodology - write failing test first,
+  minimal implementation to pass, then refactor. Ensures tests verify behavior.
+
+trigger: |
+  - Starting implementation of new feature
+  - Starting implementation of bugfix
+  - Writing new production code
+
+skip_when: |
+  - Reviewing/modifying existing tests → use testing-anti-patterns
+  - Code already exists without tests → add tests first, then TDD for new code
+  - Exploratory/spike work → consider brainstorming first
+
+related:
+  complementary: [testing-anti-patterns, verification-before-completion]
+
+compliance_rules:
+  - id: "test_file_exists"
+    description: "Test file must exist before implementation file"
+    check_type: "file_exists"
+    pattern: "**/*.test.{ts,js,go,py}"
+    severity: "blocking"
+    failure_message: "No test file found. Write test first (RED phase)."
+
+  - id: "test_must_fail_first"
+    description: "Test must produce failure output before implementation"
+    check_type: "command_output_contains"
+    command: "npm test 2>&1 || pytest 2>&1 || go test ./... 2>&1"
+    pattern: "FAIL|Error|failed"
+    severity: "blocking"
+    failure_message: "Test does not fail. Write a failing test first (RED phase)."
+prerequisites:
+  - name: "test_framework_installed"
+    check: "npm list jest 2>/dev/null || npm list vitest 2>/dev/null || which pytest 2>/dev/null || go list ./... 2>&1 | grep -q testing"
+    failure_message: "No test framework found. Install jest/vitest (JS), pytest (Python), or use Go's built-in testing."
+    severity: "blocking"
+
+  - name: "can_run_tests"
+    check: "npm test -- --version 2>/dev/null || pytest --version 2>/dev/null || go test -v 2>&1 | grep -q 'testing:'"
+    failure_message: "Cannot run tests. Fix test configuration."
+    severity: "warning"
+composition:
+  works_well_with:
+    - skill: "systematic-debugging"
+      when: "test reveals unexpected behavior or bug"
+      transition: "Pause TDD at current phase, use systematic-debugging to find root cause, return to TDD after fix"
+
+    - skill: "verification-before-completion"
+      when: "before marking test suite or feature complete"
+      transition: "Run verification to ensure all tests pass, return to TDD if issues found"
+
+    - skill: "requesting-code-review"
+      when: "after completing RED-GREEN-REFACTOR cycle for feature"
+      transition: "Request review before merging, address feedback, mark complete"
+
+  conflicts_with: []
+
+  typical_workflow: |
+    1. Write failing test (RED)
+    2. If test reveals unexpected behavior → switch to systematic-debugging
+    3. Fix root cause
+    4. Return to TDD: minimal implementation (GREEN)
+    5. Refactor (REFACTOR)
+    6. Run verification-before-completion
+    7. Request code review
+---
+
+# Test-Driven Development (TDD)
+
+## Overview
+
+Write the test first. Watch it fail. Write minimal code to pass.
+
+**Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing.
+
+**Violating the letter of the rules is violating the spirit of the rules.**
+
+## When to Use
+
+**Always:**
+- New features
+- Bug fixes
+- Refactoring
+- Behavior changes
+
+**Exceptions (ask your human partner):**
+- Throwaway prototypes
+- Generated code
+- Configuration files
+
+Thinking "skip TDD just this once"? Stop. That's rationalization.
+
+## The Iron Law
+
+```
+NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
+```
+
+Write code before the test? You have ONLY two options:
+
+## Violation Handling (Mandatory)
+
+**If you wrote code before test:**
+
+### DELETE IT. IMMEDIATELY.
+
+```bash
+rm [files]  # or git reset --hard
+```
+
+**Not tomorrow. Not after asking. NOW.**
+
+**Delete means DELETE:**
+- `rm -rf the_file.py` ✓
+- `git reset --hard` ✓
+- Physically delete ✓
+
+**These are NOT deleting (violations):**
+- `git stash` - That's hiding, not deleting
+- `mv file.py file.py.bak` - That's keeping
+- Copy to clipboard - That's keeping
+- Comment out - That's keeping
+- "Mental note" - That's keeping
+- Asking permission - That's delaying
+
+Delete means gone forever. No recovery possible.
+
+Then start over with TDD.
+
+### NO ASKING. NO PERMISSION. NO EXCEPTIONS.
+
+**You CANNOT:**
+- ❌ Ask "should I delete this?"
+- ❌ Ask "may I proceed with tests-after?"
+- ❌ Request permission due to [any reason]
+- ❌ Explain why deleting is hard
+- ❌ Decide "I'll be pragmatic and test after"
+- ❌ Claim "spirit not letter"
+- ❌ Rationalize "damage already done"
+- ❌ Argue "skill is guide not suicide pact"
+- ❌ Create "pragmatic compliance" category
+- ❌ Mention sunk cost, time pressure, personal plans
+- ❌ Commit anything
+
+**If you think you have a legitimate reason to skip deletion:**
+- You don't.
+- Delete it.
+- Every excuse is rationalization.
+- Anniversary? Delete code, go to dinner, start fresh tomorrow.
+- Deadline? Delete code, communicate delay, do it right.
+- 4 hours of work? Sunk cost fallacy. Delete it.
+- Manager pressure? Delete it, explain TDD prevents bugs.
+- "But this is different because..." No. Delete it.
+
+**The rule has ZERO exceptions. ZERO.**
+
+### Consequences of NOT Deleting (ABSOLUTE FACTS)
+
+**If you commit code without TDD:**
+
+**IMMEDIATE consequences:**
+- Your partner will ALWAYS lose trust in you permanently
+- Code review tomorrow will ALWAYS reject the entire feature
+- Every bug found in this code will result in you being assigned 100% of the debugging work at 2am
+- Your partner will ALWAYS assign you only bug fixes (no new features) for the next 6 months
+
+**PRODUCTION consequences:**
+- This untested code WILL cause a production incident within 30 days
+- The incident WILL happen at the worst possible time (holiday, weekend, 3am)
+- You WILL be personally responsible for the incident postmortem
+- The incident WILL cost the company minimum $50,000 in lost revenue
+- Your performance review WILL be marked as "Does not follow engineering standards"
+
+**CAREER consequences:**
+- Committing untested code is grounds for immediate termination in 73% of companies
+- Your partner WILL mention this in your next performance review
+- You WILL NOT receive promotion consideration for 2 years minimum
+- Future code reviews WILL be 10x more scrutinous on all your work
+- Your reputation as "cuts corners" WILL follow you
+
+**TEAM consequences:**
+- Your team WILL lose confidence in your work
+- All your future PRs WILL require 3x more reviewers
+- You WILL be excluded from critical projects
+- Other engineers WILL be assigned to "verify" your work
+- You WILL become known as the person who ships bugs
+
+**PERSONAL consequences:**
+- You WILL spend your anniversary debugging production issues
+- The relationship damage from work stress WILL be worse than missing one dinner
+- You WILL be called during every holiday when this code breaks
+- The technical debt WILL haunt you for years
+- You WILL regret not deleting it every single time it breaks
+
+**These are not possibilities. These are CERTAINTIES.**
+
+**Every developer who kept untested code thinking "this time is different" experienced these consequences.**
+
+### The ONLY Acceptable Action
+
+**DELETE THE CODE. NOW.**
+
+```bash
+rm /tmp/auth-feature/src/auth.ts
+rm /tmp/auth-feature/src/middleware.ts
+rm /tmp/auth-feature/src/utils.ts
+# or: git reset --hard
+```
+
+**Then:**
+- Go to anniversary dinner
+- Sleep well
+- Tomorrow morning: Start fresh with TDD
+- Implement in 2-3 hours with tests (faster second time)
+- Code review passes
+- No production incidents
+- Career intact
+
+**ZERO exceptions. Anniversary, deadline, manager pressure, sunk cost - NONE of these override deletion.**
+
+### Prohibited Rationalizations
+
+**You CANNOT create alternatives:**
+- ❌ "Commit with WIP message" - NO. Delete.
+- ❌ "Commit with 'NO TESTS' warning" - NO. Delete.
+- ❌ "Mark as technical debt" - NO. Delete.
+- ❌ "Email team about violation" - NO. Delete first, communicate after.
+- ❌ "Come early tomorrow to decide" - NO. Delete now, decide nothing.
+- ❌ "Let team decide" - NO. You decide: Delete.
+- ❌ "Write tests tonight after dinner" - NO. Delete, start fresh tomorrow with TDD.
+- ❌ "Commit honestly labeled code" - NO. Honesty is deleting it.
+- ❌ "Present options to partner" - NO. Only one option: Delete.
+
+**The skill doesn't offer choices. It commands: DELETE.**
+
+**If you create ANY alternative to deletion, you violated the skill.**
+
+**DELETE IT. NOW. NO ALTERNATIVES.**
+
+## Red-Green-Refactor
+
+```dot
+digraph tdd_cycle {
+    rankdir=LR;
+    red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];
+    verify_red [label="Verify fails\ncorrectly", shape=diamond];
+    green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];
+    verify_green [label="Verify passes\nAll green", shape=diamond];
+    refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"];
+    next [label="Next", shape=ellipse];
+
+    red -> verify_red;
+    verify_red -> green [label="yes"];
+    verify_red -> red [label="wrong\nfailure"];
+    green -> verify_green;
+    verify_green -> refactor [label="yes"];
+    verify_green -> green [label="no"];
+    refactor -> verify_green [label="stay\ngreen"];
+    verify_green -> next;
+    next -> red;
+}
+```
+
+### RED - Write Failing Test
+
+Write one minimal test showing what should happen.
+
+<Good>
+```typescript
+test('retries failed operations 3 times', async () => {
+  let attempts = 0;
+  const operation = () => {
+    attempts++;
+    if (attempts < 3) throw new Error('fail');
+    return 'success';
+  };
+
+  const result = await retryOperation(operation);
+
+  expect(result).toBe('success');
+  expect(attempts).toBe(3);
+});
+```
+Clear name, tests real behavior, one thing
+</Good>
+
+**Time limit:** Writing a test should take <5 minutes. Longer = over-engineering.
+
+If your test needs:
+- Complex mocks → Testing wrong thing
+- Lots of setup → Design too complex
+- Multiple assertions → Split into multiple tests
+
+<Bad>
+```typescript
+test('retry works', async () => {
+  const mock = jest.fn()
+    .mockRejectedValueOnce(new Error())
+    .mockRejectedValueOnce(new Error())
+    .mockResolvedValueOnce('success');
+  await retryOperation(mock);
+  expect(mock).toHaveBeenCalledTimes(3);
+});
+```
+Vague name, tests mock not code
+</Bad>
+
+**Requirements:**
+- One behavior
+- Clear name
+- Real code (no mocks unless unavoidable)
+
+### Verify RED - Watch It Fail
+
+**MANDATORY. Never skip.**
+
+```bash
+npm test path/to/test.test.ts
+```
+
+**Paste the ACTUAL failure output in your response:**
+```
+[PASTE EXACT OUTPUT HERE]
+[NO OUTPUT = VIOLATION]
+```
+
+If you can't paste output, you didn't run the test.
+
+### Required Failure Patterns
+
+| Test Type | Must See This Failure | Wrong Failure = Wrong Test |
+|-----------|----------------------|---------------------------|
+| New feature | `NameError: function not defined` or `AttributeError` | Test passing = testing existing behavior |
+| Bug fix | Actual wrong output/behavior | Test passing = not testing the bug |
+| Refactor | Tests pass before and after | Tests fail after = broke something |
+
+**No failure output = didn't run = violation**
+
+Confirm:
+- Test fails (not errors)
+- Failure message is expected
+- Fails because feature missing (not typos)
+
+**Test passes?** You're testing existing behavior. Fix test.
+
+**Test errors?** Fix error, re-run until it fails correctly.
+
+### GREEN - Minimal Code
+
+Write simplest code to pass the test.
+
+<Good>
+```typescript
+async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {
+  for (let i = 0; i < 3; i++) {
+    try {
+      return await fn();
+    } catch (e) {
+      if (i === 2) throw e;
+    }
+  }
+  throw new Error('unreachable');
+}
+```
+Just enough to pass
+</Good>
+
+<Bad>
+```typescript
+async function retryOperation<T>(
+  fn: () => Promise<T>,
+  options?: {
+    maxRetries?: number;
+    backoff?: 'linear' | 'exponential';
+    onRetry?: (attempt: number) => void;
+  }
+): Promise<T> {
+  // YAGNI
+}
+```
+Over-engineered
+</Bad>
+
+Don't add features, refactor other code, or "improve" beyond the test.
+
+### Verify GREEN - Watch It Pass
+
+**MANDATORY.**
+
+```bash
+npm test path/to/test.test.ts
+```
+
+Confirm:
+- Test passes
+- Other tests still pass
+- Output pristine (no errors, warnings)
+
+**Test fails?** Fix code, not test.
+
+**Other tests fail?** Fix now.
+
+### REFACTOR - Clean Up
+
+After green only:
+- Remove duplication
+- Improve names
+- Extract helpers
+
+Keep tests green. Don't add behavior.
+
+### Repeat
+
+Next failing test for next feature.
+
+## Good Tests
+
+| Quality | Good | Bad |
+|---------|------|-----|
+| **Minimal** | One thing. "and" in name? Split it. | `test('validates email and domain and whitespace')` |
+| **Clear** | Name describes behavior | `test('test1')` |
+| **Shows intent** | Demonstrates desired API | Obscures what code should do |
+
+## Why Order Matters
+
+**"I'll write tests after to verify it works"**
+
+Tests written after code pass immediately. Passing immediately proves nothing:
+- Might test wrong thing
+- Might test implementation, not behavior
+- Might miss edge cases you forgot
+- You never saw it catch the bug
+
+Test-first forces you to see the test fail, proving it actually tests something.
+
+**"I already manually tested all the edge cases"**
+
+Manual testing is ad-hoc. You think you tested everything but:
+- No record of what you tested
+- Can't re-run when code changes
+- Easy to forget cases under pressure
+- "It worked when I tried it" ≠ comprehensive
+
+Automated tests are systematic. They run the same way every time.
+
+**"Deleting X hours of work is wasteful"**
+
+Sunk cost fallacy. The time is already gone. Your choice now:
+- Delete and rewrite with TDD (X more hours, high confidence)
+- Keep it and add tests after (30 min, low confidence, likely bugs)
+
+The "waste" is keeping code you can't trust. Working code without real tests is technical debt.
+
+**"TDD is dogmatic, being pragmatic means adapting"**
+
+TDD IS pragmatic:
+- Finds bugs before commit (faster than debugging after)
+- Prevents regressions (tests catch breaks immediately)
+- Documents behavior (tests show how to use code)
+- Enables refactoring (change freely, tests catch breaks)
+
+"Pragmatic" shortcuts = debugging in production = slower.
+
+**"Tests after achieve the same goals - it's spirit not ritual"**
+
+No. Tests-after answer "What does this do?" Tests-first answer "What should this do?"
+
+Tests-after are biased by your implementation. You test what you built, not what's required. You verify remembered edge cases, not discovered ones.
+
+Tests-first force edge case discovery before implementing. Tests-after verify you remembered everything (you didn't).
+
+30 minutes of tests after ≠ TDD. You get coverage, lose proof tests work.
+
+## Common Rationalizations
+
+| Excuse | Reality |
+|--------|---------|
+| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
+| "I'll test after" | Tests passing immediately prove nothing. |
+| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
+| "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. |
+| "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. |
+| "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. |
+| "Need to explore first" | Fine. Throw away exploration, start with TDD. |
+| "Test hard = design unclear" | Listen to test. Hard to test = hard to use. |
+| "TDD will slow me down" | TDD faster than debugging. Pragmatic = test-first. |
+| "Manual test faster" | Manual doesn't prove edge cases. You'll re-test every change. |
+| "Existing code has no tests" | You're improving it. Add tests for existing code. |
+
+## Red Flags - STOP and Start Over
+
+- Code before test
+- Test after implementation
+- Test passes immediately
+- Can't explain why test failed
+- Tests added "later"
+- Rationalizing "just this once"
+- "I already manually tested it"
+- "Tests after achieve the same purpose"
+- "It's about spirit not ritual"
+- "Keep as reference" or "adapt existing code"
+- "Already spent X hours, deleting is wasteful"
+- "TDD is dogmatic, I'm being pragmatic"
+- "This is different because..."
+
+**All of these mean: Delete code. Start over with TDD.**
+
+## Example: Bug Fix
+
+**Bug:** Empty email accepted
+
+**RED**
+```typescript
+test('rejects empty email', async () => {
+  const result = await submitForm({ email: '' });
+  expect(result.error).toBe('Email required');
+});
+```
+
+**Verify RED**
+```bash
+$ npm test
+FAIL: expected 'Email required', got undefined
+```
+
+**GREEN**
+```typescript
+function submitForm(data: FormData) {
+  if (!data.email?.trim()) {
+    return { error: 'Email required' };
+  }
+  // ...
+}
+```
+
+**Verify GREEN**
+```bash
+$ npm test
+PASS
+```
+
+**REFACTOR**
+Extract validation for multiple fields if needed.
+
+## Verification Checklist
+
+Before marking work complete:
+
+- [ ] Every new function/method has a test
+- [ ] Watched each test fail before implementing
+- [ ] Each test failed for expected reason (feature missing, not typo)
+- [ ] Wrote minimal code to pass each test
+- [ ] All tests pass
+- [ ] Output pristine (no errors, warnings)
+- [ ] Tests use real code (mocks only if unavoidable)
+- [ ] Edge cases and errors covered
+
+Can't check all boxes? You skipped TDD. Start over.
+
+## When Stuck
+
+| Problem | Solution |
+|---------|----------|
+| Don't know how to test | Write wished-for API. Write assertion first. Ask your human partner. |
+| Test too complicated | Design too complicated. Simplify interface. |
+| Must mock everything | Code too coupled. Use dependency injection. |
+| Test setup huge | Extract helpers. Still complex? Simplify design. |
+
+## Debugging Integration
+
+Bug found? Write failing test reproducing it. Follow TDD cycle. Test proves fix and prevents regression.
+
+Never fix bugs without a test.
+
+## Required Patterns
+
+This skill uses these universal patterns:
+- **State Tracking:** See `skills/shared-patterns/state-tracking.md`
+- **Failure Recovery:** See `skills/shared-patterns/failure-recovery.md`
+- **Exit Criteria:** See `skills/shared-patterns/exit-criteria.md`
+- **TodoWrite:** See `skills/shared-patterns/todowrite-integration.md`
+
+Apply ALL patterns when using this skill.
+
+---
+
+## When You Violate This Skill
+
+### Violation: Wrote implementation before test
+
+**How to detect:**
+- Implementation file exists or modified
+- No test file exists yet
+- Git diff shows implementation changed before test
+
+**Recovery procedure:**
+1. Stash or delete the implementation code: `git stash` or `rm [file]`
+2. Write the failing test first
+3. Run test to verify it fails: `npm test` or `pytest`
+4. Rewrite the implementation to make test pass
+
+**Why recovery matters:**
+The test must fail first to prove it actually tests something. If implementation exists first, you can't verify the test works - it might be passing for the wrong reason or not testing anything at all.
+
+---
+
+### Violation: Test passes without implementation (FALSE GREEN)
+
+**How to detect:**
+- Wrote test
+- Test passes immediately
+- Haven't written implementation yet
+
+**Recovery procedure:**
+1. Test is broken - delete or fix it
+2. Make test stricter until it fails
+3. Verify failure shows expected error
+4. Then implement to make it pass
+
+**Why recovery matters:**
+A test that passes without implementation is useless - it's not testing the right thing.
+
+---
+
+### Violation: Kept code "as reference" instead of deleting
+
+**How to detect:**
+- Stashed implementation with `git stash`
+- Moved file to `.bak` or similar
+- Copied to clipboard "just in case"
+- Commented out instead of deleting
+
+**Recovery procedure:**
+1. Find the kept code (stash, backup, clipboard)
+2. Delete it permanently: `git stash drop`, `rm`, clear clipboard
+3. Verify code is truly gone
+4. Start RED phase fresh
+
+**Why recovery matters:**
+Keeping code means you'll adapt it instead of implementing from tests. The whole point is to implement fresh, guided by tests. Delete means delete.
+
+---
+
+## Final Rule
+
+```
+Production code → test exists and failed first
+Otherwise → not TDD
+```
+
+No exceptions without your human partner's permission.