Initial commit

2025-11-30 09:06:38 +08:00
commit ed3e4c84c3
76 changed files with 20449 additions and 0 deletions
--- a/skills/test-driven-development/SKILL.md
+++ b/skills/test-driven-development/SKILL.md
@@ -0,0 +1,336 @@
+---
+name: test-driven-development
+description: Use when implementing features or fixing bugs - enforces RED-GREEN-REFACTOR cycle requiring tests to fail before writing code
+---
+
+<skill_overview>
+Write the test first, watch it fail, write minimal code to pass. If you didn't watch the test fail, you don't know if it tests the right thing.
+</skill_overview>
+
+<rigidity_level>
+LOW FREEDOM - Follow these exact steps in order. Do not adapt.
+
+Violating the letter of the rules is violating the spirit of the rules.
+</rigidity_level>
+
+<quick_reference>
+
+| Phase | Action | Command Example | Expected Result |
+|-------|--------|-----------------|-----------------|
+| **RED** | Write failing test | `cargo test test_name` | FAIL (feature missing) |
+| **Verify RED** | Confirm correct failure | Check error message | "function not found" or assertion fails |
+| **GREEN** | Write minimal code | Implement feature | Test passes |
+| **Verify GREEN** | All tests pass | `cargo test` | All green, no warnings |
+| **REFACTOR** | Clean up code | Improve while green | Tests still pass |
+
+**Iron Law:** NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
+
+</quick_reference>
+
+<when_to_use>
+**Always use for:**
+- New features
+- Bug fixes
+- Refactoring with behavior changes
+- Any production code
+
+**Ask your human partner for exceptions:**
+- Throwaway prototypes (will be deleted)
+- Generated code
+- Configuration files
+
+Thinking "skip TDD just this once"? Stop. That's rationalization.
+</when_to_use>
+
+<the_process>
+
+## 1. RED - Write Failing Test
+
+Write one minimal test showing what should happen.
+
+**Requirements:**
+- Test one behavior only ("and" in name? Split it)
+- Clear name describing behavior
+- Use real code (no mocks unless unavoidable)
+
+See [resources/language-examples.md](resources/language-examples.md) for Rust, Swift, TypeScript examples.
+
+## 2. Verify RED - Watch It Fail
+
+**MANDATORY. Never skip.**
+
+Run the test and confirm:
+- ✓ Test **fails** (not errors with syntax issues)
+- ✓ Failure message is expected ("function not found" or assertion fails)
+- ✓ Fails because feature missing (not typos)
+
+**If test passes:** You're testing existing behavior. Fix the test.
+**If test errors:** Fix syntax error, re-run until it fails correctly.
+
+## 3. GREEN - Write Minimal Code
+
+Write simplest code to pass the test. Nothing more.
+
+**Key principle:** Don't add features the test doesn't require. Don't refactor other code. Don't "improve" beyond the test.
+
+## 4. Verify GREEN - Watch It Pass
+
+**MANDATORY.**
+
+Run tests and confirm:
+- ✓ New test passes
+- ✓ All other tests still pass
+- ✓ No errors or warnings
+
+**If test fails:** Fix code, not test.
+**If other tests fail:** Fix now before proceeding.
+
+## 5. REFACTOR - Clean Up
+
+**Only after green:**
+- Remove duplication
+- Improve names
+- Extract helpers
+
+Keep tests green. Don't add behavior.
+
+## 6. Repeat
+
+Next failing test for next feature.
+
+</the_process>
+
+<examples>
+
+<example>
+<scenario>Developer writes implementation first, then adds test that passes immediately</scenario>
+
+<code>
+// Code written FIRST
+def validate_email(email):
+    return "@" in email  # Bug: accepts "@@"
+
+// Test written AFTER
+def test_validate_email():
+    assert validate_email("user@example.com")  # Passes immediately!
+    // Missing edge case: assert not validate_email("@@")
+</code>
+
+<why_it_fails>
+When test passes immediately:
+- Never proved the test catches bugs
+- Only tested happy path you remembered
+- Forgot edge cases (like "@@")
+- Bug ships to production
+
+Tests written after verify remembered cases, not required behavior.
+</why_it_fails>
+
+<correction>
+**TDD approach:**
+
+1. **RED** - Write test first (including edge case):
+```python
+def test_validate_email():
+    assert validate_email("user@example.com")  # Will fail - function doesn't exist
+    assert not validate_email("@@")            # Edge case up front
+```
+
+2. **Verify RED** - Run test, watch it fail:
+```bash
+NameError: function 'validate_email' is not defined
+```
+
+3. **GREEN** - Implement to pass both cases:
+```python
+def validate_email(email):
+    return "@" in email and email.count("@") == 1
+```
+
+4. **Verify GREEN** - Both assertions pass, bug prevented.
+
+**Result:** Test failed first, proving it works. Edge case discovered during test writing, not in production.
+</correction>
+</example>
+
+<example>
+<scenario>Developer has already written 3 hours of code without tests. Wants to keep it as "reference" while writing tests.</scenario>
+
+<code>
+// 200 lines of untested code exists
+// Developer thinks: "I'll keep this and write tests that match it"
+// Or: "I'll use it as reference to speed up TDD"
+</code>
+
+<why_it_fails>
+**Keeping code as "reference":**
+- You'll copy it (that's testing after, with extra steps)
+- You'll adapt it (biased by implementation)
+- Tests will match code, not requirements
+- You'll justify shortcuts: "I already know this works"
+
+**Result:** All the problems of test-after, none of the benefits of TDD.
+</why_it_fails>
+
+<correction>
+**Delete it. Completely.**
+
+```bash
+git stash  # Or delete the file
+```
+
+**Then start TDD:**
+1. Write first failing test from requirements (not from code)
+2. Watch it fail
+3. Implement fresh (might be different from original, that's OK)
+4. Watch it pass
+
+**Why delete:**
+- Sunk cost is already gone
+- 3 hours implementing ≠ 3 hours with TDD (TDD might be 2 hours total)
+- Code without tests is technical debt
+- Fresh implementation from tests is usually better
+
+**What you gain:**
+- Tests that actually verify behavior
+- Confidence code works
+- Ability to refactor safely
+- No bugs from untested edge cases
+</correction>
+</example>
+
+<example>
+<scenario>Test is hard to write. Developer thinks "design must be unclear, but I'll implement first to explore."</scenario>
+
+<code>
+// Test attempt:
+func testUserServiceCreatesAccount() {
+    // Need to mock database, email service, payment gateway, logger...
+    // This is getting complicated, maybe I should just implement first
+}
+</code>
+
+<why_it_fails>
+**"Test is hard" is valuable signal:**
+- Hard to test = hard to use
+- Too many dependencies = coupling too tight
+- Complex setup = design needs simplification
+
+**Implementing first ignores this signal:**
+- Build the complex design
+- Lock in the coupling
+- Now forced to write complex tests (or skip them)
+</why_it_fails>
+
+<correction>
+**Listen to the test.**
+
+Hard to test? Simplify the interface:
+
+```swift
+// Instead of:
+class UserService {
+    init(db: Database, email: EmailService, payments: PaymentGateway, logger: Logger) { }
+    func createAccount(email: String, password: String, paymentToken: String) throws { }
+}
+
+// Make testable:
+class UserService {
+    func createAccount(request: CreateAccountRequest) -> Result<Account, Error> {
+        // Dependencies injected through request or passed separately
+    }
+}
+```
+
+**Test becomes simple:**
+```swift
+func testCreatesAccountFromRequest() {
+    let service = UserService()
+    let request = CreateAccountRequest(email: "user@example.com")
+    let result = service.createAccount(request: request)
+    XCTAssertEqual(result.email, "user@example.com")
+}
+```
+
+**TDD forces good design.** If test is hard, fix design before implementing.
+</correction>
+</example>
+
+</examples>
+
+<critical_rules>
+
+## Rules That Have No Exceptions
+
+1. **Write code before test?** → Delete it. Start over.
+   - Never keep as "reference"
+   - Never "adapt" while writing tests
+   - Delete means delete
+
+2. **Test passes immediately?** → Not TDD. Fix the test or delete the code.
+   - Passing immediately proves nothing
+   - You're testing existing behavior, not required behavior
+
+3. **Can't explain why test failed?** → Fix until failure makes sense.
+   - "function not found" = good (feature doesn't exist)
+   - Weird error = bad (fix test, re-run)
+
+4. **Want to skip "just this once"?** → That's rationalization. Stop.
+   - TDD is faster than debugging in production
+   - "Too simple to test" = test takes 30 seconds
+   - "Already manually tested" = not systematic, not repeatable
+
+## Common Excuses
+
+All of these mean: Stop, follow TDD:
+- "This is different because..."
+- "I'm being pragmatic, not dogmatic"
+- "It's about spirit not ritual"
+- "Tests after achieve the same goals"
+- "Deleting X hours of work is wasteful"
+
+</critical_rules>
+
+<verification_checklist>
+
+Before marking work complete:
+
+- [ ] Every new function/method has a test
+- [ ] Watched each test **fail** before implementing
+- [ ] Each test failed for expected reason (feature missing, not typo)
+- [ ] Wrote minimal code to pass each test
+- [ ] All tests pass with no warnings
+- [ ] Tests use real code (mocks only if unavoidable)
+- [ ] Edge cases and errors covered
+
+**Can't check all boxes?** You skipped TDD. Start over.
+
+</verification_checklist>
+
+<integration>
+
+**This skill calls:**
+- verification-before-completion (running tests to verify)
+
+**This skill is called by:**
+- fixing-bugs (write failing test reproducing bug)
+- executing-plans (when implementing bd tasks)
+- refactoring-safely (keep tests green while refactoring)
+
+**Agents used:**
+- hyperpowers:test-runner (run tests, return summary only)
+
+</integration>
+
+<resources>
+
+**Detailed language-specific examples:**
+- [Rust, Swift, TypeScript examples](resources/language-examples.md) - Complete RED-GREEN-REFACTOR cycles
+- [Language-specific test commands](resources/language-examples.md#verification-commands-by-language)
+
+**When stuck:**
+- Test too complicated? → Design too complicated, simplify interface
+- Must mock everything? → Code too coupled, use dependency injection
+- Test setup huge? → Extract helpers, or simplify design
+
+</resources>
--- a/skills/test-driven-development/resources/example-workflows.md
+++ b/skills/test-driven-development/resources/example-workflows.md
@@ -0,0 +1,325 @@
+# TDD Workflow Examples
+
+This guide shows complete TDD workflows for common scenarios: bug fixes and feature additions.
+
+## Example: Bug Fix
+
+### Bug
+
+Empty email is accepted when it should be rejected.
+
+### RED Phase: Write Failing Test
+
+**Swift:**
+```swift
+func testRejectsEmptyEmail() async throws {
+    let result = try await submitForm(FormData(email: ""))
+    XCTAssertEqual(result.error, "Email required")
+}
+```
+
+### Verify RED: Watch It Fail
+
+```bash
+$ swift test --filter FormTests.testRejectsEmptyEmail
+FAIL: XCTAssertEqual failed: ("nil") is not equal to ("Optional("Email required")")
+```
+
+**Confirms:**
+- Test fails (not errors)
+- Failure message shows email not being validated
+- Fails because feature missing (not typos)
+
+### GREEN Phase: Minimal Code
+
+```swift
+struct FormResult {
+    var error: String?
+}
+
+func submitForm(_ data: FormData) async throws -> FormResult {
+    if data.email.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
+        return FormResult(error: "Email required")
+    }
+    // ... rest of form processing
+    return FormResult()
+}
+```
+
+### Verify GREEN: Watch It Pass
+
+```bash
+$ swift test --filter FormTests.testRejectsEmptyEmail
+Test Case '-[FormTests testRejectsEmptyEmail]' passed
+```
+
+**Confirms:**
+- Test passes
+- Other tests still pass
+- No errors or warnings
+
+### REFACTOR: Clean Up
+
+If multiple fields need validation:
+
+```swift
+extension FormData {
+    func validate() -> String? {
+        if email.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
+            return "Email required"
+        }
+        // Add other validations...
+        return nil
+    }
+}
+
+func submitForm(_ data: FormData) async throws -> FormResult {
+    if let error = data.validate() {
+        return FormResult(error: error)
+    }
+    // ... rest of form processing
+    return FormResult()
+}
+```
+
+Run tests again to confirm still green.
+
+---
+
+## Example: Feature Addition
+
+### Feature
+
+Calculate average of non-empty list.
+
+### RED Phase: Write Failing Test
+
+**TypeScript:**
+```typescript
+describe('average', () => {
+  it('calculates average of non-empty list', () => {
+    expect(average([1, 2, 3])).toBe(2);
+  });
+});
+```
+
+### Verify RED: Watch It Fail
+
+```bash
+$ npm test -- --testNamePattern="average"
+FAIL: ReferenceError: average is not defined
+```
+
+**Confirms:**
+- Function doesn't exist yet
+- Test would verify the behavior when function exists
+
+### GREEN Phase: Minimal Code
+
+```typescript
+function average(numbers: number[]): number {
+  const sum = numbers.reduce((acc, n) => acc + n, 0);
+  return sum / numbers.length;
+}
+```
+
+### Verify GREEN: Watch It Pass
+
+```bash
+$ npm test -- --testNamePattern="average"
+PASS: calculates average of non-empty list
+```
+
+### Add Edge Case: Empty List
+
+**RED:**
+```typescript
+it('returns 0 for empty list', () => {
+  expect(average([])).toBe(0);
+});
+```
+
+**Verify RED:**
+```bash
+$ npm test -- --testNamePattern="average.*empty"
+FAIL: Expected: 0, Received: NaN
+```
+
+**GREEN:**
+```typescript
+function average(numbers: number[]): number {
+  if (numbers.length === 0) return 0;
+  const sum = numbers.reduce((acc, n) => acc + n, 0);
+  return sum / numbers.length;
+}
+```
+
+**Verify GREEN:**
+```bash
+$ npm test -- --testNamePattern="average"
+PASS: 2 tests passed
+```
+
+### REFACTOR: Clean Up
+
+No duplication or unclear naming, so no refactoring needed. Move to next feature.
+
+---
+
+## Example: Refactoring with Tests
+
+### Scenario
+
+Existing function works but is hard to read. Tests exist and pass.
+
+### Current Code
+
+```rust
+fn process(data: Vec<i32>) -> i32 {
+    let mut result = 0;
+    for item in data {
+        if item > 0 {
+            result += item * 2;
+        }
+    }
+    result
+}
+```
+
+### Existing Tests (Already Green)
+
+```rust
+#[test]
+fn processes_positive_numbers() {
+    assert_eq!(process(vec![1, 2, 3]), 12); // (1*2) + (2*2) + (3*2) = 12
+}
+
+#[test]
+fn ignores_negative_numbers() {
+    assert_eq!(process(vec![1, -2, 3]), 8); // (1*2) + (3*2) = 8
+}
+
+#[test]
+fn handles_empty_list() {
+    assert_eq!(process(vec![]), 0);
+}
+```
+
+### REFACTOR: Improve Clarity
+
+```rust
+fn process(data: Vec<i32>) -> i32 {
+    data.iter()
+        .filter(|&&n| n > 0)
+        .map(|&n| n * 2)
+        .sum()
+}
+```
+
+### Verify Still Green
+
+```bash
+$ cargo test
+running 3 tests
+test processes_positive_numbers ... ok
+test ignores_negative_numbers ... ok
+test handles_empty_list ... ok
+
+test result: ok. 3 passed; 0 failed
+```
+
+**Key:** Tests prove refactoring didn't break behavior.
+
+---
+
+## Common Patterns
+
+### Pattern: Adding Validation
+
+1. **RED:** Test that invalid input is rejected
+2. **Verify RED:** Confirm invalid input currently accepted
+3. **GREEN:** Add validation check
+4. **Verify GREEN:** Confirm validation works
+5. **REFACTOR:** Extract validation if reusable
+
+### Pattern: Adding Error Handling
+
+1. **RED:** Test that error condition is caught
+2. **Verify RED:** Confirm error currently unhandled
+3. **GREEN:** Add error handling
+4. **Verify GREEN:** Confirm error handled correctly
+5. **REFACTOR:** Consolidate error handling if duplicated
+
+### Pattern: Optimizing Performance
+
+1. **Ensure tests exist and pass** (if not, add tests first)
+2. **REFACTOR:** Optimize implementation
+3. **Verify GREEN:** Confirm tests still pass
+4. **Measure:** Confirm performance improved
+
+**Note:** Never optimize without tests. You can't prove optimization didn't break behavior.
+
+---
+
+## Workflow Checklist
+
+### For Each New Feature
+
+- [ ] Write one failing test
+- [ ] Run test, confirm it fails correctly
+- [ ] Write minimal code to pass
+- [ ] Run test, confirm it passes
+- [ ] Run all tests, confirm no regressions
+- [ ] Refactor if needed (staying green)
+- [ ] Commit
+
+### For Each Bug Fix
+
+- [ ] Write test reproducing the bug
+- [ ] Run test, confirm it fails (reproduces bug)
+- [ ] Fix the bug (minimal change)
+- [ ] Run test, confirm it passes (bug fixed)
+- [ ] Run all tests, confirm no regressions
+- [ ] Commit
+
+### For Each Refactoring
+
+- [ ] Confirm tests exist and pass
+- [ ] Make one small refactoring change
+- [ ] Run tests, confirm still green
+- [ ] Repeat until refactoring complete
+- [ ] Commit
+
+---
+
+## Anti-Patterns to Avoid
+
+### ❌ Writing Multiple Tests Before Implementing
+
+**Why bad:** You can't tell which test makes implementation fail. Write one, implement, repeat.
+
+### ❌ Changing Test to Make It Pass
+
+**Why bad:** Test should define correct behavior. If test is wrong, fix test first, then re-run RED phase.
+
+### ❌ Adding Features Not Covered by Tests
+
+**Why bad:** Untested code. If you need a feature, write test first.
+
+### ❌ Skipping RED Verification
+
+**Why bad:** Test might pass immediately, meaning it doesn't test anything new.
+
+### ❌ Skipping GREEN Verification
+
+**Why bad:** Test might fail for unexpected reason. Always verify expected pass.
+
+---
+
+## Remember
+
+- **One test at a time:** Write test, implement, repeat
+- **Watch it fail:** Proves test actually tests something
+- **Watch it pass:** Proves implementation works
+- **Stay green:** All tests pass before moving on
+- **Refactor freely:** Tests catch breaks
--- a/skills/test-driven-development/resources/language-examples.md
+++ b/skills/test-driven-development/resources/language-examples.md
@@ -0,0 +1,267 @@
+# TDD Language-Specific Examples
+
+This guide provides concrete TDD examples in multiple programming languages, showing the RED-GREEN-REFACTOR cycle.
+
+## RED Phase Examples
+
+Write one minimal test showing what should happen.
+
+### Rust
+
+```rust
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn retries_failed_operations_3_times() {
+        let mut attempts = 0;
+        let operation = || -> Result<&str, &str> {
+            attempts += 1;
+            if attempts < 3 {
+                Err("fail")
+            } else {
+                Ok("success")
+            }
+        };
+
+        let result = retry_operation(operation);
+
+        assert_eq!(result, Ok("success"));
+        assert_eq!(attempts, 3);
+    }
+}
+```
+
+**Running the test:**
+```bash
+cargo test tests::retries_failed_operations_3_times
+```
+
+### Swift
+
+```swift
+func testRetriesFailedOperations3Times() async throws {
+    var attempts = 0
+    let operation = { () -> Result<String, Error> in
+        attempts += 1
+        if attempts < 3 {
+            return .failure(RetryError.failed)
+        }
+        return .success("success")
+    }
+
+    let result = try await retryOperation(operation)
+
+    XCTAssertEqual(result, "success")
+    XCTAssertEqual(attempts, 3)
+}
+```
+
+**Running the test:**
+```bash
+swift test --filter RetryTests.testRetriesFailedOperations3Times
+```
+
+### TypeScript
+
+```typescript
+describe('retryOperation', () => {
+  it('retries failed operations 3 times', async () => {
+    let attempts = 0;
+    const operation = () => {
+      attempts++;
+      if (attempts < 3) {
+        throw new Error('fail');
+      }
+      return 'success';
+    };
+
+    const result = await retryOperation(operation);
+
+    expect(result).toBe('success');
+    expect(attempts).toBe(3);
+  });
+});
+```
+
+**Running the test (Jest):**
+```bash
+npm test -- --testNamePattern="retries failed operations"
+```
+
+**Running the test (Vitest):**
+```bash
+npm test -- -t "retries failed operations"
+```
+
+### Why These Are Good
+
+- Clear names describing the behavior
+- Test real behavior, not mocks
+- One thing per test
+- Shows desired API
+
+### Bad Example
+
+```typescript
+test('retry', () => {
+    let mockCalls = 0;
+    const mock = () => {
+        mockCalls++;
+        return 'success';
+    };
+    retryOperation(mock);
+    expect(mockCalls).toBe(1); // Tests mock, not behavior
+});
+```
+
+**Why this is bad:**
+- Vague name
+- Tests mock behavior, not real retry logic
+
+## GREEN Phase Examples
+
+Write simplest code to pass the test.
+
+### Rust
+
+```rust
+fn retry_operation<F, T, E>(mut operation: F) -> Result<T, E>
+where
+    F: FnMut() -> Result<T, E>,
+{
+    for i in 0..3 {
+        match operation() {
+            Ok(result) => return Ok(result),
+            Err(e) => {
+                if i == 2 {
+                    return Err(e);
+                }
+            }
+        }
+    }
+    unreachable!()
+}
+```
+
+### Swift
+
+```swift
+func retryOperation<T>(_ operation: () async throws -> T) async throws -> T {
+    var lastError: Error?
+    for attempt in 0..<3 {
+        do {
+            return try await operation()
+        } catch {
+            lastError = error
+            if attempt == 2 {
+                throw error
+            }
+        }
+    }
+    throw lastError!
+}
+```
+
+### TypeScript
+
+```typescript
+async function retryOperation<T>(
+  operation: () => Promise<T>
+): Promise<T> {
+  let lastError: Error | undefined;
+  for (let i = 0; i < 3; i++) {
+    try {
+      return await operation();
+    } catch (error) {
+      lastError = error as Error;
+      if (i === 2) {
+        throw error;
+      }
+    }
+  }
+  throw lastError;
+}
+```
+
+### Bad Example - Over-engineered (YAGNI)
+
+```typescript
+async function retryOperation<T>(
+  operation: () => Promise<T>,
+  options: {
+    maxRetries?: number;
+    backoff?: 'linear' | 'exponential';
+    onRetry?: (attempt: number) => void;
+    shouldRetry?: (error: Error) => boolean;
+  } = {}
+): Promise<T> {
+  // Don't add features the test doesn't require!
+}
+```
+
+**Why this is bad:** Test only requires 3 retries. Don't add:
+- Configurable retries
+- Backoff strategies
+- Callbacks
+- Error filtering
+
+...until a test requires them.
+
+## Test Requirements
+
+**Every test should:**
+- Test one behavior
+- Have a clear name
+- Use real code (no mocks unless unavoidable)
+
+## Verification Commands by Language
+
+### Rust
+```bash
+# Single test
+cargo test tests::test_name
+
+# All tests
+cargo test
+
+# With output
+cargo test -- --nocapture
+```
+
+### Swift
+```bash
+# Single test
+swift test --filter TestClass.testName
+
+# All tests
+swift test
+
+# With output
+swift test --verbose
+```
+
+### TypeScript (Jest)
+```bash
+# Single test
+npm test -- --testNamePattern="test name"
+
+# All tests
+npm test
+
+# With coverage
+npm test -- --coverage
+```
+
+### TypeScript (Vitest)
+```bash
+# Single test
+npm test -- -t "test name"
+
+# All tests
+npm test
+
+# With coverage
+npm test -- --coverage
+```