Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 09:06:38 +08:00
commit ed3e4c84c3
76 changed files with 20449 additions and 0 deletions

View File

@@ -0,0 +1,336 @@
---
name: test-driven-development
description: Use when implementing features or fixing bugs - enforces RED-GREEN-REFACTOR cycle requiring tests to fail before writing code
---
<skill_overview>
Write the test first, watch it fail, write minimal code to pass. If you didn't watch the test fail, you don't know if it tests the right thing.
</skill_overview>
<rigidity_level>
LOW FREEDOM - Follow these exact steps in order. Do not adapt.
Violating the letter of the rules is violating the spirit of the rules.
</rigidity_level>
<quick_reference>
| Phase | Action | Command Example | Expected Result |
|-------|--------|-----------------|-----------------|
| **RED** | Write failing test | `cargo test test_name` | FAIL (feature missing) |
| **Verify RED** | Confirm correct failure | Check error message | "function not found" or assertion fails |
| **GREEN** | Write minimal code | Implement feature | Test passes |
| **Verify GREEN** | All tests pass | `cargo test` | All green, no warnings |
| **REFACTOR** | Clean up code | Improve while green | Tests still pass |
**Iron Law:** NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
</quick_reference>
<when_to_use>
**Always use for:**
- New features
- Bug fixes
- Refactoring with behavior changes
- Any production code
**Ask your human partner for exceptions:**
- Throwaway prototypes (will be deleted)
- Generated code
- Configuration files
Thinking "skip TDD just this once"? Stop. That's rationalization.
</when_to_use>
<the_process>
## 1. RED - Write Failing Test
Write one minimal test showing what should happen.
**Requirements:**
- Test one behavior only ("and" in name? Split it)
- Clear name describing behavior
- Use real code (no mocks unless unavoidable)
See [resources/language-examples.md](resources/language-examples.md) for Rust, Swift, TypeScript examples.
## 2. Verify RED - Watch It Fail
**MANDATORY. Never skip.**
Run the test and confirm:
- ✓ Test **fails** (not errors with syntax issues)
- ✓ Failure message is expected ("function not found" or assertion fails)
- ✓ Fails because feature missing (not typos)
**If test passes:** You're testing existing behavior. Fix the test.
**If test errors:** Fix syntax error, re-run until it fails correctly.
## 3. GREEN - Write Minimal Code
Write simplest code to pass the test. Nothing more.
**Key principle:** Don't add features the test doesn't require. Don't refactor other code. Don't "improve" beyond the test.
## 4. Verify GREEN - Watch It Pass
**MANDATORY.**
Run tests and confirm:
- ✓ New test passes
- ✓ All other tests still pass
- ✓ No errors or warnings
**If test fails:** Fix code, not test.
**If other tests fail:** Fix now before proceeding.
## 5. REFACTOR - Clean Up
**Only after green:**
- Remove duplication
- Improve names
- Extract helpers
Keep tests green. Don't add behavior.
## 6. Repeat
Next failing test for next feature.
</the_process>
<examples>
<example>
<scenario>Developer writes implementation first, then adds test that passes immediately</scenario>
<code>
// Code written FIRST
def validate_email(email):
return "@" in email # Bug: accepts "@@"
// Test written AFTER
def test_validate_email():
assert validate_email("user@example.com") # Passes immediately!
// Missing edge case: assert not validate_email("@@")
</code>
<why_it_fails>
When test passes immediately:
- Never proved the test catches bugs
- Only tested happy path you remembered
- Forgot edge cases (like "@@")
- Bug ships to production
Tests written after verify remembered cases, not required behavior.
</why_it_fails>
<correction>
**TDD approach:**
1. **RED** - Write test first (including edge case):
```python
def test_validate_email():
assert validate_email("user@example.com") # Will fail - function doesn't exist
assert not validate_email("@@") # Edge case up front
```
2. **Verify RED** - Run test, watch it fail:
```bash
NameError: function 'validate_email' is not defined
```
3. **GREEN** - Implement to pass both cases:
```python
def validate_email(email):
return "@" in email and email.count("@") == 1
```
4. **Verify GREEN** - Both assertions pass, bug prevented.
**Result:** Test failed first, proving it works. Edge case discovered during test writing, not in production.
</correction>
</example>
<example>
<scenario>Developer has already written 3 hours of code without tests. Wants to keep it as "reference" while writing tests.</scenario>
<code>
// 200 lines of untested code exists
// Developer thinks: "I'll keep this and write tests that match it"
// Or: "I'll use it as reference to speed up TDD"
</code>
<why_it_fails>
**Keeping code as "reference":**
- You'll copy it (that's testing after, with extra steps)
- You'll adapt it (biased by implementation)
- Tests will match code, not requirements
- You'll justify shortcuts: "I already know this works"
**Result:** All the problems of test-after, none of the benefits of TDD.
</why_it_fails>
<correction>
**Delete it. Completely.**
```bash
git stash # Or delete the file
```
**Then start TDD:**
1. Write first failing test from requirements (not from code)
2. Watch it fail
3. Implement fresh (might be different from original, that's OK)
4. Watch it pass
**Why delete:**
- Sunk cost is already gone
- 3 hours implementing ≠ 3 hours with TDD (TDD might be 2 hours total)
- Code without tests is technical debt
- Fresh implementation from tests is usually better
**What you gain:**
- Tests that actually verify behavior
- Confidence code works
- Ability to refactor safely
- No bugs from untested edge cases
</correction>
</example>
<example>
<scenario>Test is hard to write. Developer thinks "design must be unclear, but I'll implement first to explore."</scenario>
<code>
// Test attempt:
func testUserServiceCreatesAccount() {
// Need to mock database, email service, payment gateway, logger...
// This is getting complicated, maybe I should just implement first
}
</code>
<why_it_fails>
**"Test is hard" is valuable signal:**
- Hard to test = hard to use
- Too many dependencies = coupling too tight
- Complex setup = design needs simplification
**Implementing first ignores this signal:**
- Build the complex design
- Lock in the coupling
- Now forced to write complex tests (or skip them)
</why_it_fails>
<correction>
**Listen to the test.**
Hard to test? Simplify the interface:
```swift
// Instead of:
class UserService {
init(db: Database, email: EmailService, payments: PaymentGateway, logger: Logger) { }
func createAccount(email: String, password: String, paymentToken: String) throws { }
}
// Make testable:
class UserService {
func createAccount(request: CreateAccountRequest) -> Result<Account, Error> {
// Dependencies injected through request or passed separately
}
}
```
**Test becomes simple:**
```swift
func testCreatesAccountFromRequest() {
let service = UserService()
let request = CreateAccountRequest(email: "user@example.com")
let result = service.createAccount(request: request)
XCTAssertEqual(result.email, "user@example.com")
}
```
**TDD forces good design.** If test is hard, fix design before implementing.
</correction>
</example>
</examples>
<critical_rules>
## Rules That Have No Exceptions
1. **Write code before test?** → Delete it. Start over.
- Never keep as "reference"
- Never "adapt" while writing tests
- Delete means delete
2. **Test passes immediately?** → Not TDD. Fix the test or delete the code.
- Passing immediately proves nothing
- You're testing existing behavior, not required behavior
3. **Can't explain why test failed?** → Fix until failure makes sense.
- "function not found" = good (feature doesn't exist)
- Weird error = bad (fix test, re-run)
4. **Want to skip "just this once"?** → That's rationalization. Stop.
- TDD is faster than debugging in production
- "Too simple to test" = test takes 30 seconds
- "Already manually tested" = not systematic, not repeatable
## Common Excuses
All of these mean: Stop, follow TDD:
- "This is different because..."
- "I'm being pragmatic, not dogmatic"
- "It's about spirit not ritual"
- "Tests after achieve the same goals"
- "Deleting X hours of work is wasteful"
</critical_rules>
<verification_checklist>
Before marking work complete:
- [ ] Every new function/method has a test
- [ ] Watched each test **fail** before implementing
- [ ] Each test failed for expected reason (feature missing, not typo)
- [ ] Wrote minimal code to pass each test
- [ ] All tests pass with no warnings
- [ ] Tests use real code (mocks only if unavoidable)
- [ ] Edge cases and errors covered
**Can't check all boxes?** You skipped TDD. Start over.
</verification_checklist>
<integration>
**This skill calls:**
- verification-before-completion (running tests to verify)
**This skill is called by:**
- fixing-bugs (write failing test reproducing bug)
- executing-plans (when implementing bd tasks)
- refactoring-safely (keep tests green while refactoring)
**Agents used:**
- hyperpowers:test-runner (run tests, return summary only)
</integration>
<resources>
**Detailed language-specific examples:**
- [Rust, Swift, TypeScript examples](resources/language-examples.md) - Complete RED-GREEN-REFACTOR cycles
- [Language-specific test commands](resources/language-examples.md#verification-commands-by-language)
**When stuck:**
- Test too complicated? → Design too complicated, simplify interface
- Must mock everything? → Code too coupled, use dependency injection
- Test setup huge? → Extract helpers, or simplify design
</resources>

View File

@@ -0,0 +1,325 @@
# TDD Workflow Examples
This guide shows complete TDD workflows for common scenarios: bug fixes and feature additions.
## Example: Bug Fix
### Bug
Empty email is accepted when it should be rejected.
### RED Phase: Write Failing Test
**Swift:**
```swift
func testRejectsEmptyEmail() async throws {
let result = try await submitForm(FormData(email: ""))
XCTAssertEqual(result.error, "Email required")
}
```
### Verify RED: Watch It Fail
```bash
$ swift test --filter FormTests.testRejectsEmptyEmail
FAIL: XCTAssertEqual failed: ("nil") is not equal to ("Optional("Email required")")
```
**Confirms:**
- Test fails (not errors)
- Failure message shows email not being validated
- Fails because feature missing (not typos)
### GREEN Phase: Minimal Code
```swift
struct FormResult {
var error: String?
}
func submitForm(_ data: FormData) async throws -> FormResult {
if data.email.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
return FormResult(error: "Email required")
}
// ... rest of form processing
return FormResult()
}
```
### Verify GREEN: Watch It Pass
```bash
$ swift test --filter FormTests.testRejectsEmptyEmail
Test Case '-[FormTests testRejectsEmptyEmail]' passed
```
**Confirms:**
- Test passes
- Other tests still pass
- No errors or warnings
### REFACTOR: Clean Up
If multiple fields need validation:
```swift
extension FormData {
func validate() -> String? {
if email.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
return "Email required"
}
// Add other validations...
return nil
}
}
func submitForm(_ data: FormData) async throws -> FormResult {
if let error = data.validate() {
return FormResult(error: error)
}
// ... rest of form processing
return FormResult()
}
```
Run tests again to confirm still green.
---
## Example: Feature Addition
### Feature
Calculate average of non-empty list.
### RED Phase: Write Failing Test
**TypeScript:**
```typescript
describe('average', () => {
it('calculates average of non-empty list', () => {
expect(average([1, 2, 3])).toBe(2);
});
});
```
### Verify RED: Watch It Fail
```bash
$ npm test -- --testNamePattern="average"
FAIL: ReferenceError: average is not defined
```
**Confirms:**
- Function doesn't exist yet
- Test would verify the behavior when function exists
### GREEN Phase: Minimal Code
```typescript
function average(numbers: number[]): number {
const sum = numbers.reduce((acc, n) => acc + n, 0);
return sum / numbers.length;
}
```
### Verify GREEN: Watch It Pass
```bash
$ npm test -- --testNamePattern="average"
PASS: calculates average of non-empty list
```
### Add Edge Case: Empty List
**RED:**
```typescript
it('returns 0 for empty list', () => {
expect(average([])).toBe(0);
});
```
**Verify RED:**
```bash
$ npm test -- --testNamePattern="average.*empty"
FAIL: Expected: 0, Received: NaN
```
**GREEN:**
```typescript
function average(numbers: number[]): number {
if (numbers.length === 0) return 0;
const sum = numbers.reduce((acc, n) => acc + n, 0);
return sum / numbers.length;
}
```
**Verify GREEN:**
```bash
$ npm test -- --testNamePattern="average"
PASS: 2 tests passed
```
### REFACTOR: Clean Up
No duplication or unclear naming, so no refactoring needed. Move to next feature.
---
## Example: Refactoring with Tests
### Scenario
Existing function works but is hard to read. Tests exist and pass.
### Current Code
```rust
fn process(data: Vec<i32>) -> i32 {
let mut result = 0;
for item in data {
if item > 0 {
result += item * 2;
}
}
result
}
```
### Existing Tests (Already Green)
```rust
#[test]
fn processes_positive_numbers() {
assert_eq!(process(vec![1, 2, 3]), 12); // (1*2) + (2*2) + (3*2) = 12
}
#[test]
fn ignores_negative_numbers() {
assert_eq!(process(vec![1, -2, 3]), 8); // (1*2) + (3*2) = 8
}
#[test]
fn handles_empty_list() {
assert_eq!(process(vec![]), 0);
}
```
### REFACTOR: Improve Clarity
```rust
fn process(data: Vec<i32>) -> i32 {
data.iter()
.filter(|&&n| n > 0)
.map(|&n| n * 2)
.sum()
}
```
### Verify Still Green
```bash
$ cargo test
running 3 tests
test processes_positive_numbers ... ok
test ignores_negative_numbers ... ok
test handles_empty_list ... ok
test result: ok. 3 passed; 0 failed
```
**Key:** Tests prove refactoring didn't break behavior.
---
## Common Patterns
### Pattern: Adding Validation
1. **RED:** Test that invalid input is rejected
2. **Verify RED:** Confirm invalid input currently accepted
3. **GREEN:** Add validation check
4. **Verify GREEN:** Confirm validation works
5. **REFACTOR:** Extract validation if reusable
### Pattern: Adding Error Handling
1. **RED:** Test that error condition is caught
2. **Verify RED:** Confirm error currently unhandled
3. **GREEN:** Add error handling
4. **Verify GREEN:** Confirm error handled correctly
5. **REFACTOR:** Consolidate error handling if duplicated
### Pattern: Optimizing Performance
1. **Ensure tests exist and pass** (if not, add tests first)
2. **REFACTOR:** Optimize implementation
3. **Verify GREEN:** Confirm tests still pass
4. **Measure:** Confirm performance improved
**Note:** Never optimize without tests. You can't prove optimization didn't break behavior.
---
## Workflow Checklist
### For Each New Feature
- [ ] Write one failing test
- [ ] Run test, confirm it fails correctly
- [ ] Write minimal code to pass
- [ ] Run test, confirm it passes
- [ ] Run all tests, confirm no regressions
- [ ] Refactor if needed (staying green)
- [ ] Commit
### For Each Bug Fix
- [ ] Write test reproducing the bug
- [ ] Run test, confirm it fails (reproduces bug)
- [ ] Fix the bug (minimal change)
- [ ] Run test, confirm it passes (bug fixed)
- [ ] Run all tests, confirm no regressions
- [ ] Commit
### For Each Refactoring
- [ ] Confirm tests exist and pass
- [ ] Make one small refactoring change
- [ ] Run tests, confirm still green
- [ ] Repeat until refactoring complete
- [ ] Commit
---
## Anti-Patterns to Avoid
### ❌ Writing Multiple Tests Before Implementing
**Why bad:** You can't tell which test makes implementation fail. Write one, implement, repeat.
### ❌ Changing Test to Make It Pass
**Why bad:** Test should define correct behavior. If test is wrong, fix test first, then re-run RED phase.
### ❌ Adding Features Not Covered by Tests
**Why bad:** Untested code. If you need a feature, write test first.
### ❌ Skipping RED Verification
**Why bad:** Test might pass immediately, meaning it doesn't test anything new.
### ❌ Skipping GREEN Verification
**Why bad:** Test might fail for unexpected reason. Always verify expected pass.
---
## Remember
- **One test at a time:** Write test, implement, repeat
- **Watch it fail:** Proves test actually tests something
- **Watch it pass:** Proves implementation works
- **Stay green:** All tests pass before moving on
- **Refactor freely:** Tests catch breaks

View File

@@ -0,0 +1,267 @@
# TDD Language-Specific Examples
This guide provides concrete TDD examples in multiple programming languages, showing the RED-GREEN-REFACTOR cycle.
## RED Phase Examples
Write one minimal test showing what should happen.
### Rust
```rust
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn retries_failed_operations_3_times() {
let mut attempts = 0;
let operation = || -> Result<&str, &str> {
attempts += 1;
if attempts < 3 {
Err("fail")
} else {
Ok("success")
}
};
let result = retry_operation(operation);
assert_eq!(result, Ok("success"));
assert_eq!(attempts, 3);
}
}
```
**Running the test:**
```bash
cargo test tests::retries_failed_operations_3_times
```
### Swift
```swift
func testRetriesFailedOperations3Times() async throws {
var attempts = 0
let operation = { () -> Result<String, Error> in
attempts += 1
if attempts < 3 {
return .failure(RetryError.failed)
}
return .success("success")
}
let result = try await retryOperation(operation)
XCTAssertEqual(result, "success")
XCTAssertEqual(attempts, 3)
}
```
**Running the test:**
```bash
swift test --filter RetryTests.testRetriesFailedOperations3Times
```
### TypeScript
```typescript
describe('retryOperation', () => {
it('retries failed operations 3 times', async () => {
let attempts = 0;
const operation = () => {
attempts++;
if (attempts < 3) {
throw new Error('fail');
}
return 'success';
};
const result = await retryOperation(operation);
expect(result).toBe('success');
expect(attempts).toBe(3);
});
});
```
**Running the test (Jest):**
```bash
npm test -- --testNamePattern="retries failed operations"
```
**Running the test (Vitest):**
```bash
npm test -- -t "retries failed operations"
```
### Why These Are Good
- Clear names describing the behavior
- Test real behavior, not mocks
- One thing per test
- Shows desired API
### Bad Example
```typescript
test('retry', () => {
let mockCalls = 0;
const mock = () => {
mockCalls++;
return 'success';
};
retryOperation(mock);
expect(mockCalls).toBe(1); // Tests mock, not behavior
});
```
**Why this is bad:**
- Vague name
- Tests mock behavior, not real retry logic
## GREEN Phase Examples
Write simplest code to pass the test.
### Rust
```rust
fn retry_operation<F, T, E>(mut operation: F) -> Result<T, E>
where
F: FnMut() -> Result<T, E>,
{
for i in 0..3 {
match operation() {
Ok(result) => return Ok(result),
Err(e) => {
if i == 2 {
return Err(e);
}
}
}
}
unreachable!()
}
```
### Swift
```swift
func retryOperation<T>(_ operation: () async throws -> T) async throws -> T {
var lastError: Error?
for attempt in 0..<3 {
do {
return try await operation()
} catch {
lastError = error
if attempt == 2 {
throw error
}
}
}
throw lastError!
}
```
### TypeScript
```typescript
async function retryOperation<T>(
operation: () => Promise<T>
): Promise<T> {
let lastError: Error | undefined;
for (let i = 0; i < 3; i++) {
try {
return await operation();
} catch (error) {
lastError = error as Error;
if (i === 2) {
throw error;
}
}
}
throw lastError;
}
```
### Bad Example - Over-engineered (YAGNI)
```typescript
async function retryOperation<T>(
operation: () => Promise<T>,
options: {
maxRetries?: number;
backoff?: 'linear' | 'exponential';
onRetry?: (attempt: number) => void;
shouldRetry?: (error: Error) => boolean;
} = {}
): Promise<T> {
// Don't add features the test doesn't require!
}
```
**Why this is bad:** Test only requires 3 retries. Don't add:
- Configurable retries
- Backoff strategies
- Callbacks
- Error filtering
...until a test requires them.
## Test Requirements
**Every test should:**
- Test one behavior
- Have a clear name
- Use real code (no mocks unless unavoidable)
## Verification Commands by Language
### Rust
```bash
# Single test
cargo test tests::test_name
# All tests
cargo test
# With output
cargo test -- --nocapture
```
### Swift
```bash
# Single test
swift test --filter TestClass.testName
# All tests
swift test
# With output
swift test --verbose
```
### TypeScript (Jest)
```bash
# Single test
npm test -- --testNamePattern="test name"
# All tests
npm test
# With coverage
npm test -- --coverage
```
### TypeScript (Vitest)
```bash
# Single test
npm test -- -t "test name"
# All tests
npm test
# With coverage
npm test -- --coverage
```