Files
2025-11-30 09:06:38 +08:00

9.4 KiB

name, description
name description
test-driven-development Use when implementing features or fixing bugs - enforces RED-GREEN-REFACTOR cycle requiring tests to fail before writing code

<skill_overview> Write the test first, watch it fail, write minimal code to pass. If you didn't watch the test fail, you don't know if it tests the right thing. </skill_overview>

<rigidity_level> LOW FREEDOM - Follow these exact steps in order. Do not adapt.

Violating the letter of the rules is violating the spirit of the rules. </rigidity_level>

<quick_reference>

Phase Action Command Example Expected Result
RED Write failing test cargo test test_name FAIL (feature missing)
Verify RED Confirm correct failure Check error message "function not found" or assertion fails
GREEN Write minimal code Implement feature Test passes
Verify GREEN All tests pass cargo test All green, no warnings
REFACTOR Clean up code Improve while green Tests still pass

Iron Law: NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST

</quick_reference>

<when_to_use> Always use for:

  • New features
  • Bug fixes
  • Refactoring with behavior changes
  • Any production code

Ask your human partner for exceptions:

  • Throwaway prototypes (will be deleted)
  • Generated code
  • Configuration files

Thinking "skip TDD just this once"? Stop. That's rationalization. </when_to_use>

<the_process>

1. RED - Write Failing Test

Write one minimal test showing what should happen.

Requirements:

  • Test one behavior only ("and" in name? Split it)
  • Clear name describing behavior
  • Use real code (no mocks unless unavoidable)

See resources/language-examples.md for Rust, Swift, TypeScript examples.

2. Verify RED - Watch It Fail

MANDATORY. Never skip.

Run the test and confirm:

  • ✓ Test fails (not errors with syntax issues)
  • ✓ Failure message is expected ("function not found" or assertion fails)
  • ✓ Fails because feature missing (not typos)

If test passes: You're testing existing behavior. Fix the test. If test errors: Fix syntax error, re-run until it fails correctly.

3. GREEN - Write Minimal Code

Write simplest code to pass the test. Nothing more.

Key principle: Don't add features the test doesn't require. Don't refactor other code. Don't "improve" beyond the test.

4. Verify GREEN - Watch It Pass

MANDATORY.

Run tests and confirm:

  • ✓ New test passes
  • ✓ All other tests still pass
  • ✓ No errors or warnings

If test fails: Fix code, not test. If other tests fail: Fix now before proceeding.

5. REFACTOR - Clean Up

Only after green:

  • Remove duplication
  • Improve names
  • Extract helpers

Keep tests green. Don't add behavior.

6. Repeat

Next failing test for next feature.

</the_process>

Developer writes implementation first, then adds test that passes immediately // Code written FIRST def validate_email(email): return "@" in email # Bug: accepts "@@"

// Test written AFTER def test_validate_email(): assert validate_email("user@example.com") # Passes immediately! // Missing edge case: assert not validate_email("@@")

<why_it_fails> When test passes immediately:

  • Never proved the test catches bugs
  • Only tested happy path you remembered
  • Forgot edge cases (like "@@")
  • Bug ships to production

Tests written after verify remembered cases, not required behavior. </why_it_fails>

**TDD approach:**
  1. RED - Write test first (including edge case):
def test_validate_email():
    assert validate_email("user@example.com")  # Will fail - function doesn't exist
    assert not validate_email("@@")            # Edge case up front
  1. Verify RED - Run test, watch it fail:
NameError: function 'validate_email' is not defined
  1. GREEN - Implement to pass both cases:
def validate_email(email):
    return "@" in email and email.count("@") == 1
  1. Verify GREEN - Both assertions pass, bug prevented.

Result: Test failed first, proving it works. Edge case discovered during test writing, not in production.

Developer has already written 3 hours of code without tests. Wants to keep it as "reference" while writing tests. // 200 lines of untested code exists // Developer thinks: "I'll keep this and write tests that match it" // Or: "I'll use it as reference to speed up TDD"

<why_it_fails> Keeping code as "reference":

  • You'll copy it (that's testing after, with extra steps)
  • You'll adapt it (biased by implementation)
  • Tests will match code, not requirements
  • You'll justify shortcuts: "I already know this works"

Result: All the problems of test-after, none of the benefits of TDD. </why_it_fails>

**Delete it. Completely.**
git stash  # Or delete the file

Then start TDD:

  1. Write first failing test from requirements (not from code)
  2. Watch it fail
  3. Implement fresh (might be different from original, that's OK)
  4. Watch it pass

Why delete:

  • Sunk cost is already gone
  • 3 hours implementing ≠ 3 hours with TDD (TDD might be 2 hours total)
  • Code without tests is technical debt
  • Fresh implementation from tests is usually better

What you gain:

  • Tests that actually verify behavior
  • Confidence code works
  • Ability to refactor safely
  • No bugs from untested edge cases
Test is hard to write. Developer thinks "design must be unclear, but I'll implement first to explore." // Test attempt: func testUserServiceCreatesAccount() { // Need to mock database, email service, payment gateway, logger... // This is getting complicated, maybe I should just implement first }

<why_it_fails> "Test is hard" is valuable signal:

  • Hard to test = hard to use
  • Too many dependencies = coupling too tight
  • Complex setup = design needs simplification

Implementing first ignores this signal:

  • Build the complex design
  • Lock in the coupling
  • Now forced to write complex tests (or skip them) </why_it_fails>
**Listen to the test.**

Hard to test? Simplify the interface:

// Instead of:
class UserService {
    init(db: Database, email: EmailService, payments: PaymentGateway, logger: Logger) { }
    func createAccount(email: String, password: String, paymentToken: String) throws { }
}

// Make testable:
class UserService {
    func createAccount(request: CreateAccountRequest) -> Result<Account, Error> {
        // Dependencies injected through request or passed separately
    }
}

Test becomes simple:

func testCreatesAccountFromRequest() {
    let service = UserService()
    let request = CreateAccountRequest(email: "user@example.com")
    let result = service.createAccount(request: request)
    XCTAssertEqual(result.email, "user@example.com")
}

TDD forces good design. If test is hard, fix design before implementing.

<critical_rules>

Rules That Have No Exceptions

  1. Write code before test? → Delete it. Start over.

    • Never keep as "reference"
    • Never "adapt" while writing tests
    • Delete means delete
  2. Test passes immediately? → Not TDD. Fix the test or delete the code.

    • Passing immediately proves nothing
    • You're testing existing behavior, not required behavior
  3. Can't explain why test failed? → Fix until failure makes sense.

    • "function not found" = good (feature doesn't exist)
    • Weird error = bad (fix test, re-run)
  4. Want to skip "just this once"? → That's rationalization. Stop.

    • TDD is faster than debugging in production
    • "Too simple to test" = test takes 30 seconds
    • "Already manually tested" = not systematic, not repeatable

Common Excuses

All of these mean: Stop, follow TDD:

  • "This is different because..."
  • "I'm being pragmatic, not dogmatic"
  • "It's about spirit not ritual"
  • "Tests after achieve the same goals"
  • "Deleting X hours of work is wasteful"

</critical_rules>

<verification_checklist>

Before marking work complete:

  • Every new function/method has a test
  • Watched each test fail before implementing
  • Each test failed for expected reason (feature missing, not typo)
  • Wrote minimal code to pass each test
  • All tests pass with no warnings
  • Tests use real code (mocks only if unavoidable)
  • Edge cases and errors covered

Can't check all boxes? You skipped TDD. Start over.

</verification_checklist>

This skill calls:

  • verification-before-completion (running tests to verify)

This skill is called by:

  • fixing-bugs (write failing test reproducing bug)
  • executing-plans (when implementing bd tasks)
  • refactoring-safely (keep tests green while refactoring)

Agents used:

  • hyperpowers:test-runner (run tests, return summary only)

Detailed language-specific examples:

When stuck:

  • Test too complicated? → Design too complicated, simplify interface
  • Must mock everything? → Code too coupled, use dependency injection
  • Test setup huge? → Extract helpers, or simplify design