9.4 KiB
name, description
| name | description |
|---|---|
| test-driven-development | Use when implementing features or fixing bugs - enforces RED-GREEN-REFACTOR cycle requiring tests to fail before writing code |
<skill_overview> Write the test first, watch it fail, write minimal code to pass. If you didn't watch the test fail, you don't know if it tests the right thing. </skill_overview>
<rigidity_level> LOW FREEDOM - Follow these exact steps in order. Do not adapt.
Violating the letter of the rules is violating the spirit of the rules. </rigidity_level>
<quick_reference>
| Phase | Action | Command Example | Expected Result |
|---|---|---|---|
| RED | Write failing test | cargo test test_name |
FAIL (feature missing) |
| Verify RED | Confirm correct failure | Check error message | "function not found" or assertion fails |
| GREEN | Write minimal code | Implement feature | Test passes |
| Verify GREEN | All tests pass | cargo test |
All green, no warnings |
| REFACTOR | Clean up code | Improve while green | Tests still pass |
Iron Law: NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
</quick_reference>
<when_to_use> Always use for:
- New features
- Bug fixes
- Refactoring with behavior changes
- Any production code
Ask your human partner for exceptions:
- Throwaway prototypes (will be deleted)
- Generated code
- Configuration files
Thinking "skip TDD just this once"? Stop. That's rationalization. </when_to_use>
<the_process>
1. RED - Write Failing Test
Write one minimal test showing what should happen.
Requirements:
- Test one behavior only ("and" in name? Split it)
- Clear name describing behavior
- Use real code (no mocks unless unavoidable)
See resources/language-examples.md for Rust, Swift, TypeScript examples.
2. Verify RED - Watch It Fail
MANDATORY. Never skip.
Run the test and confirm:
- ✓ Test fails (not errors with syntax issues)
- ✓ Failure message is expected ("function not found" or assertion fails)
- ✓ Fails because feature missing (not typos)
If test passes: You're testing existing behavior. Fix the test. If test errors: Fix syntax error, re-run until it fails correctly.
3. GREEN - Write Minimal Code
Write simplest code to pass the test. Nothing more.
Key principle: Don't add features the test doesn't require. Don't refactor other code. Don't "improve" beyond the test.
4. Verify GREEN - Watch It Pass
MANDATORY.
Run tests and confirm:
- ✓ New test passes
- ✓ All other tests still pass
- ✓ No errors or warnings
If test fails: Fix code, not test. If other tests fail: Fix now before proceeding.
5. REFACTOR - Clean Up
Only after green:
- Remove duplication
- Improve names
- Extract helpers
Keep tests green. Don't add behavior.
6. Repeat
Next failing test for next feature.
</the_process>
Developer writes implementation first, then adds test that passes immediately
// Code written FIRST
def validate_email(email):
return "@" in email # Bug: accepts "@@"
// Test written AFTER
def test_validate_email():
assert validate_email("user@example.com") # Passes immediately!
// Missing edge case: assert not validate_email("@@")
<why_it_fails> When test passes immediately:
- Never proved the test catches bugs
- Only tested happy path you remembered
- Forgot edge cases (like "@@")
- Bug ships to production
Tests written after verify remembered cases, not required behavior. </why_it_fails>
**TDD approach:**- RED - Write test first (including edge case):
def test_validate_email():
assert validate_email("user@example.com") # Will fail - function doesn't exist
assert not validate_email("@@") # Edge case up front
- Verify RED - Run test, watch it fail:
NameError: function 'validate_email' is not defined
- GREEN - Implement to pass both cases:
def validate_email(email):
return "@" in email and email.count("@") == 1
- Verify GREEN - Both assertions pass, bug prevented.
Result: Test failed first, proving it works. Edge case discovered during test writing, not in production.
Developer has already written 3 hours of code without tests. Wants to keep it as "reference" while writing tests.
// 200 lines of untested code exists
// Developer thinks: "I'll keep this and write tests that match it"
// Or: "I'll use it as reference to speed up TDD"
<why_it_fails> Keeping code as "reference":
- You'll copy it (that's testing after, with extra steps)
- You'll adapt it (biased by implementation)
- Tests will match code, not requirements
- You'll justify shortcuts: "I already know this works"
Result: All the problems of test-after, none of the benefits of TDD. </why_it_fails>
**Delete it. Completely.**git stash # Or delete the file
Then start TDD:
- Write first failing test from requirements (not from code)
- Watch it fail
- Implement fresh (might be different from original, that's OK)
- Watch it pass
Why delete:
- Sunk cost is already gone
- 3 hours implementing ≠ 3 hours with TDD (TDD might be 2 hours total)
- Code without tests is technical debt
- Fresh implementation from tests is usually better
What you gain:
- Tests that actually verify behavior
- Confidence code works
- Ability to refactor safely
- No bugs from untested edge cases
// Test attempt:
func testUserServiceCreatesAccount() {
// Need to mock database, email service, payment gateway, logger...
// This is getting complicated, maybe I should just implement first
}
<why_it_fails> "Test is hard" is valuable signal:
- Hard to test = hard to use
- Too many dependencies = coupling too tight
- Complex setup = design needs simplification
Implementing first ignores this signal:
- Build the complex design
- Lock in the coupling
- Now forced to write complex tests (or skip them) </why_it_fails>
Hard to test? Simplify the interface:
// Instead of:
class UserService {
init(db: Database, email: EmailService, payments: PaymentGateway, logger: Logger) { }
func createAccount(email: String, password: String, paymentToken: String) throws { }
}
// Make testable:
class UserService {
func createAccount(request: CreateAccountRequest) -> Result<Account, Error> {
// Dependencies injected through request or passed separately
}
}
Test becomes simple:
func testCreatesAccountFromRequest() {
let service = UserService()
let request = CreateAccountRequest(email: "user@example.com")
let result = service.createAccount(request: request)
XCTAssertEqual(result.email, "user@example.com")
}
TDD forces good design. If test is hard, fix design before implementing.
<critical_rules>
Rules That Have No Exceptions
-
Write code before test? → Delete it. Start over.
- Never keep as "reference"
- Never "adapt" while writing tests
- Delete means delete
-
Test passes immediately? → Not TDD. Fix the test or delete the code.
- Passing immediately proves nothing
- You're testing existing behavior, not required behavior
-
Can't explain why test failed? → Fix until failure makes sense.
- "function not found" = good (feature doesn't exist)
- Weird error = bad (fix test, re-run)
-
Want to skip "just this once"? → That's rationalization. Stop.
- TDD is faster than debugging in production
- "Too simple to test" = test takes 30 seconds
- "Already manually tested" = not systematic, not repeatable
Common Excuses
All of these mean: Stop, follow TDD:
- "This is different because..."
- "I'm being pragmatic, not dogmatic"
- "It's about spirit not ritual"
- "Tests after achieve the same goals"
- "Deleting X hours of work is wasteful"
</critical_rules>
<verification_checklist>
Before marking work complete:
- Every new function/method has a test
- Watched each test fail before implementing
- Each test failed for expected reason (feature missing, not typo)
- Wrote minimal code to pass each test
- All tests pass with no warnings
- Tests use real code (mocks only if unavoidable)
- Edge cases and errors covered
Can't check all boxes? You skipped TDD. Start over.
</verification_checklist>
This skill calls:
- verification-before-completion (running tests to verify)
This skill is called by:
- fixing-bugs (write failing test reproducing bug)
- executing-plans (when implementing bd tasks)
- refactoring-safely (keep tests green while refactoring)
Agents used:
- hyperpowers:test-runner (run tests, return summary only)
Detailed language-specific examples:
- Rust, Swift, TypeScript examples - Complete RED-GREEN-REFACTOR cycles
- Language-specific test commands
When stuck:
- Test too complicated? → Design too complicated, simplify interface
- Must mock everything? → Code too coupled, use dependency injection
- Test setup huge? → Extract helpers, or simplify design