--- name: Writing Tests description: "Principles for writing effective, maintainable tests. Covers naming conventions, assertion best practices, and comprehensive edge case checklists. Based on BugMagnet by Gojko Adzic." version: 1.0.0 --- # Writing Tests How to write tests that catch bugs, document behavior, and remain maintainable. > Based on [BugMagnet](https://github.com/gojko/bugmagnet-ai-assistant) by Gojko Adzic. Adapted with attribution. ## Critical Rules 🚨 **Test names describe outcomes, not actions.** "returns empty array when input is null" not "test null input". The name IS the specification. 🚨 **Assertions must match test titles.** If the test claims to verify "different IDs", assert on the actual ID values—not just count or existence. 🚨 **Assert specific values, not types.** `expect(result).toEqual(['First.', ' Second.'])` not `expect(result).toBeDefined()`. Specific assertions catch specific bugs. 🚨 **One concept per test.** Each test verifies one behavior. If you need "and" in your test name, split it. 🚨 **Bugs cluster together.** When you find one bug, test related scenarios. The same misunderstanding often causes multiple failures. ## When This Applies - Writing new tests - Reviewing test quality - During TDD RED phase (writing the failing test) - Expanding test coverage - Investigating discovered bugs ## Test Naming **Pattern:** `[outcome] when [condition]` ### Good Names (Describe Outcomes) ``` returns empty array when input is null throws ValidationError when email format invalid calculates tax correctly for tax-exempt items preserves original order when duplicates removed ``` ### Bad Names (Describe Actions) ``` test null input // What about null input? should work // What does "work" mean? handles edge cases // Which edge cases? email validation test // What's being validated? ``` ### The Specification Test Your test name should read like a specification. If someone reads ONLY the test names, they should understand the complete behavior of the system. ## Assertion Best Practices ### Assert Specific Values ```typescript // ❌ WEAK - passes even if completely wrong data expect(result).toBeDefined() expect(result.items).toHaveLength(2) expect(user).toBeTruthy() // ✅ STRONG - catches actual bugs expect(result).toEqual({ status: 'success', items: ['a', 'b'] }) expect(user.email).toBe('test@example.com') ``` ### Match Assertions to Test Title ```typescript // ❌ TEST SAYS "different IDs" BUT ASSERTS COUNT it('generates different IDs for each call', () => { const id1 = generateId() const id2 = generateId() expect([id1, id2]).toHaveLength(2) // WRONG: doesn't check they're different! }) // ✅ ACTUALLY VERIFIES DIFFERENT IDs it('generates different IDs for each call', () => { const id1 = generateId() const id2 = generateId() expect(id1).not.toBe(id2) // RIGHT: verifies the claim }) ``` ### Avoid Implementation Coupling ```typescript // ❌ BRITTLE - tests implementation details expect(mockDatabase.query).toHaveBeenCalledWith('SELECT * FROM users WHERE id = 1') // ✅ FLEXIBLE - tests behavior expect(result.user.name).toBe('Alice') ``` ## Test Structure ### Arrange-Act-Assert ```typescript it('calculates total with tax for non-exempt items', () => { // Arrange: Set up test data const item = { price: 100, taxExempt: false } const taxRate = 0.1 // Act: Execute the behavior const total = calculateTotal(item, taxRate) // Assert: Verify the outcome expect(total).toBe(110) }) ``` ### One Concept Per Test ```typescript // ❌ MULTIPLE CONCEPTS - hard to diagnose failures it('validates and processes order', () => { expect(validate(order)).toBe(true) expect(process(order).status).toBe('complete') expect(sendEmail).toHaveBeenCalled() }) // ✅ SINGLE CONCEPT - clear failures it('accepts valid orders', () => { expect(validate(validOrder)).toBe(true) }) it('rejects orders with negative quantities', () => { expect(validate(negativeQuantityOrder)).toBe(false) }) it('sends confirmation email after processing', () => { process(order) expect(sendEmail).toHaveBeenCalledWith(order.customerEmail) }) ``` ## Edge Case Checklists When testing a function, systematically consider these edge cases based on input types. ### Numbers - [ ] Zero - [ ] Negative numbers - [ ] Very large numbers (near MAX_SAFE_INTEGER) - [ ] Very small numbers (near MIN_SAFE_INTEGER) - [ ] Decimal precision (0.1 + 0.2) - [ ] NaN - [ ] Infinity / -Infinity - [ ] Boundary values (off-by-one at limits) ### Strings - [ ] Empty string `""` - [ ] Whitespace only `" "` - [ ] Very long strings (10K+ characters) - [ ] Unicode: emojis 👨‍👩‍👧‍👦, RTL text, combining characters - [ ] Special characters: quotes, backslashes, null bytes - [ ] SQL/HTML/script injection patterns - [ ] Leading/trailing whitespace - [ ] Mixed case sensitivity ### Collections (Arrays, Objects, Maps) - [ ] Empty collection `[]`, `{}` - [ ] Single element - [ ] Duplicates - [ ] Nested structures - [ ] Circular references - [ ] Very large collections (performance) - [ ] Sparse arrays - [ ] Mixed types in arrays ### Dates and Times - [ ] Leap years (Feb 29) - [ ] Daylight saving transitions - [ ] Timezone boundaries - [ ] Midnight (00:00:00) - [ ] End of day (23:59:59) - [ ] Year boundaries (Dec 31 → Jan 1) - [ ] Invalid dates (Feb 30, Month 13) - [ ] Unix epoch edge cases - [ ] Far future/past dates ### Null and Undefined - [ ] `null` input - [ ] `undefined` input - [ ] Missing optional properties - [ ] Explicit `undefined` vs missing key ### Domain-Specific - [ ] Email: valid formats, edge cases (plus signs, subdomains) - [ ] URLs: protocols, ports, special characters, relative paths - [ ] Phone numbers: international formats, extensions - [ ] Addresses: Unicode, multi-line, missing components - [ ] Currency: rounding, different currencies, zero amounts - [ ] Percentages: 0%, 100%, over 100% ### Violated Domain Constraints These test implicit assumptions in your domain: - [ ] Uniqueness violations (duplicate IDs, emails) - [ ] Missing required relationships (orphaned records) - [ ] Ordering violations (events out of sequence) - [ ] Range breaches (age -1, quantity 1000000) - [ ] State inconsistencies (shipped but not paid) - [ ] Format mismatches (expected JSON, got XML) - [ ] Temporal ordering (end before start) ## Bug Clustering When you discover a bug, don't stop—explore related scenarios: 1. **Same function, similar inputs** - If null fails, test undefined, empty string 2. **Same pattern, different locations** - If one endpoint mishandles auth, check others 3. **Same developer assumption** - If off-by-one here, check other boundaries 4. **Same data type** - If dates fail at DST, check other time edge cases ## When Tempted to Cut Corners - If your test name says "test" or "should work": STOP. What outcome are you actually verifying? Name it specifically. - If you're asserting `toBeDefined()` or `toBeTruthy()`: STOP. What value do you actually expect? Assert that instead. - If your assertion doesn't match your test title: STOP. Either fix the assertion or rename the test. They must agree. - If you're testing multiple concepts in one test: STOP. Split it. Future you debugging a failure will thank you. - If you found a bug and wrote one test: STOP. Bugs cluster. What related scenarios might have the same problem? - If you're skipping edge cases because "that won't happen": STOP. It will happen. In production. At 3 AM. ## Integration with Other Skills **With TDD Process:** This skill guides the RED phase—how to write the failing test well. **With Software Design Principles:** Testable code follows design principles. Hard-to-test code often has design problems.