Files
gh-ntcoding-claude-skillz-d…/skills/writing-tests/SKILL.md
2025-11-30 08:44:41 +08:00

250 lines
7.6 KiB
Markdown

---
name: Writing Tests
description: "Principles for writing effective, maintainable tests. Covers naming conventions, assertion best practices, and comprehensive edge case checklists. Based on BugMagnet by Gojko Adzic."
version: 1.0.0
---
# Writing Tests
How to write tests that catch bugs, document behavior, and remain maintainable.
> Based on [BugMagnet](https://github.com/gojko/bugmagnet-ai-assistant) by Gojko Adzic. Adapted with attribution.
## Critical Rules
🚨 **Test names describe outcomes, not actions.** "returns empty array when input is null" not "test null input". The name IS the specification.
🚨 **Assertions must match test titles.** If the test claims to verify "different IDs", assert on the actual ID values—not just count or existence.
🚨 **Assert specific values, not types.** `expect(result).toEqual(['First.', ' Second.'])` not `expect(result).toBeDefined()`. Specific assertions catch specific bugs.
🚨 **One concept per test.** Each test verifies one behavior. If you need "and" in your test name, split it.
🚨 **Bugs cluster together.** When you find one bug, test related scenarios. The same misunderstanding often causes multiple failures.
## When This Applies
- Writing new tests
- Reviewing test quality
- During TDD RED phase (writing the failing test)
- Expanding test coverage
- Investigating discovered bugs
## Test Naming
**Pattern:** `[outcome] when [condition]`
### Good Names (Describe Outcomes)
```
returns empty array when input is null
throws ValidationError when email format invalid
calculates tax correctly for tax-exempt items
preserves original order when duplicates removed
```
### Bad Names (Describe Actions)
```
test null input // What about null input?
should work // What does "work" mean?
handles edge cases // Which edge cases?
email validation test // What's being validated?
```
### The Specification Test
Your test name should read like a specification. If someone reads ONLY the test names, they should understand the complete behavior of the system.
## Assertion Best Practices
### Assert Specific Values
```typescript
// ❌ WEAK - passes even if completely wrong data
expect(result).toBeDefined()
expect(result.items).toHaveLength(2)
expect(user).toBeTruthy()
// ✅ STRONG - catches actual bugs
expect(result).toEqual({ status: 'success', items: ['a', 'b'] })
expect(user.email).toBe('test@example.com')
```
### Match Assertions to Test Title
```typescript
// ❌ TEST SAYS "different IDs" BUT ASSERTS COUNT
it('generates different IDs for each call', () => {
const id1 = generateId()
const id2 = generateId()
expect([id1, id2]).toHaveLength(2) // WRONG: doesn't check they're different!
})
// ✅ ACTUALLY VERIFIES DIFFERENT IDs
it('generates different IDs for each call', () => {
const id1 = generateId()
const id2 = generateId()
expect(id1).not.toBe(id2) // RIGHT: verifies the claim
})
```
### Avoid Implementation Coupling
```typescript
// ❌ BRITTLE - tests implementation details
expect(mockDatabase.query).toHaveBeenCalledWith('SELECT * FROM users WHERE id = 1')
// ✅ FLEXIBLE - tests behavior
expect(result.user.name).toBe('Alice')
```
## Test Structure
### Arrange-Act-Assert
```typescript
it('calculates total with tax for non-exempt items', () => {
// Arrange: Set up test data
const item = { price: 100, taxExempt: false }
const taxRate = 0.1
// Act: Execute the behavior
const total = calculateTotal(item, taxRate)
// Assert: Verify the outcome
expect(total).toBe(110)
})
```
### One Concept Per Test
```typescript
// ❌ MULTIPLE CONCEPTS - hard to diagnose failures
it('validates and processes order', () => {
expect(validate(order)).toBe(true)
expect(process(order).status).toBe('complete')
expect(sendEmail).toHaveBeenCalled()
})
// ✅ SINGLE CONCEPT - clear failures
it('accepts valid orders', () => {
expect(validate(validOrder)).toBe(true)
})
it('rejects orders with negative quantities', () => {
expect(validate(negativeQuantityOrder)).toBe(false)
})
it('sends confirmation email after processing', () => {
process(order)
expect(sendEmail).toHaveBeenCalledWith(order.customerEmail)
})
```
## Edge Case Checklists
When testing a function, systematically consider these edge cases based on input types.
### Numbers
- [ ] Zero
- [ ] Negative numbers
- [ ] Very large numbers (near MAX_SAFE_INTEGER)
- [ ] Very small numbers (near MIN_SAFE_INTEGER)
- [ ] Decimal precision (0.1 + 0.2)
- [ ] NaN
- [ ] Infinity / -Infinity
- [ ] Boundary values (off-by-one at limits)
### Strings
- [ ] Empty string `""`
- [ ] Whitespace only `" "`
- [ ] Very long strings (10K+ characters)
- [ ] Unicode: emojis 👨‍👩‍👧‍👦, RTL text, combining characters
- [ ] Special characters: quotes, backslashes, null bytes
- [ ] SQL/HTML/script injection patterns
- [ ] Leading/trailing whitespace
- [ ] Mixed case sensitivity
### Collections (Arrays, Objects, Maps)
- [ ] Empty collection `[]`, `{}`
- [ ] Single element
- [ ] Duplicates
- [ ] Nested structures
- [ ] Circular references
- [ ] Very large collections (performance)
- [ ] Sparse arrays
- [ ] Mixed types in arrays
### Dates and Times
- [ ] Leap years (Feb 29)
- [ ] Daylight saving transitions
- [ ] Timezone boundaries
- [ ] Midnight (00:00:00)
- [ ] End of day (23:59:59)
- [ ] Year boundaries (Dec 31 → Jan 1)
- [ ] Invalid dates (Feb 30, Month 13)
- [ ] Unix epoch edge cases
- [ ] Far future/past dates
### Null and Undefined
- [ ] `null` input
- [ ] `undefined` input
- [ ] Missing optional properties
- [ ] Explicit `undefined` vs missing key
### Domain-Specific
- [ ] Email: valid formats, edge cases (plus signs, subdomains)
- [ ] URLs: protocols, ports, special characters, relative paths
- [ ] Phone numbers: international formats, extensions
- [ ] Addresses: Unicode, multi-line, missing components
- [ ] Currency: rounding, different currencies, zero amounts
- [ ] Percentages: 0%, 100%, over 100%
### Violated Domain Constraints
These test implicit assumptions in your domain:
- [ ] Uniqueness violations (duplicate IDs, emails)
- [ ] Missing required relationships (orphaned records)
- [ ] Ordering violations (events out of sequence)
- [ ] Range breaches (age -1, quantity 1000000)
- [ ] State inconsistencies (shipped but not paid)
- [ ] Format mismatches (expected JSON, got XML)
- [ ] Temporal ordering (end before start)
## Bug Clustering
When you discover a bug, don't stop—explore related scenarios:
1. **Same function, similar inputs** - If null fails, test undefined, empty string
2. **Same pattern, different locations** - If one endpoint mishandles auth, check others
3. **Same developer assumption** - If off-by-one here, check other boundaries
4. **Same data type** - If dates fail at DST, check other time edge cases
## When Tempted to Cut Corners
- If your test name says "test" or "should work": STOP. What outcome are you actually verifying? Name it specifically.
- If you're asserting `toBeDefined()` or `toBeTruthy()`: STOP. What value do you actually expect? Assert that instead.
- If your assertion doesn't match your test title: STOP. Either fix the assertion or rename the test. They must agree.
- If you're testing multiple concepts in one test: STOP. Split it. Future you debugging a failure will thank you.
- If you found a bug and wrote one test: STOP. Bugs cluster. What related scenarios might have the same problem?
- If you're skipping edge cases because "that won't happen": STOP. It will happen. In production. At 3 AM.
## Integration with Other Skills
**With TDD Process:** This skill guides the RED phase—how to write the failing test well.
**With Software Design Principles:** Testable code follows design principles. Hard-to-test code often has design problems.