zhongwei/gh-levnikolaevich-claude-code-skills-full-development-workflow-skills

Files

Zhongwei Li 37774aa937 Initial commit

2025-11-30 08:37:27 +08:00

18 KiB

Raw Permalink Blame History

Risk-Based Testing Guide

Purpose

This guide replaces the traditional Test Pyramid (70/20/10 ratio) with a Value-Based Testing Framework that prioritizes business risk and practical test limits. The goal is to write tests that matter, not to chase coverage metrics.

Problem solved: Traditional Test Pyramid approach generates excessive tests (~200 per Story) by mechanically testing every conditional branch. This creates maintenance burden without proportional business value.

Solution: Risk-Based Testing with clear prioritization criteria and enforced limits (10-28 tests max per Story).

Core Philosophy

Kent Beck's Principle

"Write tests. Not too many. Mostly integration."

Key Insights

Test business value, not code coverage - 80% coverage means nothing if critical payment flow isn't tested
Manual testing has value - Not every scenario needs automated test duplication
Each test has maintenance cost - More tests = more refactoring overhead
Integration tests catch real bugs - Unit tests catch edge cases in isolation
E2E tests validate user value - Only E2E proves the feature actually works end-to-end

Minimum Viable Testing Philosophy

Start Minimal, Justify Additions

Baseline for every Story:

2 E2E tests per endpoint: Positive scenario (happy path) + Negative scenario (critical error)
0 Integration tests (E2E covers full stack by default)
0 Unit tests (E2E covers simple logic by default)

Realistic goal: 2-7 tests per Story (not 10-28!)

Additional tests ONLY with critical justification:

Test #3 and beyond: Each requires documented answer to "Why does this test OUR business logic (not framework/library/database)?"
Priority ≥15 required for all additional tests
Auto-trim to 7 tests if plan exceeds realistic goal

Critical Justification Questions

Before adding ANY test beyond 2 baseline E2E, answer:

❓ Does this test OUR business logic?
- ✅ YES: Tax calculation with country-specific rules (OUR algorithm)
- ❌ NO: bcrypt hashing (library behavior)
- ❌ NO: Prisma query execution (framework behavior)
- ❌ NO: PostgreSQL LIKE operator (database behavior)
❓ Is this already covered by 2 baseline E2E tests?
- ✅ NO: E2E doesn't exercise all branches of complex calculation
- ❌ YES: E2E test validates full flow end-to-end
❓ Priority ≥15?
- ✅ YES: Money, security, data integrity
- ❌ NO: Skip, manual testing sufficient
❓ Unique business value?
- ✅ YES: Tests different scenario than existing tests
- ❌ NO: Duplicate coverage

If ANY answer is ❌ NO → SKIP this test

Risk Priority Matrix

Calculation Formula

Priority = Business Impact (1-5) × Probability of Failure (1-5)

Result ranges:

Priority ≥15 (15-25): MUST test - critical scenarios
Priority 9-14: SHOULD test if not already covered
Priority ≤8 (1-8): SKIP - manual testing sufficient

Business Impact Scoring (1-5)

Score	Impact Level	Examples
5	Critical	Money loss, security breach, data corruption, legal liability
4	High	Core business flow breaks (cannot complete purchase, cannot login)
3	Medium	Feature partially broken (search works but pagination fails)
2	Low	Minor UX issue (button disabled state wrong, tooltip missing)
1	Trivial	Cosmetic bug (color slightly off, spacing issue)

Probability of Failure Scoring (1-5)

Score	Probability	Indicators
5	Very High (>50%)	Complex algorithm, external API, new technology, no existing tests
4	High (25-50%)	Multiple dependencies, concurrency, state management
3	Medium (10-25%)	Standard CRUD, framework defaults, well-tested patterns
2	Low (5-10%)	Simple logic, established library, copy-paste from working code
1	Very Low (<5%)	Trivial assignment, framework-generated code

Priority Matrix Table

	Probability 1	Probability 2	Probability 3	Probability 4	Probability 5
Impact 5	5 (SKIP)	10 (SHOULD)	15 (MUST)	20 (MUST)	25 (MUST)
Impact 4	4 (SKIP)	8 (SKIP)	12 (SHOULD)	16 (MUST)	20 (MUST)
Impact 3	3 (SKIP)	6 (SKIP)	9 (SHOULD)	12 (SHOULD)	15 (MUST)
Impact 2	2 (SKIP)	4 (SKIP)	6 (SKIP)	8 (SKIP)	10 (SHOULD)
Impact 1	1 (SKIP)	2 (SKIP)	3 (SKIP)	4 (SKIP)	5 (SKIP)

Test Type Decision Tree

Step 1: Calculate Risk Priority

Use Risk Priority Matrix above.

Step 2: Select Test Type

IF Priority ≥15 → Proceed to Step 3
ELSE IF Priority 9-14 → Check Anti-Duplication (Step 4), then Step 3
ELSE Priority ≤8 → SKIP (manual testing sufficient)

Step 3: Choose Test Level

E2E Test (2-5 max per Story):

BASELINE (ALWAYS): 2 E2E tests per endpoint
- Test 1: Positive scenario (happy path validating main AC)
- Test 2: Negative scenario (critical error handling)
ADDITIONAL (3-5): ONLY if Priority ≥15 AND justified
- Critical edge case from manual testing
- Second endpoint (if Story implements multiple endpoints)
Examples:
- User registers → receives email → confirms → can login
- User adds product → proceeds to checkout → pays → sees confirmation
- User uploads file → sees progress → file appears in list

Integration Test (0-8 max per Story):

DEFAULT: 0 Integration tests (2 E2E tests cover full stack by default)
ADD ONLY if: E2E doesn't cover interaction completely AND Priority ≥15 AND justified
Examples:
- Transaction rollback on error (E2E tests happy path only)
- Concurrent request handling (E2E tests single request)
- External API error scenarios (500, timeout) with Priority ≥15
MANDATORY SKIP:
- ❌ Simple pass-through calls (E2E already validates end-to-end)
- ❌ Testing framework integrations (Prisma client, TypeORM repository, Express app)
- ❌ Testing database query execution (database engine behavior)

Unit Test (0-15 max per Story):

DEFAULT: 0 Unit tests (2 E2E tests cover simple logic by default)
ADD ONLY for complex business logic with Priority ≥15:
- Financial calculations (tax, discount, currency conversion) WITH COMPLEX RULES
- Security algorithms (password strength, permission matrix) WITH CUSTOM LOGIC
- Complex business algorithms (scoring, matching, ranking) WITH MULTIPLE FACTORS
MANDATORY SKIP - DO NOT create unit tests for:
- ❌ Simple CRUD operations (already covered by E2E)
- ❌ Framework code (Express middleware, React hooks, FastAPI dependencies)
- ❌ Library functions (bcrypt hashing, jsonwebtoken signing, axios requests)
- ❌ Database queries (Prisma findMany, TypeORM query builder, SQL joins)
- ❌ Getters/setters or simple property access
- ❌ Trivial conditionals (if (user) return user.name, status === 'active')
- ❌ Pass-through functions (wrappers without logic)
- ❌ Performance/load testing (benchmarks, stress tests, scalability validation)

Step 4: Anti-Duplication Check

Before writing ANY test, verify:

Is this scenario already covered by E2E?
- E2E tests payment flow → SKIP unit test for calculateTotal()
- E2E tests login → SKIP unit test for validateEmail()
Is this testing framework code?
- Testing Express app.use() → SKIP
- Testing React useState → SKIP
- Testing Prisma findMany() → SKIP
Does this add unique business value?
- E2E tests happy path → Unit test for edge case (negative price) → KEEP
- Integration test already validates DB transaction → SKIP duplicate unit test
Is this a one-line function?
- getFullName() { return firstName + lastName } → SKIP (E2E covers it)

Test Limits Per Story

Enforced Limits with Realistic Goals

Test Type	Minimum	Realistic Goal	Maximum	Purpose
E2E	2	2	5	Baseline: positive + negative per endpoint
Integration	0	0-2	8	ONLY if E2E doesn't cover interaction
Unit	0	0-3	15	ONLY complex business logic (financial/security/algorithms)
TOTAL	2	2-7	28	Start minimal, add only with justification

Key Change: Test limits are now CEILINGS (maximum allowed), NOT targets to fill. Start with 2 E2E tests, add more only with critical justification.

Rationale for Limits

Why maximum 5 E2E?

E2E tests are slow (10-60 seconds each)
Each Story typically has 2-4 Acceptance Criteria
1-2 E2E per AC is sufficient
Edge cases covered by Integration/Unit tests

Why maximum 8 Integration?

Integration tests validate layer interactions
Typical Story has 3-5 integration points (API → Service → DB)
1-2 tests per integration point + error scenarios

Why maximum 15 Unit?

Only test complex business logic
Typical Story has 2-4 complex functions
3-5 tests per function (happy path + edge cases)

Why total maximum 28?

Industry data: Stories with >30 tests rarely have proportional bug prevention
Maintenance cost grows quadratically beyond this point
Focus on quality over quantity

Common Over-Testing Anti-Patterns

Anti-Pattern 1: "Every if/else needs a test"

Bad:

// Function with 10 if/else branches
function processOrder(order) {
  if (!order) return null;           // Test 1
  if (!order.items) return null;      // Test 2
  if (order.items.length === 0) return null; // Test 3
  // ... 7 more conditionals
}

Problem: 10 unit tests for trivial validation logic already covered by E2E test that calls processOrder().

Good:

1 E2E test: User submits valid order → success
1 E2E test: User submits invalid order → error message
1 Unit test: Complex tax calculation inside processOrder() (if exists)

Total: 3 tests instead of 12

Anti-Pattern 2: "Testing framework code"

Bad:

// Testing Express middleware
test('CORS middleware sets headers', () => {
  // Testing Express, not OUR code
});

// Testing React hook
test('useState updates component', () => {
  // Testing React, not OUR code
});

Good:

Trust framework tests (Express/React have thousands of tests)
Test OUR business logic that USES framework

Anti-Pattern 3: "Duplicating E2E coverage with Unit tests"

Bad:

// E2E already tests: POST /api/orders → creates order in DB
test('E2E: User can create order', ...);          // E2E
test('Unit: createOrder() inserts to database', ...); // Duplicate!
test('Unit: createOrder() returns order object', ...); // Duplicate!

Good:

// E2E tests full flow
test('E2E: User can create order', ...);

// Unit tests ONLY complex calculation NOT fully exercised by E2E
test('Unit: Bulk discount applied when quantity > 100', ...);

Anti-Pattern 4: "Aiming for 80% coverage"

Bad mindset:

"We have 75% coverage, need 5 more tests to hit 80%"
Writes tests for trivial getters/setters to inflate coverage

Good mindset:

"Payment flow is critical (Priority 25) but only has 1 E2E test"
"We have 60% coverage but all critical paths tested - DONE"

Anti-Pattern 5: "Testing framework integration"

Bad:

// Testing Express framework behavior
test('Express middleware chain works', () => {
  // Testing Express.js, not OUR code
});

// Testing Prisma client behavior
test('Prisma findMany returns array', () => {
  // Testing Prisma, not OUR code
});

// Testing React hook behavior
test('useState triggers rerender', () => {
  // Testing React, not OUR code
});

Why bad: Frameworks have thousands of tests. Trust the framework, test OUR business logic that USES the framework.

Good:

// Test OUR business logic that uses framework
test('E2E: User can create order', () => {
  // Tests OUR endpoint logic (which happens to use Express + Prisma)
  // But we're validating OUR business rules, not framework behavior
});

Anti-Pattern 6: "Testing database query syntax"

Bad:

// Testing database query execution
test('findByEmail() returns user from database', () => {
  await prisma.user.findUnique({ where: { email: 'test@example.com' }});
  // Testing Prisma query builder, not OUR logic
});

// Testing SQL JOIN behavior
test('getUserWithOrders() joins tables correctly', () => {
  // Testing PostgreSQL JOIN semantics, not OUR logic
});

Why bad: Database engines have extensive test suites. We're testing PostgreSQL/MySQL, not our code.

Good:

// E2E test already validates query works
test('E2E: User can view order history', () => {
  // Implicitly validates that JOIN query works correctly
  // We test the USER OUTCOME, not the database mechanism
});

// Unit test ONLY for complex query construction logic
test('buildSearchQuery() with multiple filters generates correct WHERE clause', () => {
  // ONLY if we have complex query building logic with business rules
  // NOT testing database execution, testing OUR query builder logic
});

Anti-Pattern 7: "Testing library behavior"

Bad:

// Testing bcrypt library
test('bcrypt hashes password correctly', () => {
  const hash = await bcrypt.hash('password', 10);
  const valid = await bcrypt.compare('password', hash);
  expect(valid).toBe(true);
  // Testing bcrypt library, not OUR code
});

// Testing jsonwebtoken library
test('JWT token is valid', () => {
  const token = jwt.sign({ userId: 1 }, SECRET);
  const decoded = jwt.verify(token, SECRET);
  // Testing jsonwebtoken library, not OUR code
});

// Testing axios library
test('axios makes HTTP request', () => {
  await axios.get('https://api.example.com');
  // Testing axios library, not OUR code
});

Why bad: Libraries are already tested by their maintainers. We're duplicating their test suite.

Good:

// E2E test validates full authentication flow
test('E2E: User can login and access protected endpoint', () => {
  // Implicitly validates that bcrypt comparison works
  // AND that JWT token generation/validation works
  // But we test the USER FLOW, not library internals
});

// Unit test ONLY for custom password rules (OUR business logic)
test('validatePasswordStrength() requires 12+ chars with special symbols', () => {
  // Testing OUR custom password policy, not bcrypt itself
});

When to Break the Rules

Scenario 1: Regulatory Compliance

Financial/Healthcare applications:

May need >28 tests for audit trail
Document WHY each test exists (regulation reference)

Scenario 2: Bug-Prone Legacy Code

If Story modifies legacy code with history of bugs:

Increase Unit test limit to 20
Add characterization tests

Scenario 3: Public API

If Story creates API consumed by 3rd parties:

Increase Integration test limit to 12
Test all error codes (400, 401, 403, 404, 429, 500)

Scenario 4: Security-Critical Features

Authentication, authorization, encryption:

All scenarios Priority ≥15
May reach 28 test maximum legitimately

Quick Reference

Decision Flowchart (Minimum Viable Testing)

1. Start with 2 baseline E2E tests (positive + negative) - ALWAYS
   ↓
2. For test #3 and beyond, calculate Risk Priority (Impact × Probability)
   ↓
3. Priority ≥15?
   NO (≤14) → SKIP (manual testing sufficient)
   YES → Proceed to Step 4
   ↓
4. Critical Justification Check (ALL must be YES):
   ❓ Tests OUR business logic? (not framework/library/database)
   ❓ Not already covered by 2 baseline E2E?
   ❓ Unique business value?
   ANY NO? → SKIP
   ALL YES? → Proceed to Step 5
   ↓
5. Select Test Type:
   - User flow? → E2E #3-5 (with justification)
   - E2E doesn't cover interaction? → Integration 0-8 (with justification)
   - Complex OUR algorithm? → Unit 0-15 (with justification)
   ↓
6. Verify total ≤7 (realistic goal) or ≤28 (hard limit)
   > 7 tests? → Auto-trim by Priority, keep 2 baseline E2E + top 5 Priority

Red Flags (Stop and Reconsider)

❌ "I need to test every branch for coverage" → Focus on business risk, not coverage ❌ "This E2E already tests it, but I'll add unit test anyway" → Duplication ❌ "Need to test Express middleware behavior" → Testing framework, not OUR code ❌ "Need to test Prisma query execution" → Testing database/ORM, not OUR code ❌ "Need to test bcrypt hashing" → Testing library, not OUR code ❌ "Story has 45 tests" → Exceeds limit, prioritize and trim ❌ "Story has 15 tests but includes Prisma/bcrypt/Express tests" → Testing framework/library, remove ❌ "Testing getter/setter" → Trivial code, E2E covers it ❌ "Need more tests to hit 10 minimum" → Minimum is 2, not 10!

Green Lights (Good Test)

✅ "2 E2E tests: positive + negative for main endpoint" → Baseline (ALWAYS) ✅ "Tax calculation with country-specific rules, Priority 25" → Unit test (OUR complex logic) ✅ "User must complete checkout, Priority 20" → E2E test (user value) ✅ "Story has 3 tests: 2 E2E + 1 Unit for OUR tax logic" → Minimum viable! ✅ "Story has 5 tests, all test OUR business logic, all Priority ≥15" → Justified and minimal ✅ "Skipped 8 scenarios - all were framework/library behavior" → Good filtering!

References

Kent Beck, "Test Desiderata" (2018)
Martin Fowler, "Practical Test Pyramid" (2018)
Kent C. Dodds, "The Testing Trophy" (2020)
Google Testing Blog, "Code Coverage Best Practices" (2020)
Netflix Tech Blog, "Testing Strategy at Scale" (2021)
Michael Feathers, "Working Effectively with Legacy Code" (2004)
OWASP Testing Guide v4.2 (2023)

Version History

Version	Date	Changes
1.0	2025-10-31	Initial Risk-Based Testing framework to replace Test Pyramid (10-28 tests per Story)
2.0.0	2025-11-11	Minimum Viable Testing philosophy: Start with 2 E2E baseline, realistic goal 2-7 tests. Critical justification required for each test beyond baseline. New anti-patterns (5-7) for framework/library/database testing. Updated examples (Login 6→3, Search 7→2, Payment 13→5)

Version: 2.0.0 Last Updated: 2025-11-11

18 KiB Raw Permalink Blame History Unescape Escape

Risk-Based Testing Guide

Purpose

Core Philosophy

Kent Beck's Principle

Key Insights

Minimum Viable Testing Philosophy

Start Minimal, Justify Additions

Critical Justification Questions

Risk Priority Matrix

Calculation Formula

Business Impact Scoring (1-5)

Probability of Failure Scoring (1-5)

Priority Matrix Table

Test Type Decision Tree

Step 1: Calculate Risk Priority

Step 2: Select Test Type

Step 3: Choose Test Level

Step 4: Anti-Duplication Check

Test Limits Per Story

Enforced Limits with Realistic Goals

Rationale for Limits

Common Over-Testing Anti-Patterns

Anti-Pattern 1: "Every if/else needs a test"

Anti-Pattern 2: "Testing framework code"

Anti-Pattern 3: "Duplicating E2E coverage with Unit tests"

Anti-Pattern 4: "Aiming for 80% coverage"

Anti-Pattern 5: "Testing framework integration"

Anti-Pattern 6: "Testing database query syntax"

Anti-Pattern 7: "Testing library behavior"

When to Break the Rules

Scenario 1: Regulatory Compliance

Scenario 2: Bug-Prone Legacy Code

Scenario 3: Public API

Scenario 4: Security-Critical Features

Quick Reference

Decision Flowchart (Minimum Viable Testing)

Red Flags (Stop and Reconsider)

Green Lights (Good Test)

References

Version History

18 KiB

Raw Permalink Blame History