Initial commit

2025-11-30 08:37:27 +08:00
commit 37774aa937
131 changed files with 31137 additions and 0 deletions
--- a/skills/ln-116-test-docs-creator/references/testing_strategy_template.md
+++ b/skills/ln-116-test-docs-creator/references/testing_strategy_template.md
@@ -0,0 +1,465 @@
+# Testing Strategy
+
+Universal testing philosophy and strategy for modern software projects: principles, organization, and best practices.
+
+> **SCOPE:** Testing philosophy, risk-based strategy, test organization, isolation patterns, what to test. **NOT IN SCOPE:** Project structure, framework-specific patterns, CI/CD configuration, test tooling setup.
+
+## Quick Navigation
+
+- **Tests Organization:** [tests/README.md](../../tests/README.md) - Directory structure, Story-Level Pattern, running tests
+- **Test Inventory:** [tests/unit/REGISTRY.md](../../tests/unit/REGISTRY.md), [tests/integration/REGISTRY.md](../../tests/integration/REGISTRY.md), [tests/e2e/REGISTRY.md](../../tests/e2e/REGISTRY.md)
+
+---
+
+## Core Philosophy
+
+### Test YOUR Code, Not Frameworks
+
+**Focus testing effort on YOUR business logic and integration usage.** Do not retest database constraints, ORM internals, framework validation, or third-party library mechanics.
+
+**Rule of thumb:** If deleting your code wouldn't fail the test, you're testing someone else's code.
+
+### Examples
+
+| Verdict | Test Description | Rationale |
+|---------|-----------------|-----------|
+| ✅ **GOOD** | Custom validation logic raises exception for invalid input | Tests YOUR validation rules |
+| ✅ **GOOD** | Repository query returns filtered results based on business criteria | Tests YOUR query construction |
+| ✅ **GOOD** | API endpoint returns correct HTTP status for error scenarios | Tests YOUR error handling |
+| ❌ **BAD** | Database enforces UNIQUE constraint on email column | Tests database, not your code |
+| ❌ **BAD** | ORM model has correct column types and lengths | Tests ORM configuration, not logic |
+| ❌ **BAD** | Framework validates request body matches schema | Tests framework validation |
+
+---
+
+## Risk-Based Testing Strategy
+
+### Priority Matrix
+
+**Automate only high-value scenarios** using Business Impact (1-5) × Probability (1-5).
+
+| Priority Score | Action | Example Scenarios |
+|----------------|--------|-------------------|
+| **≥15** | MUST test | Payment processing, authentication, data loss scenarios |
+| **10-14** | Consider testing | Edge cases with moderate impact |
+| **<10** | Skip automated tests | Low-probability edge cases, framework behavior |
+
+### Test Caps (per Story)
+
+**Enforce caps to prevent test bloat:**
+
+- **E2E:** 2-5 tests
+- **Integration:** 3-8 tests
+- **Unit:** 5-15 tests
+- **Total:** 10-28 tests per Story
+
+**Key principles:**
+- **No minimum limits** - Can be 0 tests if no Priority ≥15 scenarios exist
+- **No test pyramids** - Test distribution based on risk, not arbitrary ratios
+- **Every test must add value** - Each test should validate unique Priority ≥15 scenario
+
+**Exception:** ML/GPU/Hardware-dependent workloads may favor more E2E (5-10), fewer Integration (2-5), minimal Unit (1-3) because behavior is hardware-dependent and mocks lack fidelity. Same 10-28 total cap applies.
+
+---
+
+## Story-Level Testing Pattern
+
+### When to Write Tests
+
+**Consolidate ALL tests in Story's final test task** AFTER implementation + manual verification.
+
+| Task Type | Contains Tests? | Rationale |
+|-----------|----------------|-----------|
+| **Implementation Tasks** | ❌ NO tests | Focus on implementation only |
+| **Final Test Task** | ✅ ALL tests | Complete Story coverage after manual verification |
+
+### Benefits
+
+1. **Complete context** - Tests written when all code implemented
+2. **No duplication** - E2E covers integration paths, no need to retest same code
+3. **Better prioritization** - Manual testing identifies Priority ≥15 scenarios before automation
+4. **Atomic delivery** - Story delivers working code + comprehensive tests together
+
+### Anti-Pattern Example
+
+| ❌ Wrong Approach | ✅ Correct Approach |
+|-------------------|---------------------|
+| Task 1: Implement feature X + write unit tests<br>Task 2: Update integration + write integration tests<br>Task 3: Add logging + write E2E tests | Task 1: Implement feature X<br>Task 2: Update integration points<br>Task 3: Add logging<br>**Task 4 (Final): Write ALL tests (2 E2E, 3 Integration, 8 Unit)** |
+| **Result:** Tests scattered, duplication, incomplete coverage | **Result:** Tests consolidated, no duplication, complete coverage |
+
+---
+
+## Test Organization
+
+### Directory Structure
+
+```
+tests/
+├── e2e/              # End-to-end tests (full system, real services)
+│   ├── test_user_journey.ext
+│   └── REGISTRY.md   # E2E test inventory
+├── integration/      # Integration tests (multiple components, real dependencies)
+│   ├── api/
+│   ├── services/
+│   ├── db/
+│   └── REGISTRY.md   # Integration test inventory
+├── unit/             # Unit tests (single component, mocked dependencies)
+│   ├── api/
+│   ├── services/
+│   ├── db/
+│   └── REGISTRY.md   # Unit test inventory
+└── README.md         # Test documentation
+```
+
+### Test Inventory (REGISTRY.md)
+
+**Each test category has REGISTRY.md** with detailed test descriptions:
+
+**Purpose:**
+- Document what each test validates
+- Track test counts per Epic/Story
+- Provide navigation for test maintenance
+
+**Format example:**
+
+```markdown
+# E2E Test Registry
+
+## Quality Estimation (Epic 6 - API-69)
+
+**File:** tests/e2e/test_quality_estimation.ext
+
+**Tests (4):**
+1. **evaluate_endpoint_batch_splitting** - MetricX batch splitting (segments >128 split into batches)
+2. **evaluate_endpoint_gpu_integration** - MetricX-24 GPU service integration
+3. **evaluate_endpoint_error_handling** - Service timeout handling (503 status)
+4. **evaluate_endpoint_response_format** - Response schema validation
+
+**Total:** 4 E2E tests | **Coverage:** 100% Priority ≥15 scenarios
+```
+
+---
+
+## Test Levels
+
+### E2E (End-to-End) Tests
+
+**Definition:** Full system tests with real external services and complete data flow.
+
+**Characteristics:**
+- Real external APIs/services
+- Real database
+- Full request-response cycle
+- Validates complete user journeys
+
+**When to write:**
+- Critical user workflows (authentication, payments, core features)
+- Integration with external services
+- Priority ≥15 scenarios that span multiple systems
+
+**Example:** User registration flow (E2E) vs individual validation function (Unit)
+
+### Integration Tests
+
+**Definition:** Tests multiple components together with real dependencies (database, cache, file system).
+
+**Characteristics:**
+- Real database/cache/file system
+- Multiple components interact
+- May mock external APIs
+- Validates component integration
+
+**When to write:**
+- Database query behavior
+- Service orchestration
+- Component interaction
+- API endpoint behavior (without external services)
+
+**Example:** Repository query with real database vs service logic with mocked repository
+
+### Unit Tests
+
+**Definition:** Tests single component in isolation with mocked dependencies.
+
+**Characteristics:**
+- Fast execution (<1ms per test)
+- No external dependencies
+- Mocked collaborators
+- Validates single responsibility
+
+**When to write:**
+- Business logic validation
+- Complex calculations
+- Error handling logic
+- Custom transformations
+
+**Example:** Validation function with mocked data vs endpoint with real database
+
+---
+
+## Isolation Patterns
+
+### Pattern Comparison
+
+| Pattern | Speed | Complexity | Best For |
+|---------|-------|------------|----------|
+| **Data Deletion** | ⚡⚡⚡ Fastest | Simple | Default choice (90% of projects) |
+| **Transaction Rollback** | ⚡⚡ Fast | Moderate | Transaction semantics testing |
+| **Database Recreation** | ⚡ Slow | Simple | Maximum isolation paranoia |
+
+### Data Deletion (Default)
+
+**How it works:**
+1. Create schema once at test session start
+2. Delete data after each test
+3. Drop schema at test session end
+
+**Benefits:**
+- Fast (5-8s for 50 tests)
+- Simple implementation
+- Full isolation between tests
+
+**When to use:** Default choice for most projects
+
+### Transaction Rollback
+
+**How it works:**
+1. Start transaction before each test
+2. Run test code
+3. Rollback transaction after test
+
+**Benefits:**
+- Good for testing transaction semantics
+- Faster than DB recreation
+
+**When to use:** Testing transaction behavior, savepoints, isolation levels
+
+### Database Recreation
+
+**How it works:**
+1. Drop and recreate database before each test
+2. Apply migrations
+3. Run test
+
+**Benefits:**
+- Maximum isolation
+- Catches migration issues
+
+**When to use:** Paranoia about shared state, testing migrations
+
+---
+
+## What To Test vs NOT Test
+
+### ✅ Test (GOOD)
+
+**Test YOUR code and integration usage:**
+
+| Category | Examples |
+|----------|----------|
+| **Business logic** | Validation rules, orchestration, error handling, computed properties |
+| **Query construction** | Filters, joins, aggregations, pagination |
+| **API behavior** | Request validation, response shape, HTTP status codes |
+| **Custom validators** | Complex validation logic, transformations |
+| **Integration smoke** | Database connectivity, basic CRUD, configuration |
+
+### ❌ Avoid (BAD)
+
+**Don't test framework internals and third-party libraries:**
+
+| Category | Examples |
+|----------|----------|
+| **Database constraints** | UNIQUE, FOREIGN KEY, NOT NULL, CHECK constraints |
+| **ORM internals** | Column types, table creation, metadata, relationships |
+| **Framework validation** | Request body validation, dependency injection, routing |
+| **Third-party libraries** | HTTP client behavior, serialization libraries, cryptography |
+
+---
+
+## Testing Patterns
+
+### Arrange-Act-Assert
+
+**Structure tests clearly:**
+
+```
+test_example:
+    # ARRANGE: Set up test data and dependencies
+    setup_data()
+    mock_dependencies()
+
+    # ACT: Execute code under test
+    result = execute_operation()
+
+    # ASSERT: Verify outcomes
+    assert result == expected
+    verify_side_effects()
+```
+
+**Benefits:**
+- Clear test structure
+- Easy to read and maintain
+- Explicit test phases
+
+### Mock at the Seam
+
+**Mock at component boundaries, not internals:**
+
+| Test Type | What to Mock | What to Use Real |
+|-----------|--------------|------------------|
+| **Unit tests** | External dependencies (repositories, APIs, file system) | Business logic |
+| **Integration tests** | External APIs, slow services | Database, cache, your code |
+| **E2E tests** | Nothing (or minimal external services) | Everything |
+
+**Anti-pattern:** Over-mocking your own code defeats the purpose of integration tests.
+
+### Test Data Builders
+
+**Create readable test data:**
+
+```
+# Builder pattern for test data
+user = build_user(
+    email="test@example.com",
+    role="admin",
+    active=True
+)
+
+# Easy to create edge cases
+inactive_user = build_user(active=False)
+guest_user = build_user(role="guest")
+```
+
+**Benefits:**
+- Readable test setup
+- Easy edge case creation
+- Reusable across tests
+
+---
+
+## Common Issues
+
+### Flaky Tests
+
+**Symptom:** Tests pass/fail randomly without code changes
+
+**Common causes:**
+- Shared state between tests (global variables, cached data)
+- Time-dependent logic (timestamps, delays)
+- External service instability
+- Improper cleanup between tests
+
+**Solutions:**
+- Isolate test data (per-test creation, cleanup)
+- Mock time-dependent code
+- Use test-specific configurations
+- Implement proper teardown
+
+### Slow Tests
+
+**Symptom:** Test suite takes too long (>30s for 50 tests)
+
+**Common causes:**
+- Database recreation per test
+- Running migrations per test
+- No connection pooling
+- Too many E2E tests
+
+**Solutions:**
+- Use Data Deletion pattern
+- Run migrations once per session
+- Optimize test data creation
+- Balance test levels (more Unit, fewer E2E)
+
+### Test Coupling
+
+**Symptom:** Changing one component breaks many unrelated tests
+
+**Common causes:**
+- Tests depend on implementation details
+- Shared test fixtures across unrelated tests
+- Testing framework internals instead of behavior
+
+**Solutions:**
+- Test behavior, not implementation
+- Use independent test data per test
+- Focus on public APIs, not internal state
+
+---
+
+## Coverage Guidelines
+
+### Targets
+
+| Layer | Target | Priority |
+|-------|--------|----------|
+| **Critical business logic** | 100% branch coverage | HIGH |
+| **Repositories/Data access** | 90%+ line coverage | HIGH |
+| **API endpoints** | 80%+ line coverage | MEDIUM |
+| **Utilities/Helpers** | 80%+ line coverage | MEDIUM |
+| **Overall** | 80%+ line coverage | MEDIUM |
+
+### What Coverage Means
+
+**Coverage is a tool, not a goal:**
+- ✅ High coverage + focused tests = good quality signal
+- ❌ High coverage + meaningless tests = false confidence
+- ❌ Low coverage = blind spots in testing
+
+**Focus on:**
+- Critical paths covered
+- Edge cases tested
+- Error handling validated
+
+**Not on:**
+- Arbitrary percentage targets
+- Testing getters/setters
+- Framework code
+
+---
+
+## Verification Checklist
+
+### Strategy
+
+- [ ] Risk-based selection (Priority ≥15)
+- [ ] Test caps enforced (E2E 2-5, Integration 3-8, Unit 5-15)
+- [ ] Total 10-28 tests per Story
+- [ ] Tests target YOUR code, not framework internals
+- [ ] E2E smoke tests for critical integrations
+
+### Organization
+
+- [ ] Story-Level Test Task Pattern followed
+- [ ] Tests consolidated in final Story task
+- [ ] REGISTRY.md files maintained for all test categories
+- [ ] Test directory structure follows conventions
+
+### Isolation
+
+- [ ] Isolation pattern chosen (Data Deletion recommended)
+- [ ] Each test creates own data
+- [ ] Proper cleanup between tests
+- [ ] No shared state between tests
+
+### Quality
+
+- [ ] Tests are order-independent
+- [ ] Tests run fast (<10s for 50 integration tests)
+- [ ] No flaky tests
+- [ ] Coverage ≥80% overall, 100% for critical logic
+- [ ] Meaningful test names and descriptions
+
+---
+
+## Maintenance
+
+**Update Triggers:**
+- New testing patterns discovered
+- Framework version changes affecting tests
+- Significant changes to test architecture
+- New isolation issues identified
+
+**Verification:** Review this strategy when starting new projects or experiencing test quality issues.
+
+**Last Updated:** [CURRENT_DATE] - Initial universal testing strategy