Initial commit
This commit is contained in:
301
skills/ln-350-story-test-planner/SKILL.md
Normal file
301
skills/ln-350-story-test-planner/SKILL.md
Normal file
@@ -0,0 +1,301 @@
|
||||
---
|
||||
name: ln-350-story-test-planner
|
||||
description: Plans Story test task by Risk-Based Testing after manual testing. Calculates priorities, selects E2E/Integration/Unit, delegates to ln-311-task-creator. Invoked by ln-340-story-quality-gate.
|
||||
---
|
||||
|
||||
# Test Task Planner
|
||||
|
||||
Creates final Story task with comprehensive test coverage (Unit/Integration/E2E) PLUS existing test fixes, infrastructure updates, documentation, and legacy cleanup based on REAL manual testing results.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
This skill should be used when:
|
||||
- **Invoked by ln-340-story-quality-gate Pass 1** after manual functional testing PASSED
|
||||
- **Invocation method:** Use Skill tool with command: `Skill(command: "ln-350-story-test-planner")`
|
||||
- All implementation tasks in Story are Done
|
||||
- Manual testing results documented in Linear comment
|
||||
- Create final Story task covering: tests, test fixes, infrastructure, documentation, legacy cleanup
|
||||
|
||||
**Prerequisites:**
|
||||
- All implementation Tasks in Story status = Done
|
||||
- ln-340-story-quality-gate Pass 1 completed manual testing
|
||||
- Manual test results in Linear comment (created by ln-340-story-quality-gate Phase 3 step 4)
|
||||
|
||||
**Automation:** Supports `autoApprove: true` (default when invoked by ln-340-story-quality-gate) to skip manual confirmation and run unattended.
|
||||
|
||||
## When NOT to Use
|
||||
|
||||
Do NOT use if:
|
||||
- Manual testing NOT completed → Wait for ln-340-story-quality-gate Pass 1
|
||||
- Manual test results NOT in Linear comment → ln-340-story-quality-gate must document first
|
||||
- Implementation tasks NOT all Done → Complete impl tasks first
|
||||
|
||||
## How It Works
|
||||
|
||||
### Phase 1: Discovery (Automated)
|
||||
|
||||
Auto-discovers Team ID from `docs/tasks/kanban_board.md` (see CLAUDE.md "Configuration Auto-Discovery").
|
||||
|
||||
**Input:** Story ID from user (e.g., US001, API-42)
|
||||
|
||||
### Phase 2: Story + Tasks Analysis (NO Dialog)
|
||||
|
||||
**Step 0: Study Project Test Files**
|
||||
1. Scan for test-related files:
|
||||
- tests/README.md (commands, setup, environment)
|
||||
- Test configs (jest.config.js, vitest.config.ts, pytest.ini)
|
||||
- Existing test structure (tests/, __tests__/ directories)
|
||||
- Coverage config (.coveragerc, coverage.json)
|
||||
2. Extract: test commands, framework, patterns, coverage thresholds
|
||||
3. Ensures test planning aligns with project practices
|
||||
|
||||
**Step 1: Load Manual Test Results**
|
||||
1. Fetch Story from Linear (must have label "user-story")
|
||||
2. Extract Story.id (UUID) - ⚠️ Use UUID, NOT short ID (required for Linear API)
|
||||
3. Load manual test results comment (format: ln-343-manual-tester Format v1.0)
|
||||
- Search for the header containing "Manual Testing Results" (see `ln-343-manual-tester/references/test_result_format_v1.md`)
|
||||
- If not found → ERROR: Run ln-340-story-quality-gate Pass 1 first
|
||||
4. Parse sections: AC results (PASS/FAIL), Edge Cases, Error Handling, Integration flows
|
||||
5. Map to test design: PASSED AC → E2E, Edge cases → Unit, Errors → Error handling, Flows → Integration
|
||||
|
||||
**Step 2: Analyze Story + Tasks**
|
||||
1. Parse Story: Goal, Test Strategy, Technical Notes
|
||||
2. Fetch **all child Tasks** (parentId = Story.id, status = Done) from Linear
|
||||
3. Analyze each Task:
|
||||
- Components implemented
|
||||
- Business logic added
|
||||
- Integration points created
|
||||
- Conditional branches (if/else/switch)
|
||||
4. Identify what needs testing
|
||||
|
||||
### Phase 3: Parsing Strategy for Manual Test Results
|
||||
|
||||
**Process:** Locate Linear comment with the "Manual Testing Results" header described in `ln-343-manual-tester/references/test_result_format_v1.md` → Verify Format Version 1.0 → Extract structured sections (Acceptance Criteria, Test Results by AC, Edge Cases, Error Handling, Integration Testing) using regex → Validate (at least 1 PASSED AC, AC count matches Story, completeness check) → Map parsed data to test design structure
|
||||
|
||||
**Error Handling:** Missing comment → ERROR (run ln-340-story-quality-gate Pass 1 first), Missing format version → WARNING (try legacy parsing), Required section missing → ERROR (re-run ln-340-story-quality-gate), No PASSED AC → ERROR (fix implementation)
|
||||
|
||||
### Phase 4: Risk-Based Test Planning (Automated)
|
||||
|
||||
**Reference:** See `references/risk_based_testing_guide.md` for complete methodology (Business Impact/Probability tables, detailed decision trees, anti-patterns with code examples).
|
||||
|
||||
**E2E-First Approach:** Prioritize by business risk (Priority = Impact × Probability), not coverage metrics.
|
||||
|
||||
**Workflow:**
|
||||
|
||||
**Step 1: Risk Assessment**
|
||||
|
||||
Calculate Priority for each scenario from manual testing:
|
||||
|
||||
```
|
||||
Priority = Business Impact (1-5) × Probability (1-5)
|
||||
```
|
||||
|
||||
**Decision Criteria:**
|
||||
- Priority ≥15 → **MUST test**
|
||||
- Priority 9-14 → **SHOULD test** if not covered
|
||||
- Priority ≤8 → **SKIP** (manual testing sufficient)
|
||||
|
||||
*See guide: Business Impact Table, Probability Table, Priority Matrix 5×5*
|
||||
|
||||
**Step 2: E2E Test Selection (2-5):** Baseline 2 (positive + negative) ALWAYS + 0-3 additional (Priority ≥15 only)
|
||||
|
||||
**Step 3: Unit Test Selection (0-15):** DEFAULT 0. Add ONLY for complex business logic (Priority ≥15): financial, security, algorithms
|
||||
|
||||
**Step 4: Integration Test Selection (0-8):** DEFAULT 0. Add ONLY if E2E gaps AND Priority ≥15: rollback, concurrency, external API errors
|
||||
|
||||
**Step 5: Validation:** Limits 2-28 total (realistic goal: 2-7). Auto-trim if >7 (keep 2 baseline + top 5 by Priority)
|
||||
|
||||
**Decision criteria details in guide:** Justification Checks, MANDATORY SKIP lists, Anti-Framework Rule
|
||||
|
||||
### Phase 5: Test Task Generation (Automated)
|
||||
|
||||
Generates complete Story Finalizer test task per `test_task_template.md` (11 sections):
|
||||
|
||||
**Sections 1-7:** Context, Risk Matrix, E2E/Integration/Unit Tests (with Priority scores + justifications), Coverage, DoD
|
||||
|
||||
**Section 8:** Existing Tests to Fix (analysis of affected tests from implementation tasks)
|
||||
|
||||
**Section 9:** Infrastructure Changes (packages, Docker, configs - based on test dependencies)
|
||||
|
||||
**Section 10:** Documentation Updates (README, CHANGELOG, tests/README, config docs)
|
||||
|
||||
**Section 11:** Legacy Code Cleanup (deprecated patterns, backward compat, dead code)
|
||||
|
||||
Shows preview for review.
|
||||
|
||||
### Phase 6: Confirmation & Delegation
|
||||
|
||||
**Step 1:** Preview generated test plan (always displayed for transparency)
|
||||
|
||||
**Step 2:** Confirmation logic:
|
||||
- **autoApprove: true** (default for ln-340-story-quality-gate) → proceed automatically with no user input
|
||||
- **Manual run** → prompt user to type "confirm" after reviewing the preview
|
||||
|
||||
**Step 3:** Check for existing test task
|
||||
|
||||
Query Linear: `list_issues(parentId=Story.id, labels=["tests"])`
|
||||
|
||||
**Decision:**
|
||||
- **Count = 0** → **CREATE MODE** (Step 4a)
|
||||
- **Count ≥ 1** → **REPLAN MODE** (Step 4b)
|
||||
|
||||
**Step 4a: CREATE MODE** (if Count = 0)
|
||||
|
||||
Invoke ln-311-task-creator worker with taskType: "test"
|
||||
|
||||
**Pass to worker:**
|
||||
- taskType, teamId, storyData (Story.id, title, AC, Technical Notes, Context)
|
||||
- manualTestResults (parsed from Linear comment)
|
||||
- testPlan (e2eTests, integrationTests, unitTests, riskPriorityMatrix)
|
||||
- infrastructureChanges, documentationUpdates, legacyCleanup
|
||||
|
||||
**Worker returns:** Task URL + summary
|
||||
|
||||
**Step 4b: REPLAN MODE** (if Count ≥ 1)
|
||||
|
||||
Invoke ln-312-task-replanner worker with taskType: "test"
|
||||
|
||||
**Pass to worker:**
|
||||
- Same data as CREATE MODE + existingTaskIds
|
||||
|
||||
**Worker returns:** Operations summary + warnings
|
||||
|
||||
**Step 5:** Return summary to user
|
||||
- CREATE MODE: "Test task created. Linear URL: [...]"
|
||||
- REPLAN MODE: "Test task updated. Operations executed."
|
||||
|
||||
---
|
||||
|
||||
## Definition of Done
|
||||
|
||||
Before completing work, verify ALL checkpoints:
|
||||
|
||||
**✅ Manual Testing Results Parsed:**
|
||||
- [ ] Linear comment "## 🧪 Manual Testing Results" found and parsed successfully
|
||||
- [ ] Format Version 1.0 validated
|
||||
- [ ] All required sections extracted: AC, Test Results by AC, Edge Cases, Error Handling, Integration Testing
|
||||
- [ ] At least 1 AC marked as PASSED (cannot create test task if all AC failed)
|
||||
|
||||
**✅ Risk-Based Test Plan Generated:**
|
||||
- [ ] Risk Priority Matrix calculated for all scenarios (Business Impact × Probability)
|
||||
- [ ] E2E tests (2-5): Baseline 2 (positive/negative) + additional 0-3 with Priority ≥15 AND justification
|
||||
- [ ] Integration tests (0-8): ONLY if E2E doesn't cover AND Priority ≥15 AND justification provided
|
||||
- [ ] Unit tests (0-15): ONLY complex business logic with Priority ≥15 AND justification for each test
|
||||
- [ ] **Total tests: 2-7 realistic goal** (hard limit: 2-28) - auto-trimmed if exceeds 7
|
||||
- [ ] No test duplication: Each test adds unique business value
|
||||
- [ ] No framework/library testing: Each test validates OUR business logic only
|
||||
|
||||
**✅ Story Finalizer Task Description Complete (11 sections):**
|
||||
- [ ] Section 1 - Context: Story link, why final task needed
|
||||
- [ ] Section 2 - Risk Priority Matrix: All scenarios with calculated Priority (Impact × Probability)
|
||||
- [ ] Section 3 - E2E Tests (2-5 max): Baseline 2 + additional 0-3 with Priority ≥15 AND justification, based on ACTUAL manual testing
|
||||
- [ ] Section 4 - Integration Tests (0-8 max): ONLY if E2E doesn't cover AND Priority ≥15 AND justification
|
||||
- [ ] Section 5 - Unit Tests (0-15 max): ONLY complex business logic with Priority ≥15 AND justification for EACH test
|
||||
- [ ] Section 6 - Critical Path Coverage: What MUST be tested (Priority ≥15) vs what skipped (≤14)
|
||||
- [ ] Section 7 - Definition of Done: All tests pass, Priority ≥15 scenarios tested, **realistic goal 2-7 tests** (max 28), no flaky tests, each test beyond baseline 2 justified
|
||||
- [ ] Section 8 - Existing Tests to Fix/Update: Affected tests + reasons + required fixes
|
||||
- [ ] Section 9 - Infrastructure Changes: Packages, Docker, configs to update
|
||||
- [ ] Section 10 - Documentation Updates: tests/README, README, CHANGELOG, other docs
|
||||
- [ ] Section 11 - Legacy Code Cleanup: Workarounds, backward compat, deprecated patterns, dead code
|
||||
|
||||
**✅ Worker Delegation Executed:**
|
||||
- [ ] Checked for existing test task in Linear (labels=["tests"])
|
||||
- [ ] CREATE MODE (if count = 0): Delegated to ln-311-task-creator with taskType: "test"
|
||||
- [ ] REPLAN MODE (if count ≥ 1): Delegated to ln-312-task-replanner with taskType: "test"
|
||||
- [ ] All required data passed to worker:
|
||||
- taskType, teamId, storyData (AC, Technical Notes, Context)
|
||||
- manualTestResults (parsed from Linear comment)
|
||||
- testPlan (e2eTests, integrationTests, unitTests, riskPriorityMatrix)
|
||||
- infrastructureChanges, documentationUpdates, legacyCleanup
|
||||
|
||||
**✅ Worker Completed Successfully:**
|
||||
- [ ] ln-311-task-creator: Test task created in Linear + kanban_board.md updated
|
||||
- [ ] ln-312-task-replanner: Operations executed + kanban_board.md updated
|
||||
- [ ] Linear Issue URL returned from worker
|
||||
|
||||
**✅ Confirmation Handling:**
|
||||
- [ ] Preview displayed (automation logs still capture full plan)
|
||||
- [ ] Confirmation satisfied: autoApprove: true supplied or user typed "confirm" after manual review
|
||||
|
||||
**Output:**
|
||||
- **CREATE MODE:** Linear Issue URL + confirmation message ("Created test task for Story US00X")
|
||||
- **REPLAN MODE:** Operations summary + URLs ("Test task updated. X operations executed.")
|
||||
|
||||
---
|
||||
|
||||
## Example Usage
|
||||
|
||||
**Context:**
|
||||
- ln-340-story-quality-gate Pass 1 completed manual testing
|
||||
- Manual test results in Linear comment (3 AC PASSED, 2 edge cases discovered, error handling verified)
|
||||
|
||||
**Invocation (by ln-340-story-quality-gate via Skill tool):**
|
||||
```
|
||||
Skill(skill: "ln-350-story-test-planner", storyId: "US001")
|
||||
```
|
||||
|
||||
**Execution (NO questions):**
|
||||
1. Discovery → Team "API", Story: US001
|
||||
2. Load Manual Test Results → Parse Linear comment (3 scenarios PASSED, 2 edge cases, 1 error scenario)
|
||||
3. Analysis → Story + 5 Done implementation Tasks
|
||||
4. Risk-Based Test Planning with Minimum Viable Testing:
|
||||
- Calculate Priority for each scenario (Business Impact × Probability)
|
||||
- E2E: 2 baseline tests (positive + negative for main endpoint) + 1 additional (critical edge case with Priority 20)
|
||||
- Integration: 0 tests (2 baseline E2E cover full stack)
|
||||
- Unit: 2 tests (tax calculation + discount logic with Priority ≥15)
|
||||
- **Total: 5 tests (within realistic goal 2-7)**
|
||||
- Auto-trim: Skipped 3 scenarios with Priority ≤14 (manual testing sufficient)
|
||||
5. Impact Analysis:
|
||||
- Existing Tests: 2 test files need updates (mock responses changed)
|
||||
- Infrastructure: Add Playwright for UI E2E tests
|
||||
- Documentation: Update tests/README.md, main README.md
|
||||
- Legacy Cleanup: Remove deprecated API v1 compatibility shim
|
||||
6. Generation → Complete story finalizer task (11 sections) with Risk Priority Matrix + justification for each test beyond baseline 2
|
||||
7. Confirmation / autoApprove → Creates final Task with parentId=US001, label "tests"
|
||||
|
||||
## Reference Files
|
||||
|
||||
### risk_based_testing_guide.md (Skill-Specific)
|
||||
|
||||
**Purpose**: Risk-Based Testing methodology for test task planning
|
||||
|
||||
**Contents**: Risk Priority Matrix (Business Impact × Probability), test limits (E2E 2-5, Integration 0-8, Unit 0-15), decision tree, anti-patterns, test selection examples
|
||||
|
||||
**Location**: [ln-350-story-test-planner/references/risk_based_testing_guide.md](references/risk_based_testing_guide.md)
|
||||
|
||||
**Ownership**: ln-350-story-test-planner (orchestrator-specific logic)
|
||||
|
||||
**Usage**: ln-350-story-test-planner uses this guide in Phase 4 (Risk-Based Test Planning)
|
||||
|
||||
### test_task_template.md (MOVED)
|
||||
|
||||
**Purpose**: Story finalizer task structure (11 sections: tests + fixes + infrastructure + docs + cleanup)
|
||||
|
||||
**Location**: Moved to [ln-311-task-creator/references/test_task_template.md](../ln-311-task-creator/references/test_task_template.md)
|
||||
|
||||
**Ownership**: ln-311-task-creator (universal factory owns all product templates)
|
||||
|
||||
**Rationale**: Templates moved to universal factory (ln-311-task-creator) which creates ALL 3 task types (implementation, refactoring, test). ln-311-task-creator owns all product templates.
|
||||
|
||||
**Usage**: Workers (ln-311-task-creator, ln-312-task-replanner) read this template when generating test task documents (via `taskType: "test"`)
|
||||
|
||||
### linear_integration.md (Shared Reference)
|
||||
|
||||
**Location**: [ln-210-epic-coordinator/references/linear_integration.md](../ln-210-epic-coordinator/references/linear_integration.md)
|
||||
|
||||
**Purpose**: Linear API reference and integration patterns
|
||||
|
||||
## Best Practices
|
||||
|
||||
**Minimum Viable Testing Philosophy:** Start with 2 E2E tests per endpoint (positive + negative). Add more tests ONLY with critical justification. **Realistic goal: 2-7 tests per Story** (not 10-28). Each test beyond baseline 2 MUST justify: "Why does this test OUR business logic (not framework/library/database)?"
|
||||
|
||||
**Risk-Based Testing:** Prioritize by Business Impact × Probability (not coverage metrics). Test limits: 2-5 E2E (baseline 2 + additional 0-3), 0-8 Integration (default 0), 0-15 Unit (default 0). E2E-first from ACTUAL manual testing results. Priority ≥15 scenarios covered by tests, Priority ≤14 covered by manual testing.
|
||||
|
||||
**Anti-Duplication:** Each test validates unique business value. If 2 baseline E2E cover it, SKIP unit test. Test OUR code only (not frameworks/libraries/database queries). Focus on complex business logic ONLY (financial calculations, security algorithms, complex business rules). MANDATORY SKIP: CRUD, getters/setters, trivial conditionals, framework code, library functions, database queries.
|
||||
|
||||
**Auto-Trim:** If test plan exceeds 7 tests → auto-trim to 7 by Priority. Keep 2 baseline E2E always, trim lowest Priority tests. Document trimmed scenarios: "Covered by manual testing".
|
||||
|
||||
---
|
||||
|
||||
**Version:** 7.2.0
|
||||
**Last Updated:** 2025-11-14
|
||||
110
skills/ln-350-story-test-planner/diagram.html
Normal file
110
skills/ln-350-story-test-planner/diagram.html
Normal file
@@ -0,0 +1,110 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>ln-350-story-test-planner - State Diagram</title>
|
||||
<script src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js"></script>
|
||||
<link rel="stylesheet" href="../shared/css/diagram.css">
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<header>
|
||||
<h1>🎯 ln-350-story-test-planner</h1>
|
||||
<p class="subtitle">Test Task Planner - State Diagram</p>
|
||||
</header>
|
||||
<div class="info-box">
|
||||
<h3>📋 Overview</h3>
|
||||
<ul>
|
||||
<li><strong>Purpose:</strong> Create final Story task after manual testing passes (invoked by ln-340-story-quality-gate Pass 1)</li>
|
||||
<li><strong>Philosophy:</strong> Minimum Viable Testing - Start with 2 baseline E2E tests (positive + negative), add more ONLY with critical justification</li>
|
||||
<li><strong>Realistic Goal:</strong> 2-7 tests per Story (hard limit: 28)</li>
|
||||
<li><strong>Output:</strong> Comprehensive task with 11 sections: E2E-first Risk-Based Testing (Priority ≥15, tests OUR logic ONLY)</li>
|
||||
<li><strong>Risk Priority:</strong> Business Impact × Probability, Priority ≥15 MUST be tested</li>
|
||||
<li><strong>Critical Justification:</strong> Each test beyond baseline 2 requires documented answer: "Why does this test OUR business logic (not framework/library/database)?"</li>
|
||||
<li><strong>Delegation:</strong> Delegates task creation to ln-311-task-creator (CREATE mode) or ln-312-task-replanner (REPLAN mode) with taskType: "test"</li>
|
||||
</ul>
|
||||
</div>
|
||||
<div class="diagram-container">
|
||||
<div class="mermaid">
|
||||
graph TD
|
||||
Start([Start: Create Story Finalizer Task<br/>Invoked by ln-340-story-quality-gate Pass 1]) --> Phase1[Phase 1: Discovery<br/>Team ID + Parent Story]
|
||||
|
||||
Phase1 --> Phase2[Phase 2: Load Context<br/>Step 0-2 Combined]
|
||||
|
||||
subgraph Context [Phase 2 Steps]
|
||||
Step0[Step 0: Load Manual Test Results<br/>Parse Linear comment Format v1.0<br/>AC + Test Results + Edge Cases + Errors + Integration]
|
||||
Step0 --> Step1[Step 1: Analyze Story<br/>Load full Story description 8 sections]
|
||||
Step1 --> Step2[Step 2: Analyze Tasks<br/>Load all Done implementation tasks]
|
||||
end
|
||||
|
||||
Phase2 --> Step0
|
||||
Step2 --> Phase3
|
||||
|
||||
Phase3[Phase 3: Risk-Based Test Planning]
|
||||
|
||||
subgraph RiskPlanning [Minimum Viable Testing - Risk-Based]
|
||||
Risk1[Step 1: Risk Assessment<br/>Priority = Business Impact × Probability]
|
||||
Risk1 --> Risk2[Step 2: E2E Test Selection<br/>2 baseline ALWAYS + 0-3 additional Priority ≥15]
|
||||
Risk2 --> Risk2_5{Step 2.5: Critical Justification<br/>Tests OUR business logic?<br/>Not framework/library/database?}
|
||||
Risk2_5 -->|Pass| Risk3[Step 3: Unit Test Selection<br/>0-15 tests ONLY complex logic Priority ≥15]
|
||||
Risk2_5 -->|Fail| Risk2
|
||||
Risk3 --> Risk3_5{Critical Justification<br/>Tests OUR logic?}
|
||||
Risk3_5 -->|Pass| Risk4[Step 4: Integration Test Selection<br/>0-8 tests ONLY if E2E doesn't cover Priority ≥15]
|
||||
Risk3_5 -->|Fail| Risk3
|
||||
Risk4 --> Risk4_5{Critical Justification<br/>Tests OUR logic?}
|
||||
Risk4_5 -->|Pass| Risk5[Step 5: Validation<br/>2-7 realistic goal max 28 auto-trim]
|
||||
Risk4_5 -->|Fail| Risk4
|
||||
end
|
||||
|
||||
Phase3 --> Risk1
|
||||
Risk5 --> Phase4
|
||||
|
||||
Phase4[Phase 4: Impact Analysis]
|
||||
|
||||
subgraph Impact [5 Impact Areas]
|
||||
Impact1[Step 1: Existing Tests to Fix/Update]
|
||||
Impact2[Step 2: Infrastructure Changes<br/>package.json Docker configs]
|
||||
Impact2_5[Step 3: Configuration Management<br/>Environment variables secrets configs]
|
||||
Impact3[Step 4: Documentation Updates<br/>README tests/README CHANGELOG]
|
||||
Impact4[Step 5: Legacy Code Cleanup<br/>workarounds backward compat deprecated]
|
||||
end
|
||||
|
||||
Phase4 --> Impact1
|
||||
Impact1 --> Impact2
|
||||
Impact2 --> Impact2_5
|
||||
Impact2_5 --> Impact3
|
||||
Impact3 --> Impact4
|
||||
Impact4 --> Phase5
|
||||
|
||||
Phase5[Phase 5: Generate Complete Story Finalizer Task<br/>11 sections: Context Risk Matrix E2E Integration Unit<br/>Coverage DoD Existing Tests Infra Docs Cleanup]
|
||||
|
||||
Phase5 --> Confirm{User confirms?}
|
||||
Confirm -->|No| Phase5
|
||||
Confirm -->|Yes| CheckExisting{Check existing test task}
|
||||
CheckExisting -->|Exists| Replan[Phase 6: Delegate to ln-312-task-replanner<br/>REPLAN mode with taskType: test]
|
||||
CheckExisting -->|None| Create[Phase 6: Delegate to ln-311-task-creator<br/>CREATE mode with taskType: test]
|
||||
Replan --> End([End])
|
||||
Create --> End([End])
|
||||
|
||||
%% Styling
|
||||
classDef discovery fill:#E3F2FD,stroke:#1976D2,stroke-width:2px
|
||||
classDef analysis fill:#FFF9C4,stroke:#F57C00,stroke-width:2px
|
||||
classDef decision fill:#FFE0B2,stroke:#E64A19,stroke-width:2px
|
||||
classDef action fill:#C8E6C9,stroke:#388E3C,stroke-width:2px
|
||||
|
||||
class Phase1,Phase2,Step0,Step1,Step2 discovery
|
||||
class Phase3,Risk1,Risk2,Risk2_5,Risk3,Risk3_5,Risk4,Risk4_5,Risk5,Phase4,Impact1,Impact2,Impact2_5,Impact3,Impact4,Phase5 analysis
|
||||
class Confirm,CheckExisting decision
|
||||
class Create,Replan action
|
||||
</div>
|
||||
</div>
|
||||
<footer>
|
||||
<p>ln-350-story-test-planner v7.0.0 | Minimum Viable Testing | Mermaid.js</p>
|
||||
</footer>
|
||||
</div>
|
||||
<script>
|
||||
mermaid.initialize({ startOnLoad: true, theme: 'default', flowchart: { useMaxWidth: true, htmlLabels: true, curve: 'basis' } });
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
@@ -0,0 +1,187 @@
|
||||
# Risk-Based Testing - Practical Examples
|
||||
|
||||
This file contains detailed examples of applying Minimum Viable Testing philosophy to real Stories.
|
||||
|
||||
**Purpose:** Learning and reference (not loaded during skill execution).
|
||||
|
||||
**When to use:** Study these examples to understand how to trim test plans from excessive coverage-driven testing to minimal risk-based testing.
|
||||
|
||||
---
|
||||
|
||||
## Example 1: User Login Story (Minimal Approach)
|
||||
|
||||
**Acceptance Criteria:**
|
||||
1. User can login with valid credentials → JWT token returned
|
||||
2. Invalid credentials rejected → 401 error
|
||||
3. Rate limiting after 5 failed attempts → 429 error
|
||||
|
||||
**Risk Assessment:**
|
||||
|
||||
| Scenario | Business Impact | Probability | Priority | Test Type |
|
||||
|----------|-----------------|-------------|----------|-----------|
|
||||
| Valid login works | 4 (core flow) | 3 (standard auth) | **12** | E2E (baseline) |
|
||||
| Invalid credentials rejected | 5 (security) | 3 | **15** | E2E (baseline) |
|
||||
| Rate limiting works | 5 (security, brute force) | 4 (concurrency) | **20** | SKIP - E2E negative covers auth error |
|
||||
| SQL injection attempt blocked | 5 (security breach) | 2 (Prisma escapes) | 10 | SKIP - framework behavior |
|
||||
| JWT token format valid | 4 (breaks API calls) | 2 (library tested) | 8 | SKIP - library behavior |
|
||||
| Password hashing uses bcrypt | 5 (security) | 1 (copy-paste code) | 5 | SKIP - library behavior |
|
||||
| Custom password strength rules | 5 (security policy) | 4 (complex regex) | **20** | Unit (OUR logic) |
|
||||
|
||||
**Test Plan (Minimum Viable Testing):**
|
||||
|
||||
**E2E Tests (2 baseline):**
|
||||
1. **Positive:** User enters valid email/password → 200 OK + JWT token → token works for protected API call
|
||||
2. **Negative:** User enters invalid password → 401 Unauthorized → clear error message shown
|
||||
|
||||
**Integration Tests (0):**
|
||||
- None needed - 2 baseline E2E tests cover full stack (endpoint → service → database)
|
||||
|
||||
**Unit Tests (1 - OUR business logic only):**
|
||||
1. `validatePasswordStrength()` - OUR custom regex (12+ chars, special symbols, numbers) with 5 edge cases
|
||||
|
||||
**Total: 3 tests (within realistic goal 2-7)**
|
||||
|
||||
**What changed from 6 → 3 tests:**
|
||||
- ❌ E2E rate limiting test - REMOVED (Priority 20 but tests Redis library, not OUR logic)
|
||||
- ❌ Integration SQL injection test - REMOVED (testing Prisma escaping, not OUR code)
|
||||
- ❌ Integration rate limiter test - REMOVED (testing Redis counter, not OUR code)
|
||||
|
||||
**Why 3 tests sufficient:**
|
||||
- 2 baseline E2E cover all Acceptance Criteria (valid login + error handling)
|
||||
- 1 Unit test covers OUR custom password policy (not library behavior)
|
||||
- Rate limiting, SQL escaping, JWT generation = framework/library behavior (trust the library)
|
||||
|
||||
**Avoided tests (with rationale):**
|
||||
- ❌ Unit test `hashPassword()` - bcrypt library behavior, Priority 5
|
||||
- ❌ Unit test `generateJWT()` - jsonwebtoken library behavior, Priority 8
|
||||
- ❌ Unit test `validateEmail()` format - covered by E2E negative test
|
||||
- ❌ Integration test JWT token decoding - jsonwebtoken library behavior
|
||||
- ❌ Integration test rate limiting - Redis library behavior
|
||||
- ❌ Integration test SQL injection - Prisma library behavior
|
||||
|
||||
---
|
||||
|
||||
## Example 2: Product Search Story (Minimal Approach)
|
||||
|
||||
**Acceptance Criteria:**
|
||||
1. User can search products by name → results displayed
|
||||
2. User can filter by category → filtered results
|
||||
3. Empty search returns all products
|
||||
|
||||
**Risk Assessment:**
|
||||
|
||||
| Scenario | Business Impact | Probability | Priority | Test Type |
|
||||
|----------|-----------------|-------------|----------|-----------|
|
||||
| Search returns correct results | 4 (core feature) | 3 (SQL query) | **12** | E2E (baseline positive) |
|
||||
| Invalid search returns empty | 3 (UX feedback) | 3 | 9 | E2E (baseline negative) |
|
||||
| Category filter works | 3 (partial feature) | 3 | 9 | SKIP - covered by positive E2E |
|
||||
| Empty search shows all | 2 (minor UX) | 2 | 4 | SKIP - Priority too low |
|
||||
| Pagination works | 3 (UX issue if breaks) | 4 (off-by-one errors) | 12 | SKIP - UI pagination, not business logic |
|
||||
| Search handles special chars | 3 (breaks search) | 4 (SQL injection risk) | 12 | SKIP - Prisma/PostgreSQL behavior |
|
||||
| Results sorted by relevance | 2 (minor UX) | 3 | 6 | SKIP - Priority too low |
|
||||
| Unicode search | 3 (breaks for non-EN) | 4 | 12 | SKIP - database engine behavior |
|
||||
|
||||
**Test Plan (Minimum Viable Testing):**
|
||||
|
||||
**E2E Tests (2 baseline):**
|
||||
1. **Positive:** User types "laptop" in search → sees products with "laptop" in name/description
|
||||
2. **Negative:** User types "nonexistent999" → sees "No results found" message
|
||||
|
||||
**Integration Tests (0):**
|
||||
- None needed - special character escaping is Prisma/PostgreSQL behavior, not OUR logic
|
||||
|
||||
**Unit Tests (0):**
|
||||
- No complex business logic - simple database search query
|
||||
|
||||
**Total: 2 tests (minimum baseline)**
|
||||
|
||||
**What changed from 7 → 2 tests:**
|
||||
- ❌ E2E pagination test - REMOVED (UI pagination library, not OUR business logic)
|
||||
- ❌ Integration special chars test - REMOVED (Prisma query builder escaping, not OUR code)
|
||||
- ❌ Integration Unicode test - REMOVED (PostgreSQL LIKE operator, not OUR code)
|
||||
- ❌ Integration 1000-char string test - REMOVED (input validation middleware, not search logic)
|
||||
- ❌ Integration 500 error test - REMOVED (error handling middleware, not search logic)
|
||||
|
||||
**Why 2 tests sufficient:**
|
||||
- 2 baseline E2E cover both Acceptance Criteria (successful search + no results case)
|
||||
- No complex business logic to isolate - just database query (trust Prisma + PostgreSQL)
|
||||
- Pagination, special characters, Unicode, error handling = framework/library/database behavior
|
||||
|
||||
**Avoided tests (with rationale):**
|
||||
- ❌ E2E empty search - Priority 4 (manual testing sufficient)
|
||||
- ❌ E2E category filter - covered by baseline positive test (can search + filter simultaneously)
|
||||
- ❌ E2E pagination - testing UI pagination library, not OUR code
|
||||
- ❌ Unit test `buildSearchQuery()` - covered by E2E that executes query
|
||||
- ❌ Unit test sorting - Priority 6 (nice-to-have, not critical)
|
||||
- ❌ Integration test database `LIKE` query - testing PostgreSQL, not OUR code
|
||||
- ❌ Integration test special character escaping - testing Prisma, not OUR code
|
||||
|
||||
---
|
||||
|
||||
## Example 3: Payment Processing Story (Minimal Approach)
|
||||
|
||||
**Acceptance Criteria:**
|
||||
1. User can pay with credit card → order confirmed
|
||||
2. Failed payment shows error message
|
||||
3. Payment amount matches cart total
|
||||
|
||||
**Risk Assessment:**
|
||||
|
||||
| Scenario | Business Impact | Probability | Priority | Test Type |
|
||||
|----------|-----------------|-------------|----------|-----------|
|
||||
| Successful payment flow | 5 (money) | 3 (Stripe API) | **15** | E2E (baseline positive) |
|
||||
| Failed payment handled | 5 (money) | 4 (network issues) | **20** | E2E (baseline negative) |
|
||||
| Amount calculation correct | 5 (money) | 4 (complex math) | **20** | Unit (OUR calculation logic) |
|
||||
| Tax calculation by region | 5 (money) | 5 (complex rules) | **25** | Unit (OUR tax rules) |
|
||||
| Discount calculation | 5 (money) | 4 (business rules) | **20** | Unit (OUR discount logic) |
|
||||
| Currency conversion | 5 (money) | 5 (API + math) | **25** | SKIP - E2E covers, no complex OUR logic |
|
||||
| Refund processing | 5 (money) | 3 | **15** | SKIP - E2E positive covers payment flow |
|
||||
| Duplicate payment prevented | 5 (money) | 4 (race condition) | **20** | SKIP - Stripe API idempotency, not OUR code |
|
||||
| Transaction rollback on error | 5 (data corruption) | 4 (distributed transaction) | **20** | SKIP - database transaction manager, not OUR code |
|
||||
| Stripe API 500 error | 5 (money) | 3 | **15** | SKIP - E2E negative covers error handling |
|
||||
| Webhook processing | 5 (money) | 3 | **15** | SKIP - Stripe webhook mechanism, not complex OUR logic |
|
||||
|
||||
**Test Plan (Minimum Viable Testing):**
|
||||
|
||||
**E2E Tests (2 baseline):**
|
||||
1. **Positive:** User adds items to cart → proceeds to checkout → enters valid card → payment succeeds → order created in DB
|
||||
2. **Negative:** User enters invalid card → Stripe rejects → error message shown → order NOT created
|
||||
|
||||
**Integration Tests (0):**
|
||||
- None needed - currency conversion uses external API (trust API), transaction rollback is database behavior, Stripe idempotency is Stripe behavior
|
||||
|
||||
**Unit Tests (3 - OUR complex business logic only):**
|
||||
1. `calculateTotal()` - OUR calculation: items total + tax (by region) + shipping - discount → correct amount (5 edge cases)
|
||||
2. `calculateTax()` - OUR tax rules: different rates by country/state, special product categories (5 edge cases)
|
||||
3. `applyDiscount()` - OUR discount logic: percentage discount, fixed amount discount, minimum order threshold (5 edge cases)
|
||||
|
||||
**Total: 5 tests (within realistic goal 2-7)**
|
||||
|
||||
**What changed from 13 → 5 tests:**
|
||||
- ❌ E2E refund test - REMOVED (Stripe API refund mechanism, covered by positive E2E)
|
||||
- ❌ Integration Stripe 500 error test - REMOVED (covered by baseline negative E2E)
|
||||
- ❌ Integration duplicate payment test - REMOVED (Stripe idempotency keys, not OUR code)
|
||||
- ❌ Integration currency conversion test - REMOVED (external API behavior, not complex OUR logic)
|
||||
- ❌ Integration transaction rollback test - REMOVED (database transaction manager, not OUR code)
|
||||
- ❌ Integration webhook test - REMOVED (Stripe webhook mechanism, not complex OUR logic)
|
||||
- ❌ Unit test `convertCurrency()` - REMOVED (external API call, no complex OUR calculation)
|
||||
- ❌ Unit test shipping calculation - MERGED into `calculateTotal()` (part of same calculation)
|
||||
|
||||
**Why 5 tests sufficient:**
|
||||
- 2 baseline E2E cover all Acceptance Criteria (successful payment + failed payment)
|
||||
- 3 Unit tests cover OUR complex financial calculations (money = Priority 25)
|
||||
- Currency conversion, transaction rollback, Stripe idempotency, webhooks = external services/framework behavior (trust them)
|
||||
|
||||
**Avoided tests (with rationale):**
|
||||
- ❌ Integration test currency conversion - external API behavior, not OUR math
|
||||
- ❌ Integration test transaction rollback - database transaction manager behavior
|
||||
- ❌ Integration test Stripe idempotency - Stripe API feature, not OUR code
|
||||
- ❌ Integration test Stripe 500 error - covered by baseline E2E negative test
|
||||
- ❌ Integration test webhook - Stripe mechanism, not complex OUR logic
|
||||
- ❌ E2E refund test - Stripe API refund, not different from payment flow
|
||||
- ❌ Unit test free shipping threshold - part of `calculateTotal()` unit test
|
||||
|
||||
---
|
||||
|
||||
**Version:** 1.0.0
|
||||
**Last Updated:** 2025-11-14
|
||||
@@ -0,0 +1,492 @@
|
||||
# Risk-Based Testing Guide
|
||||
|
||||
## Purpose
|
||||
|
||||
This guide replaces the traditional Test Pyramid (70/20/10 ratio) with a **Value-Based Testing Framework** that prioritizes business risk and practical test limits. The goal is to write tests that matter, not to chase coverage metrics.
|
||||
|
||||
**Problem solved:** Traditional Test Pyramid approach generates excessive tests (~200 per Story) by mechanically testing every conditional branch. This creates maintenance burden without proportional business value.
|
||||
|
||||
**Solution:** Risk-Based Testing with clear prioritization criteria and enforced limits (10-28 tests max per Story).
|
||||
|
||||
## Core Philosophy
|
||||
|
||||
### Kent Beck's Principle
|
||||
|
||||
> "Write tests. Not too many. Mostly integration."
|
||||
|
||||
### Key Insights
|
||||
|
||||
1. **Test business value, not code coverage** - 80% coverage means nothing if critical payment flow isn't tested
|
||||
2. **Manual testing has value** - Not every scenario needs automated test duplication
|
||||
3. **Each test has maintenance cost** - More tests = more refactoring overhead
|
||||
4. **Integration tests catch real bugs** - Unit tests catch edge cases in isolation
|
||||
5. **E2E tests validate user value** - Only E2E proves the feature actually works end-to-end
|
||||
|
||||
## Minimum Viable Testing Philosophy
|
||||
|
||||
### Start Minimal, Justify Additions
|
||||
|
||||
**Baseline for every Story:**
|
||||
- **2 E2E tests** per endpoint: Positive scenario (happy path) + Negative scenario (critical error)
|
||||
- **0 Integration tests** (E2E covers full stack by default)
|
||||
- **0 Unit tests** (E2E covers simple logic by default)
|
||||
|
||||
**Realistic goal: 2-7 tests per Story** (not 10-28!)
|
||||
|
||||
**Additional tests ONLY with critical justification:**
|
||||
- Test #3 and beyond: Each requires documented answer to "Why does this test OUR business logic (not framework/library/database)?"
|
||||
- Priority ≥15 required for all additional tests
|
||||
- Auto-trim to 7 tests if plan exceeds realistic goal
|
||||
|
||||
### Critical Justification Questions
|
||||
|
||||
Before adding ANY test beyond 2 baseline E2E, answer:
|
||||
|
||||
1. ❓ **Does this test OUR business logic?**
|
||||
- ✅ YES: Tax calculation with country-specific rules (OUR algorithm)
|
||||
- ❌ NO: bcrypt hashing (library behavior)
|
||||
- ❌ NO: Prisma query execution (framework behavior)
|
||||
- ❌ NO: PostgreSQL LIKE operator (database behavior)
|
||||
|
||||
2. ❓ **Is this already covered by 2 baseline E2E tests?**
|
||||
- ✅ NO: E2E doesn't exercise all branches of complex calculation
|
||||
- ❌ YES: E2E test validates full flow end-to-end
|
||||
|
||||
3. ❓ **Priority ≥15?**
|
||||
- ✅ YES: Money, security, data integrity
|
||||
- ❌ NO: Skip, manual testing sufficient
|
||||
|
||||
4. ❓ **Unique business value?**
|
||||
- ✅ YES: Tests different scenario than existing tests
|
||||
- ❌ NO: Duplicate coverage
|
||||
|
||||
**If ANY answer is ❌ NO → SKIP this test**
|
||||
|
||||
## Risk Priority Matrix
|
||||
|
||||
### Calculation Formula
|
||||
|
||||
```
|
||||
Priority = Business Impact (1-5) × Probability of Failure (1-5)
|
||||
```
|
||||
|
||||
**Result ranges:**
|
||||
- **Priority ≥15 (15-25):** MUST test - critical scenarios
|
||||
- **Priority 9-14:** SHOULD test if not already covered
|
||||
- **Priority ≤8 (1-8):** SKIP - manual testing sufficient
|
||||
|
||||
### Business Impact Scoring (1-5)
|
||||
|
||||
| Score | Impact Level | Examples |
|
||||
|-------|--------------|----------|
|
||||
| **5** | **Critical** | Money loss, security breach, data corruption, legal liability |
|
||||
| **4** | **High** | Core business flow breaks (cannot complete purchase, cannot login) |
|
||||
| **3** | **Medium** | Feature partially broken (search works but pagination fails) |
|
||||
| **2** | **Low** | Minor UX issue (button disabled state wrong, tooltip missing) |
|
||||
| **1** | **Trivial** | Cosmetic bug (color slightly off, spacing issue) |
|
||||
|
||||
### Probability of Failure Scoring (1-5)
|
||||
|
||||
| Score | Probability | Indicators |
|
||||
|-------|-------------|------------|
|
||||
| **5** | **Very High (>50%)** | Complex algorithm, external API, new technology, no existing tests |
|
||||
| **4** | **High (25-50%)** | Multiple dependencies, concurrency, state management |
|
||||
| **3** | **Medium (10-25%)** | Standard CRUD, framework defaults, well-tested patterns |
|
||||
| **2** | **Low (5-10%)** | Simple logic, established library, copy-paste from working code |
|
||||
| **1** | **Very Low (<5%)** | Trivial assignment, framework-generated code |
|
||||
|
||||
### Priority Matrix Table
|
||||
|
||||
| | Probability 1 | Probability 2 | Probability 3 | Probability 4 | Probability 5 |
|
||||
|---|---------------|---------------|---------------|---------------|---------------|
|
||||
| **Impact 5** | 5 (SKIP) | 10 (SHOULD) | **15 (MUST)** | **20 (MUST)** | **25 (MUST)** |
|
||||
| **Impact 4** | 4 (SKIP) | 8 (SKIP) | 12 (SHOULD) | **16 (MUST)** | **20 (MUST)** |
|
||||
| **Impact 3** | 3 (SKIP) | 6 (SKIP) | 9 (SHOULD) | 12 (SHOULD) | **15 (MUST)** |
|
||||
| **Impact 2** | 2 (SKIP) | 4 (SKIP) | 6 (SKIP) | 8 (SKIP) | 10 (SHOULD) |
|
||||
| **Impact 1** | 1 (SKIP) | 2 (SKIP) | 3 (SKIP) | 4 (SKIP) | 5 (SKIP) |
|
||||
|
||||
## Test Type Decision Tree
|
||||
|
||||
### Step 1: Calculate Risk Priority
|
||||
|
||||
Use Risk Priority Matrix above.
|
||||
|
||||
### Step 2: Select Test Type
|
||||
|
||||
```
|
||||
IF Priority ≥15 → Proceed to Step 3
|
||||
ELSE IF Priority 9-14 → Check Anti-Duplication (Step 4), then Step 3
|
||||
ELSE Priority ≤8 → SKIP (manual testing sufficient)
|
||||
```
|
||||
|
||||
### Step 3: Choose Test Level
|
||||
|
||||
**E2E Test (2-5 max per Story):**
|
||||
- **BASELINE (ALWAYS): 2 E2E tests per endpoint**
|
||||
- Test 1: Positive scenario (happy path validating main AC)
|
||||
- Test 2: Negative scenario (critical error handling)
|
||||
- **ADDITIONAL (3-5): ONLY if Priority ≥15 AND justified**
|
||||
- Critical edge case from manual testing
|
||||
- Second endpoint (if Story implements multiple endpoints)
|
||||
- **Examples:**
|
||||
- User registers → receives email → confirms → can login
|
||||
- User adds product → proceeds to checkout → pays → sees confirmation
|
||||
- User uploads file → sees progress → file appears in list
|
||||
|
||||
**Integration Test (0-8 max per Story):**
|
||||
- **DEFAULT: 0 Integration tests** (2 E2E tests cover full stack by default)
|
||||
- **ADD ONLY if:** E2E doesn't cover interaction completely AND Priority ≥15 AND justified
|
||||
- **Examples:**
|
||||
- Transaction rollback on error (E2E tests happy path only)
|
||||
- Concurrent request handling (E2E tests single request)
|
||||
- External API error scenarios (500, timeout) with Priority ≥15
|
||||
- **MANDATORY SKIP:**
|
||||
- ❌ Simple pass-through calls (E2E already validates end-to-end)
|
||||
- ❌ Testing framework integrations (Prisma client, TypeORM repository, Express app)
|
||||
- ❌ Testing database query execution (database engine behavior)
|
||||
|
||||
**Unit Test (0-15 max per Story):**
|
||||
- **DEFAULT: 0 Unit tests** (2 E2E tests cover simple logic by default)
|
||||
- **ADD ONLY for complex business logic with Priority ≥15:**
|
||||
- Financial calculations (tax, discount, currency conversion) **WITH COMPLEX RULES**
|
||||
- Security algorithms (password strength, permission matrix) **WITH CUSTOM LOGIC**
|
||||
- Complex business algorithms (scoring, matching, ranking) **WITH MULTIPLE FACTORS**
|
||||
- **MANDATORY SKIP - DO NOT create unit tests for:**
|
||||
- ❌ Simple CRUD operations (already covered by E2E)
|
||||
- ❌ Framework code (Express middleware, React hooks, FastAPI dependencies)
|
||||
- ❌ Library functions (bcrypt hashing, jsonwebtoken signing, axios requests)
|
||||
- ❌ Database queries (Prisma findMany, TypeORM query builder, SQL joins)
|
||||
- ❌ Getters/setters or simple property access
|
||||
- ❌ Trivial conditionals (`if (user) return user.name`, `status === 'active'`)
|
||||
- ❌ Pass-through functions (wrappers without logic)
|
||||
- ❌ Performance/load testing (benchmarks, stress tests, scalability validation)
|
||||
|
||||
### Step 4: Anti-Duplication Check
|
||||
|
||||
Before writing ANY test, verify:
|
||||
|
||||
1. **Is this scenario already covered by E2E?**
|
||||
- E2E tests payment flow → SKIP unit test for `calculateTotal()`
|
||||
- E2E tests login → SKIP unit test for `validateEmail()`
|
||||
|
||||
2. **Is this testing framework code?**
|
||||
- Testing Express `app.use()` → SKIP
|
||||
- Testing React `useState` → SKIP
|
||||
- Testing Prisma `findMany()` → SKIP
|
||||
|
||||
3. **Does this add unique business value?**
|
||||
- E2E tests happy path → Unit test for edge case (negative price) → KEEP
|
||||
- Integration test already validates DB transaction → SKIP duplicate unit test
|
||||
|
||||
4. **Is this a one-line function?**
|
||||
- `getFullName() { return firstName + lastName }` → SKIP (E2E covers it)
|
||||
|
||||
## Test Limits Per Story
|
||||
|
||||
### Enforced Limits with Realistic Goals
|
||||
|
||||
| Test Type | Minimum | Realistic Goal | Maximum | Purpose |
|
||||
|-----------|---------|----------------|---------|---------|
|
||||
| **E2E** | 2 | 2 | 5 | Baseline: positive + negative per endpoint |
|
||||
| **Integration** | 0 | 0-2 | 8 | ONLY if E2E doesn't cover interaction |
|
||||
| **Unit** | 0 | 0-3 | 15 | ONLY complex business logic (financial/security/algorithms) |
|
||||
| **TOTAL** | 2 | **2-7** | 28 | Start minimal, add only with justification |
|
||||
|
||||
**Key Change:** Test limits are now CEILINGS (maximum allowed), NOT targets to fill. Start with 2 E2E tests, add more only with critical justification.
|
||||
|
||||
### Rationale for Limits
|
||||
|
||||
**Why maximum 5 E2E?**
|
||||
- E2E tests are slow (10-60 seconds each)
|
||||
- Each Story typically has 2-4 Acceptance Criteria
|
||||
- 1-2 E2E per AC is sufficient
|
||||
- Edge cases covered by Integration/Unit tests
|
||||
|
||||
**Why maximum 8 Integration?**
|
||||
- Integration tests validate layer interactions
|
||||
- Typical Story has 3-5 integration points (API → Service → DB)
|
||||
- 1-2 tests per integration point + error scenarios
|
||||
|
||||
**Why maximum 15 Unit?**
|
||||
- Only test complex business logic
|
||||
- Typical Story has 2-4 complex functions
|
||||
- 3-5 tests per function (happy path + edge cases)
|
||||
|
||||
**Why total maximum 28?**
|
||||
- Industry data: Stories with >30 tests rarely have proportional bug prevention
|
||||
- Maintenance cost grows quadratically beyond this point
|
||||
- Focus on quality over quantity
|
||||
|
||||
## Common Over-Testing Anti-Patterns
|
||||
|
||||
### Anti-Pattern 1: "Every if/else needs a test"
|
||||
|
||||
**Bad:**
|
||||
```javascript
|
||||
// Function with 10 if/else branches
|
||||
function processOrder(order) {
|
||||
if (!order) return null; // Test 1
|
||||
if (!order.items) return null; // Test 2
|
||||
if (order.items.length === 0) return null; // Test 3
|
||||
// ... 7 more conditionals
|
||||
}
|
||||
```
|
||||
**Problem:** 10 unit tests for trivial validation logic already covered by E2E test that calls `processOrder()`.
|
||||
|
||||
**Good:**
|
||||
- 1 E2E test: User submits valid order → success
|
||||
- 1 E2E test: User submits invalid order → error message
|
||||
- 1 Unit test: Complex tax calculation inside `processOrder()` (if exists)
|
||||
|
||||
**Total: 3 tests instead of 12**
|
||||
|
||||
### Anti-Pattern 2: "Testing framework code"
|
||||
|
||||
**Bad:**
|
||||
```javascript
|
||||
// Testing Express middleware
|
||||
test('CORS middleware sets headers', () => {
|
||||
// Testing Express, not OUR code
|
||||
});
|
||||
|
||||
// Testing React hook
|
||||
test('useState updates component', () => {
|
||||
// Testing React, not OUR code
|
||||
});
|
||||
```
|
||||
|
||||
**Good:**
|
||||
- Trust framework tests (Express/React have thousands of tests)
|
||||
- Test OUR business logic that USES framework
|
||||
|
||||
### Anti-Pattern 3: "Duplicating E2E coverage with Unit tests"
|
||||
|
||||
**Bad:**
|
||||
```javascript
|
||||
// E2E already tests: POST /api/orders → creates order in DB
|
||||
test('E2E: User can create order', ...); // E2E
|
||||
test('Unit: createOrder() inserts to database', ...); // Duplicate!
|
||||
test('Unit: createOrder() returns order object', ...); // Duplicate!
|
||||
```
|
||||
|
||||
**Good:**
|
||||
```javascript
|
||||
// E2E tests full flow
|
||||
test('E2E: User can create order', ...);
|
||||
|
||||
// Unit tests ONLY complex calculation NOT fully exercised by E2E
|
||||
test('Unit: Bulk discount applied when quantity > 100', ...);
|
||||
```
|
||||
|
||||
### Anti-Pattern 4: "Aiming for 80% coverage"
|
||||
|
||||
**Bad mindset:**
|
||||
- "We have 75% coverage, need 5 more tests to hit 80%"
|
||||
- Writes tests for trivial getters/setters to inflate coverage
|
||||
|
||||
**Good mindset:**
|
||||
- "Payment flow is critical (Priority 25) but only has 1 E2E test"
|
||||
- "We have 60% coverage but all critical paths tested - DONE"
|
||||
|
||||
### Anti-Pattern 5: "Testing framework integration"
|
||||
|
||||
**Bad:**
|
||||
```javascript
|
||||
// Testing Express framework behavior
|
||||
test('Express middleware chain works', () => {
|
||||
// Testing Express.js, not OUR code
|
||||
});
|
||||
|
||||
// Testing Prisma client behavior
|
||||
test('Prisma findMany returns array', () => {
|
||||
// Testing Prisma, not OUR code
|
||||
});
|
||||
|
||||
// Testing React hook behavior
|
||||
test('useState triggers rerender', () => {
|
||||
// Testing React, not OUR code
|
||||
});
|
||||
```
|
||||
|
||||
**Why bad:** Frameworks have thousands of tests. Trust the framework, test OUR business logic that USES the framework.
|
||||
|
||||
**Good:**
|
||||
```javascript
|
||||
// Test OUR business logic that uses framework
|
||||
test('E2E: User can create order', () => {
|
||||
// Tests OUR endpoint logic (which happens to use Express + Prisma)
|
||||
// But we're validating OUR business rules, not framework behavior
|
||||
});
|
||||
```
|
||||
|
||||
### Anti-Pattern 6: "Testing database query syntax"
|
||||
|
||||
**Bad:**
|
||||
```javascript
|
||||
// Testing database query execution
|
||||
test('findByEmail() returns user from database', () => {
|
||||
await prisma.user.findUnique({ where: { email: 'test@example.com' }});
|
||||
// Testing Prisma query builder, not OUR logic
|
||||
});
|
||||
|
||||
// Testing SQL JOIN behavior
|
||||
test('getUserWithOrders() joins tables correctly', () => {
|
||||
// Testing PostgreSQL JOIN semantics, not OUR logic
|
||||
});
|
||||
```
|
||||
|
||||
**Why bad:** Database engines have extensive test suites. We're testing PostgreSQL/MySQL, not our code.
|
||||
|
||||
**Good:**
|
||||
```javascript
|
||||
// E2E test already validates query works
|
||||
test('E2E: User can view order history', () => {
|
||||
// Implicitly validates that JOIN query works correctly
|
||||
// We test the USER OUTCOME, not the database mechanism
|
||||
});
|
||||
|
||||
// Unit test ONLY for complex query construction logic
|
||||
test('buildSearchQuery() with multiple filters generates correct WHERE clause', () => {
|
||||
// ONLY if we have complex query building logic with business rules
|
||||
// NOT testing database execution, testing OUR query builder logic
|
||||
});
|
||||
```
|
||||
|
||||
### Anti-Pattern 7: "Testing library behavior"
|
||||
|
||||
**Bad:**
|
||||
```javascript
|
||||
// Testing bcrypt library
|
||||
test('bcrypt hashes password correctly', () => {
|
||||
const hash = await bcrypt.hash('password', 10);
|
||||
const valid = await bcrypt.compare('password', hash);
|
||||
expect(valid).toBe(true);
|
||||
// Testing bcrypt library, not OUR code
|
||||
});
|
||||
|
||||
// Testing jsonwebtoken library
|
||||
test('JWT token is valid', () => {
|
||||
const token = jwt.sign({ userId: 1 }, SECRET);
|
||||
const decoded = jwt.verify(token, SECRET);
|
||||
// Testing jsonwebtoken library, not OUR code
|
||||
});
|
||||
|
||||
// Testing axios library
|
||||
test('axios makes HTTP request', () => {
|
||||
await axios.get('https://api.example.com');
|
||||
// Testing axios library, not OUR code
|
||||
});
|
||||
```
|
||||
|
||||
**Why bad:** Libraries are already tested by their maintainers. We're duplicating their test suite.
|
||||
|
||||
**Good:**
|
||||
```javascript
|
||||
// E2E test validates full authentication flow
|
||||
test('E2E: User can login and access protected endpoint', () => {
|
||||
// Implicitly validates that bcrypt comparison works
|
||||
// AND that JWT token generation/validation works
|
||||
// But we test the USER FLOW, not library internals
|
||||
});
|
||||
|
||||
// Unit test ONLY for custom password rules (OUR business logic)
|
||||
test('validatePasswordStrength() requires 12+ chars with special symbols', () => {
|
||||
// Testing OUR custom password policy, not bcrypt itself
|
||||
});
|
||||
```
|
||||
|
||||
## When to Break the Rules
|
||||
|
||||
### Scenario 1: Regulatory Compliance
|
||||
|
||||
**Financial/Healthcare applications:**
|
||||
- May need >28 tests for audit trail
|
||||
- Document WHY each test exists (regulation reference)
|
||||
|
||||
### Scenario 2: Bug-Prone Legacy Code
|
||||
|
||||
**If Story modifies legacy code with history of bugs:**
|
||||
- Increase Unit test limit to 20
|
||||
- Add characterization tests
|
||||
|
||||
### Scenario 3: Public API
|
||||
|
||||
**If Story creates API consumed by 3rd parties:**
|
||||
- Increase Integration test limit to 12
|
||||
- Test all error codes (400, 401, 403, 404, 429, 500)
|
||||
|
||||
### Scenario 4: Security-Critical Features
|
||||
|
||||
**Authentication, authorization, encryption:**
|
||||
- All scenarios Priority ≥15
|
||||
- May reach 28 test maximum legitimately
|
||||
|
||||
## Quick Reference
|
||||
|
||||
### Decision Flowchart (Minimum Viable Testing)
|
||||
|
||||
```
|
||||
1. Start with 2 baseline E2E tests (positive + negative) - ALWAYS
|
||||
↓
|
||||
2. For test #3 and beyond, calculate Risk Priority (Impact × Probability)
|
||||
↓
|
||||
3. Priority ≥15?
|
||||
NO (≤14) → SKIP (manual testing sufficient)
|
||||
YES → Proceed to Step 4
|
||||
↓
|
||||
4. Critical Justification Check (ALL must be YES):
|
||||
❓ Tests OUR business logic? (not framework/library/database)
|
||||
❓ Not already covered by 2 baseline E2E?
|
||||
❓ Unique business value?
|
||||
ANY NO? → SKIP
|
||||
ALL YES? → Proceed to Step 5
|
||||
↓
|
||||
5. Select Test Type:
|
||||
- User flow? → E2E #3-5 (with justification)
|
||||
- E2E doesn't cover interaction? → Integration 0-8 (with justification)
|
||||
- Complex OUR algorithm? → Unit 0-15 (with justification)
|
||||
↓
|
||||
6. Verify total ≤7 (realistic goal) or ≤28 (hard limit)
|
||||
> 7 tests? → Auto-trim by Priority, keep 2 baseline E2E + top 5 Priority
|
||||
```
|
||||
|
||||
### Red Flags (Stop and Reconsider)
|
||||
|
||||
❌ **"I need to test every branch for coverage"** → Focus on business risk, not coverage
|
||||
❌ **"This E2E already tests it, but I'll add unit test anyway"** → Duplication
|
||||
❌ **"Need to test Express middleware behavior"** → Testing framework, not OUR code
|
||||
❌ **"Need to test Prisma query execution"** → Testing database/ORM, not OUR code
|
||||
❌ **"Need to test bcrypt hashing"** → Testing library, not OUR code
|
||||
❌ **"Story has 45 tests"** → Exceeds limit, prioritize and trim
|
||||
❌ **"Story has 15 tests but includes Prisma/bcrypt/Express tests"** → Testing framework/library, remove
|
||||
❌ **"Testing getter/setter"** → Trivial code, E2E covers it
|
||||
❌ **"Need more tests to hit 10 minimum"** → Minimum is 2, not 10!
|
||||
|
||||
### Green Lights (Good Test)
|
||||
|
||||
✅ **"2 E2E tests: positive + negative for main endpoint"** → Baseline (ALWAYS)
|
||||
✅ **"Tax calculation with country-specific rules, Priority 25"** → Unit test (OUR complex logic)
|
||||
✅ **"User must complete checkout, Priority 20"** → E2E test (user value)
|
||||
✅ **"Story has 3 tests: 2 E2E + 1 Unit for OUR tax logic"** → Minimum viable!
|
||||
✅ **"Story has 5 tests, all test OUR business logic, all Priority ≥15"** → Justified and minimal
|
||||
✅ **"Skipped 8 scenarios - all were framework/library behavior"** → Good filtering!
|
||||
|
||||
## References
|
||||
|
||||
- Kent Beck, "Test Desiderata" (2018)
|
||||
- Martin Fowler, "Practical Test Pyramid" (2018)
|
||||
- Kent C. Dodds, "The Testing Trophy" (2020)
|
||||
- Google Testing Blog, "Code Coverage Best Practices" (2020)
|
||||
- Netflix Tech Blog, "Testing Strategy at Scale" (2021)
|
||||
- Michael Feathers, "Working Effectively with Legacy Code" (2004)
|
||||
- OWASP Testing Guide v4.2 (2023)
|
||||
|
||||
## Version History
|
||||
|
||||
| Version | Date | Changes |
|
||||
|---------|------|---------|
|
||||
| 1.0 | 2025-10-31 | Initial Risk-Based Testing framework to replace Test Pyramid (10-28 tests per Story) |
|
||||
| 2.0.0 | 2025-11-11 | Minimum Viable Testing philosophy: Start with 2 E2E baseline, realistic goal 2-7 tests. Critical justification required for each test beyond baseline. New anti-patterns (5-7) for framework/library/database testing. Updated examples (Login 6→3, Search 7→2, Payment 13→5) |
|
||||
|
||||
**Version:** 2.0.0
|
||||
**Last Updated:** 2025-11-11
|
||||
Reference in New Issue
Block a user