431 lines
12 KiB
Markdown
431 lines
12 KiB
Markdown
---
|
|
name: regression-testing
|
|
description: Evaluates and implements regression tests after bug fixes based on severity, code complexity, and coverage. Use when bugs are fixed to prevent future regressions.
|
|
---
|
|
|
|
# Regression Testing Skill
|
|
|
|
**Purpose**: Automatically evaluate and implement regression tests after bug fixes to prevent future regressions.
|
|
|
|
**When to Trigger**: This skill activates after bug fixes are implemented, allowing Claude (the orchestrator) to decide if regression tests would be valuable based on context.
|
|
|
|
---
|
|
|
|
## Decision Criteria (Orchestrator Evaluation)
|
|
|
|
Before implementing regression tests, evaluate these factors:
|
|
|
|
### High Value Scenarios (Implement Regression Tests)
|
|
- **Critical Bugs**: Security, data loss, or production-impacting issues
|
|
- **Subtle Bugs**: Edge cases, race conditions, timing issues that are easy to miss
|
|
- **Complex Logic**: Multi-step workflows, state machines, intricate business rules
|
|
- **Low Coverage Areas**: Bug occurred in under-tested code (<70% coverage)
|
|
- **Recurring Patterns**: Similar bugs fixed before in related code
|
|
- **Integration Points**: Bugs at module/service boundaries
|
|
|
|
### Lower Value Scenarios (Skip or Defer)
|
|
- **Trivial Fixes**: Typos, obvious logic errors with existing tests
|
|
- **Already Well-Tested**: Bug area has >90% coverage with comprehensive tests
|
|
- **One-Time Anomalies**: Environmental issues, config errors (not code bugs)
|
|
- **Rapid Prototyping**: Early-stage features expected to change significantly
|
|
- **UI-Only Changes**: Purely cosmetic fixes with no logic impact
|
|
|
|
---
|
|
|
|
## Regression Test Strategy
|
|
|
|
### 1. Bug Analysis Phase
|
|
|
|
**Understand the Bug:**
|
|
```markdown
|
|
## Bug Context
|
|
- **What broke**: [Symptom/error]
|
|
- **Root cause**: [Why it happened]
|
|
- **Fix applied**: [What changed]
|
|
- **Failure scenario**: [Steps to reproduce original bug]
|
|
```
|
|
|
|
**Evaluate Test Value:**
|
|
```python
|
|
def should_add_regression_test(bug_context: dict) -> tuple[bool, str]:
|
|
"""
|
|
Decide if regression test is valuable.
|
|
|
|
Returns:
|
|
(add_test: bool, reason: str)
|
|
"""
|
|
severity = bug_context.get("severity") # critical, high, medium, low
|
|
complexity = bug_context.get("complexity") # high, medium, low
|
|
coverage = bug_context.get("coverage_pct", 0)
|
|
|
|
# Critical bugs always get regression tests
|
|
if severity == "critical":
|
|
return True, "Critical bug requires regression test"
|
|
|
|
# Complex bugs with low coverage
|
|
if complexity == "high" and coverage < 70:
|
|
return True, "Complex logic with insufficient coverage"
|
|
|
|
# Already well-tested
|
|
if coverage > 90:
|
|
return False, "Area already has comprehensive tests"
|
|
|
|
# Default: add test for medium+ severity
|
|
if severity in {"high", "medium"}:
|
|
return True, f"Bug severity {severity} warrants regression test"
|
|
|
|
return False, "Low-value regression test, skipping"
|
|
```
|
|
|
|
### 2. Regression Test Implementation
|
|
|
|
**Test Structure:**
|
|
```python
|
|
# test_<module>_regression.py
|
|
|
|
import pytest
|
|
from datetime import datetime
|
|
|
|
class TestRegressions:
|
|
"""Regression tests for fixed bugs."""
|
|
|
|
def test_regression_issue_123_null_pointer_in_payment(self):
|
|
"""
|
|
Regression test for GitHub issue #123.
|
|
|
|
Bug: NullPointerException when processing payment with missing user email.
|
|
Fixed: 2025-10-30
|
|
Root cause: Missing null check in payment processor
|
|
|
|
This test ensures the fix remains in place and prevents regression.
|
|
"""
|
|
# Arrange: Setup scenario that caused original bug
|
|
payment = Payment(amount=100.0, user=User(email=None))
|
|
processor = PaymentProcessor()
|
|
|
|
# Act: Execute the previously failing code path
|
|
result = processor.process(payment)
|
|
|
|
# Assert: Verify fix works (no exception, proper error handling)
|
|
assert result.status == "failed"
|
|
assert "invalid user email" in result.error_message.lower()
|
|
|
|
def test_regression_pr_456_race_condition_in_cache(self):
|
|
"""
|
|
Regression test for PR #456.
|
|
|
|
Bug: Race condition in cache invalidation caused stale reads
|
|
Fixed: 2025-10-30
|
|
Root cause: Non-atomic read-modify-write operation
|
|
|
|
This test simulates concurrent cache access to verify thread safety.
|
|
"""
|
|
# Arrange: Setup concurrent scenario
|
|
cache = ThreadSafeCache()
|
|
cache.set("key", "value1")
|
|
|
|
# Act: Simulate race condition with threads
|
|
with ThreadPoolExecutor(max_workers=10) as executor:
|
|
futures = [
|
|
executor.submit(cache.update, "key", f"value{i}")
|
|
for i in range(100)
|
|
]
|
|
wait(futures)
|
|
|
|
# Assert: Verify no stale reads or corruption
|
|
final_value = cache.get("key")
|
|
assert final_value.startswith("value")
|
|
assert cache.consistency_check() # Internal consistency
|
|
```
|
|
|
|
**Test Naming Convention:**
|
|
- `test_regression_<issue_id>_<short_description>`
|
|
- Include issue/PR number for traceability
|
|
- Short description of what broke
|
|
|
|
**Test Documentation:**
|
|
- **Bug description**: What failed
|
|
- **Date fixed**: When fix was applied
|
|
- **Root cause**: Why it happened
|
|
- **Test purpose**: What regression is prevented
|
|
|
|
### 3. Regression Test Coverage
|
|
|
|
**What to Test:**
|
|
1. **Exact Failure Scenario**: Reproduce original bug conditions
|
|
2. **Edge Cases Around Fix**: Test boundaries near the bug
|
|
3. **Integration Impact**: Test how fix affects dependent code
|
|
4. **Performance**: If bug was performance-related, add benchmark
|
|
|
|
**What NOT to Test:**
|
|
- Don't duplicate existing unit tests
|
|
- Don't test obvious behavior already covered
|
|
- Don't over-specify implementation details (brittle tests)
|
|
|
|
---
|
|
|
|
## Workflow Integration
|
|
|
|
### Standard Bug Fix Flow
|
|
|
|
```bash
|
|
# 1. Fix the bug
|
|
/lazy code "fix: null pointer in payment processor"
|
|
|
|
# ✓ Bug fixed and committed
|
|
|
|
# 2. Regression testing skill evaluates
|
|
# (Automatic trigger after bug fix commit)
|
|
|
|
## Decision: Add regression test?
|
|
- Severity: HIGH (production crash)
|
|
- Coverage: 65% (medium)
|
|
- Complexity: MEDIUM
|
|
→ **YES, add regression test**
|
|
|
|
# 3. Implement regression test
|
|
# ✓ test_regression_issue_123_null_pointer_in_payment() added
|
|
# ✓ Coverage increased to 78%
|
|
# ✓ Test passes (bug is fixed)
|
|
|
|
# 4. Commit regression test
|
|
git add tests/test_payment_regression.py
|
|
git commit -m "test: add regression test for issue #123 null pointer"
|
|
```
|
|
|
|
### Quick Bug Fix (Skip Regression)
|
|
|
|
```bash
|
|
# 1. Fix trivial bug
|
|
/lazy code "fix: typo in error message"
|
|
|
|
# ✓ Bug fixed
|
|
|
|
# 2. Regression testing skill evaluates
|
|
## Decision: Add regression test?
|
|
- Severity: LOW (cosmetic)
|
|
- Coverage: 95% (excellent)
|
|
- Complexity: LOW (trivial)
|
|
→ **NO, skip regression test** (low value, already well-tested)
|
|
|
|
# 3. Commit fix only
|
|
# No additional test needed
|
|
```
|
|
|
|
---
|
|
|
|
## Regression Test Suite Management
|
|
|
|
### Organization
|
|
|
|
```
|
|
tests/
|
|
├── test_module.py # Regular unit tests
|
|
├── test_module_integration.py # Integration tests
|
|
└── test_module_regression.py # Regression tests (this skill)
|
|
```
|
|
|
|
**Separate regression tests** to:
|
|
- Track historical bug fixes
|
|
- Easy to identify which tests prevent regressions
|
|
- Can be run as separate CI job for faster feedback
|
|
|
|
### CI/CD Integration
|
|
|
|
```yaml
|
|
# .github/workflows/ci.yml
|
|
|
|
jobs:
|
|
regression-tests:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Run regression test suite
|
|
run: pytest tests/*_regression.py -v --tb=short
|
|
|
|
# Fast feedback: regression tests run first
|
|
# If they fail, likely a regression occurred
|
|
```
|
|
|
|
### Regression Test Metrics
|
|
|
|
**Track Over Time:**
|
|
- Total regression tests count
|
|
- Bug recurrence rate (0% is goal)
|
|
- Coverage increase from regression tests
|
|
- Time to detect regression (should be in CI, not production)
|
|
|
|
---
|
|
|
|
## Examples
|
|
|
|
### Example 1: Critical Bug (Add Regression Test)
|
|
|
|
**Bug**: Authentication bypass when session token is malformed
|
|
**Fix**: Added token validation
|
|
**Decision**: ✅ **Add regression test** (security critical)
|
|
|
|
```python
|
|
def test_regression_issue_789_auth_bypass_malformed_token():
|
|
"""
|
|
Regression test for security issue #789.
|
|
|
|
Bug: Malformed session tokens bypassed authentication
|
|
Fixed: 2025-10-30
|
|
Severity: CRITICAL (security)
|
|
Root cause: Missing token format validation
|
|
"""
|
|
# Arrange: Malformed token that bypassed auth
|
|
malformed_token = "invalid||format||token"
|
|
|
|
# Act: Attempt authentication
|
|
result = AuthService.validate_token(malformed_token)
|
|
|
|
# Assert: Should reject malformed token
|
|
assert result.is_valid is False
|
|
assert result.error == "invalid_token_format"
|
|
```
|
|
|
|
### Example 2: Complex Bug (Add Regression Test)
|
|
|
|
**Bug**: Race condition in distributed lock causes duplicate job execution
|
|
**Fix**: Atomic compare-and-swap operation
|
|
**Decision**: ✅ **Add regression test** (complex concurrency issue)
|
|
|
|
```python
|
|
def test_regression_pr_234_race_condition_duplicate_jobs():
|
|
"""
|
|
Regression test for PR #234.
|
|
|
|
Bug: Race condition allowed duplicate job execution
|
|
Fixed: 2025-10-30
|
|
Complexity: HIGH (concurrency)
|
|
Root cause: Non-atomic lock acquisition
|
|
"""
|
|
# Arrange: Simulate concurrent job submissions
|
|
job_queue = DistributedJobQueue()
|
|
job_id = "test-job-123"
|
|
|
|
# Act: 100 threads try to acquire same job
|
|
with ThreadPoolExecutor(max_workers=100) as executor:
|
|
futures = [
|
|
executor.submit(job_queue.try_acquire_job, job_id)
|
|
for _ in range(100)
|
|
]
|
|
results = [f.result() for f in futures]
|
|
|
|
# Assert: Only ONE thread should acquire the job
|
|
acquired = [r for r in results if r.acquired]
|
|
assert len(acquired) == 1, "Race condition: multiple threads acquired same job"
|
|
```
|
|
|
|
### Example 3: Trivial Bug (Skip Regression Test)
|
|
|
|
**Bug**: Typo in log message "Usre authenticated" → "User authenticated"
|
|
**Fix**: Corrected spelling
|
|
**Decision**: ❌ **Skip regression test** (cosmetic, no logic impact)
|
|
|
|
```
|
|
No test needed. Fix is obvious and has no functional impact.
|
|
Existing tests already cover authentication logic.
|
|
```
|
|
|
|
### Example 4: Well-Tested Area (Skip Regression Test)
|
|
|
|
**Bug**: Off-by-one error in pagination (page 1 showed 0 results)
|
|
**Fix**: Changed `offset = page * size` to `offset = (page - 1) * size`
|
|
**Coverage**: 95% (pagination thoroughly tested)
|
|
**Decision**: ❌ **Skip regression test** (area already has comprehensive tests)
|
|
|
|
```python
|
|
# Existing test already covers this:
|
|
def test_pagination_first_page_shows_results():
|
|
results = api.get_users(page=1, size=10)
|
|
assert len(results) == 10 # This test would have caught the bug
|
|
```
|
|
|
|
---
|
|
|
|
## Best Practices
|
|
|
|
### DO:
|
|
✅ Add regression tests for **critical and complex bugs**
|
|
✅ Include **issue/PR number** in test name for traceability
|
|
✅ Document **what broke, why, and when** in test docstring
|
|
✅ Test the **exact failure scenario** that caused the bug
|
|
✅ Keep regression tests **separate** from unit tests (easier tracking)
|
|
✅ Run regression tests in **CI/CD** for early detection
|
|
|
|
### DON'T:
|
|
❌ Add regression tests for **trivial or cosmetic bugs**
|
|
❌ Duplicate **existing comprehensive tests**
|
|
❌ Write **brittle tests** that test implementation details
|
|
❌ Skip **root cause analysis** (understand why it broke)
|
|
❌ Forget to **verify test fails** before fix (should reproduce bug)
|
|
|
|
---
|
|
|
|
## Output Format
|
|
|
|
When this skill triggers, provide:
|
|
|
|
```markdown
|
|
## Regression Test Evaluation
|
|
|
|
**Bug Fixed**: [Brief description]
|
|
**Issue/PR**: #[number]
|
|
**Severity**: [critical/high/medium/low]
|
|
**Complexity**: [high/medium/low]
|
|
**Current Coverage**: [X%]
|
|
|
|
**Decision**: [✅ Add Regression Test | ❌ Skip Regression Test]
|
|
|
|
**Reason**: [Why regression test is/isn't valuable]
|
|
|
|
---
|
|
|
|
[If adding test]
|
|
## Regression Test Implementation
|
|
|
|
**File**: `tests/test_<module>_regression.py`
|
|
|
|
```python
|
|
def test_regression_<issue>_<description>():
|
|
"""
|
|
[Docstring with bug context]
|
|
"""
|
|
# Test implementation
|
|
```
|
|
|
|
**Coverage Impact**: +X% (before: Y%, after: Z%)
|
|
```
|
|
|
|
---
|
|
|
|
## Integration with Other Skills
|
|
|
|
- **Works with**: `test-driven-development` (adds tests post-fix)
|
|
- **Complements**: `code-review-request` (reviewer checks for regression tests)
|
|
- **Used by**: `/lazy fix` command (auto-evaluates regression test need)
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
**Environment Variables:**
|
|
```bash
|
|
# Force regression tests for all bugs (strict mode)
|
|
export LAZYDEV_FORCE_REGRESSION_TESTS=1
|
|
|
|
# Disable regression test skill
|
|
export LAZYDEV_DISABLE_REGRESSION_SKILL=1
|
|
|
|
# Minimum coverage threshold to skip regression test (default: 90)
|
|
export LAZYDEV_REGRESSION_SKIP_COVERAGE_THRESHOLD=90
|
|
```
|
|
|
|
---
|
|
|
|
**Version**: 1.0.0
|
|
**Created**: 2025-10-30
|
|
**Anthropic Best Practice**: Model-invoked, autonomous trigger after bug fixes
|