Initial commit
This commit is contained in:
663
skills/test-isolation-fundamentals/SKILL.md
Normal file
663
skills/test-isolation-fundamentals/SKILL.md
Normal file
@@ -0,0 +1,663 @@
|
||||
---
|
||||
name: test-isolation-fundamentals
|
||||
description: Use when tests fail together but pass alone, diagnosing test pollution, ensuring test independence and idempotence, managing shared state, or designing parallel-safe tests - provides isolation principles, database/file/service patterns, and cleanup strategies
|
||||
---
|
||||
|
||||
# Test Isolation Fundamentals
|
||||
|
||||
## Overview
|
||||
|
||||
**Core principle:** Each test must work independently, regardless of execution order or parallel execution.
|
||||
|
||||
**Rule:** If a test fails when run with other tests but passes alone, you have an isolation problem. Fix it before adding more tests.
|
||||
|
||||
## When You Have Isolation Problems
|
||||
|
||||
**Symptoms:**
|
||||
- Tests pass individually: `pytest test_checkout.py` ✓
|
||||
- Tests fail in full suite: `pytest` ✗
|
||||
- Errors like "User already exists", "Expected empty but found data"
|
||||
- Tests fail randomly or only in CI
|
||||
- Different results when tests run in different orders
|
||||
|
||||
**Root cause:** Tests share mutable state without cleanup.
|
||||
|
||||
## The Five Principles
|
||||
|
||||
### 1. Order-Independence
|
||||
|
||||
**Tests must pass regardless of execution order.**
|
||||
|
||||
```bash
|
||||
# All of these must produce identical results
|
||||
pytest tests/ # alphabetical order
|
||||
pytest tests/ --random-order # random order
|
||||
pytest tests/ --reverse # reverse order
|
||||
```
|
||||
|
||||
**Anti-pattern:**
|
||||
```python
|
||||
# ❌ BAD: Test B depends on Test A running first
|
||||
def test_create_user():
|
||||
db.users.insert({"id": 1, "name": "Alice"})
|
||||
|
||||
def test_update_user():
|
||||
db.users.update({"id": 1}, {"name": "Bob"}) # Assumes Alice exists!
|
||||
```
|
||||
|
||||
**Fix:** Each test creates its own data.
|
||||
|
||||
---
|
||||
|
||||
### 2. Idempotence
|
||||
|
||||
**Running a test twice produces the same result both times.**
|
||||
|
||||
```bash
|
||||
# Both runs must pass
|
||||
pytest test_checkout.py # First run
|
||||
pytest test_checkout.py # Second run (same result)
|
||||
```
|
||||
|
||||
**Anti-pattern:**
|
||||
```python
|
||||
# ❌ BAD: Second run fails on unique constraint
|
||||
def test_signup():
|
||||
user = create_user(email="test@example.com")
|
||||
assert user.id is not None
|
||||
# No cleanup - second run fails: "email already exists"
|
||||
```
|
||||
|
||||
**Fix:** Clean up data after test OR use unique data per run.
|
||||
|
||||
---
|
||||
|
||||
### 3. Fresh State
|
||||
|
||||
**Each test starts with a clean slate.**
|
||||
|
||||
**What needs to be fresh:**
|
||||
- Database records
|
||||
- Files and directories
|
||||
- In-memory caches
|
||||
- Global variables
|
||||
- Module-level state
|
||||
- Environment variables
|
||||
- Network sockets/ports
|
||||
- Background processes
|
||||
|
||||
**Anti-pattern:**
|
||||
```python
|
||||
# ❌ BAD: Shared mutable global state
|
||||
cache = {} # Module-level global
|
||||
|
||||
def test_cache_miss():
|
||||
assert get_from_cache("key1") is None # Passes first time
|
||||
cache["key1"] = "value" # Pollutes global state
|
||||
|
||||
def test_cache_lookup():
|
||||
assert get_from_cache("key1") is None # Fails if previous test ran!
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 4. Explicit Scope
|
||||
|
||||
**Know what state is shared vs isolated.**
|
||||
|
||||
**Test scopes (pytest):**
|
||||
- `scope="function"` - Fresh per test (default, safest)
|
||||
- `scope="class"` - Shared across test class
|
||||
- `scope="module"` - Shared across file
|
||||
- `scope="session"` - Shared across entire test run
|
||||
|
||||
**Rule:** Default to `scope="function"`. Only use broader scopes for expensive resources that are READ-ONLY.
|
||||
|
||||
```python
|
||||
# ✅ GOOD: Expensive read-only data can be shared
|
||||
@pytest.fixture(scope="session")
|
||||
def large_config_file():
|
||||
return load_config("data.json") # Expensive, never modified
|
||||
|
||||
# ❌ BAD: Mutable data shared across tests
|
||||
@pytest.fixture(scope="session")
|
||||
def database():
|
||||
return Database() # Tests will pollute each other!
|
||||
|
||||
# ✅ GOOD: Mutable data fresh per test
|
||||
@pytest.fixture(scope="function")
|
||||
def database():
|
||||
db = Database()
|
||||
yield db
|
||||
db.cleanup() # Fresh per test
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5. Parallel Safety
|
||||
|
||||
**Tests must work when run concurrently.**
|
||||
|
||||
```bash
|
||||
pytest -n 4 # Run 4 tests in parallel with pytest-xdist
|
||||
```
|
||||
|
||||
**Parallel-unsafe patterns:**
|
||||
- Shared files without unique names
|
||||
- Fixed network ports
|
||||
- Singleton databases
|
||||
- Global module state
|
||||
- Fixed temp directories
|
||||
|
||||
**Fix:** Use unique identifiers per test (UUIDs, process IDs, random ports).
|
||||
|
||||
---
|
||||
|
||||
## Isolation Patterns by Resource Type
|
||||
|
||||
### Database Isolation
|
||||
|
||||
**Pattern 1: Transactions with Rollback (Fastest, Recommended)**
|
||||
|
||||
```python
|
||||
import pytest
|
||||
from sqlalchemy import create_engine
|
||||
from sqlalchemy.orm import sessionmaker
|
||||
|
||||
@pytest.fixture
|
||||
def db_session(db_engine):
|
||||
"""Each test gets a fresh DB session that auto-rollbacks."""
|
||||
connection = db_engine.connect()
|
||||
transaction = connection.begin()
|
||||
session = Session(bind=connection)
|
||||
|
||||
yield session
|
||||
|
||||
transaction.rollback() # Undo all changes
|
||||
connection.close()
|
||||
```
|
||||
|
||||
**Why it works:**
|
||||
- No cleanup code needed - rollback is automatic
|
||||
- Fast (<1ms per test)
|
||||
- Works with ANY database (PostgreSQL, MySQL, SQLite, Oracle)
|
||||
- Handles FK relationships automatically
|
||||
|
||||
**When NOT to use:**
|
||||
- Testing actual commits
|
||||
- Testing transaction isolation levels
|
||||
- Multi-database transactions
|
||||
|
||||
---
|
||||
|
||||
**Pattern 2: Unique Data Per Test**
|
||||
|
||||
```python
|
||||
import uuid
|
||||
import pytest
|
||||
|
||||
@pytest.fixture
|
||||
def unique_user():
|
||||
"""Each test gets a unique user."""
|
||||
email = f"test-{uuid.uuid4()}@example.com"
|
||||
user = create_user(email=email, name="Test User")
|
||||
|
||||
yield user
|
||||
|
||||
# Optional cleanup (or rely on test DB being dropped)
|
||||
delete_user(user.id)
|
||||
```
|
||||
|
||||
**Why it works:**
|
||||
- Tests don't interfere (different users)
|
||||
- Can run in parallel
|
||||
- Idempotent (UUID ensures uniqueness)
|
||||
|
||||
**When to use:**
|
||||
- Testing with real databases
|
||||
- Parallel test execution
|
||||
- Integration tests that need real commits
|
||||
|
||||
---
|
||||
|
||||
**Pattern 3: Test Database Per Test**
|
||||
|
||||
```python
|
||||
@pytest.fixture
|
||||
def isolated_db():
|
||||
"""Each test gets its own temporary database."""
|
||||
db_name = f"test_db_{uuid.uuid4().hex}"
|
||||
create_database(db_name)
|
||||
|
||||
yield get_connection(db_name)
|
||||
|
||||
drop_database(db_name)
|
||||
```
|
||||
|
||||
**Why it works:**
|
||||
- Complete isolation
|
||||
- Can test schema migrations
|
||||
- No cross-test pollution
|
||||
|
||||
**When NOT to use:**
|
||||
- Unit tests (too slow)
|
||||
- Large test suites (overhead adds up)
|
||||
|
||||
---
|
||||
|
||||
### File System Isolation
|
||||
|
||||
**Pattern: Temporary Directories**
|
||||
|
||||
```python
|
||||
import pytest
|
||||
import tempfile
|
||||
import shutil
|
||||
|
||||
@pytest.fixture
|
||||
def temp_workspace():
|
||||
"""Each test gets a fresh temporary directory."""
|
||||
tmpdir = tempfile.mkdtemp(prefix="test_")
|
||||
|
||||
yield tmpdir
|
||||
|
||||
shutil.rmtree(tmpdir) # Clean up
|
||||
```
|
||||
|
||||
**Parallel-safe version:**
|
||||
|
||||
```python
|
||||
@pytest.fixture
|
||||
def temp_workspace(tmp_path):
|
||||
"""pytest's tmp_path is automatically unique per test."""
|
||||
workspace = tmp_path / "workspace"
|
||||
workspace.mkdir()
|
||||
|
||||
yield workspace
|
||||
|
||||
# No cleanup needed - pytest handles it
|
||||
```
|
||||
|
||||
**Why it works:**
|
||||
- Each test writes to different directory
|
||||
- Parallel-safe (unique paths)
|
||||
- Automatic cleanup
|
||||
|
||||
---
|
||||
|
||||
### Service/API Isolation
|
||||
|
||||
**Pattern: Mocking External Services**
|
||||
|
||||
```python
|
||||
import pytest
|
||||
from unittest.mock import patch, MagicMock
|
||||
|
||||
@pytest.fixture
|
||||
def mock_stripe():
|
||||
"""Mock Stripe API for all tests."""
|
||||
with patch('stripe.Charge.create') as mock:
|
||||
mock.return_value = MagicMock(id="ch_test123", status="succeeded")
|
||||
yield mock
|
||||
```
|
||||
|
||||
**When to use:**
|
||||
- External APIs (Stripe, Twilio, SendGrid)
|
||||
- Slow services
|
||||
- Non-deterministic responses
|
||||
- Services that cost money per call
|
||||
|
||||
**When NOT to use:**
|
||||
- Testing integration with real service (use separate integration test suite)
|
||||
|
||||
---
|
||||
|
||||
### In-Memory Cache Isolation
|
||||
|
||||
**Pattern: Clear Cache Before Each Test**
|
||||
|
||||
```python
|
||||
import pytest
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def clear_cache():
|
||||
"""Automatically clear cache before each test."""
|
||||
cache.clear()
|
||||
yield
|
||||
# Optional: clear after test too
|
||||
cache.clear()
|
||||
```
|
||||
|
||||
**Why `autouse=True`:** Runs automatically for every test without explicit declaration.
|
||||
|
||||
---
|
||||
|
||||
### Process/Port Isolation
|
||||
|
||||
**Pattern: Dynamic Port Allocation**
|
||||
|
||||
```python
|
||||
import socket
|
||||
import pytest
|
||||
|
||||
def get_free_port():
|
||||
"""Find an available port."""
|
||||
sock = socket.socket()
|
||||
sock.bind(('', 0))
|
||||
port = sock.getsockname()[1]
|
||||
sock.close()
|
||||
return port
|
||||
|
||||
@pytest.fixture
|
||||
def test_server():
|
||||
"""Each test gets a server on a unique port."""
|
||||
port = get_free_port()
|
||||
server = start_server(port=port)
|
||||
|
||||
yield f"http://localhost:{port}"
|
||||
|
||||
server.stop()
|
||||
```
|
||||
|
||||
**Why it works:**
|
||||
- Tests can run in parallel (different ports)
|
||||
- No port conflicts
|
||||
|
||||
---
|
||||
|
||||
## Test Doubles: When to Use What
|
||||
|
||||
| Type | Purpose | Example |
|
||||
|------|---------|---------|
|
||||
| **Stub** | Returns hardcoded values | `getUser() → {id: 1, name: "Alice"}` |
|
||||
| **Mock** | Verifies calls were made | `assert emailService.send.called` |
|
||||
| **Fake** | Working implementation, simplified | In-memory database instead of PostgreSQL |
|
||||
| **Spy** | Records calls for later inspection | Logs all method calls |
|
||||
|
||||
**Decision tree:**
|
||||
|
||||
```
|
||||
Do you need to verify the call was made?
|
||||
YES → Use Mock
|
||||
NO → Do you need a working implementation?
|
||||
YES → Use Fake
|
||||
NO → Use Stub
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Diagnosing Isolation Problems
|
||||
|
||||
### Step 1: Identify Flaky Tests
|
||||
|
||||
```bash
|
||||
# Run tests 100 times to find flakiness
|
||||
pytest --count=100 test_checkout.py
|
||||
|
||||
# Run in random order
|
||||
pytest --random-order
|
||||
```
|
||||
|
||||
**Interpretation:**
|
||||
- Passes 100/100 → Not flaky
|
||||
- Passes 95/100 → Flaky (5% failure rate)
|
||||
- Failures are random → Parallel unsafe OR order-dependent
|
||||
|
||||
---
|
||||
|
||||
### Step 2: Find Which Tests Interfere
|
||||
|
||||
**Run tests in isolation:**
|
||||
|
||||
```bash
|
||||
# Test A alone
|
||||
pytest test_a.py # ✓ Passes
|
||||
|
||||
# Test B alone
|
||||
pytest test_b.py # ✓ Passes
|
||||
|
||||
# Both together
|
||||
pytest test_a.py test_b.py # ✗ Test B fails
|
||||
|
||||
# Conclusion: Test A pollutes state that Test B depends on
|
||||
```
|
||||
|
||||
**Reverse the order:**
|
||||
|
||||
```bash
|
||||
pytest test_b.py test_a.py # Does Test A fail now?
|
||||
```
|
||||
|
||||
- If YES: Bidirectional pollution
|
||||
- If NO: Test A pollutes, Test B is victim
|
||||
|
||||
---
|
||||
|
||||
### Step 3: Identify Shared State
|
||||
|
||||
**Add diagnostic logging:**
|
||||
|
||||
```python
|
||||
@pytest.fixture(autouse=True)
|
||||
def log_state():
|
||||
"""Log state before/after each test."""
|
||||
print(f"Before: DB has {db.count()} records")
|
||||
yield
|
||||
print(f"After: DB has {db.count()} records")
|
||||
```
|
||||
|
||||
**Look for:**
|
||||
- Record count increasing over time (no cleanup)
|
||||
- Files accumulating
|
||||
- Cache growing
|
||||
- Ports in use
|
||||
|
||||
---
|
||||
|
||||
### Step 4: Audit for Global State
|
||||
|
||||
**Search codebase for isolation violations:**
|
||||
|
||||
```bash
|
||||
# Module-level globals
|
||||
grep -r "^[A-Z_]* = " app/
|
||||
|
||||
# Global caches
|
||||
grep -r "cache = " app/
|
||||
|
||||
# Singletons
|
||||
grep -r "@singleton" app/
|
||||
grep -r "class.*Singleton" app/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns Catalog
|
||||
|
||||
### ❌ Cleanup Code Instead of Structural Isolation
|
||||
|
||||
**Symptom:** Every test has teardown code to clean up
|
||||
|
||||
```python
|
||||
def test_checkout():
|
||||
user = create_user()
|
||||
cart = create_cart(user)
|
||||
|
||||
checkout(cart)
|
||||
|
||||
# Teardown
|
||||
delete_cart(cart.id)
|
||||
delete_user(user.id)
|
||||
```
|
||||
|
||||
**Why bad:**
|
||||
- If test fails before cleanup, state pollutes
|
||||
- If cleanup has bugs, state pollutes
|
||||
- Forces sequential execution (no parallelism)
|
||||
|
||||
**Fix:** Use transactions, unique IDs, or dependency injection
|
||||
|
||||
---
|
||||
|
||||
### ❌ Shared Test Fixtures
|
||||
|
||||
**Symptom:** Fixtures modify mutable state
|
||||
|
||||
```python
|
||||
@pytest.fixture(scope="module")
|
||||
def user():
|
||||
return create_user(email="test@example.com")
|
||||
|
||||
def test_update_name(user):
|
||||
user.name = "Alice" # Modifies shared fixture!
|
||||
save(user)
|
||||
|
||||
def test_update_email(user):
|
||||
# Expects name to be original, but Test 1 changed it!
|
||||
assert user.name == "Test User" # FAILS
|
||||
```
|
||||
|
||||
**Why bad:** Tests interfere when fixture is modified
|
||||
|
||||
**Fix:** Use `scope="function"` for mutable fixtures
|
||||
|
||||
---
|
||||
|
||||
### ❌ Hidden Dependencies on Execution Order
|
||||
|
||||
**Symptom:** Test suite has implicit execution order
|
||||
|
||||
```python
|
||||
# test_a.py
|
||||
def test_create_admin():
|
||||
create_user(email="admin@example.com", role="admin")
|
||||
|
||||
# test_b.py
|
||||
def test_admin_permissions():
|
||||
admin = get_user("admin@example.com") # Assumes test_a ran!
|
||||
assert admin.has_permission("delete_users")
|
||||
```
|
||||
|
||||
**Why bad:** Breaks when tests run in different order or in parallel
|
||||
|
||||
**Fix:** Each test creates its own dependencies
|
||||
|
||||
---
|
||||
|
||||
### ❌ Testing on Production-Like State
|
||||
|
||||
**Symptom:** Tests run against shared database with existing data
|
||||
|
||||
```python
|
||||
def test_user_count():
|
||||
assert db.users.count() == 100 # Assumes specific state!
|
||||
```
|
||||
|
||||
**Why bad:**
|
||||
- Tests fail when data changes
|
||||
- Can't run in parallel
|
||||
- Can't run idempotently
|
||||
|
||||
**Fix:** Use isolated test database or count relative to test's own data
|
||||
|
||||
---
|
||||
|
||||
## Common Scenarios
|
||||
|
||||
### Scenario 1: "Tests pass locally, fail in CI"
|
||||
|
||||
**Likely causes:**
|
||||
1. **Timing issues** - CI is slower/faster, race conditions exposed
|
||||
2. **Parallel execution** - CI runs tests in parallel, local doesn't
|
||||
3. **Missing cleanup** - Local has leftover state, CI is fresh
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
# Test parallel execution locally
|
||||
pytest -n 4
|
||||
|
||||
# Test with clean state
|
||||
rm -rf .pytest_cache && pytest
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario 2: "Random test failures that disappear on retry"
|
||||
|
||||
**Likely causes:**
|
||||
1. **Race conditions** - Async operations not awaited
|
||||
2. **Shared mutable state** - Global variables polluted
|
||||
3. **External service flakiness** - Real APIs being called
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
# Run same test 100 times
|
||||
pytest --count=100 test_flaky.py
|
||||
|
||||
# If failure rate is consistent (e.g., 5/100), it's likely shared state
|
||||
# If failure rate varies wildly, it's likely race condition
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario 3: "Database unique constraint violations"
|
||||
|
||||
**Symptom:** `IntegrityError: duplicate key value violates unique constraint`
|
||||
|
||||
**Cause:** Tests reuse same email/username/ID
|
||||
|
||||
**Fix:**
|
||||
```python
|
||||
import uuid
|
||||
|
||||
@pytest.fixture
|
||||
def unique_user():
|
||||
email = f"test-{uuid.uuid4()}@example.com"
|
||||
return create_user(email=email)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference: Isolation Strategy Decision Tree
|
||||
|
||||
```
|
||||
What resource needs isolation?
|
||||
|
||||
DATABASE
|
||||
├─ Can you use transactions? → Transaction Rollback (fastest)
|
||||
├─ Need real commits? → Unique Data Per Test
|
||||
└─ Need schema changes? → Test Database Per Test
|
||||
|
||||
FILES
|
||||
├─ Few files? → pytest's tmp_path
|
||||
└─ Complex directories? → tempfile.mkdtemp()
|
||||
|
||||
EXTERNAL SERVICES
|
||||
├─ Testing integration? → Separate integration test suite
|
||||
└─ Testing business logic? → Mock the service
|
||||
|
||||
IN-MEMORY STATE
|
||||
├─ Caches → Clear before each test (autouse fixture)
|
||||
├─ Globals → Dependency injection (refactor)
|
||||
└─ Module-level → Reset in fixture or avoid entirely
|
||||
|
||||
PROCESSES/PORTS
|
||||
└─ Dynamic port allocation per test
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Bottom Line
|
||||
|
||||
**Test isolation is structural, not reactive.**
|
||||
|
||||
- ❌ **Reactive:** Write cleanup code after each test
|
||||
- ✅ **Structural:** Design tests so cleanup isn't needed
|
||||
|
||||
**The hierarchy:**
|
||||
1. **Best:** Dependency injection (no shared state)
|
||||
2. **Good:** Transactions/tmp_path (automatic cleanup)
|
||||
3. **Acceptable:** Unique data per test (explicit isolation)
|
||||
4. **Last resort:** Manual cleanup (fragile, error-prone)
|
||||
|
||||
**If your tests fail together but pass alone, you have an isolation problem. Stop adding tests and fix isolation first.**
|
||||
Reference in New Issue
Block a user