664 lines
14 KiB
Markdown
664 lines
14 KiB
Markdown
---
|
|
name: test-isolation-fundamentals
|
|
description: Use when tests fail together but pass alone, diagnosing test pollution, ensuring test independence and idempotence, managing shared state, or designing parallel-safe tests - provides isolation principles, database/file/service patterns, and cleanup strategies
|
|
---
|
|
|
|
# Test Isolation Fundamentals
|
|
|
|
## Overview
|
|
|
|
**Core principle:** Each test must work independently, regardless of execution order or parallel execution.
|
|
|
|
**Rule:** If a test fails when run with other tests but passes alone, you have an isolation problem. Fix it before adding more tests.
|
|
|
|
## When You Have Isolation Problems
|
|
|
|
**Symptoms:**
|
|
- Tests pass individually: `pytest test_checkout.py` ✓
|
|
- Tests fail in full suite: `pytest` ✗
|
|
- Errors like "User already exists", "Expected empty but found data"
|
|
- Tests fail randomly or only in CI
|
|
- Different results when tests run in different orders
|
|
|
|
**Root cause:** Tests share mutable state without cleanup.
|
|
|
|
## The Five Principles
|
|
|
|
### 1. Order-Independence
|
|
|
|
**Tests must pass regardless of execution order.**
|
|
|
|
```bash
|
|
# All of these must produce identical results
|
|
pytest tests/ # alphabetical order
|
|
pytest tests/ --random-order # random order
|
|
pytest tests/ --reverse # reverse order
|
|
```
|
|
|
|
**Anti-pattern:**
|
|
```python
|
|
# ❌ BAD: Test B depends on Test A running first
|
|
def test_create_user():
|
|
db.users.insert({"id": 1, "name": "Alice"})
|
|
|
|
def test_update_user():
|
|
db.users.update({"id": 1}, {"name": "Bob"}) # Assumes Alice exists!
|
|
```
|
|
|
|
**Fix:** Each test creates its own data.
|
|
|
|
---
|
|
|
|
### 2. Idempotence
|
|
|
|
**Running a test twice produces the same result both times.**
|
|
|
|
```bash
|
|
# Both runs must pass
|
|
pytest test_checkout.py # First run
|
|
pytest test_checkout.py # Second run (same result)
|
|
```
|
|
|
|
**Anti-pattern:**
|
|
```python
|
|
# ❌ BAD: Second run fails on unique constraint
|
|
def test_signup():
|
|
user = create_user(email="test@example.com")
|
|
assert user.id is not None
|
|
# No cleanup - second run fails: "email already exists"
|
|
```
|
|
|
|
**Fix:** Clean up data after test OR use unique data per run.
|
|
|
|
---
|
|
|
|
### 3. Fresh State
|
|
|
|
**Each test starts with a clean slate.**
|
|
|
|
**What needs to be fresh:**
|
|
- Database records
|
|
- Files and directories
|
|
- In-memory caches
|
|
- Global variables
|
|
- Module-level state
|
|
- Environment variables
|
|
- Network sockets/ports
|
|
- Background processes
|
|
|
|
**Anti-pattern:**
|
|
```python
|
|
# ❌ BAD: Shared mutable global state
|
|
cache = {} # Module-level global
|
|
|
|
def test_cache_miss():
|
|
assert get_from_cache("key1") is None # Passes first time
|
|
cache["key1"] = "value" # Pollutes global state
|
|
|
|
def test_cache_lookup():
|
|
assert get_from_cache("key1") is None # Fails if previous test ran!
|
|
```
|
|
|
|
---
|
|
|
|
### 4. Explicit Scope
|
|
|
|
**Know what state is shared vs isolated.**
|
|
|
|
**Test scopes (pytest):**
|
|
- `scope="function"` - Fresh per test (default, safest)
|
|
- `scope="class"` - Shared across test class
|
|
- `scope="module"` - Shared across file
|
|
- `scope="session"` - Shared across entire test run
|
|
|
|
**Rule:** Default to `scope="function"`. Only use broader scopes for expensive resources that are READ-ONLY.
|
|
|
|
```python
|
|
# ✅ GOOD: Expensive read-only data can be shared
|
|
@pytest.fixture(scope="session")
|
|
def large_config_file():
|
|
return load_config("data.json") # Expensive, never modified
|
|
|
|
# ❌ BAD: Mutable data shared across tests
|
|
@pytest.fixture(scope="session")
|
|
def database():
|
|
return Database() # Tests will pollute each other!
|
|
|
|
# ✅ GOOD: Mutable data fresh per test
|
|
@pytest.fixture(scope="function")
|
|
def database():
|
|
db = Database()
|
|
yield db
|
|
db.cleanup() # Fresh per test
|
|
```
|
|
|
|
---
|
|
|
|
### 5. Parallel Safety
|
|
|
|
**Tests must work when run concurrently.**
|
|
|
|
```bash
|
|
pytest -n 4 # Run 4 tests in parallel with pytest-xdist
|
|
```
|
|
|
|
**Parallel-unsafe patterns:**
|
|
- Shared files without unique names
|
|
- Fixed network ports
|
|
- Singleton databases
|
|
- Global module state
|
|
- Fixed temp directories
|
|
|
|
**Fix:** Use unique identifiers per test (UUIDs, process IDs, random ports).
|
|
|
|
---
|
|
|
|
## Isolation Patterns by Resource Type
|
|
|
|
### Database Isolation
|
|
|
|
**Pattern 1: Transactions with Rollback (Fastest, Recommended)**
|
|
|
|
```python
|
|
import pytest
|
|
from sqlalchemy import create_engine
|
|
from sqlalchemy.orm import sessionmaker
|
|
|
|
@pytest.fixture
|
|
def db_session(db_engine):
|
|
"""Each test gets a fresh DB session that auto-rollbacks."""
|
|
connection = db_engine.connect()
|
|
transaction = connection.begin()
|
|
session = Session(bind=connection)
|
|
|
|
yield session
|
|
|
|
transaction.rollback() # Undo all changes
|
|
connection.close()
|
|
```
|
|
|
|
**Why it works:**
|
|
- No cleanup code needed - rollback is automatic
|
|
- Fast (<1ms per test)
|
|
- Works with ANY database (PostgreSQL, MySQL, SQLite, Oracle)
|
|
- Handles FK relationships automatically
|
|
|
|
**When NOT to use:**
|
|
- Testing actual commits
|
|
- Testing transaction isolation levels
|
|
- Multi-database transactions
|
|
|
|
---
|
|
|
|
**Pattern 2: Unique Data Per Test**
|
|
|
|
```python
|
|
import uuid
|
|
import pytest
|
|
|
|
@pytest.fixture
|
|
def unique_user():
|
|
"""Each test gets a unique user."""
|
|
email = f"test-{uuid.uuid4()}@example.com"
|
|
user = create_user(email=email, name="Test User")
|
|
|
|
yield user
|
|
|
|
# Optional cleanup (or rely on test DB being dropped)
|
|
delete_user(user.id)
|
|
```
|
|
|
|
**Why it works:**
|
|
- Tests don't interfere (different users)
|
|
- Can run in parallel
|
|
- Idempotent (UUID ensures uniqueness)
|
|
|
|
**When to use:**
|
|
- Testing with real databases
|
|
- Parallel test execution
|
|
- Integration tests that need real commits
|
|
|
|
---
|
|
|
|
**Pattern 3: Test Database Per Test**
|
|
|
|
```python
|
|
@pytest.fixture
|
|
def isolated_db():
|
|
"""Each test gets its own temporary database."""
|
|
db_name = f"test_db_{uuid.uuid4().hex}"
|
|
create_database(db_name)
|
|
|
|
yield get_connection(db_name)
|
|
|
|
drop_database(db_name)
|
|
```
|
|
|
|
**Why it works:**
|
|
- Complete isolation
|
|
- Can test schema migrations
|
|
- No cross-test pollution
|
|
|
|
**When NOT to use:**
|
|
- Unit tests (too slow)
|
|
- Large test suites (overhead adds up)
|
|
|
|
---
|
|
|
|
### File System Isolation
|
|
|
|
**Pattern: Temporary Directories**
|
|
|
|
```python
|
|
import pytest
|
|
import tempfile
|
|
import shutil
|
|
|
|
@pytest.fixture
|
|
def temp_workspace():
|
|
"""Each test gets a fresh temporary directory."""
|
|
tmpdir = tempfile.mkdtemp(prefix="test_")
|
|
|
|
yield tmpdir
|
|
|
|
shutil.rmtree(tmpdir) # Clean up
|
|
```
|
|
|
|
**Parallel-safe version:**
|
|
|
|
```python
|
|
@pytest.fixture
|
|
def temp_workspace(tmp_path):
|
|
"""pytest's tmp_path is automatically unique per test."""
|
|
workspace = tmp_path / "workspace"
|
|
workspace.mkdir()
|
|
|
|
yield workspace
|
|
|
|
# No cleanup needed - pytest handles it
|
|
```
|
|
|
|
**Why it works:**
|
|
- Each test writes to different directory
|
|
- Parallel-safe (unique paths)
|
|
- Automatic cleanup
|
|
|
|
---
|
|
|
|
### Service/API Isolation
|
|
|
|
**Pattern: Mocking External Services**
|
|
|
|
```python
|
|
import pytest
|
|
from unittest.mock import patch, MagicMock
|
|
|
|
@pytest.fixture
|
|
def mock_stripe():
|
|
"""Mock Stripe API for all tests."""
|
|
with patch('stripe.Charge.create') as mock:
|
|
mock.return_value = MagicMock(id="ch_test123", status="succeeded")
|
|
yield mock
|
|
```
|
|
|
|
**When to use:**
|
|
- External APIs (Stripe, Twilio, SendGrid)
|
|
- Slow services
|
|
- Non-deterministic responses
|
|
- Services that cost money per call
|
|
|
|
**When NOT to use:**
|
|
- Testing integration with real service (use separate integration test suite)
|
|
|
|
---
|
|
|
|
### In-Memory Cache Isolation
|
|
|
|
**Pattern: Clear Cache Before Each Test**
|
|
|
|
```python
|
|
import pytest
|
|
|
|
@pytest.fixture(autouse=True)
|
|
def clear_cache():
|
|
"""Automatically clear cache before each test."""
|
|
cache.clear()
|
|
yield
|
|
# Optional: clear after test too
|
|
cache.clear()
|
|
```
|
|
|
|
**Why `autouse=True`:** Runs automatically for every test without explicit declaration.
|
|
|
|
---
|
|
|
|
### Process/Port Isolation
|
|
|
|
**Pattern: Dynamic Port Allocation**
|
|
|
|
```python
|
|
import socket
|
|
import pytest
|
|
|
|
def get_free_port():
|
|
"""Find an available port."""
|
|
sock = socket.socket()
|
|
sock.bind(('', 0))
|
|
port = sock.getsockname()[1]
|
|
sock.close()
|
|
return port
|
|
|
|
@pytest.fixture
|
|
def test_server():
|
|
"""Each test gets a server on a unique port."""
|
|
port = get_free_port()
|
|
server = start_server(port=port)
|
|
|
|
yield f"http://localhost:{port}"
|
|
|
|
server.stop()
|
|
```
|
|
|
|
**Why it works:**
|
|
- Tests can run in parallel (different ports)
|
|
- No port conflicts
|
|
|
|
---
|
|
|
|
## Test Doubles: When to Use What
|
|
|
|
| Type | Purpose | Example |
|
|
|------|---------|---------|
|
|
| **Stub** | Returns hardcoded values | `getUser() → {id: 1, name: "Alice"}` |
|
|
| **Mock** | Verifies calls were made | `assert emailService.send.called` |
|
|
| **Fake** | Working implementation, simplified | In-memory database instead of PostgreSQL |
|
|
| **Spy** | Records calls for later inspection | Logs all method calls |
|
|
|
|
**Decision tree:**
|
|
|
|
```
|
|
Do you need to verify the call was made?
|
|
YES → Use Mock
|
|
NO → Do you need a working implementation?
|
|
YES → Use Fake
|
|
NO → Use Stub
|
|
```
|
|
|
|
---
|
|
|
|
## Diagnosing Isolation Problems
|
|
|
|
### Step 1: Identify Flaky Tests
|
|
|
|
```bash
|
|
# Run tests 100 times to find flakiness
|
|
pytest --count=100 test_checkout.py
|
|
|
|
# Run in random order
|
|
pytest --random-order
|
|
```
|
|
|
|
**Interpretation:**
|
|
- Passes 100/100 → Not flaky
|
|
- Passes 95/100 → Flaky (5% failure rate)
|
|
- Failures are random → Parallel unsafe OR order-dependent
|
|
|
|
---
|
|
|
|
### Step 2: Find Which Tests Interfere
|
|
|
|
**Run tests in isolation:**
|
|
|
|
```bash
|
|
# Test A alone
|
|
pytest test_a.py # ✓ Passes
|
|
|
|
# Test B alone
|
|
pytest test_b.py # ✓ Passes
|
|
|
|
# Both together
|
|
pytest test_a.py test_b.py # ✗ Test B fails
|
|
|
|
# Conclusion: Test A pollutes state that Test B depends on
|
|
```
|
|
|
|
**Reverse the order:**
|
|
|
|
```bash
|
|
pytest test_b.py test_a.py # Does Test A fail now?
|
|
```
|
|
|
|
- If YES: Bidirectional pollution
|
|
- If NO: Test A pollutes, Test B is victim
|
|
|
|
---
|
|
|
|
### Step 3: Identify Shared State
|
|
|
|
**Add diagnostic logging:**
|
|
|
|
```python
|
|
@pytest.fixture(autouse=True)
|
|
def log_state():
|
|
"""Log state before/after each test."""
|
|
print(f"Before: DB has {db.count()} records")
|
|
yield
|
|
print(f"After: DB has {db.count()} records")
|
|
```
|
|
|
|
**Look for:**
|
|
- Record count increasing over time (no cleanup)
|
|
- Files accumulating
|
|
- Cache growing
|
|
- Ports in use
|
|
|
|
---
|
|
|
|
### Step 4: Audit for Global State
|
|
|
|
**Search codebase for isolation violations:**
|
|
|
|
```bash
|
|
# Module-level globals
|
|
grep -r "^[A-Z_]* = " app/
|
|
|
|
# Global caches
|
|
grep -r "cache = " app/
|
|
|
|
# Singletons
|
|
grep -r "@singleton" app/
|
|
grep -r "class.*Singleton" app/
|
|
```
|
|
|
|
---
|
|
|
|
## Anti-Patterns Catalog
|
|
|
|
### ❌ Cleanup Code Instead of Structural Isolation
|
|
|
|
**Symptom:** Every test has teardown code to clean up
|
|
|
|
```python
|
|
def test_checkout():
|
|
user = create_user()
|
|
cart = create_cart(user)
|
|
|
|
checkout(cart)
|
|
|
|
# Teardown
|
|
delete_cart(cart.id)
|
|
delete_user(user.id)
|
|
```
|
|
|
|
**Why bad:**
|
|
- If test fails before cleanup, state pollutes
|
|
- If cleanup has bugs, state pollutes
|
|
- Forces sequential execution (no parallelism)
|
|
|
|
**Fix:** Use transactions, unique IDs, or dependency injection
|
|
|
|
---
|
|
|
|
### ❌ Shared Test Fixtures
|
|
|
|
**Symptom:** Fixtures modify mutable state
|
|
|
|
```python
|
|
@pytest.fixture(scope="module")
|
|
def user():
|
|
return create_user(email="test@example.com")
|
|
|
|
def test_update_name(user):
|
|
user.name = "Alice" # Modifies shared fixture!
|
|
save(user)
|
|
|
|
def test_update_email(user):
|
|
# Expects name to be original, but Test 1 changed it!
|
|
assert user.name == "Test User" # FAILS
|
|
```
|
|
|
|
**Why bad:** Tests interfere when fixture is modified
|
|
|
|
**Fix:** Use `scope="function"` for mutable fixtures
|
|
|
|
---
|
|
|
|
### ❌ Hidden Dependencies on Execution Order
|
|
|
|
**Symptom:** Test suite has implicit execution order
|
|
|
|
```python
|
|
# test_a.py
|
|
def test_create_admin():
|
|
create_user(email="admin@example.com", role="admin")
|
|
|
|
# test_b.py
|
|
def test_admin_permissions():
|
|
admin = get_user("admin@example.com") # Assumes test_a ran!
|
|
assert admin.has_permission("delete_users")
|
|
```
|
|
|
|
**Why bad:** Breaks when tests run in different order or in parallel
|
|
|
|
**Fix:** Each test creates its own dependencies
|
|
|
|
---
|
|
|
|
### ❌ Testing on Production-Like State
|
|
|
|
**Symptom:** Tests run against shared database with existing data
|
|
|
|
```python
|
|
def test_user_count():
|
|
assert db.users.count() == 100 # Assumes specific state!
|
|
```
|
|
|
|
**Why bad:**
|
|
- Tests fail when data changes
|
|
- Can't run in parallel
|
|
- Can't run idempotently
|
|
|
|
**Fix:** Use isolated test database or count relative to test's own data
|
|
|
|
---
|
|
|
|
## Common Scenarios
|
|
|
|
### Scenario 1: "Tests pass locally, fail in CI"
|
|
|
|
**Likely causes:**
|
|
1. **Timing issues** - CI is slower/faster, race conditions exposed
|
|
2. **Parallel execution** - CI runs tests in parallel, local doesn't
|
|
3. **Missing cleanup** - Local has leftover state, CI is fresh
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Test parallel execution locally
|
|
pytest -n 4
|
|
|
|
# Test with clean state
|
|
rm -rf .pytest_cache && pytest
|
|
```
|
|
|
|
---
|
|
|
|
### Scenario 2: "Random test failures that disappear on retry"
|
|
|
|
**Likely causes:**
|
|
1. **Race conditions** - Async operations not awaited
|
|
2. **Shared mutable state** - Global variables polluted
|
|
3. **External service flakiness** - Real APIs being called
|
|
|
|
**Diagnosis:**
|
|
```bash
|
|
# Run same test 100 times
|
|
pytest --count=100 test_flaky.py
|
|
|
|
# If failure rate is consistent (e.g., 5/100), it's likely shared state
|
|
# If failure rate varies wildly, it's likely race condition
|
|
```
|
|
|
|
---
|
|
|
|
### Scenario 3: "Database unique constraint violations"
|
|
|
|
**Symptom:** `IntegrityError: duplicate key value violates unique constraint`
|
|
|
|
**Cause:** Tests reuse same email/username/ID
|
|
|
|
**Fix:**
|
|
```python
|
|
import uuid
|
|
|
|
@pytest.fixture
|
|
def unique_user():
|
|
email = f"test-{uuid.uuid4()}@example.com"
|
|
return create_user(email=email)
|
|
```
|
|
|
|
---
|
|
|
|
## Quick Reference: Isolation Strategy Decision Tree
|
|
|
|
```
|
|
What resource needs isolation?
|
|
|
|
DATABASE
|
|
├─ Can you use transactions? → Transaction Rollback (fastest)
|
|
├─ Need real commits? → Unique Data Per Test
|
|
└─ Need schema changes? → Test Database Per Test
|
|
|
|
FILES
|
|
├─ Few files? → pytest's tmp_path
|
|
└─ Complex directories? → tempfile.mkdtemp()
|
|
|
|
EXTERNAL SERVICES
|
|
├─ Testing integration? → Separate integration test suite
|
|
└─ Testing business logic? → Mock the service
|
|
|
|
IN-MEMORY STATE
|
|
├─ Caches → Clear before each test (autouse fixture)
|
|
├─ Globals → Dependency injection (refactor)
|
|
└─ Module-level → Reset in fixture or avoid entirely
|
|
|
|
PROCESSES/PORTS
|
|
└─ Dynamic port allocation per test
|
|
```
|
|
|
|
---
|
|
|
|
## Bottom Line
|
|
|
|
**Test isolation is structural, not reactive.**
|
|
|
|
- ❌ **Reactive:** Write cleanup code after each test
|
|
- ✅ **Structural:** Design tests so cleanup isn't needed
|
|
|
|
**The hierarchy:**
|
|
1. **Best:** Dependency injection (no shared state)
|
|
2. **Good:** Transactions/tmp_path (automatic cleanup)
|
|
3. **Acceptable:** Unique data per test (explicit isolation)
|
|
4. **Last resort:** Manual cleanup (fragile, error-prone)
|
|
|
|
**If your tests fail together but pass alone, you have an isolation problem. Stop adding tests and fix isolation first.**
|