Initial commit

2025-11-30 08:59:43 +08:00
commit 966ef521f7
25 changed files with 9763 additions and 0 deletions
--- a/skills/test-isolation-fundamentals/SKILL.md
+++ b/skills/test-isolation-fundamentals/SKILL.md
@@ -0,0 +1,663 @@
+---
+name: test-isolation-fundamentals
+description: Use when tests fail together but pass alone, diagnosing test pollution, ensuring test independence and idempotence, managing shared state, or designing parallel-safe tests - provides isolation principles, database/file/service patterns, and cleanup strategies
+---
+
+# Test Isolation Fundamentals
+
+## Overview
+
+**Core principle:** Each test must work independently, regardless of execution order or parallel execution.
+
+**Rule:** If a test fails when run with other tests but passes alone, you have an isolation problem. Fix it before adding more tests.
+
+## When You Have Isolation Problems
+
+**Symptoms:**
+- Tests pass individually: `pytest test_checkout.py` ✓
+- Tests fail in full suite: `pytest` ✗
+- Errors like "User already exists", "Expected empty but found data"
+- Tests fail randomly or only in CI
+- Different results when tests run in different orders
+
+**Root cause:** Tests share mutable state without cleanup.
+
+## The Five Principles
+
+### 1. Order-Independence
+
+**Tests must pass regardless of execution order.**
+
+```bash
+# All of these must produce identical results
+pytest tests/  # alphabetical order
+pytest tests/ --random-order  # random order
+pytest tests/ --reverse  # reverse order
+```
+
+**Anti-pattern:**
+```python
+# ❌ BAD: Test B depends on Test A running first
+def test_create_user():
+    db.users.insert({"id": 1, "name": "Alice"})
+
+def test_update_user():
+    db.users.update({"id": 1}, {"name": "Bob"})  # Assumes Alice exists!
+```
+
+**Fix:** Each test creates its own data.
+
+---
+
+### 2. Idempotence
+
+**Running a test twice produces the same result both times.**
+
+```bash
+# Both runs must pass
+pytest test_checkout.py  # First run
+pytest test_checkout.py  # Second run (same result)
+```
+
+**Anti-pattern:**
+```python
+# ❌ BAD: Second run fails on unique constraint
+def test_signup():
+    user = create_user(email="test@example.com")
+    assert user.id is not None
+    # No cleanup - second run fails: "email already exists"
+```
+
+**Fix:** Clean up data after test OR use unique data per run.
+
+---
+
+### 3. Fresh State
+
+**Each test starts with a clean slate.**
+
+**What needs to be fresh:**
+- Database records
+- Files and directories
+- In-memory caches
+- Global variables
+- Module-level state
+- Environment variables
+- Network sockets/ports
+- Background processes
+
+**Anti-pattern:**
+```python
+# ❌ BAD: Shared mutable global state
+cache = {}  # Module-level global
+
+def test_cache_miss():
+    assert get_from_cache("key1") is None  # Passes first time
+    cache["key1"] = "value"  # Pollutes global state
+
+def test_cache_lookup():
+    assert get_from_cache("key1") is None  # Fails if previous test ran!
+```
+
+---
+
+### 4. Explicit Scope
+
+**Know what state is shared vs isolated.**
+
+**Test scopes (pytest):**
+- `scope="function"` - Fresh per test (default, safest)
+- `scope="class"` - Shared across test class
+- `scope="module"` - Shared across file
+- `scope="session"` - Shared across entire test run
+
+**Rule:** Default to `scope="function"`. Only use broader scopes for expensive resources that are READ-ONLY.
+
+```python
+# ✅ GOOD: Expensive read-only data can be shared
+@pytest.fixture(scope="session")
+def large_config_file():
+    return load_config("data.json")  # Expensive, never modified
+
+# ❌ BAD: Mutable data shared across tests
+@pytest.fixture(scope="session")
+def database():
+    return Database()  # Tests will pollute each other!
+
+# ✅ GOOD: Mutable data fresh per test
+@pytest.fixture(scope="function")
+def database():
+    db = Database()
+    yield db
+    db.cleanup()  # Fresh per test
+```
+
+---
+
+### 5. Parallel Safety
+
+**Tests must work when run concurrently.**
+
+```bash
+pytest -n 4  # Run 4 tests in parallel with pytest-xdist
+```
+
+**Parallel-unsafe patterns:**
+- Shared files without unique names
+- Fixed network ports
+- Singleton databases
+- Global module state
+- Fixed temp directories
+
+**Fix:** Use unique identifiers per test (UUIDs, process IDs, random ports).
+
+---
+
+## Isolation Patterns by Resource Type
+
+### Database Isolation
+
+**Pattern 1: Transactions with Rollback (Fastest, Recommended)**
+
+```python
+import pytest
+from sqlalchemy import create_engine
+from sqlalchemy.orm import sessionmaker
+
+@pytest.fixture
+def db_session(db_engine):
+    """Each test gets a fresh DB session that auto-rollbacks."""
+    connection = db_engine.connect()
+    transaction = connection.begin()
+    session = Session(bind=connection)
+
+    yield session
+
+    transaction.rollback()  # Undo all changes
+    connection.close()
+```
+
+**Why it works:**
+- No cleanup code needed - rollback is automatic
+- Fast (<1ms per test)
+- Works with ANY database (PostgreSQL, MySQL, SQLite, Oracle)
+- Handles FK relationships automatically
+
+**When NOT to use:**
+- Testing actual commits
+- Testing transaction isolation levels
+- Multi-database transactions
+
+---
+
+**Pattern 2: Unique Data Per Test**
+
+```python
+import uuid
+import pytest
+
+@pytest.fixture
+def unique_user():
+    """Each test gets a unique user."""
+    email = f"test-{uuid.uuid4()}@example.com"
+    user = create_user(email=email, name="Test User")
+
+    yield user
+
+    # Optional cleanup (or rely on test DB being dropped)
+    delete_user(user.id)
+```
+
+**Why it works:**
+- Tests don't interfere (different users)
+- Can run in parallel
+- Idempotent (UUID ensures uniqueness)
+
+**When to use:**
+- Testing with real databases
+- Parallel test execution
+- Integration tests that need real commits
+
+---
+
+**Pattern 3: Test Database Per Test**
+
+```python
+@pytest.fixture
+def isolated_db():
+    """Each test gets its own temporary database."""
+    db_name = f"test_db_{uuid.uuid4().hex}"
+    create_database(db_name)
+
+    yield get_connection(db_name)
+
+    drop_database(db_name)
+```
+
+**Why it works:**
+- Complete isolation
+- Can test schema migrations
+- No cross-test pollution
+
+**When NOT to use:**
+- Unit tests (too slow)
+- Large test suites (overhead adds up)
+
+---
+
+### File System Isolation
+
+**Pattern: Temporary Directories**
+
+```python
+import pytest
+import tempfile
+import shutil
+
+@pytest.fixture
+def temp_workspace():
+    """Each test gets a fresh temporary directory."""
+    tmpdir = tempfile.mkdtemp(prefix="test_")
+
+    yield tmpdir
+
+    shutil.rmtree(tmpdir)  # Clean up
+```
+
+**Parallel-safe version:**
+
+```python
+@pytest.fixture
+def temp_workspace(tmp_path):
+    """pytest's tmp_path is automatically unique per test."""
+    workspace = tmp_path / "workspace"
+    workspace.mkdir()
+
+    yield workspace
+
+    # No cleanup needed - pytest handles it
+```
+
+**Why it works:**
+- Each test writes to different directory
+- Parallel-safe (unique paths)
+- Automatic cleanup
+
+---
+
+### Service/API Isolation
+
+**Pattern: Mocking External Services**
+
+```python
+import pytest
+from unittest.mock import patch, MagicMock
+
+@pytest.fixture
+def mock_stripe():
+    """Mock Stripe API for all tests."""
+    with patch('stripe.Charge.create') as mock:
+        mock.return_value = MagicMock(id="ch_test123", status="succeeded")
+        yield mock
+```
+
+**When to use:**
+- External APIs (Stripe, Twilio, SendGrid)
+- Slow services
+- Non-deterministic responses
+- Services that cost money per call
+
+**When NOT to use:**
+- Testing integration with real service (use separate integration test suite)
+
+---
+
+### In-Memory Cache Isolation
+
+**Pattern: Clear Cache Before Each Test**
+
+```python
+import pytest
+
+@pytest.fixture(autouse=True)
+def clear_cache():
+    """Automatically clear cache before each test."""
+    cache.clear()
+    yield
+    # Optional: clear after test too
+    cache.clear()
+```
+
+**Why `autouse=True`:** Runs automatically for every test without explicit declaration.
+
+---
+
+### Process/Port Isolation
+
+**Pattern: Dynamic Port Allocation**
+
+```python
+import socket
+import pytest
+
+def get_free_port():
+    """Find an available port."""
+    sock = socket.socket()
+    sock.bind(('', 0))
+    port = sock.getsockname()[1]
+    sock.close()
+    return port
+
+@pytest.fixture
+def test_server():
+    """Each test gets a server on a unique port."""
+    port = get_free_port()
+    server = start_server(port=port)
+
+    yield f"http://localhost:{port}"
+
+    server.stop()
+```
+
+**Why it works:**
+- Tests can run in parallel (different ports)
+- No port conflicts
+
+---
+
+## Test Doubles: When to Use What
+
+| Type | Purpose | Example |
+|------|---------|---------|
+| **Stub** | Returns hardcoded values | `getUser() → {id: 1, name: "Alice"}` |
+| **Mock** | Verifies calls were made | `assert emailService.send.called` |
+| **Fake** | Working implementation, simplified | In-memory database instead of PostgreSQL |
+| **Spy** | Records calls for later inspection | Logs all method calls |
+
+**Decision tree:**
+
+```
+Do you need to verify the call was made?
+  YES → Use Mock
+  NO → Do you need a working implementation?
+    YES → Use Fake
+    NO → Use Stub
+```
+
+---
+
+## Diagnosing Isolation Problems
+
+### Step 1: Identify Flaky Tests
+
+```bash
+# Run tests 100 times to find flakiness
+pytest --count=100 test_checkout.py
+
+# Run in random order
+pytest --random-order
+```
+
+**Interpretation:**
+- Passes 100/100 → Not flaky
+- Passes 95/100 → Flaky (5% failure rate)
+- Failures are random → Parallel unsafe OR order-dependent
+
+---
+
+### Step 2: Find Which Tests Interfere
+
+**Run tests in isolation:**
+
+```bash
+# Test A alone
+pytest test_a.py  # ✓ Passes
+
+# Test B alone
+pytest test_b.py  # ✓ Passes
+
+# Both together
+pytest test_a.py test_b.py  # ✗ Test B fails
+
+# Conclusion: Test A pollutes state that Test B depends on
+```
+
+**Reverse the order:**
+
+```bash
+pytest test_b.py test_a.py  # Does Test A fail now?
+```
+
+- If YES: Bidirectional pollution
+- If NO: Test A pollutes, Test B is victim
+
+---
+
+### Step 3: Identify Shared State
+
+**Add diagnostic logging:**
+
+```python
+@pytest.fixture(autouse=True)
+def log_state():
+    """Log state before/after each test."""
+    print(f"Before: DB has {db.count()} records")
+    yield
+    print(f"After: DB has {db.count()} records")
+```
+
+**Look for:**
+- Record count increasing over time (no cleanup)
+- Files accumulating
+- Cache growing
+- Ports in use
+
+---
+
+### Step 4: Audit for Global State
+
+**Search codebase for isolation violations:**
+
+```bash
+# Module-level globals
+grep -r "^[A-Z_]* = " app/
+
+# Global caches
+grep -r "cache = " app/
+
+# Singletons
+grep -r "@singleton" app/
+grep -r "class.*Singleton" app/
+```
+
+---
+
+## Anti-Patterns Catalog
+
+### ❌ Cleanup Code Instead of Structural Isolation
+
+**Symptom:** Every test has teardown code to clean up
+
+```python
+def test_checkout():
+    user = create_user()
+    cart = create_cart(user)
+
+    checkout(cart)
+
+    # Teardown
+    delete_cart(cart.id)
+    delete_user(user.id)
+```
+
+**Why bad:**
+- If test fails before cleanup, state pollutes
+- If cleanup has bugs, state pollutes
+- Forces sequential execution (no parallelism)
+
+**Fix:** Use transactions, unique IDs, or dependency injection
+
+---
+
+### ❌ Shared Test Fixtures
+
+**Symptom:** Fixtures modify mutable state
+
+```python
+@pytest.fixture(scope="module")
+def user():
+    return create_user(email="test@example.com")
+
+def test_update_name(user):
+    user.name = "Alice"  # Modifies shared fixture!
+    save(user)
+
+def test_update_email(user):
+    # Expects name to be original, but Test 1 changed it!
+    assert user.name == "Test User"  # FAILS
+```
+
+**Why bad:** Tests interfere when fixture is modified
+
+**Fix:** Use `scope="function"` for mutable fixtures
+
+---
+
+### ❌ Hidden Dependencies on Execution Order
+
+**Symptom:** Test suite has implicit execution order
+
+```python
+# test_a.py
+def test_create_admin():
+    create_user(email="admin@example.com", role="admin")
+
+# test_b.py
+def test_admin_permissions():
+    admin = get_user("admin@example.com")  # Assumes test_a ran!
+    assert admin.has_permission("delete_users")
+```
+
+**Why bad:** Breaks when tests run in different order or in parallel
+
+**Fix:** Each test creates its own dependencies
+
+---
+
+### ❌ Testing on Production-Like State
+
+**Symptom:** Tests run against shared database with existing data
+
+```python
+def test_user_count():
+    assert db.users.count() == 100  # Assumes specific state!
+```
+
+**Why bad:**
+- Tests fail when data changes
+- Can't run in parallel
+- Can't run idempotently
+
+**Fix:** Use isolated test database or count relative to test's own data
+
+---
+
+## Common Scenarios
+
+### Scenario 1: "Tests pass locally, fail in CI"
+
+**Likely causes:**
+1. **Timing issues** - CI is slower/faster, race conditions exposed
+2. **Parallel execution** - CI runs tests in parallel, local doesn't
+3. **Missing cleanup** - Local has leftover state, CI is fresh
+
+**Diagnosis:**
+```bash
+# Test parallel execution locally
+pytest -n 4
+
+# Test with clean state
+rm -rf .pytest_cache && pytest
+```
+
+---
+
+### Scenario 2: "Random test failures that disappear on retry"
+
+**Likely causes:**
+1. **Race conditions** - Async operations not awaited
+2. **Shared mutable state** - Global variables polluted
+3. **External service flakiness** - Real APIs being called
+
+**Diagnosis:**
+```bash
+# Run same test 100 times
+pytest --count=100 test_flaky.py
+
+# If failure rate is consistent (e.g., 5/100), it's likely shared state
+# If failure rate varies wildly, it's likely race condition
+```
+
+---
+
+### Scenario 3: "Database unique constraint violations"
+
+**Symptom:** `IntegrityError: duplicate key value violates unique constraint`
+
+**Cause:** Tests reuse same email/username/ID
+
+**Fix:**
+```python
+import uuid
+
+@pytest.fixture
+def unique_user():
+    email = f"test-{uuid.uuid4()}@example.com"
+    return create_user(email=email)
+```
+
+---
+
+## Quick Reference: Isolation Strategy Decision Tree
+
+```
+What resource needs isolation?
+
+DATABASE
+├─ Can you use transactions? → Transaction Rollback (fastest)
+├─ Need real commits? → Unique Data Per Test
+└─ Need schema changes? → Test Database Per Test
+
+FILES
+├─ Few files? → pytest's tmp_path
+└─ Complex directories? → tempfile.mkdtemp()
+
+EXTERNAL SERVICES
+├─ Testing integration? → Separate integration test suite
+└─ Testing business logic? → Mock the service
+
+IN-MEMORY STATE
+├─ Caches → Clear before each test (autouse fixture)
+├─ Globals → Dependency injection (refactor)
+└─ Module-level → Reset in fixture or avoid entirely
+
+PROCESSES/PORTS
+└─ Dynamic port allocation per test
+```
+
+---
+
+## Bottom Line
+
+**Test isolation is structural, not reactive.**
+
+- ❌ **Reactive:** Write cleanup code after each test
+- ✅ **Structural:** Design tests so cleanup isn't needed
+
+**The hierarchy:**
+1. **Best:** Dependency injection (no shared state)
+2. **Good:** Transactions/tmp_path (automatic cleanup)
+3. **Acceptable:** Unique data per test (explicit isolation)
+4. **Last resort:** Manual cleanup (fragile, error-prone)
+
+**If your tests fail together but pass alone, you have an isolation problem. Stop adding tests and fix isolation first.**