zhongwei/gh-aazbeltran-claude-code-plugins-plugins-testing

Files

Zhongwei Li d601a38b28 Initial commit

2025-11-29 17:50:49 +08:00

20 KiB

Raw Permalink Blame History

Test Design Best Practices

Building Stable, Deterministic Tests

The foundation of a reliable test suite is deterministic tests that produce consistent results. Non-deterministic or "flaky" tests erode trust and waste time debugging phantom failures.

Eliminating Timing Dependencies

The Problem: Fixed sleep statements create unreliable tests that either waste time or fail intermittently.

# ❌ BAD: Fixed sleep
def test_async_operation():
    start_async_task()
    time.sleep(5)  # Hope 5 seconds is enough
    assert task_completed()

# ✅ GOOD: Conditional wait
def test_async_operation():
    start_async_task()
    wait_until(lambda: task_completed(), timeout=10, interval=0.1)
    assert task_completed()

Strategies:

Use explicit waits with conditions, not arbitrary sleeps
Implement timeout-based polling for async operations
Use testing framework wait utilities (e.g., Selenium WebDriverWait)
Mock time-dependent code to make tests deterministic
Use fast fake timers in tests instead of actual delays

Example: Polling with Timeout

def wait_until(condition, timeout=10, interval=0.1):
    """Wait until condition is true or timeout expires."""
    end_time = time.time() + timeout
    while time.time() < end_time:
        if condition():
            return True
        time.sleep(interval)
    raise TimeoutError(f"Condition not met within {timeout} seconds")

def test_message_processing():
    queue.publish(message)

    wait_until(lambda: queue.is_empty(), timeout=5)

    assert message_handler.processed_count == 1

Avoiding Shared State

The Problem: Tests that share state create order dependencies and cascading failures.

# ❌ BAD: Shared state
shared_user = None

def test_create_user():
    global shared_user
    shared_user = User.create(email="test@example.com")
    assert shared_user.id is not None

def test_user_can_login():
    # Depends on previous test
    result = login(shared_user.email, "password")
    assert result.success

# ✅ GOOD: Independent tests
@pytest.fixture
def user():
    """Each test gets its own user."""
    user = User.create(email=f"test-{uuid4()}@example.com")
    yield user
    user.delete()

def test_create_user(user):
    assert user.id is not None

def test_user_can_login(user):
    result = login(user.email, "password")
    assert result.success

Strategies:

Use test fixtures that create fresh instances
Employ database transactions that rollback after each test
Use in-memory databases that reset between tests
Avoid global variables and singletons in tests
Make each test completely self-contained

Database Isolation Pattern

@pytest.fixture
def db_transaction():
    """Each test runs in a transaction that rolls back."""
    connection = engine.connect()
    transaction = connection.begin()
    session = Session(bind=connection)

    yield session

    session.close()
    transaction.rollback()
    connection.close()

def test_user_creation(db_transaction):
    user = User(email="test@example.com")
    db_transaction.add(user)
    db_transaction.commit()

    assert db_transaction.query(User).count() == 1
    # Transaction rolls back after test - no cleanup needed

Handling External Dependencies

The Problem: Tests depending on external systems (APIs, databases, file systems) become non-deterministic and slow.

Strategies:

Mock external HTTP APIs: Use libraries like responses, httpretty, or requests-mock
Use in-memory databases: SQLite in-memory for SQL databases
Fake file systems: Use pyfakefs or similar for file operations
Container-based dependencies: Use Testcontainers for isolated, real dependencies
Test doubles: Stubs, mocks, and fakes for external services

Mocking HTTP Calls

import responses

@responses.activate
def test_fetch_user_data():
    responses.add(
        responses.GET,
        'https://api.example.com/users/123',
        json={'id': 123, 'name': 'Test User'},
        status=200
    )

    user_data = api_client.get_user(123)

    assert user_data['name'] == 'Test User'

Controlling Randomness and Time

Random Data: Use fixed seeds for reproducible randomness

# ❌ BAD: Non-deterministic
def test_shuffle():
    items = [1, 2, 3, 4, 5]
    random.shuffle(items)  # Different result each run
    assert items[0] < items[-1]

# ✅ GOOD: Deterministic
def test_shuffle():
    random.seed(42)
    items = [1, 2, 3, 4, 5]
    random.shuffle(items)
    assert items == [2, 4, 5, 3, 1]  # Always same order

Time-Dependent Code: Mock current time

from datetime import datetime
from unittest.mock import patch

# ❌ BAD: Depends on actual current time
def test_is_expired():
    expiry = datetime(2024, 1, 1)
    assert is_expired(expiry)  # Fails after 2024-01-01

# ✅ GOOD: Freezes time
@patch('mymodule.datetime')
def test_is_expired(mock_datetime):
    mock_datetime.now.return_value = datetime(2024, 6, 1)

    expiry = datetime(2024, 1, 1)

    assert is_expired(expiry)

Boundary Analysis and Equivalence Partitioning

Systematic approaches to test input selection ensure comprehensive coverage without exhaustive testing.

Boundary Value Analysis

Principle: Bugs often occur at the edges of input ranges. Test boundaries explicitly.

For a function accepting ages 0-120:

Test: -1 (below minimum)
Test: 0 (minimum boundary)
Test: 1 (just above minimum)
Test: 119 (just below maximum)
Test: 120 (maximum boundary)
Test: 121 (above maximum)

Example Implementation

def validate_age(age):
    if age < 0 or age > 120:
        raise ValueError("Age must be between 0 and 120")
    return True

@pytest.mark.parametrize("age,should_pass", [
    (-1, False),   # Below minimum
    (0, True),     # Minimum boundary
    (1, True),     # Just above minimum
    (60, True),    # Typical value
    (119, True),   # Just below maximum
    (120, True),   # Maximum boundary
    (121, False),  # Above maximum
])
def test_age_validation(age, should_pass):
    if should_pass:
        assert validate_age(age) == True
    else:
        with pytest.raises(ValueError):
            validate_age(age)

Equivalence Partitioning

Principle: Divide input space into classes that should behave identically. Test one representative from each class.

For a discount function based on purchase amount:

Partition 1: $0-100 (no discount)
Partition 2: $101-500 (10% discount)
Partition 3: $501+ (20% discount)

Test representatives: $50, $300, $600

Example Implementation

def calculate_discount(amount):
    if amount <= 100:
        return 0
    elif amount <= 500:
        return amount * 0.10
    else:
        return amount * 0.20

@pytest.mark.parametrize("amount,expected_discount", [
    (50, 0),         # Partition 1: 0-100
    (100, 0),        # Boundary
    (101, 10.10),    # Just into partition 2
    (300, 30),       # Partition 2: 101-500
    (500, 50),       # Boundary
    (501, 100.20),   # Just into partition 3
    (1000, 200),     # Partition 3: 501+
])
def test_discount_calculation(amount, expected_discount):
    assert calculate_discount(amount) == pytest.approx(expected_discount)

Property-Based Testing

Principle: Define invariants that should always hold, then generate hundreds of test cases automatically.

Example: Reversing a List

from hypothesis import given
from hypothesis.strategies import lists, integers

@given(lists(integers()))
def test_reverse_twice_returns_original(items):
    """Reversing a list twice should return the original."""
    reversed_once = list(reversed(items))
    reversed_twice = list(reversed(reversed_once))
    assert reversed_twice == items

@given(lists(integers()))
def test_sorted_list_has_same_elements(items):
    """Sorted list should have same elements as original."""
    sorted_items = sorted(items)
    assert sorted(sorted_items) == sorted(items)
    assert len(sorted_items) == len(items)

Benefits:

Finds edge cases you didn't think to test
Tests many more scenarios than manual tests
Shrinks failing examples to minimal reproducible cases
Documents properties/invariants of the code

Writing Expressive, Maintainable Tests

Arrange-Act-Assert (AAA) Pattern

Structure: Organize tests into three clear sections for maximum readability.

def test_user_registration():
    # Arrange: Set up test data and dependencies
    email = "newuser@example.com"
    password = "SecurePass123!"
    user_service = UserService(database, email_service)

    # Act: Execute the behavior being tested
    result = user_service.register(email, password)

    # Assert: Verify expected outcomes
    assert result.success == True
    assert result.user.email == email
    assert result.user.is_verified == False

Benefits:

Immediately clear what the test does
Easy to identify what's being tested
Simple to understand failure location
Consistent structure across test suite

Variations:

Use blank lines to separate sections
Use comments to label sections
BDD Given-When-Then is equivalent structure

Descriptive Test Names

Principle: Test names should describe scenario and expected outcome without reading test code.

Naming Patterns:

# Pattern 1: test_[function]_[scenario]_[expected_result]
def test_calculate_discount_for_vip_customer_returns_ten_percent():
    pass

# Pattern 2: test_[scenario]_[expected_result]
def test_invalid_email_raises_validation_error():
    pass

# Pattern 3: Natural language with underscores
def test_user_cannot_login_with_expired_password():
    pass

# Pattern 4: BDD-style
def test_given_vip_customer_when_ordering_then_receives_discount():
    pass

Good Test Names:

✅ test_empty_cart_checkout_raises_error
✅ test_duplicate_email_registration_rejected
✅ test_expired_token_authentication_fails
✅ test_concurrent_order_updates_handled_correctly

Bad Test Names:

❌ test_1, test_2, test_3 (meaningless)
❌ test_order (what about orders?)
❌ test_the_thing_works (what thing? how?)
❌ test_bug_fix (what bug?)

Clear, Actionable Assertions

Principle: Assertion failures should immediately communicate what went wrong.

# ❌ BAD: Unclear failure message
def test_discount_calculation():
    assert calculate_discount(100) == 10

# Output: AssertionError: assert 0 == 10
# (Why did we expect 10? What was the input context?)

# ✅ GOOD: Clear failure message
def test_discount_calculation():
    customer = Customer(type="VIP")
    order_total = 100

    discount = calculate_discount(customer, order_total)

    assert discount == 10, \
        f"VIP customer with ${order_total} order should get $10 discount, got ${discount}"

# Output: AssertionError: VIP customer with $100 order should get $10 discount, got $0

Custom Assertion Messages:

# Complex conditions benefit from explanatory messages
assert user.is_active, \
    f"User {user.email} should be active after verification, but is_active={user.is_active}"

assert len(results) > 0, \
    f"Search for '{search_term}' should return results, got empty list"

assert response.status_code == 200, \
    f"GET /api/users/{user_id} should return 200, got {response.status_code}: {response.text}"

Keep Tests Focused and Small

Principle: Each test should verify one behavior or logical assertion.

# ❌ BAD: Testing too much
def test_user_lifecycle():
    # Create user
    user = User.create(email="test@example.com")
    assert user.id is not None

    # Activate user
    user.activate()
    assert user.is_active

    # Update profile
    user.update_profile(name="Test User")
    assert user.name == "Test User"

    # Deactivate user
    user.deactivate()
    assert not user.is_active

    # Delete user
    user.delete()
    assert User.find(user.id) is None

# ✅ GOOD: Focused tests
def test_create_user_assigns_id():
    user = User.create(email="test@example.com")
    assert user.id is not None

def test_activate_user_sets_active_flag():
    user = User.create(email="test@example.com")

    user.activate()

    assert user.is_active == True

def test_update_profile_changes_name():
    user = User.create(email="test@example.com")

    user.update_profile(name="Test User")

    assert user.name == "Test User"

When Multiple Assertions Are OK:

Verifying different properties of the same object
Checking related side effects of single action
Asserting preconditions and postconditions

# ✅ GOOD: Multiple related assertions
def test_order_creation():
    items = [Item("Widget", 10), Item("Gadget", 20)]

    order = Order.create(items)

    assert order.id is not None
    assert order.total == 30
    assert len(order.items) == 2
    assert order.status == "pending"

Test Data Management

Fixtures: Providing Test Dependencies

Fixtures provide reusable test dependencies and setup/teardown logic.

Pytest Fixtures

@pytest.fixture
def database():
    """Provide a test database."""
    db = create_test_database()
    yield db
    db.close()

@pytest.fixture
def user(database):
    """Provide a test user."""
    user = User.create(email="test@example.com")
    yield user
    user.delete()

def test_user_can_place_order(user, database):
    order = Order.create(user=user, items=[Item("Widget", 10)])
    assert order.user_id == user.id

Fixture Scopes:

function: New instance per test (default)
class: Shared across test class
module: Shared across test file
session: Shared across entire test run

@pytest.fixture(scope="session")
def app():
    """Start application once for entire test session."""
    app = create_app()
    app.start()
    yield app
    app.stop()

Factories: Flexible Test Data Creation

Factories create test objects with sensible defaults that can be customized.

# Factory pattern
class UserFactory:
    _counter = 0

    @classmethod
    def create(cls, **kwargs):
        cls._counter += 1
        defaults = {
            'email': f'user{cls._counter}@example.com',
            'name': f'Test User {cls._counter}',
            'role': 'user',
            'is_active': True
        }
        defaults.update(kwargs)
        return User(**defaults)

# Usage
def test_admin_user():
    admin = UserFactory.create(role='admin', name='Admin User')
    assert admin.role == 'admin'

def test_inactive_user():
    inactive = UserFactory.create(is_active=False)
    assert not inactive.is_active

FactoryBoy Library

import factory

class UserFactory(factory.Factory):
    class Meta:
        model = User

    email = factory.Sequence(lambda n: f'user{n}@example.com')
    name = factory.Faker('name')
    role = 'user'
    is_active = True

# Usage
def test_user_creation():
    user = UserFactory()
    assert user.email.endswith('@example.com')

def test_admin_user():
    admin = UserFactory(role='admin')
    assert admin.role == 'admin'

Builders: Constructing Complex Objects

Builders provide fluent APIs for creating complex test objects.

class OrderBuilder:
    def __init__(self):
        self._user = None
        self._items = []
        self._status = 'pending'
        self._shipping_address = None

    def with_user(self, user):
        self._user = user
        return self

    def with_item(self, name, price, quantity=1):
        self._items.append(Item(name, price, quantity))
        return self

    def with_status(self, status):
        self._status = status
        return self

    def with_shipping(self, address):
        self._shipping_address = address
        return self

    def build(self):
        return Order(
            user=self._user,
            items=self._items,
            status=self._status,
            shipping_address=self._shipping_address
        )

# Usage
def test_order_total_calculation():
    order = (OrderBuilder()
        .with_item("Widget", 10, quantity=2)
        .with_item("Gadget", 20)
        .build())

    assert order.total == 40

Snapshot/Golden Master Testing

Principle: Capture complex output as baseline, verify future runs match.

Use Cases:

Testing rendered HTML or UI components
Verifying generated reports or documents
Validating complex JSON/XML output
Testing data transformations with many fields

def test_user_profile_rendering(snapshot):
    user = User(name="John Doe", email="john@example.com", age=30)

    html = render_user_profile(user)

    snapshot.assert_match(html, 'user_profile.html')

# First run: Captures output as baseline
# Subsequent runs: Compares against baseline
# To update baseline: pytest --snapshot-update

Trade-offs:

Pro: Easy to test complex outputs
Pro: Catches unintended changes
Con: Changes to output format require updating all snapshots
Con: Can hide real regressions if snapshots updated without review

Mocking, Stubbing, and Fakes

Understanding the Distinctions

Stub: Provides canned responses to method calls

class StubEmailService:
    def send_email(self, to, subject, body):
        return True  # Always succeeds, doesn't actually send

Mock: Records interactions and verifies they occurred as expected

from unittest.mock import Mock

email_service = Mock()
user_service.register(user)

email_service.send_welcome_email.assert_called_once_with(user.email)

Fake: Simplified working implementation

class FakeDatabase:
    def __init__(self):
        self._data = {}

    def save(self, id, value):
        self._data[id] = value

    def find(self, id):
        return self._data.get(id)

When to Use Each Type

Use Stubs for query dependencies that provide data

def test_calculate_user_discount():
    stub_repository = StubUserRepository()
    stub_repository.set_return_value(User(type="VIP"))

    discount_service = DiscountService(stub_repository)

    discount = discount_service.calculate_for_user(user_id=123)

    assert discount == 10

Use Mocks for command dependencies where interaction matters

def test_order_completion_sends_confirmation():
    mock_email_service = Mock()
    order_service = OrderService(mock_email_service)

    order_service.complete_order(order_id=123)

    mock_email_service.send_confirmation.assert_called_once()

Use Fakes for complex dependencies needing realistic behavior

def test_user_repository_operations():
    fake_db = FakeDatabase()
    repository = UserRepository(fake_db)

    user = User(email="test@example.com")
    repository.save(user)

    found = repository.find_by_email("test@example.com")
    assert found.email == user.email

Avoiding Over-Mocking

The Problem: Mocking every dependency couples tests to implementation.

# ❌ BAD: Over-mocking
def test_order_creation():
    mock_validator = Mock()
    mock_repository = Mock()
    mock_calculator = Mock()
    mock_logger = Mock()

    service = OrderService(mock_validator, mock_repository, mock_calculator, mock_logger)

    service.create_order(data)

    mock_validator.validate.assert_called_once()
    mock_calculator.calculate_total.assert_called_once()
    mock_repository.save.assert_called_once()
    mock_logger.log.assert_called()

# ✅ GOOD: Mock only external boundaries
def test_order_creation():
    repository = InMemoryOrderRepository()
    service = OrderService(repository)

    order = service.create_order(data)

    assert order.total == 100
    assert repository.find(order.id) is not None

Guidelines:

Mock external systems (databases, APIs, file systems)
Use real implementations for internal collaborators
Don't verify every method call
Focus on observable behavior, not interactions

Summary: Test Design Principles

Deterministic tests: Eliminate timing issues, shared state, and external dependencies
Boundary testing: Test edges systematically where bugs hide
Expressive tests: Clear names, AAA structure, actionable failures
Focused tests: One behavior per test, small and understandable
Flexible test data: Use factories and builders for maintainable test data
Strategic mocking: Mock at system boundaries, use real implementations internally
Fast execution: Keep unit tests fast for rapid feedback loops

These practices work together to create test suites that catch bugs, enable refactoring, and remain maintainable over time.

20 KiB Raw Permalink Blame History

Test Design Best Practices

Building Stable, Deterministic Tests

Eliminating Timing Dependencies

Avoiding Shared State

Handling External Dependencies

Controlling Randomness and Time

Boundary Analysis and Equivalence Partitioning

Boundary Value Analysis

Equivalence Partitioning

Property-Based Testing

Writing Expressive, Maintainable Tests

Arrange-Act-Assert (AAA) Pattern

Descriptive Test Names

Clear, Actionable Assertions

Keep Tests Focused and Small

Test Data Management

Fixtures: Providing Test Dependencies

Factories: Flexible Test Data Creation

Builders: Constructing Complex Objects

Snapshot/Golden Master Testing

Mocking, Stubbing, and Fakes

Understanding the Distinctions

When to Use Each Type

Avoiding Over-Mocking

Summary: Test Design Principles

20 KiB

Raw Permalink Blame History