Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 17:50:49 +08:00
commit d601a38b28
12 changed files with 5068 additions and 0 deletions

View File

@@ -0,0 +1,761 @@
# Test Design Best Practices
## Building Stable, Deterministic Tests
The foundation of a reliable test suite is deterministic tests that produce consistent results. Non-deterministic or "flaky" tests erode trust and waste time debugging phantom failures.
### Eliminating Timing Dependencies
**The Problem**: Fixed sleep statements create unreliable tests that either waste time or fail intermittently.
```python
# ❌ BAD: Fixed sleep
def test_async_operation():
start_async_task()
time.sleep(5) # Hope 5 seconds is enough
assert task_completed()
# ✅ GOOD: Conditional wait
def test_async_operation():
start_async_task()
wait_until(lambda: task_completed(), timeout=10, interval=0.1)
assert task_completed()
```
**Strategies**:
- Use explicit waits with conditions, not arbitrary sleeps
- Implement timeout-based polling for async operations
- Use testing framework wait utilities (e.g., Selenium WebDriverWait)
- Mock time-dependent code to make tests deterministic
- Use fast fake timers in tests instead of actual delays
**Example: Polling with Timeout**
```python
def wait_until(condition, timeout=10, interval=0.1):
"""Wait until condition is true or timeout expires."""
end_time = time.time() + timeout
while time.time() < end_time:
if condition():
return True
time.sleep(interval)
raise TimeoutError(f"Condition not met within {timeout} seconds")
def test_message_processing():
queue.publish(message)
wait_until(lambda: queue.is_empty(), timeout=5)
assert message_handler.processed_count == 1
```
### Avoiding Shared State
**The Problem**: Tests that share state create order dependencies and cascading failures.
```python
# ❌ BAD: Shared state
shared_user = None
def test_create_user():
global shared_user
shared_user = User.create(email="test@example.com")
assert shared_user.id is not None
def test_user_can_login():
# Depends on previous test
result = login(shared_user.email, "password")
assert result.success
# ✅ GOOD: Independent tests
@pytest.fixture
def user():
"""Each test gets its own user."""
user = User.create(email=f"test-{uuid4()}@example.com")
yield user
user.delete()
def test_create_user(user):
assert user.id is not None
def test_user_can_login(user):
result = login(user.email, "password")
assert result.success
```
**Strategies**:
- Use test fixtures that create fresh instances
- Employ database transactions that rollback after each test
- Use in-memory databases that reset between tests
- Avoid global variables and singletons in tests
- Make each test completely self-contained
**Database Isolation Pattern**
```python
@pytest.fixture
def db_transaction():
"""Each test runs in a transaction that rolls back."""
connection = engine.connect()
transaction = connection.begin()
session = Session(bind=connection)
yield session
session.close()
transaction.rollback()
connection.close()
def test_user_creation(db_transaction):
user = User(email="test@example.com")
db_transaction.add(user)
db_transaction.commit()
assert db_transaction.query(User).count() == 1
# Transaction rolls back after test - no cleanup needed
```
### Handling External Dependencies
**The Problem**: Tests depending on external systems (APIs, databases, file systems) become non-deterministic and slow.
**Strategies**:
- **Mock external HTTP APIs**: Use libraries like `responses`, `httpretty`, or `requests-mock`
- **Use in-memory databases**: SQLite in-memory for SQL databases
- **Fake file systems**: Use `pyfakefs` or similar for file operations
- **Container-based dependencies**: Use Testcontainers for isolated, real dependencies
- **Test doubles**: Stubs, mocks, and fakes for external services
**Mocking HTTP Calls**
```python
import responses
@responses.activate
def test_fetch_user_data():
responses.add(
responses.GET,
'https://api.example.com/users/123',
json={'id': 123, 'name': 'Test User'},
status=200
)
user_data = api_client.get_user(123)
assert user_data['name'] == 'Test User'
```
### Controlling Randomness and Time
**Random Data**: Use fixed seeds for reproducible randomness
```python
# ❌ BAD: Non-deterministic
def test_shuffle():
items = [1, 2, 3, 4, 5]
random.shuffle(items) # Different result each run
assert items[0] < items[-1]
# ✅ GOOD: Deterministic
def test_shuffle():
random.seed(42)
items = [1, 2, 3, 4, 5]
random.shuffle(items)
assert items == [2, 4, 5, 3, 1] # Always same order
```
**Time-Dependent Code**: Mock current time
```python
from datetime import datetime
from unittest.mock import patch
# ❌ BAD: Depends on actual current time
def test_is_expired():
expiry = datetime(2024, 1, 1)
assert is_expired(expiry) # Fails after 2024-01-01
# ✅ GOOD: Freezes time
@patch('mymodule.datetime')
def test_is_expired(mock_datetime):
mock_datetime.now.return_value = datetime(2024, 6, 1)
expiry = datetime(2024, 1, 1)
assert is_expired(expiry)
```
---
## Boundary Analysis and Equivalence Partitioning
Systematic approaches to test input selection ensure comprehensive coverage without exhaustive testing.
### Boundary Value Analysis
**Principle**: Bugs often occur at the edges of input ranges. Test boundaries explicitly.
**For a function accepting ages 0-120:**
- Test: -1 (below minimum)
- Test: 0 (minimum boundary)
- Test: 1 (just above minimum)
- Test: 119 (just below maximum)
- Test: 120 (maximum boundary)
- Test: 121 (above maximum)
**Example Implementation**
```python
def validate_age(age):
if age < 0 or age > 120:
raise ValueError("Age must be between 0 and 120")
return True
@pytest.mark.parametrize("age,should_pass", [
(-1, False), # Below minimum
(0, True), # Minimum boundary
(1, True), # Just above minimum
(60, True), # Typical value
(119, True), # Just below maximum
(120, True), # Maximum boundary
(121, False), # Above maximum
])
def test_age_validation(age, should_pass):
if should_pass:
assert validate_age(age) == True
else:
with pytest.raises(ValueError):
validate_age(age)
```
### Equivalence Partitioning
**Principle**: Divide input space into classes that should behave identically. Test one representative from each class.
**For a discount function based on purchase amount:**
- Partition 1: $0-100 (no discount)
- Partition 2: $101-500 (10% discount)
- Partition 3: $501+ (20% discount)
Test representatives: $50, $300, $600
**Example Implementation**
```python
def calculate_discount(amount):
if amount <= 100:
return 0
elif amount <= 500:
return amount * 0.10
else:
return amount * 0.20
@pytest.mark.parametrize("amount,expected_discount", [
(50, 0), # Partition 1: 0-100
(100, 0), # Boundary
(101, 10.10), # Just into partition 2
(300, 30), # Partition 2: 101-500
(500, 50), # Boundary
(501, 100.20), # Just into partition 3
(1000, 200), # Partition 3: 501+
])
def test_discount_calculation(amount, expected_discount):
assert calculate_discount(amount) == pytest.approx(expected_discount)
```
### Property-Based Testing
**Principle**: Define invariants that should always hold, then generate hundreds of test cases automatically.
**Example: Reversing a List**
```python
from hypothesis import given
from hypothesis.strategies import lists, integers
@given(lists(integers()))
def test_reverse_twice_returns_original(items):
"""Reversing a list twice should return the original."""
reversed_once = list(reversed(items))
reversed_twice = list(reversed(reversed_once))
assert reversed_twice == items
@given(lists(integers()))
def test_sorted_list_has_same_elements(items):
"""Sorted list should have same elements as original."""
sorted_items = sorted(items)
assert sorted(sorted_items) == sorted(items)
assert len(sorted_items) == len(items)
```
**Benefits**:
- Finds edge cases you didn't think to test
- Tests many more scenarios than manual tests
- Shrinks failing examples to minimal reproducible cases
- Documents properties/invariants of the code
---
## Writing Expressive, Maintainable Tests
### Arrange-Act-Assert (AAA) Pattern
**Structure**: Organize tests into three clear sections for maximum readability.
```python
def test_user_registration():
# Arrange: Set up test data and dependencies
email = "newuser@example.com"
password = "SecurePass123!"
user_service = UserService(database, email_service)
# Act: Execute the behavior being tested
result = user_service.register(email, password)
# Assert: Verify expected outcomes
assert result.success == True
assert result.user.email == email
assert result.user.is_verified == False
```
**Benefits**:
- Immediately clear what the test does
- Easy to identify what's being tested
- Simple to understand failure location
- Consistent structure across test suite
**Variations**:
- Use blank lines to separate sections
- Use comments to label sections
- BDD Given-When-Then is equivalent structure
### Descriptive Test Names
**Principle**: Test names should describe scenario and expected outcome without reading test code.
**Naming Patterns**:
```python
# Pattern 1: test_[function]_[scenario]_[expected_result]
def test_calculate_discount_for_vip_customer_returns_ten_percent():
pass
# Pattern 2: test_[scenario]_[expected_result]
def test_invalid_email_raises_validation_error():
pass
# Pattern 3: Natural language with underscores
def test_user_cannot_login_with_expired_password():
pass
# Pattern 4: BDD-style
def test_given_vip_customer_when_ordering_then_receives_discount():
pass
```
**Good Test Names**:
-`test_empty_cart_checkout_raises_error`
-`test_duplicate_email_registration_rejected`
-`test_expired_token_authentication_fails`
-`test_concurrent_order_updates_handled_correctly`
**Bad Test Names**:
-`test_1`, `test_2`, `test_3` (meaningless)
-`test_order` (what about orders?)
-`test_the_thing_works` (what thing? how?)
-`test_bug_fix` (what bug?)
### Clear, Actionable Assertions
**Principle**: Assertion failures should immediately communicate what went wrong.
```python
# ❌ BAD: Unclear failure message
def test_discount_calculation():
assert calculate_discount(100) == 10
# Output: AssertionError: assert 0 == 10
# (Why did we expect 10? What was the input context?)
# ✅ GOOD: Clear failure message
def test_discount_calculation():
customer = Customer(type="VIP")
order_total = 100
discount = calculate_discount(customer, order_total)
assert discount == 10, \
f"VIP customer with ${order_total} order should get $10 discount, got ${discount}"
# Output: AssertionError: VIP customer with $100 order should get $10 discount, got $0
```
**Custom Assertion Messages**:
```python
# Complex conditions benefit from explanatory messages
assert user.is_active, \
f"User {user.email} should be active after verification, but is_active={user.is_active}"
assert len(results) > 0, \
f"Search for '{search_term}' should return results, got empty list"
assert response.status_code == 200, \
f"GET /api/users/{user_id} should return 200, got {response.status_code}: {response.text}"
```
### Keep Tests Focused and Small
**Principle**: Each test should verify one behavior or logical assertion.
```python
# ❌ BAD: Testing too much
def test_user_lifecycle():
# Create user
user = User.create(email="test@example.com")
assert user.id is not None
# Activate user
user.activate()
assert user.is_active
# Update profile
user.update_profile(name="Test User")
assert user.name == "Test User"
# Deactivate user
user.deactivate()
assert not user.is_active
# Delete user
user.delete()
assert User.find(user.id) is None
# ✅ GOOD: Focused tests
def test_create_user_assigns_id():
user = User.create(email="test@example.com")
assert user.id is not None
def test_activate_user_sets_active_flag():
user = User.create(email="test@example.com")
user.activate()
assert user.is_active == True
def test_update_profile_changes_name():
user = User.create(email="test@example.com")
user.update_profile(name="Test User")
assert user.name == "Test User"
```
**When Multiple Assertions Are OK**:
- Verifying different properties of the same object
- Checking related side effects of single action
- Asserting preconditions and postconditions
```python
# ✅ GOOD: Multiple related assertions
def test_order_creation():
items = [Item("Widget", 10), Item("Gadget", 20)]
order = Order.create(items)
assert order.id is not None
assert order.total == 30
assert len(order.items) == 2
assert order.status == "pending"
```
---
## Test Data Management
### Fixtures: Providing Test Dependencies
**Fixtures** provide reusable test dependencies and setup/teardown logic.
**Pytest Fixtures**
```python
@pytest.fixture
def database():
"""Provide a test database."""
db = create_test_database()
yield db
db.close()
@pytest.fixture
def user(database):
"""Provide a test user."""
user = User.create(email="test@example.com")
yield user
user.delete()
def test_user_can_place_order(user, database):
order = Order.create(user=user, items=[Item("Widget", 10)])
assert order.user_id == user.id
```
**Fixture Scopes**:
- `function`: New instance per test (default)
- `class`: Shared across test class
- `module`: Shared across test file
- `session`: Shared across entire test run
```python
@pytest.fixture(scope="session")
def app():
"""Start application once for entire test session."""
app = create_app()
app.start()
yield app
app.stop()
```
### Factories: Flexible Test Data Creation
**Factories** create test objects with sensible defaults that can be customized.
```python
# Factory pattern
class UserFactory:
_counter = 0
@classmethod
def create(cls, **kwargs):
cls._counter += 1
defaults = {
'email': f'user{cls._counter}@example.com',
'name': f'Test User {cls._counter}',
'role': 'user',
'is_active': True
}
defaults.update(kwargs)
return User(**defaults)
# Usage
def test_admin_user():
admin = UserFactory.create(role='admin', name='Admin User')
assert admin.role == 'admin'
def test_inactive_user():
inactive = UserFactory.create(is_active=False)
assert not inactive.is_active
```
**FactoryBoy Library**
```python
import factory
class UserFactory(factory.Factory):
class Meta:
model = User
email = factory.Sequence(lambda n: f'user{n}@example.com')
name = factory.Faker('name')
role = 'user'
is_active = True
# Usage
def test_user_creation():
user = UserFactory()
assert user.email.endswith('@example.com')
def test_admin_user():
admin = UserFactory(role='admin')
assert admin.role == 'admin'
```
### Builders: Constructing Complex Objects
**Builders** provide fluent APIs for creating complex test objects.
```python
class OrderBuilder:
def __init__(self):
self._user = None
self._items = []
self._status = 'pending'
self._shipping_address = None
def with_user(self, user):
self._user = user
return self
def with_item(self, name, price, quantity=1):
self._items.append(Item(name, price, quantity))
return self
def with_status(self, status):
self._status = status
return self
def with_shipping(self, address):
self._shipping_address = address
return self
def build(self):
return Order(
user=self._user,
items=self._items,
status=self._status,
shipping_address=self._shipping_address
)
# Usage
def test_order_total_calculation():
order = (OrderBuilder()
.with_item("Widget", 10, quantity=2)
.with_item("Gadget", 20)
.build())
assert order.total == 40
```
### Snapshot/Golden Master Testing
**Principle**: Capture complex output as baseline, verify future runs match.
**Use Cases**:
- Testing rendered HTML or UI components
- Verifying generated reports or documents
- Validating complex JSON/XML output
- Testing data transformations with many fields
```python
def test_user_profile_rendering(snapshot):
user = User(name="John Doe", email="john@example.com", age=30)
html = render_user_profile(user)
snapshot.assert_match(html, 'user_profile.html')
# First run: Captures output as baseline
# Subsequent runs: Compares against baseline
# To update baseline: pytest --snapshot-update
```
**Trade-offs**:
- **Pro**: Easy to test complex outputs
- **Pro**: Catches unintended changes
- **Con**: Changes to output format require updating all snapshots
- **Con**: Can hide real regressions if snapshots updated without review
---
## Mocking, Stubbing, and Fakes
### Understanding the Distinctions
**Stub**: Provides canned responses to method calls
```python
class StubEmailService:
def send_email(self, to, subject, body):
return True # Always succeeds, doesn't actually send
```
**Mock**: Records interactions and verifies they occurred as expected
```python
from unittest.mock import Mock
email_service = Mock()
user_service.register(user)
email_service.send_welcome_email.assert_called_once_with(user.email)
```
**Fake**: Simplified working implementation
```python
class FakeDatabase:
def __init__(self):
self._data = {}
def save(self, id, value):
self._data[id] = value
def find(self, id):
return self._data.get(id)
```
### When to Use Each Type
**Use Stubs** for query dependencies that provide data
```python
def test_calculate_user_discount():
stub_repository = StubUserRepository()
stub_repository.set_return_value(User(type="VIP"))
discount_service = DiscountService(stub_repository)
discount = discount_service.calculate_for_user(user_id=123)
assert discount == 10
```
**Use Mocks** for command dependencies where interaction matters
```python
def test_order_completion_sends_confirmation():
mock_email_service = Mock()
order_service = OrderService(mock_email_service)
order_service.complete_order(order_id=123)
mock_email_service.send_confirmation.assert_called_once()
```
**Use Fakes** for complex dependencies needing realistic behavior
```python
def test_user_repository_operations():
fake_db = FakeDatabase()
repository = UserRepository(fake_db)
user = User(email="test@example.com")
repository.save(user)
found = repository.find_by_email("test@example.com")
assert found.email == user.email
```
### Avoiding Over-Mocking
**The Problem**: Mocking every dependency couples tests to implementation.
```python
# ❌ BAD: Over-mocking
def test_order_creation():
mock_validator = Mock()
mock_repository = Mock()
mock_calculator = Mock()
mock_logger = Mock()
service = OrderService(mock_validator, mock_repository, mock_calculator, mock_logger)
service.create_order(data)
mock_validator.validate.assert_called_once()
mock_calculator.calculate_total.assert_called_once()
mock_repository.save.assert_called_once()
mock_logger.log.assert_called()
# ✅ GOOD: Mock only external boundaries
def test_order_creation():
repository = InMemoryOrderRepository()
service = OrderService(repository)
order = service.create_order(data)
assert order.total == 100
assert repository.find(order.id) is not None
```
**Guidelines**:
- Mock external systems (databases, APIs, file systems)
- Use real implementations for internal collaborators
- Don't verify every method call
- Focus on observable behavior, not interactions
---
## Summary: Test Design Principles
1. **Deterministic tests**: Eliminate timing issues, shared state, and external dependencies
2. **Boundary testing**: Test edges systematically where bugs hide
3. **Expressive tests**: Clear names, AAA structure, actionable failures
4. **Focused tests**: One behavior per test, small and understandable
5. **Flexible test data**: Use factories and builders for maintainable test data
6. **Strategic mocking**: Mock at system boundaries, use real implementations internally
7. **Fast execution**: Keep unit tests fast for rapid feedback loops
These practices work together to create test suites that catch bugs, enable refactoring, and remain maintainable over time.