Initial commit
This commit is contained in:
455
skills/temporal-python-testing/resources/replay-testing.md
Normal file
455
skills/temporal-python-testing/resources/replay-testing.md
Normal file
@@ -0,0 +1,455 @@
|
||||
# Replay Testing for Determinism and Compatibility
|
||||
|
||||
Comprehensive guide for validating workflow determinism and ensuring safe code changes using replay testing.
|
||||
|
||||
## What is Replay Testing?
|
||||
|
||||
**Purpose**: Verify that workflow code changes are backward-compatible with existing workflow executions
|
||||
|
||||
**How it works**:
|
||||
1. Temporal records every workflow decision as Event History
|
||||
2. Replay testing re-executes workflow code against recorded history
|
||||
3. If new code makes same decisions → deterministic (safe to deploy)
|
||||
4. If decisions differ → non-deterministic (breaking change)
|
||||
|
||||
**Critical Use Cases**:
|
||||
- Deploying workflow code changes to production
|
||||
- Validating refactoring doesn't break running workflows
|
||||
- CI/CD automated compatibility checks
|
||||
- Version migration validation
|
||||
|
||||
## Basic Replay Testing
|
||||
|
||||
### Replayer Setup
|
||||
|
||||
```python
|
||||
from temporalio.worker import Replayer
|
||||
from temporalio.client import Client
|
||||
|
||||
async def test_workflow_replay():
|
||||
"""Test workflow against production history"""
|
||||
|
||||
# Connect to Temporal server
|
||||
client = await Client.connect("localhost:7233")
|
||||
|
||||
# Create replayer with current workflow code
|
||||
replayer = Replayer(
|
||||
workflows=[OrderWorkflow, PaymentWorkflow]
|
||||
)
|
||||
|
||||
# Fetch workflow history from production
|
||||
handle = client.get_workflow_handle("order-123")
|
||||
history = await handle.fetch_history()
|
||||
|
||||
# Replay history with current code
|
||||
await replayer.replay_workflow(history)
|
||||
# Success = deterministic, Exception = breaking change
|
||||
```
|
||||
|
||||
### Testing Against Multiple Histories
|
||||
|
||||
```python
|
||||
import pytest
|
||||
from temporalio.worker import Replayer
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_replay_multiple_workflows():
|
||||
"""Replay against multiple production histories"""
|
||||
|
||||
replayer = Replayer(workflows=[OrderWorkflow])
|
||||
|
||||
# Test against different workflow executions
|
||||
workflow_ids = [
|
||||
"order-success-123",
|
||||
"order-cancelled-456",
|
||||
"order-retry-789",
|
||||
]
|
||||
|
||||
for workflow_id in workflow_ids:
|
||||
handle = client.get_workflow_handle(workflow_id)
|
||||
history = await handle.fetch_history()
|
||||
|
||||
# Replay should succeed for all variants
|
||||
await replayer.replay_workflow(history)
|
||||
```
|
||||
|
||||
## Determinism Validation
|
||||
|
||||
### Common Non-Deterministic Patterns
|
||||
|
||||
**Problem: Random Number Generation**
|
||||
```python
|
||||
# ❌ Non-deterministic (breaks replay)
|
||||
@workflow.defn
|
||||
class BadWorkflow:
|
||||
@workflow.run
|
||||
async def run(self) -> int:
|
||||
return random.randint(1, 100) # Different on replay!
|
||||
|
||||
# ✅ Deterministic (safe for replay)
|
||||
@workflow.defn
|
||||
class GoodWorkflow:
|
||||
@workflow.run
|
||||
async def run(self) -> int:
|
||||
return workflow.random().randint(1, 100) # Deterministic random
|
||||
```
|
||||
|
||||
**Problem: Current Time**
|
||||
```python
|
||||
# ❌ Non-deterministic
|
||||
@workflow.defn
|
||||
class BadWorkflow:
|
||||
@workflow.run
|
||||
async def run(self) -> str:
|
||||
now = datetime.now() # Different on replay!
|
||||
return now.isoformat()
|
||||
|
||||
# ✅ Deterministic
|
||||
@workflow.defn
|
||||
class GoodWorkflow:
|
||||
@workflow.run
|
||||
async def run(self) -> str:
|
||||
now = workflow.now() # Deterministic time
|
||||
return now.isoformat()
|
||||
```
|
||||
|
||||
**Problem: Direct External Calls**
|
||||
```python
|
||||
# ❌ Non-deterministic
|
||||
@workflow.defn
|
||||
class BadWorkflow:
|
||||
@workflow.run
|
||||
async def run(self) -> dict:
|
||||
response = requests.get("https://api.example.com/data") # External call!
|
||||
return response.json()
|
||||
|
||||
# ✅ Deterministic
|
||||
@workflow.defn
|
||||
class GoodWorkflow:
|
||||
@workflow.run
|
||||
async def run(self) -> dict:
|
||||
# Use activity for external calls
|
||||
return await workflow.execute_activity(
|
||||
fetch_external_data,
|
||||
start_to_close_timeout=timedelta(seconds=30),
|
||||
)
|
||||
```
|
||||
|
||||
### Testing Determinism
|
||||
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_workflow_determinism():
|
||||
"""Verify workflow produces same output on multiple runs"""
|
||||
|
||||
@workflow.defn
|
||||
class DeterministicWorkflow:
|
||||
@workflow.run
|
||||
async def run(self, seed: int) -> list[int]:
|
||||
# Use workflow.random() for determinism
|
||||
rng = workflow.random()
|
||||
rng.seed(seed)
|
||||
return [rng.randint(1, 100) for _ in range(10)]
|
||||
|
||||
env = await WorkflowEnvironment.start_time_skipping()
|
||||
|
||||
# Run workflow twice with same input
|
||||
results = []
|
||||
for i in range(2):
|
||||
async with Worker(
|
||||
env.client,
|
||||
task_queue="test",
|
||||
workflows=[DeterministicWorkflow],
|
||||
):
|
||||
result = await env.client.execute_workflow(
|
||||
DeterministicWorkflow.run,
|
||||
42, # Same seed
|
||||
id=f"determinism-test-{i}",
|
||||
task_queue="test",
|
||||
)
|
||||
results.append(result)
|
||||
|
||||
await env.shutdown()
|
||||
|
||||
# Verify identical outputs
|
||||
assert results[0] == results[1]
|
||||
```
|
||||
|
||||
## Production History Replay
|
||||
|
||||
### Exporting Workflow History
|
||||
|
||||
```python
|
||||
from temporalio.client import Client
|
||||
|
||||
async def export_workflow_history(workflow_id: str, output_file: str):
|
||||
"""Export workflow history for replay testing"""
|
||||
|
||||
client = await Client.connect("production.temporal.io:7233")
|
||||
|
||||
# Fetch workflow history
|
||||
handle = client.get_workflow_handle(workflow_id)
|
||||
history = await handle.fetch_history()
|
||||
|
||||
# Save to file for replay testing
|
||||
with open(output_file, "wb") as f:
|
||||
f.write(history.SerializeToString())
|
||||
|
||||
print(f"Exported history to {output_file}")
|
||||
```
|
||||
|
||||
### Replaying from File
|
||||
|
||||
```python
|
||||
from temporalio.worker import Replayer
|
||||
from temporalio.api.history.v1 import History
|
||||
|
||||
async def test_replay_from_file():
|
||||
"""Replay workflow from exported history file"""
|
||||
|
||||
# Load history from file
|
||||
with open("workflow_histories/order-123.pb", "rb") as f:
|
||||
history = History.FromString(f.read())
|
||||
|
||||
# Replay with current workflow code
|
||||
replayer = Replayer(workflows=[OrderWorkflow])
|
||||
await replayer.replay_workflow(history)
|
||||
# Success = safe to deploy
|
||||
```
|
||||
|
||||
## CI/CD Integration Patterns
|
||||
|
||||
### GitHub Actions Example
|
||||
|
||||
```yaml
|
||||
# .github/workflows/replay-tests.yml
|
||||
name: Replay Tests
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
branches: [main]
|
||||
|
||||
jobs:
|
||||
replay-tests:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v4
|
||||
with:
|
||||
python-version: "3.11"
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
pip install -r requirements.txt
|
||||
pip install pytest pytest-asyncio
|
||||
|
||||
- name: Download production histories
|
||||
run: |
|
||||
# Fetch recent workflow histories from production
|
||||
python scripts/export_histories.py
|
||||
|
||||
- name: Run replay tests
|
||||
run: |
|
||||
pytest tests/replay/ --verbose
|
||||
|
||||
- name: Upload results
|
||||
if: failure()
|
||||
uses: actions/upload-artifact@v3
|
||||
with:
|
||||
name: replay-failures
|
||||
path: replay-failures/
|
||||
```
|
||||
|
||||
### Automated History Export
|
||||
|
||||
```python
|
||||
# scripts/export_histories.py
|
||||
import asyncio
|
||||
from temporalio.client import Client
|
||||
from datetime import datetime, timedelta
|
||||
|
||||
async def export_recent_histories():
|
||||
"""Export recent production workflow histories"""
|
||||
|
||||
client = await Client.connect("production.temporal.io:7233")
|
||||
|
||||
# Query recent completed workflows
|
||||
workflows = client.list_workflows(
|
||||
query="WorkflowType='OrderWorkflow' AND CloseTime > '7 days ago'"
|
||||
)
|
||||
|
||||
count = 0
|
||||
async for workflow in workflows:
|
||||
# Export history
|
||||
history = await workflow.fetch_history()
|
||||
|
||||
# Save to file
|
||||
filename = f"workflow_histories/{workflow.id}.pb"
|
||||
with open(filename, "wb") as f:
|
||||
f.write(history.SerializeToString())
|
||||
|
||||
count += 1
|
||||
if count >= 100: # Limit to 100 most recent
|
||||
break
|
||||
|
||||
print(f"Exported {count} workflow histories")
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(export_recent_histories())
|
||||
```
|
||||
|
||||
### Replay Test Suite
|
||||
|
||||
```python
|
||||
# tests/replay/test_workflow_replay.py
|
||||
import pytest
|
||||
import glob
|
||||
from temporalio.worker import Replayer
|
||||
from temporalio.api.history.v1 import History
|
||||
from workflows import OrderWorkflow, PaymentWorkflow
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_replay_all_histories():
|
||||
"""Replay all production histories"""
|
||||
|
||||
replayer = Replayer(
|
||||
workflows=[OrderWorkflow, PaymentWorkflow]
|
||||
)
|
||||
|
||||
# Load all history files
|
||||
history_files = glob.glob("workflow_histories/*.pb")
|
||||
|
||||
failures = []
|
||||
for history_file in history_files:
|
||||
try:
|
||||
with open(history_file, "rb") as f:
|
||||
history = History.FromString(f.read())
|
||||
|
||||
await replayer.replay_workflow(history)
|
||||
print(f"✓ {history_file}")
|
||||
|
||||
except Exception as e:
|
||||
failures.append((history_file, str(e)))
|
||||
print(f"✗ {history_file}: {e}")
|
||||
|
||||
# Report failures
|
||||
if failures:
|
||||
pytest.fail(
|
||||
f"Replay failed for {len(failures)} workflows:\n"
|
||||
+ "\n".join(f" {file}: {error}" for file, error in failures)
|
||||
)
|
||||
```
|
||||
|
||||
## Version Compatibility Testing
|
||||
|
||||
### Testing Code Evolution
|
||||
|
||||
```python
|
||||
@pytest.mark.asyncio
|
||||
async def test_workflow_version_compatibility():
|
||||
"""Test workflow with version changes"""
|
||||
|
||||
@workflow.defn
|
||||
class EvolvingWorkflow:
|
||||
@workflow.run
|
||||
async def run(self) -> str:
|
||||
# Use versioning for safe code evolution
|
||||
version = workflow.get_version("feature-flag", 1, 2)
|
||||
|
||||
if version == 1:
|
||||
# Old behavior
|
||||
return "version-1"
|
||||
else:
|
||||
# New behavior
|
||||
return "version-2"
|
||||
|
||||
env = await WorkflowEnvironment.start_time_skipping()
|
||||
|
||||
# Test version 1 behavior
|
||||
async with Worker(
|
||||
env.client,
|
||||
task_queue="test",
|
||||
workflows=[EvolvingWorkflow],
|
||||
):
|
||||
result_v1 = await env.client.execute_workflow(
|
||||
EvolvingWorkflow.run,
|
||||
id="evolving-v1",
|
||||
task_queue="test",
|
||||
)
|
||||
assert result_v1 == "version-1"
|
||||
|
||||
# Simulate workflow executing again with version 2
|
||||
result_v2 = await env.client.execute_workflow(
|
||||
EvolvingWorkflow.run,
|
||||
id="evolving-v2",
|
||||
task_queue="test",
|
||||
)
|
||||
# New workflows use version 2
|
||||
assert result_v2 == "version-2"
|
||||
|
||||
await env.shutdown()
|
||||
```
|
||||
|
||||
### Migration Strategy
|
||||
|
||||
```python
|
||||
# Phase 1: Add version check
|
||||
@workflow.defn
|
||||
class MigratingWorkflow:
|
||||
@workflow.run
|
||||
async def run(self) -> dict:
|
||||
version = workflow.get_version("new-logic", 1, 2)
|
||||
|
||||
if version == 1:
|
||||
# Old logic (existing workflows)
|
||||
return await self._old_implementation()
|
||||
else:
|
||||
# New logic (new workflows)
|
||||
return await self._new_implementation()
|
||||
|
||||
# Phase 2: After all old workflows complete, remove old code
|
||||
@workflow.defn
|
||||
class MigratedWorkflow:
|
||||
@workflow.run
|
||||
async def run(self) -> dict:
|
||||
# Only new logic remains
|
||||
return await self._new_implementation()
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Replay Before Deploy**: Always run replay tests before deploying workflow changes
|
||||
2. **Export Regularly**: Continuously export production histories for testing
|
||||
3. **CI/CD Integration**: Automated replay testing in pull request checks
|
||||
4. **Version Tracking**: Use workflow.get_version() for safe code evolution
|
||||
5. **History Retention**: Keep representative workflow histories for regression testing
|
||||
6. **Determinism**: Never use random(), datetime.now(), or direct external calls
|
||||
7. **Comprehensive Testing**: Test against various workflow execution paths
|
||||
|
||||
## Common Replay Errors
|
||||
|
||||
**Non-Deterministic Error**:
|
||||
```
|
||||
WorkflowNonDeterministicError: Workflow command mismatch at position 5
|
||||
Expected: ScheduleActivityTask(activity_id='activity-1')
|
||||
Got: ScheduleActivityTask(activity_id='activity-2')
|
||||
```
|
||||
|
||||
**Solution**: Code change altered workflow decision sequence
|
||||
|
||||
**Version Mismatch Error**:
|
||||
```
|
||||
WorkflowVersionError: Workflow version changed from 1 to 2 without using get_version()
|
||||
```
|
||||
|
||||
**Solution**: Use workflow.get_version() for backward-compatible changes
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- Replay Testing: docs.temporal.io/develop/python/testing-suite#replay-testing
|
||||
- Workflow Versioning: docs.temporal.io/workflows#versioning
|
||||
- Determinism Guide: docs.temporal.io/workflows#deterministic-constraints
|
||||
- CI/CD Integration: github.com/temporalio/samples-python/tree/main/.github/workflows
|
||||
Reference in New Issue
Block a user