Initial commit

2025-11-29 18:33:10 +08:00
commit 3fb1c70e0e
21 changed files with 6250 additions and 0 deletions
--- a/skills/temporal-python-testing/resources/replay-testing.md
+++ b/skills/temporal-python-testing/resources/replay-testing.md
@@ -0,0 +1,455 @@
+# Replay Testing for Determinism and Compatibility
+
+Comprehensive guide for validating workflow determinism and ensuring safe code changes using replay testing.
+
+## What is Replay Testing?
+
+**Purpose**: Verify that workflow code changes are backward-compatible with existing workflow executions
+
+**How it works**:
+1. Temporal records every workflow decision as Event History
+2. Replay testing re-executes workflow code against recorded history
+3. If new code makes same decisions → deterministic (safe to deploy)
+4. If decisions differ → non-deterministic (breaking change)
+
+**Critical Use Cases**:
+- Deploying workflow code changes to production
+- Validating refactoring doesn't break running workflows
+- CI/CD automated compatibility checks
+- Version migration validation
+
+## Basic Replay Testing
+
+### Replayer Setup
+
+```python
+from temporalio.worker import Replayer
+from temporalio.client import Client
+
+async def test_workflow_replay():
+    """Test workflow against production history"""
+
+    # Connect to Temporal server
+    client = await Client.connect("localhost:7233")
+
+    # Create replayer with current workflow code
+    replayer = Replayer(
+        workflows=[OrderWorkflow, PaymentWorkflow]
+    )
+
+    # Fetch workflow history from production
+    handle = client.get_workflow_handle("order-123")
+    history = await handle.fetch_history()
+
+    # Replay history with current code
+    await replayer.replay_workflow(history)
+    # Success = deterministic, Exception = breaking change
+```
+
+### Testing Against Multiple Histories
+
+```python
+import pytest
+from temporalio.worker import Replayer
+
+@pytest.mark.asyncio
+async def test_replay_multiple_workflows():
+    """Replay against multiple production histories"""
+
+    replayer = Replayer(workflows=[OrderWorkflow])
+
+    # Test against different workflow executions
+    workflow_ids = [
+        "order-success-123",
+        "order-cancelled-456",
+        "order-retry-789",
+    ]
+
+    for workflow_id in workflow_ids:
+        handle = client.get_workflow_handle(workflow_id)
+        history = await handle.fetch_history()
+
+        # Replay should succeed for all variants
+        await replayer.replay_workflow(history)
+```
+
+## Determinism Validation
+
+### Common Non-Deterministic Patterns
+
+**Problem: Random Number Generation**
+```python
+# ❌ Non-deterministic (breaks replay)
+@workflow.defn
+class BadWorkflow:
+    @workflow.run
+    async def run(self) -> int:
+        return random.randint(1, 100)  # Different on replay!
+
+# ✅ Deterministic (safe for replay)
+@workflow.defn
+class GoodWorkflow:
+    @workflow.run
+    async def run(self) -> int:
+        return workflow.random().randint(1, 100)  # Deterministic random
+```
+
+**Problem: Current Time**
+```python
+# ❌ Non-deterministic
+@workflow.defn
+class BadWorkflow:
+    @workflow.run
+    async def run(self) -> str:
+        now = datetime.now()  # Different on replay!
+        return now.isoformat()
+
+# ✅ Deterministic
+@workflow.defn
+class GoodWorkflow:
+    @workflow.run
+    async def run(self) -> str:
+        now = workflow.now()  # Deterministic time
+        return now.isoformat()
+```
+
+**Problem: Direct External Calls**
+```python
+# ❌ Non-deterministic
+@workflow.defn
+class BadWorkflow:
+    @workflow.run
+    async def run(self) -> dict:
+        response = requests.get("https://api.example.com/data")  # External call!
+        return response.json()
+
+# ✅ Deterministic
+@workflow.defn
+class GoodWorkflow:
+    @workflow.run
+    async def run(self) -> dict:
+        # Use activity for external calls
+        return await workflow.execute_activity(
+            fetch_external_data,
+            start_to_close_timeout=timedelta(seconds=30),
+        )
+```
+
+### Testing Determinism
+
+```python
+@pytest.mark.asyncio
+async def test_workflow_determinism():
+    """Verify workflow produces same output on multiple runs"""
+
+    @workflow.defn
+    class DeterministicWorkflow:
+        @workflow.run
+        async def run(self, seed: int) -> list[int]:
+            # Use workflow.random() for determinism
+            rng = workflow.random()
+            rng.seed(seed)
+            return [rng.randint(1, 100) for _ in range(10)]
+
+    env = await WorkflowEnvironment.start_time_skipping()
+
+    # Run workflow twice with same input
+    results = []
+    for i in range(2):
+        async with Worker(
+            env.client,
+            task_queue="test",
+            workflows=[DeterministicWorkflow],
+        ):
+            result = await env.client.execute_workflow(
+                DeterministicWorkflow.run,
+                42,  # Same seed
+                id=f"determinism-test-{i}",
+                task_queue="test",
+            )
+            results.append(result)
+
+    await env.shutdown()
+
+    # Verify identical outputs
+    assert results[0] == results[1]
+```
+
+## Production History Replay
+
+### Exporting Workflow History
+
+```python
+from temporalio.client import Client
+
+async def export_workflow_history(workflow_id: str, output_file: str):
+    """Export workflow history for replay testing"""
+
+    client = await Client.connect("production.temporal.io:7233")
+
+    # Fetch workflow history
+    handle = client.get_workflow_handle(workflow_id)
+    history = await handle.fetch_history()
+
+    # Save to file for replay testing
+    with open(output_file, "wb") as f:
+        f.write(history.SerializeToString())
+
+    print(f"Exported history to {output_file}")
+```
+
+### Replaying from File
+
+```python
+from temporalio.worker import Replayer
+from temporalio.api.history.v1 import History
+
+async def test_replay_from_file():
+    """Replay workflow from exported history file"""
+
+    # Load history from file
+    with open("workflow_histories/order-123.pb", "rb") as f:
+        history = History.FromString(f.read())
+
+    # Replay with current workflow code
+    replayer = Replayer(workflows=[OrderWorkflow])
+    await replayer.replay_workflow(history)
+    # Success = safe to deploy
+```
+
+## CI/CD Integration Patterns
+
+### GitHub Actions Example
+
+```yaml
+# .github/workflows/replay-tests.yml
+name: Replay Tests
+
+on:
+  pull_request:
+    branches: [main]
+
+jobs:
+  replay-tests:
+    runs-on: ubuntu-latest
+
+    steps:
+      - uses: actions/checkout@v3
+
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: "3.11"
+
+      - name: Install dependencies
+        run: |
+          pip install -r requirements.txt
+          pip install pytest pytest-asyncio
+
+      - name: Download production histories
+        run: |
+          # Fetch recent workflow histories from production
+          python scripts/export_histories.py
+
+      - name: Run replay tests
+        run: |
+          pytest tests/replay/ --verbose
+
+      - name: Upload results
+        if: failure()
+        uses: actions/upload-artifact@v3
+        with:
+          name: replay-failures
+          path: replay-failures/
+```
+
+### Automated History Export
+
+```python
+# scripts/export_histories.py
+import asyncio
+from temporalio.client import Client
+from datetime import datetime, timedelta
+
+async def export_recent_histories():
+    """Export recent production workflow histories"""
+
+    client = await Client.connect("production.temporal.io:7233")
+
+    # Query recent completed workflows
+    workflows = client.list_workflows(
+        query="WorkflowType='OrderWorkflow' AND CloseTime > '7 days ago'"
+    )
+
+    count = 0
+    async for workflow in workflows:
+        # Export history
+        history = await workflow.fetch_history()
+
+        # Save to file
+        filename = f"workflow_histories/{workflow.id}.pb"
+        with open(filename, "wb") as f:
+            f.write(history.SerializeToString())
+
+        count += 1
+        if count >= 100:  # Limit to 100 most recent
+            break
+
+    print(f"Exported {count} workflow histories")
+
+if __name__ == "__main__":
+    asyncio.run(export_recent_histories())
+```
+
+### Replay Test Suite
+
+```python
+# tests/replay/test_workflow_replay.py
+import pytest
+import glob
+from temporalio.worker import Replayer
+from temporalio.api.history.v1 import History
+from workflows import OrderWorkflow, PaymentWorkflow
+
+@pytest.mark.asyncio
+async def test_replay_all_histories():
+    """Replay all production histories"""
+
+    replayer = Replayer(
+        workflows=[OrderWorkflow, PaymentWorkflow]
+    )
+
+    # Load all history files
+    history_files = glob.glob("workflow_histories/*.pb")
+
+    failures = []
+    for history_file in history_files:
+        try:
+            with open(history_file, "rb") as f:
+                history = History.FromString(f.read())
+
+            await replayer.replay_workflow(history)
+            print(f"✓ {history_file}")
+
+        except Exception as e:
+            failures.append((history_file, str(e)))
+            print(f"✗ {history_file}: {e}")
+
+    # Report failures
+    if failures:
+        pytest.fail(
+            f"Replay failed for {len(failures)} workflows:\n"
+            + "\n".join(f"  {file}: {error}" for file, error in failures)
+        )
+```
+
+## Version Compatibility Testing
+
+### Testing Code Evolution
+
+```python
+@pytest.mark.asyncio
+async def test_workflow_version_compatibility():
+    """Test workflow with version changes"""
+
+    @workflow.defn
+    class EvolvingWorkflow:
+        @workflow.run
+        async def run(self) -> str:
+            # Use versioning for safe code evolution
+            version = workflow.get_version("feature-flag", 1, 2)
+
+            if version == 1:
+                # Old behavior
+                return "version-1"
+            else:
+                # New behavior
+                return "version-2"
+
+    env = await WorkflowEnvironment.start_time_skipping()
+
+    # Test version 1 behavior
+    async with Worker(
+        env.client,
+        task_queue="test",
+        workflows=[EvolvingWorkflow],
+    ):
+        result_v1 = await env.client.execute_workflow(
+            EvolvingWorkflow.run,
+            id="evolving-v1",
+            task_queue="test",
+        )
+        assert result_v1 == "version-1"
+
+        # Simulate workflow executing again with version 2
+        result_v2 = await env.client.execute_workflow(
+            EvolvingWorkflow.run,
+            id="evolving-v2",
+            task_queue="test",
+        )
+        # New workflows use version 2
+        assert result_v2 == "version-2"
+
+    await env.shutdown()
+```
+
+### Migration Strategy
+
+```python
+# Phase 1: Add version check
+@workflow.defn
+class MigratingWorkflow:
+    @workflow.run
+    async def run(self) -> dict:
+        version = workflow.get_version("new-logic", 1, 2)
+
+        if version == 1:
+            # Old logic (existing workflows)
+            return await self._old_implementation()
+        else:
+            # New logic (new workflows)
+            return await self._new_implementation()
+
+# Phase 2: After all old workflows complete, remove old code
+@workflow.defn
+class MigratedWorkflow:
+    @workflow.run
+    async def run(self) -> dict:
+        # Only new logic remains
+        return await self._new_implementation()
+```
+
+## Best Practices
+
+1. **Replay Before Deploy**: Always run replay tests before deploying workflow changes
+2. **Export Regularly**: Continuously export production histories for testing
+3. **CI/CD Integration**: Automated replay testing in pull request checks
+4. **Version Tracking**: Use workflow.get_version() for safe code evolution
+5. **History Retention**: Keep representative workflow histories for regression testing
+6. **Determinism**: Never use random(), datetime.now(), or direct external calls
+7. **Comprehensive Testing**: Test against various workflow execution paths
+
+## Common Replay Errors
+
+**Non-Deterministic Error**:
+```
+WorkflowNonDeterministicError: Workflow command mismatch at position 5
+Expected: ScheduleActivityTask(activity_id='activity-1')
+Got: ScheduleActivityTask(activity_id='activity-2')
+```
+
+**Solution**: Code change altered workflow decision sequence
+
+**Version Mismatch Error**:
+```
+WorkflowVersionError: Workflow version changed from 1 to 2 without using get_version()
+```
+
+**Solution**: Use workflow.get_version() for backward-compatible changes
+
+## Additional Resources
+
+- Replay Testing: docs.temporal.io/develop/python/testing-suite#replay-testing
+- Workflow Versioning: docs.temporal.io/workflows#versioning
+- Determinism Guide: docs.temporal.io/workflows#deterministic-constraints
+- CI/CD Integration: github.com/temporalio/samples-python/tree/main/.github/workflows