Initial commit
This commit is contained in:
200
skills/spring-boot/spring-boot-saga-pattern/SKILL.md
Normal file
200
skills/spring-boot/spring-boot-saga-pattern/SKILL.md
Normal file
@@ -0,0 +1,200 @@
|
||||
---
|
||||
name: spring-boot-saga-pattern
|
||||
description: Implement distributed transactions using the Saga Pattern in Spring Boot microservices. Use when building microservices requiring transaction management across multiple services, handling compensating transactions, ensuring eventual consistency, or implementing choreography or orchestration-based sagas with Spring Boot, Kafka, or Axon Framework.
|
||||
allowed-tools: Read, Write, Bash
|
||||
category: backend
|
||||
tags: [spring-boot, saga, distributed-transactions, choreography, orchestration, microservices]
|
||||
version: 1.1.0
|
||||
---
|
||||
|
||||
# Spring Boot Saga Pattern
|
||||
|
||||
## When to Use
|
||||
|
||||
Implement this skill when:
|
||||
|
||||
- Building distributed transactions across multiple microservices
|
||||
- Needing to replace two-phase commit (2PC) with a more scalable solution
|
||||
- Handling transaction rollback when a service fails in multi-service workflows
|
||||
- Ensuring eventual consistency in microservices architecture
|
||||
- Implementing compensating transactions for failed operations
|
||||
- Coordinating complex business processes spanning multiple services
|
||||
- Choosing between choreography-based and orchestration-based saga approaches
|
||||
|
||||
**Trigger phrases**: distributed transactions, saga pattern, compensating transactions, microservices transaction, eventual consistency, rollback across services, orchestration pattern, choreography pattern
|
||||
|
||||
## Overview
|
||||
|
||||
The **Saga Pattern** is an architectural pattern for managing distributed transactions in microservices. Instead of using a single ACID transaction across multiple databases, a saga breaks the transaction into a sequence of local transactions. Each local transaction updates its database and publishes an event or message to trigger the next step. If a step fails, the saga executes **compensating transactions** to undo the changes made by previous steps.
|
||||
|
||||
### Key Architectural Decisions
|
||||
|
||||
When implementing a saga, make these decisions:
|
||||
|
||||
1. **Approach Selection**: Choose between **choreography-based** (event-driven, decoupled) or **orchestration-based** (centralized control, easier to track)
|
||||
2. **Messaging Platform**: Select Kafka, RabbitMQ, or Spring Cloud Stream
|
||||
3. **Framework**: Use Axon Framework, Eventuate Tram, Camunda, or Apache Camel
|
||||
4. **State Persistence**: Store saga state in database for recovery and debugging
|
||||
5. **Idempotency**: Ensure all operations (especially compensations) are idempotent and retryable
|
||||
|
||||
## Two Approaches to Implement Saga
|
||||
|
||||
### Choreography-Based Saga
|
||||
|
||||
Each microservice publishes events and listens to events from other services. **No central coordinator**.
|
||||
|
||||
**Best for**: Greenfield microservice applications with few participants
|
||||
|
||||
**Advantages**:
|
||||
- Simple for small number of services
|
||||
- Loose coupling between services
|
||||
- No single point of failure
|
||||
|
||||
**Disadvantages**:
|
||||
- Difficult to track workflow state
|
||||
- Hard to troubleshoot and maintain
|
||||
- Complexity grows with number of services
|
||||
|
||||
### Orchestration-Based Saga
|
||||
|
||||
A **central orchestrator** manages the entire transaction flow and tells services what to do.
|
||||
|
||||
**Best for**: Brownfield applications, complex workflows, or when centralized control is needed
|
||||
|
||||
**Advantages**:
|
||||
- Centralized visibility and monitoring
|
||||
- Easier to troubleshoot and maintain
|
||||
- Clear transaction flow
|
||||
- Simplified error handling
|
||||
- Better for complex workflows
|
||||
|
||||
**Disadvantages**:
|
||||
- Orchestrator can become single point of failure
|
||||
- Additional infrastructure component
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Step 1: Define Transaction Flow
|
||||
|
||||
Identify the sequence of operations and corresponding compensating transactions:
|
||||
|
||||
```
|
||||
Order → Payment → Inventory → Shipment → Notification
|
||||
↓ ↓ ↓ ↓ ↓
|
||||
Cancel Refund Release Cancel Cancel
|
||||
```
|
||||
|
||||
### Step 2: Choose Implementation Approach
|
||||
|
||||
- **Choreography**: Spring Cloud Stream with Kafka or RabbitMQ
|
||||
- **Orchestration**: Axon Framework, Eventuate Tram, Camunda, or Apache Camel
|
||||
|
||||
### Step 3: Implement Services with Local Transactions
|
||||
|
||||
Each service handles its local ACID transaction and publishes events or responds to commands.
|
||||
|
||||
### Step 4: Implement Compensating Transactions
|
||||
|
||||
Every forward transaction must have a corresponding compensating transaction. Ensure **idempotency** and **retryability**.
|
||||
|
||||
### Step 5: Handle Failure Scenarios
|
||||
|
||||
Implement retry logic, timeouts, and dead-letter queues for failed messages.
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Design Principles
|
||||
|
||||
1. **Idempotency**: Ensure compensating transactions execute safely multiple times
|
||||
2. **Retryability**: Design operations to handle retries without side effects
|
||||
3. **Atomicity**: Each local transaction must be atomic within its service
|
||||
4. **Isolation**: Handle concurrent saga executions properly
|
||||
5. **Eventual Consistency**: Accept that data becomes consistent over time
|
||||
|
||||
### Service Design
|
||||
|
||||
- Use **constructor injection** exclusively (never field injection)
|
||||
- Implement services as **stateless** components
|
||||
- Store saga state in persistent store (database or event store)
|
||||
- Use **immutable DTOs** (Java records preferred)
|
||||
- Separate domain logic from infrastructure concerns
|
||||
|
||||
### Error Handling
|
||||
|
||||
- Implement **circuit breakers** for service calls
|
||||
- Use **dead-letter queues** for failed messages
|
||||
- Log all saga events for debugging and monitoring
|
||||
- Implement **timeout mechanisms** for long-running sagas
|
||||
- Design **semantic locks** to prevent concurrent updates
|
||||
|
||||
### Testing
|
||||
|
||||
- Test happy path scenarios
|
||||
- Test each failure scenario and its compensation
|
||||
- Test concurrent saga executions
|
||||
- Test idempotency of compensating transactions
|
||||
- Use Testcontainers for integration testing
|
||||
|
||||
### Monitoring and Observability
|
||||
|
||||
- Track saga execution status and duration
|
||||
- Monitor compensation transaction execution
|
||||
- Alert on stuck or failed sagas
|
||||
- Use distributed tracing (Spring Cloud Sleuth, Zipkin)
|
||||
- Implement health checks for saga coordinators
|
||||
|
||||
## Technology Stack
|
||||
|
||||
**Spring Boot 3.x** with dependencies:
|
||||
|
||||
**Messaging**: Spring Cloud Stream, Apache Kafka, RabbitMQ, Spring AMQP
|
||||
|
||||
**Saga Frameworks**: Axon Framework (4.9.0), Eventuate Tram Sagas, Camunda, Apache Camel
|
||||
|
||||
**Persistence**: Spring Data JPA, Event Sourcing (optional), Transactional Outbox Pattern
|
||||
|
||||
**Monitoring**: Spring Boot Actuator, Micrometer, Distributed Tracing (Sleuth + Zipkin)
|
||||
|
||||
## Anti-Patterns to Avoid
|
||||
|
||||
❌ **Tight Coupling**: Services directly calling each other instead of using events
|
||||
❌ **Missing Compensations**: Not implementing compensating transactions for every step
|
||||
❌ **Non-Idempotent Operations**: Compensations that cannot be safely retried
|
||||
❌ **Synchronous Sagas**: Waiting synchronously for each step (defeats the purpose)
|
||||
❌ **Lost Messages**: Not handling message delivery failures
|
||||
❌ **No Monitoring**: Running sagas without visibility into their status
|
||||
❌ **Shared Database**: Using same database across multiple services
|
||||
❌ **Ignoring Network Failures**: Not handling partial failures gracefully
|
||||
|
||||
## When NOT to Use Saga Pattern
|
||||
|
||||
Do not implement this pattern when:
|
||||
|
||||
- Single service transactions (use local ACID transactions instead)
|
||||
- Strong consistency is required (consider monolith or shared database)
|
||||
- Simple CRUD operations without cross-service dependencies
|
||||
- Low transaction volume with simple flows
|
||||
- Team lacks experience with distributed systems
|
||||
|
||||
## References
|
||||
|
||||
For detailed information, consult the following resources:
|
||||
|
||||
- [Saga Pattern Definition](references/01-saga-pattern-definition.md)
|
||||
- [Choreography-Based Implementation](references/02-choreography-implementation.md)
|
||||
- [Orchestration-Based Implementation](references/03-orchestration-implementation.md)
|
||||
- [Event-Driven Architecture](references/04-event-driven-architecture.md)
|
||||
- [Compensating Transactions](references/05-compensating-transactions.md)
|
||||
- [State Management](references/06-state-management.md)
|
||||
- [Error Handling and Retry](references/07-error-handling-retry.md)
|
||||
- [Testing Strategies](references/08-testing-strategies.md)
|
||||
- [Common Pitfalls and Solutions](references/09-pitfalls-solutions.md)
|
||||
|
||||
See also [examples.md](references/examples.md) for complete implementation examples:
|
||||
|
||||
- E-Commerce Order Processing (orchestration with Axon Framework)
|
||||
- Food Delivery Application (choreography with Kafka and Spring Cloud Stream)
|
||||
- Travel Booking System (complex orchestration with multiple compensations)
|
||||
- Banking Transfer System
|
||||
- Real-world microservices patterns
|
||||
|
||||
@@ -0,0 +1,68 @@
|
||||
# Saga Pattern Definition
|
||||
|
||||
## What is a Saga?
|
||||
|
||||
A **Saga** is a sequence of local transactions where each transaction updates data within a single service. Each local transaction publishes an event or message that triggers the next local transaction in the saga. If a local transaction fails, the saga executes compensating transactions to undo the changes made by preceding transactions.
|
||||
|
||||
## Key Characteristics
|
||||
|
||||
**Distributed Transactions**: Spans multiple microservices, each with its own database.
|
||||
|
||||
**Local Transactions**: Each service performs its own ACID transaction.
|
||||
|
||||
**Event-Driven**: Services communicate through events or commands.
|
||||
|
||||
**Compensations**: Rollback mechanism using compensating transactions.
|
||||
|
||||
**Eventual Consistency**: System reaches a consistent state over time.
|
||||
|
||||
## Saga vs Two-Phase Commit (2PC)
|
||||
|
||||
| Feature | Saga Pattern | Two-Phase Commit |
|
||||
|---------|-------------|------------------|
|
||||
| Locking | No distributed locks | Requires locks during commit |
|
||||
| Performance | Better performance | Performance bottleneck |
|
||||
| Scalability | Highly scalable | Limited scalability |
|
||||
| Complexity | Business logic complexity | Protocol complexity |
|
||||
| Failure Handling | Compensating transactions | Automatic rollback |
|
||||
| Isolation | Lower isolation | Full isolation |
|
||||
| NoSQL Support | Yes | No |
|
||||
| Microservices Fit | Excellent | Poor |
|
||||
|
||||
## ACID vs BASE
|
||||
|
||||
**ACID** (Traditional Databases):
|
||||
- **A**tomicity: All or nothing
|
||||
- **C**onsistency: Valid state transitions
|
||||
- **I**solation: Concurrent transactions don't interfere
|
||||
- **D**urability: Committed data persists
|
||||
|
||||
**BASE** (Saga Pattern):
|
||||
- **B**asically **A**vailable: System is available most of the time
|
||||
- **S**oft state: State may change over time
|
||||
- **E**ventual consistency: System becomes consistent eventually
|
||||
|
||||
## When to Use Saga Pattern
|
||||
|
||||
Use the saga pattern when:
|
||||
- Building distributed transactions across multiple microservices
|
||||
- Needing to replace 2PC with a more scalable solution
|
||||
- Services need to maintain eventual consistency
|
||||
- Handling long-running processes spanning multiple services
|
||||
- Implementing compensating transactions for failed operations
|
||||
|
||||
## When NOT to Use Saga Pattern
|
||||
|
||||
Avoid the saga pattern when:
|
||||
- Single service transactions (use local ACID transactions)
|
||||
- Strong consistency is required immediately
|
||||
- Simple CRUD operations without cross-service dependencies
|
||||
- Low transaction volume with simple flows
|
||||
- Team lacks experience with distributed systems
|
||||
|
||||
## Migration Path
|
||||
|
||||
Many organizations migrate from traditional monolithic systems or 2PC-based systems to sagas:
|
||||
|
||||
1. **From Monolith to Saga**: Identify transaction boundaries, extract services gradually, implement sagas incrementally
|
||||
2. **From 2PC to Saga**: Analyze existing 2PC transactions, design compensating transactions, implement sagas in parallel, monitor and compare results before full migration
|
||||
@@ -0,0 +1,153 @@
|
||||
# Choreography-Based Saga Implementation
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
In choreography-based sagas, each service produces and listens to events. Services know what to do when they receive an event. **No central coordinator** manages the flow.
|
||||
|
||||
```
|
||||
Service A → Event → Service B → Event → Service C
|
||||
↓ ↓ ↓
|
||||
Event Event Event
|
||||
↓ ↓ ↓
|
||||
Compensation Compensation Compensation
|
||||
```
|
||||
|
||||
## Event Flow
|
||||
|
||||
### Success Path
|
||||
|
||||
1. **Order Service** creates order → publishes `OrderCreated` event
|
||||
2. **Payment Service** listens → processes payment → publishes `PaymentProcessed` event
|
||||
3. **Inventory Service** listens → reserves inventory → publishes `InventoryReserved` event
|
||||
4. **Shipment Service** listens → prepares shipment → publishes `ShipmentPrepared` event
|
||||
|
||||
### Failure Path (When Payment Fails)
|
||||
|
||||
1. **Payment Service** publishes `PaymentFailed` event
|
||||
2. **Order Service** listens → cancels order → publishes `OrderCancelled` event
|
||||
3. All other services respond to cancellation with cleanup
|
||||
|
||||
## Event Publisher
|
||||
|
||||
```java
|
||||
@Component
|
||||
public class OrderEventPublisher {
|
||||
private final StreamBridge streamBridge;
|
||||
|
||||
public OrderEventPublisher(StreamBridge streamBridge) {
|
||||
this.streamBridge = streamBridge;
|
||||
}
|
||||
|
||||
public void publishOrderCreatedEvent(String orderId, BigDecimal amount, String itemId) {
|
||||
OrderCreatedEvent event = new OrderCreatedEvent(orderId, amount, itemId);
|
||||
streamBridge.send("orderCreated-out-0",
|
||||
MessageBuilder
|
||||
.withPayload(event)
|
||||
.setHeader(MessageHeaders.CONTENT_TYPE, MimeTypeUtils.APPLICATION_JSON)
|
||||
.build());
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Event Listener
|
||||
|
||||
```java
|
||||
@Component
|
||||
public class PaymentEventListener {
|
||||
|
||||
@Bean
|
||||
public Consumer<OrderCreatedEvent> handleOrderCreatedEvent() {
|
||||
return event -> processPayment(event.getOrderId());
|
||||
}
|
||||
|
||||
private void processPayment(String orderId) {
|
||||
// Payment processing logic
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Event Classes
|
||||
|
||||
```java
|
||||
public record OrderCreatedEvent(
|
||||
String orderId,
|
||||
BigDecimal amount,
|
||||
String itemId
|
||||
) {}
|
||||
|
||||
public record PaymentProcessedEvent(
|
||||
String paymentId,
|
||||
String orderId,
|
||||
String itemId
|
||||
) {}
|
||||
|
||||
public record PaymentFailedEvent(
|
||||
String paymentId,
|
||||
String orderId,
|
||||
String itemId,
|
||||
String reason
|
||||
) {}
|
||||
```
|
||||
|
||||
## Spring Cloud Stream Configuration
|
||||
|
||||
```yaml
|
||||
spring:
|
||||
cloud:
|
||||
stream:
|
||||
bindings:
|
||||
orderCreated-out-0:
|
||||
destination: order-events
|
||||
paymentProcessed-out-0:
|
||||
destination: payment-events
|
||||
paymentFailed-out-0:
|
||||
destination: payment-events
|
||||
kafka:
|
||||
binder:
|
||||
brokers: localhost:9092
|
||||
```
|
||||
|
||||
## Maven Dependencies
|
||||
|
||||
```xml
|
||||
<dependency>
|
||||
<groupId>org.springframework.cloud</groupId>
|
||||
<artifactId>spring-cloud-stream</artifactId>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.springframework.cloud</groupId>
|
||||
<artifactId>spring-cloud-stream-binder-kafka</artifactId>
|
||||
</dependency>
|
||||
```
|
||||
|
||||
## Gradle Dependencies
|
||||
|
||||
```groovy
|
||||
implementation 'org.springframework.cloud:spring-cloud-stream'
|
||||
implementation 'org.springframework.cloud:spring-cloud-stream-binder-kafka'
|
||||
```
|
||||
|
||||
## Advantages and Disadvantages
|
||||
|
||||
### Advantages
|
||||
|
||||
- **Simple** for small number of services
|
||||
- **Loose coupling** between services
|
||||
- **No single point of failure**
|
||||
- Each service is independently deployable
|
||||
|
||||
### Disadvantages
|
||||
|
||||
- **Difficult to track workflow state** - distributed across services
|
||||
- **Hard to troubleshoot** - following event flow is complex
|
||||
- **Complexity grows** with number of services
|
||||
- **Distributed source of truth** - saga state not centralized
|
||||
|
||||
## When to Use Choreography
|
||||
|
||||
Use choreography-based sagas when:
|
||||
- Microservices are few in number (< 5 services per saga)
|
||||
- Loose coupling is critical
|
||||
- Team is experienced with event-driven architecture
|
||||
- System can handle eventual consistency
|
||||
- Workflow doesn't need centralized monitoring
|
||||
@@ -0,0 +1,232 @@
|
||||
# Orchestration-Based Saga Implementation
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
A **central orchestrator** (Saga Coordinator) manages the entire transaction flow, sending commands to services and handling responses.
|
||||
|
||||
```
|
||||
Saga Orchestrator
|
||||
/ | \
|
||||
Service A Service B Service C
|
||||
```
|
||||
|
||||
## Orchestrator Responsibilities
|
||||
|
||||
1. **Command Dispatch**: Send commands to services
|
||||
2. **Response Handling**: Process service responses
|
||||
3. **State Management**: Track saga execution state
|
||||
4. **Compensation Coordination**: Trigger compensating transactions on failure
|
||||
5. **Timeout Management**: Handle service timeouts
|
||||
6. **Retry Logic**: Manage retry attempts
|
||||
|
||||
## Axon Framework Implementation
|
||||
|
||||
### Saga Class
|
||||
|
||||
```java
|
||||
@Saga
|
||||
public class OrderSaga {
|
||||
|
||||
@Autowired
|
||||
private transient CommandGateway commandGateway;
|
||||
|
||||
@StartSaga
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(OrderCreatedEvent event) {
|
||||
String paymentId = UUID.randomUUID().toString();
|
||||
ProcessPaymentCommand command = new ProcessPaymentCommand(
|
||||
paymentId,
|
||||
event.getOrderId(),
|
||||
event.getAmount(),
|
||||
event.getItemId()
|
||||
);
|
||||
commandGateway.send(command);
|
||||
}
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentProcessedEvent event) {
|
||||
ReserveInventoryCommand command = new ReserveInventoryCommand(
|
||||
event.getOrderId(),
|
||||
event.getItemId()
|
||||
);
|
||||
commandGateway.send(command);
|
||||
}
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentFailedEvent event) {
|
||||
CancelOrderCommand command = new CancelOrderCommand(event.getOrderId());
|
||||
commandGateway.send(command);
|
||||
end();
|
||||
}
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(InventoryReservedEvent event) {
|
||||
PrepareShipmentCommand command = new PrepareShipmentCommand(
|
||||
event.getOrderId(),
|
||||
event.getItemId()
|
||||
);
|
||||
commandGateway.send(command);
|
||||
}
|
||||
|
||||
@EndSaga
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(OrderCompletedEvent event) {
|
||||
// Saga completed successfully
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Aggregate for Order Service
|
||||
|
||||
```java
|
||||
@Aggregate
|
||||
public class OrderAggregate {
|
||||
|
||||
@AggregateIdentifier
|
||||
private String orderId;
|
||||
|
||||
private OrderStatus status;
|
||||
|
||||
public OrderAggregate() {
|
||||
}
|
||||
|
||||
@CommandHandler
|
||||
public OrderAggregate(CreateOrderCommand command) {
|
||||
apply(new OrderCreatedEvent(
|
||||
command.getOrderId(),
|
||||
command.getAmount(),
|
||||
command.getItemId()
|
||||
));
|
||||
}
|
||||
|
||||
@EventSourcingHandler
|
||||
public void on(OrderCreatedEvent event) {
|
||||
this.orderId = event.getOrderId();
|
||||
this.status = OrderStatus.PENDING;
|
||||
}
|
||||
|
||||
@CommandHandler
|
||||
public void handle(CancelOrderCommand command) {
|
||||
apply(new OrderCancelledEvent(command.getOrderId()));
|
||||
}
|
||||
|
||||
@EventSourcingHandler
|
||||
public void on(OrderCancelledEvent event) {
|
||||
this.status = OrderStatus.CANCELLED;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Aggregate for Payment Service
|
||||
|
||||
```java
|
||||
@Aggregate
|
||||
public class PaymentAggregate {
|
||||
|
||||
@AggregateIdentifier
|
||||
private String paymentId;
|
||||
|
||||
public PaymentAggregate() {
|
||||
}
|
||||
|
||||
@CommandHandler
|
||||
public PaymentAggregate(ProcessPaymentCommand command) {
|
||||
this.paymentId = command.getPaymentId();
|
||||
|
||||
if (command.getAmount().compareTo(BigDecimal.ZERO) <= 0) {
|
||||
apply(new PaymentFailedEvent(
|
||||
command.getPaymentId(),
|
||||
command.getOrderId(),
|
||||
command.getItemId(),
|
||||
"Payment amount must be greater than zero"
|
||||
));
|
||||
} else {
|
||||
apply(new PaymentProcessedEvent(
|
||||
command.getPaymentId(),
|
||||
command.getOrderId(),
|
||||
command.getItemId()
|
||||
));
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Axon Configuration
|
||||
|
||||
```yaml
|
||||
axon:
|
||||
serializer:
|
||||
general: jackson
|
||||
events: jackson
|
||||
messages: jackson
|
||||
eventhandling:
|
||||
processors:
|
||||
order-processor:
|
||||
mode: tracking
|
||||
source: eventBus
|
||||
axonserver:
|
||||
enabled: false
|
||||
```
|
||||
|
||||
## Maven Dependencies for Axon
|
||||
|
||||
```xml
|
||||
<dependency>
|
||||
<groupId>org.axonframework</groupId>
|
||||
<artifactId>axon-spring-boot-starter</artifactId>
|
||||
<version>4.9.0</version> // Use latest stable version
|
||||
</dependency>
|
||||
```
|
||||
|
||||
## Advantages and Disadvantages
|
||||
|
||||
### Advantages
|
||||
|
||||
- **Centralized visibility** - easy to see workflow status
|
||||
- **Easier to troubleshoot** - single place to analyze flow
|
||||
- **Clear transaction flow** - orchestrator defines sequence
|
||||
- **Simplified error handling** - centralized compensation logic
|
||||
- **Better for complex workflows** - easier to manage many steps
|
||||
|
||||
### Disadvantages
|
||||
|
||||
- **Orchestrator becomes single point of failure** - can be mitigated with clustering
|
||||
- **Additional infrastructure component** - more complexity in deployment
|
||||
- **Potential tight coupling** - if orchestrator knows too much about services
|
||||
|
||||
## Eventuate Tram Sagas
|
||||
|
||||
Eventuate Tram is an alternative to Axon for orchestration-based sagas:
|
||||
|
||||
```xml
|
||||
<dependency>
|
||||
<groupId>io.eventuate.tram.sagas</groupId>
|
||||
<artifactId>eventuate-tram-sagas-spring-starter</artifactId>
|
||||
<version>0.28.0</version> // Use latest stable version
|
||||
</dependency>
|
||||
```
|
||||
|
||||
## Camunda for BPMN-Based Orchestration
|
||||
|
||||
Use Camunda when visual workflow design is beneficial:
|
||||
|
||||
**Features**:
|
||||
- Visual workflow design
|
||||
- BPMN 2.0 standard
|
||||
- Human tasks support
|
||||
- Complex workflow modeling
|
||||
|
||||
**Use When**:
|
||||
- Business process modeling needed
|
||||
- Visual workflow design preferred
|
||||
- Human approval steps required
|
||||
- Complex orchestration logic
|
||||
|
||||
## When to Use Orchestration
|
||||
|
||||
Use orchestration-based sagas when:
|
||||
- Building brownfield applications with existing microservices
|
||||
- Handling complex workflows with many steps
|
||||
- Centralized control and monitoring is critical
|
||||
- Organization wants clear visibility into saga execution
|
||||
- Need for human intervention in workflow
|
||||
@@ -0,0 +1,262 @@
|
||||
# Event-Driven Architecture in Sagas
|
||||
|
||||
## Event Types
|
||||
|
||||
### Domain Events
|
||||
|
||||
Represent business facts that happened within a service:
|
||||
|
||||
```java
|
||||
public record OrderCreatedEvent(
|
||||
String orderId,
|
||||
Instant createdAt,
|
||||
BigDecimal amount
|
||||
) implements DomainEvent {}
|
||||
```
|
||||
|
||||
### Integration Events
|
||||
|
||||
Communication between bounded contexts (microservices):
|
||||
|
||||
```java
|
||||
public record PaymentRequestedEvent(
|
||||
String orderId,
|
||||
String paymentId,
|
||||
BigDecimal amount
|
||||
) implements IntegrationEvent {}
|
||||
```
|
||||
|
||||
### Command Events
|
||||
|
||||
Request for action by another service:
|
||||
|
||||
```java
|
||||
public record ProcessPaymentCommand(
|
||||
String paymentId,
|
||||
String orderId,
|
||||
BigDecimal amount
|
||||
) {}
|
||||
```
|
||||
|
||||
## Event Versioning
|
||||
|
||||
Handle event schema evolution using versioning:
|
||||
|
||||
```java
|
||||
public record OrderCreatedEventV1(
|
||||
String orderId,
|
||||
BigDecimal amount
|
||||
) {}
|
||||
|
||||
public record OrderCreatedEventV2(
|
||||
String orderId,
|
||||
BigDecimal amount,
|
||||
String customerId,
|
||||
Instant timestamp
|
||||
) {}
|
||||
|
||||
// Event Upcaster
|
||||
public class OrderEventUpcaster implements EventUpcaster {
|
||||
@Override
|
||||
public Stream<IntermediateEventRepresentation> upcast(
|
||||
Stream<IntermediateEventRepresentation> eventStream) {
|
||||
|
||||
return eventStream.map(event -> {
|
||||
if (event.getType().getName().equals("OrderCreatedEventV1")) {
|
||||
return upcastV1ToV2(event);
|
||||
}
|
||||
return event;
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Event Store
|
||||
|
||||
Store all events for audit trail and recovery:
|
||||
|
||||
```java
|
||||
@Entity
|
||||
@Table(name = "saga_events")
|
||||
public class SagaEvent {
|
||||
|
||||
@Id
|
||||
@GeneratedValue(strategy = GenerationType.IDENTITY)
|
||||
private Long id;
|
||||
|
||||
@Column(nullable = false)
|
||||
private String sagaId;
|
||||
|
||||
@Column(nullable = false)
|
||||
private String eventType;
|
||||
|
||||
@Column(columnDefinition = "TEXT")
|
||||
private String payload;
|
||||
|
||||
@Column(nullable = false)
|
||||
private Instant timestamp;
|
||||
|
||||
@Column(nullable = false)
|
||||
private Integer version;
|
||||
}
|
||||
```
|
||||
|
||||
## Event Publishing Patterns
|
||||
|
||||
### Outbox Pattern (Transactional)
|
||||
|
||||
Ensure atomic update of database and event publishing:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class OrderService {
|
||||
|
||||
private final OrderRepository orderRepository;
|
||||
private final OutboxRepository outboxRepository;
|
||||
|
||||
@Transactional
|
||||
public void createOrder(CreateOrderRequest request) {
|
||||
// 1. Create and save order
|
||||
Order order = new Order(...);
|
||||
orderRepository.save(order);
|
||||
|
||||
// 2. Create outbox entry in same transaction
|
||||
OutboxEntry entry = new OutboxEntry(
|
||||
"OrderCreated",
|
||||
order.getId(),
|
||||
new OrderCreatedEvent(...)
|
||||
);
|
||||
outboxRepository.save(entry);
|
||||
}
|
||||
}
|
||||
|
||||
@Component
|
||||
public class OutboxPoller {
|
||||
|
||||
@Scheduled(fixedDelay = 1000)
|
||||
public void pollAndPublish() {
|
||||
List<OutboxEntry> unpublished = outboxRepository.findUnpublished();
|
||||
|
||||
unpublished.forEach(entry -> {
|
||||
eventPublisher.publish(entry.getEvent());
|
||||
outboxRepository.markAsPublished(entry.getId());
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Direct Publishing Pattern
|
||||
|
||||
Publish events immediately after transaction:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class OrderService {
|
||||
|
||||
private final OrderRepository orderRepository;
|
||||
private final EventPublisher eventPublisher;
|
||||
|
||||
@Transactional
|
||||
public void createOrder(CreateOrderRequest request) {
|
||||
Order order = new Order(...);
|
||||
orderRepository.save(order);
|
||||
|
||||
// Publish event after transaction commits
|
||||
TransactionSynchronizationManager.registerSynchronization(
|
||||
new TransactionSynchronization() {
|
||||
@Override
|
||||
public void afterCommit() {
|
||||
eventPublisher.publish(new OrderCreatedEvent(...));
|
||||
}
|
||||
}
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Event Sourcing
|
||||
|
||||
Store all state changes as events instead of current state:
|
||||
|
||||
**Benefits**:
|
||||
- Complete audit trail
|
||||
- Time-travel debugging
|
||||
- Natural fit for sagas
|
||||
- Event replay for recovery
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```java
|
||||
@Entity
|
||||
public class Order {
|
||||
|
||||
@Id
|
||||
private String orderId;
|
||||
|
||||
@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true)
|
||||
private List<DomainEvent> events = new ArrayList<>();
|
||||
|
||||
public void createOrder(...) {
|
||||
apply(new OrderCreatedEvent(...));
|
||||
}
|
||||
|
||||
protected void apply(DomainEvent event) {
|
||||
if (event instanceof OrderCreatedEvent e) {
|
||||
this.orderId = e.orderId();
|
||||
this.status = OrderStatus.PENDING;
|
||||
}
|
||||
events.add(event);
|
||||
}
|
||||
|
||||
public List<DomainEvent> getUncommittedEvents() {
|
||||
return new ArrayList<>(events);
|
||||
}
|
||||
|
||||
public void clearUncommittedEvents() {
|
||||
events.clear();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Event Ordering and Consistency
|
||||
|
||||
### Maintain Event Order
|
||||
|
||||
Use partitioning to maintain order within a saga:
|
||||
|
||||
```java
|
||||
@Bean
|
||||
public ProducerFactory<String, Object> producerFactory() {
|
||||
Map<String, Object> config = new HashMap<>();
|
||||
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
|
||||
StringSerializer.class);
|
||||
return new DefaultKafkaProducerFactory<>(config);
|
||||
}
|
||||
|
||||
@Service
|
||||
public class EventPublisher {
|
||||
|
||||
private final KafkaTemplate<String, Object> kafkaTemplate;
|
||||
|
||||
public void publish(DomainEvent event) {
|
||||
// Use sagaId as key to maintain order
|
||||
kafkaTemplate.send("events", event.getSagaId(), event);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Handle Out-of-Order Events
|
||||
|
||||
Use saga state to detect and handle out-of-order events:
|
||||
|
||||
```java
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentProcessedEvent event) {
|
||||
if (saga.getStatus() != SagaStatus.AWAITING_PAYMENT) {
|
||||
// Out of order event, ignore or queue for retry
|
||||
logger.warn("Unexpected event in state: {}", saga.getStatus());
|
||||
return;
|
||||
}
|
||||
// Process event
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,299 @@
|
||||
# Compensating Transactions
|
||||
|
||||
## Design Principles
|
||||
|
||||
### Idempotency
|
||||
|
||||
Execute multiple times with same result:
|
||||
|
||||
```java
|
||||
public void cancelPayment(String paymentId) {
|
||||
Payment payment = paymentRepository.findById(paymentId)
|
||||
.orElse(null);
|
||||
|
||||
if (payment == null) {
|
||||
// Already cancelled or doesn't exist
|
||||
return;
|
||||
}
|
||||
|
||||
if (payment.getStatus() == PaymentStatus.CANCELLED) {
|
||||
// Already cancelled, idempotent
|
||||
return;
|
||||
}
|
||||
|
||||
payment.setStatus(PaymentStatus.CANCELLED);
|
||||
paymentRepository.save(payment);
|
||||
|
||||
// Refund logic here
|
||||
}
|
||||
```
|
||||
|
||||
### Retryability
|
||||
|
||||
Design operations to handle retries without side effects:
|
||||
|
||||
```java
|
||||
@Retryable(
|
||||
value = {TransientException.class},
|
||||
maxAttempts = 3,
|
||||
backoff = @Backoff(delay = 1000, multiplier = 2)
|
||||
)
|
||||
public void releaseInventory(String itemId, int quantity) {
|
||||
// Use set operations for idempotency
|
||||
InventoryItem item = inventoryRepository.findById(itemId)
|
||||
.orElseThrow();
|
||||
|
||||
item.increaseAvailableQuantity(quantity);
|
||||
inventoryRepository.save(item);
|
||||
}
|
||||
```
|
||||
|
||||
## Compensation Strategies
|
||||
|
||||
### Backward Recovery
|
||||
|
||||
Undo completed steps in reverse order:
|
||||
|
||||
```java
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentFailedEvent event) {
|
||||
logger.error("Payment failed, initiating compensation");
|
||||
|
||||
// Step 1: Cancel shipment preparation
|
||||
commandGateway.send(new CancelShipmentCommand(event.getOrderId()));
|
||||
|
||||
// Step 2: Release inventory
|
||||
commandGateway.send(new ReleaseInventoryCommand(event.getOrderId()));
|
||||
|
||||
// Step 3: Cancel order
|
||||
commandGateway.send(new CancelOrderCommand(event.getOrderId()));
|
||||
|
||||
end();
|
||||
}
|
||||
```
|
||||
|
||||
### Forward Recovery
|
||||
|
||||
Retry failed operation with exponential backoff:
|
||||
|
||||
```java
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentTransientFailureEvent event) {
|
||||
if (event.getRetryCount() < MAX_RETRIES) {
|
||||
// Retry payment with backoff
|
||||
ProcessPaymentCommand retryCommand = new ProcessPaymentCommand(
|
||||
event.getPaymentId(),
|
||||
event.getOrderId(),
|
||||
event.getAmount()
|
||||
);
|
||||
commandGateway.send(retryCommand);
|
||||
} else {
|
||||
// After max retries, compensate
|
||||
handlePaymentFailure(event);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Semantic Lock Pattern
|
||||
|
||||
Prevent concurrent modifications during saga execution:
|
||||
|
||||
```java
|
||||
@Entity
|
||||
public class Order {
|
||||
@Id
|
||||
private String orderId;
|
||||
|
||||
@Enumerated(EnumType.STRING)
|
||||
private OrderStatus status;
|
||||
|
||||
@Version
|
||||
private Long version;
|
||||
|
||||
private Instant lockedUntil;
|
||||
|
||||
public boolean tryLock(Duration lockDuration) {
|
||||
if (isLocked()) {
|
||||
return false;
|
||||
}
|
||||
this.lockedUntil = Instant.now().plus(lockDuration);
|
||||
return true;
|
||||
}
|
||||
|
||||
public boolean isLocked() {
|
||||
return lockedUntil != null &&
|
||||
Instant.now().isBefore(lockedUntil);
|
||||
}
|
||||
|
||||
public void unlock() {
|
||||
this.lockedUntil = null;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Compensation in Axon Framework
|
||||
|
||||
```java
|
||||
@Saga
|
||||
public class OrderSaga {
|
||||
|
||||
private String orderId;
|
||||
private String paymentId;
|
||||
private String inventoryId;
|
||||
private boolean compensating = false;
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(InventoryReservationFailedEvent event) {
|
||||
logger.error("Inventory reservation failed");
|
||||
compensating = true;
|
||||
|
||||
// Compensate: refund payment
|
||||
RefundPaymentCommand refundCommand = new RefundPaymentCommand(
|
||||
paymentId,
|
||||
event.getOrderId(),
|
||||
event.getReservedAmount(),
|
||||
"Inventory unavailable"
|
||||
);
|
||||
|
||||
commandGateway.send(refundCommand);
|
||||
}
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentRefundedEvent event) {
|
||||
if (!compensating) return;
|
||||
|
||||
logger.info("Payment refunded, cancelling order");
|
||||
|
||||
// Compensate: cancel order
|
||||
CancelOrderCommand command = new CancelOrderCommand(
|
||||
event.getOrderId(),
|
||||
"Inventory unavailable - payment refunded"
|
||||
);
|
||||
|
||||
commandGateway.send(command);
|
||||
}
|
||||
|
||||
@EndSaga
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(OrderCancelledEvent event) {
|
||||
logger.info("Saga completed with compensation");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Handling Compensation Failures
|
||||
|
||||
Handle cases where compensation itself fails:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class CompensationService {
|
||||
|
||||
private final DeadLetterQueueService dlqService;
|
||||
|
||||
public void handleCompensationFailure(String sagaId, String step, Exception cause) {
|
||||
logger.error("Compensation failed for saga {} at step {}", sagaId, step, cause);
|
||||
|
||||
// Send to dead letter queue for manual intervention
|
||||
dlqService.send(new FailedCompensation(
|
||||
sagaId,
|
||||
step,
|
||||
cause.getMessage(),
|
||||
Instant.now()
|
||||
));
|
||||
|
||||
// Create alert for operations team
|
||||
alertingService.alert(
|
||||
"Compensation Failure",
|
||||
"Saga " + sagaId + " failed compensation at " + step
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Compensation
|
||||
|
||||
Verify that compensation produces expected results:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldCompensateWhenPaymentFails() {
|
||||
String orderId = "order-123";
|
||||
String paymentId = "payment-456";
|
||||
|
||||
// Arrange: execute payment
|
||||
Payment payment = new Payment(paymentId, orderId, BigDecimal.TEN);
|
||||
paymentRepository.save(payment);
|
||||
orderRepository.save(new Order(orderId, OrderStatus.PENDING));
|
||||
|
||||
// Act: compensate
|
||||
paymentService.cancelPayment(paymentId);
|
||||
|
||||
// Assert: verify idempotency
|
||||
paymentService.cancelPayment(paymentId);
|
||||
|
||||
Payment result = paymentRepository.findById(paymentId).orElseThrow();
|
||||
assertThat(result.getStatus()).isEqualTo(PaymentStatus.CANCELLED);
|
||||
}
|
||||
```
|
||||
|
||||
## Common Compensation Patterns
|
||||
|
||||
### Inventory Release
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class InventoryService {
|
||||
|
||||
public void releaseInventory(String orderId) {
|
||||
Order order = orderRepository.findById(orderId).orElseThrow();
|
||||
|
||||
order.getItems().forEach(item -> {
|
||||
InventoryItem inventoryItem = inventoryRepository
|
||||
.findById(item.getProductId())
|
||||
.orElseThrow();
|
||||
|
||||
inventoryItem.increaseAvailableQuantity(item.getQuantity());
|
||||
inventoryRepository.save(inventoryItem);
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Payment Refund
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class PaymentService {
|
||||
|
||||
public void refundPayment(String paymentId) {
|
||||
Payment payment = paymentRepository.findById(paymentId)
|
||||
.orElseThrow();
|
||||
|
||||
if (payment.getStatus() == PaymentStatus.PROCESSED) {
|
||||
payment.setStatus(PaymentStatus.REFUNDED);
|
||||
paymentGateway.refund(payment.getTransactionId());
|
||||
paymentRepository.save(payment);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Order Cancellation
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class OrderService {
|
||||
|
||||
public void cancelOrder(String orderId, String reason) {
|
||||
Order order = orderRepository.findById(orderId)
|
||||
.orElseThrow();
|
||||
|
||||
order.setStatus(OrderStatus.CANCELLED);
|
||||
order.setCancellationReason(reason);
|
||||
order.setCancelledAt(Instant.now());
|
||||
|
||||
orderRepository.save(order);
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,294 @@
|
||||
# State Management in Sagas
|
||||
|
||||
## Saga State Entity
|
||||
|
||||
Persist saga state for recovery and monitoring:
|
||||
|
||||
```java
|
||||
@Entity
|
||||
@Table(name = "saga_state")
|
||||
public class SagaState {
|
||||
|
||||
@Id
|
||||
private String sagaId;
|
||||
|
||||
@Enumerated(EnumType.STRING)
|
||||
private SagaStatus status;
|
||||
|
||||
@Column(columnDefinition = "TEXT")
|
||||
private String currentStep;
|
||||
|
||||
@Column(columnDefinition = "TEXT")
|
||||
private String compensationSteps;
|
||||
|
||||
private Instant startedAt;
|
||||
private Instant completedAt;
|
||||
|
||||
@Version
|
||||
private Long version;
|
||||
}
|
||||
|
||||
public enum SagaStatus {
|
||||
STARTED,
|
||||
PROCESSING,
|
||||
COMPENSATING,
|
||||
COMPLETED,
|
||||
FAILED,
|
||||
CANCELLED
|
||||
}
|
||||
```
|
||||
|
||||
## Saga State Machine with Spring Statemachine
|
||||
|
||||
Define saga state transitions explicitly:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
@EnableStateMachine
|
||||
public class SagaStateMachineConfig
|
||||
extends StateMachineConfigurerAdapter<SagaStatus, SagaEvent> {
|
||||
|
||||
@Override
|
||||
public void configure(
|
||||
StateMachineStateConfigurer<SagaStatus, SagaEvent> states)
|
||||
throws Exception {
|
||||
|
||||
states
|
||||
.withStates()
|
||||
.initial(SagaStatus.STARTED)
|
||||
.states(EnumSet.allOf(SagaStatus.class))
|
||||
.end(SagaStatus.COMPLETED)
|
||||
.end(SagaStatus.FAILED);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void configure(
|
||||
StateMachineTransitionConfigurer<SagaStatus, SagaEvent> transitions)
|
||||
throws Exception {
|
||||
|
||||
transitions
|
||||
.withExternal()
|
||||
.source(SagaStatus.STARTED)
|
||||
.target(SagaStatus.PROCESSING)
|
||||
.event(SagaEvent.ORDER_CREATED)
|
||||
.and()
|
||||
.withExternal()
|
||||
.source(SagaStatus.PROCESSING)
|
||||
.target(SagaStatus.COMPLETED)
|
||||
.event(SagaEvent.ALL_STEPS_COMPLETED)
|
||||
.and()
|
||||
.withExternal()
|
||||
.source(SagaStatus.PROCESSING)
|
||||
.target(SagaStatus.COMPENSATING)
|
||||
.event(SagaEvent.STEP_FAILED)
|
||||
.and()
|
||||
.withExternal()
|
||||
.source(SagaStatus.COMPENSATING)
|
||||
.target(SagaStatus.FAILED)
|
||||
.event(SagaEvent.COMPENSATION_COMPLETED);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## State Transitions
|
||||
|
||||
### Successful Saga Flow
|
||||
|
||||
```
|
||||
STARTED → PROCESSING → COMPLETED
|
||||
```
|
||||
|
||||
### Failed Saga with Compensation
|
||||
|
||||
```
|
||||
STARTED → PROCESSING → COMPENSATING → FAILED
|
||||
```
|
||||
|
||||
### Saga with Retry
|
||||
|
||||
```
|
||||
STARTED → PROCESSING → PROCESSING (retry) → COMPLETED
|
||||
```
|
||||
|
||||
## Persisting Saga Context
|
||||
|
||||
Store context data for saga execution:
|
||||
|
||||
```java
|
||||
@Entity
|
||||
@Table(name = "saga_context")
|
||||
public class SagaContext {
|
||||
|
||||
@Id
|
||||
private String sagaId;
|
||||
|
||||
@Column(columnDefinition = "TEXT")
|
||||
private String contextData; // JSON-serialized
|
||||
|
||||
private Instant createdAt;
|
||||
private Instant updatedAt;
|
||||
|
||||
public <T> T getContextData(Class<T> type) {
|
||||
return JsonUtils.fromJson(contextData, type);
|
||||
}
|
||||
|
||||
public void setContextData(Object data) {
|
||||
this.contextData = JsonUtils.toJson(data);
|
||||
}
|
||||
}
|
||||
|
||||
@Service
|
||||
public class SagaContextService {
|
||||
|
||||
private final SagaContextRepository repository;
|
||||
|
||||
public void saveContext(String sagaId, Object context) {
|
||||
SagaContext sagaContext = new SagaContext(sagaId);
|
||||
sagaContext.setContextData(context);
|
||||
repository.save(sagaContext);
|
||||
}
|
||||
|
||||
public <T> T loadContext(String sagaId, Class<T> type) {
|
||||
return repository.findById(sagaId)
|
||||
.map(ctx -> ctx.getContextData(type))
|
||||
.orElseThrow(() -> new SagaContextNotFoundException(sagaId));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Handling Saga Timeouts
|
||||
|
||||
Detect and handle sagas that exceed expected duration:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class SagaTimeoutHandler {
|
||||
|
||||
private final SagaStateRepository repository;
|
||||
private static final Duration MAX_SAGA_DURATION = Duration.ofMinutes(30);
|
||||
|
||||
@Scheduled(fixedDelay = 60000) // Check every minute
|
||||
public void detectTimeouts() {
|
||||
Instant timeout = Instant.now().minus(MAX_SAGA_DURATION);
|
||||
|
||||
List<SagaState> timedOutSagas = repository
|
||||
.findByStatusAndStartedAtBefore(SagaStatus.PROCESSING, timeout);
|
||||
|
||||
timedOutSagas.forEach(saga -> {
|
||||
logger.warn("Saga {} timed out", saga.getSagaId());
|
||||
compensateSaga(saga);
|
||||
});
|
||||
}
|
||||
|
||||
private void compensateSaga(SagaState saga) {
|
||||
saga.setStatus(SagaStatus.COMPENSATING);
|
||||
repository.save(saga);
|
||||
// Trigger compensation logic
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Saga Recovery
|
||||
|
||||
Recover sagas from failures:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class SagaRecoveryService {
|
||||
|
||||
private final SagaStateRepository stateRepository;
|
||||
private final CommandGateway commandGateway;
|
||||
|
||||
@Scheduled(fixedDelay = 30000) // Check every 30 seconds
|
||||
public void recoverFailedSagas() {
|
||||
List<SagaState> failedSagas = stateRepository
|
||||
.findByStatus(SagaStatus.FAILED);
|
||||
|
||||
failedSagas.forEach(saga -> {
|
||||
if (canBeRetried(saga)) {
|
||||
logger.info("Retrying saga {}", saga.getSagaId());
|
||||
retrySaga(saga);
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
private boolean canBeRetried(SagaState saga) {
|
||||
return saga.getRetryCount() < 3;
|
||||
}
|
||||
|
||||
private void retrySaga(SagaState saga) {
|
||||
saga.setStatus(SagaStatus.STARTED);
|
||||
saga.setRetryCount(saga.getRetryCount() + 1);
|
||||
stateRepository.save(saga);
|
||||
// Send retry command
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Saga State Query
|
||||
|
||||
Query sagas for monitoring:
|
||||
|
||||
```java
|
||||
@Repository
|
||||
public interface SagaStateRepository extends JpaRepository<SagaState, String> {
|
||||
|
||||
List<SagaState> findByStatus(SagaStatus status);
|
||||
|
||||
List<SagaState> findByStatusAndStartedAtBefore(
|
||||
SagaStatus status, Instant before);
|
||||
|
||||
Page<SagaState> findByStatus(SagaStatus status, Pageable pageable);
|
||||
|
||||
long countByStatus(SagaStatus status);
|
||||
|
||||
long countByStatusAndStartedAtBefore(SagaStatus status, Instant before);
|
||||
}
|
||||
|
||||
@RestController
|
||||
@RequestMapping("/api/sagas")
|
||||
public class SagaMonitoringController {
|
||||
|
||||
private final SagaStateRepository repository;
|
||||
|
||||
@GetMapping("/status/{status}")
|
||||
public List<SagaState> getSagasByStatus(
|
||||
@PathVariable SagaStatus status) {
|
||||
return repository.findByStatus(status);
|
||||
}
|
||||
|
||||
@GetMapping("/stuck")
|
||||
public List<SagaState> getStuckSagas() {
|
||||
Instant oneHourAgo = Instant.now().minus(Duration.ofHours(1));
|
||||
return repository.findByStatusAndStartedAtBefore(
|
||||
SagaStatus.PROCESSING, oneHourAgo);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Database Schema for State Management
|
||||
|
||||
```sql
|
||||
CREATE TABLE saga_state (
|
||||
saga_id VARCHAR(255) PRIMARY KEY,
|
||||
status VARCHAR(50) NOT NULL,
|
||||
current_step TEXT,
|
||||
compensation_steps TEXT,
|
||||
started_at TIMESTAMP NOT NULL,
|
||||
completed_at TIMESTAMP,
|
||||
version BIGINT,
|
||||
INDEX idx_status (status),
|
||||
INDEX idx_started_at (started_at)
|
||||
);
|
||||
|
||||
CREATE TABLE saga_context (
|
||||
saga_id VARCHAR(255) PRIMARY KEY,
|
||||
context_data LONGTEXT,
|
||||
created_at TIMESTAMP NOT NULL,
|
||||
updated_at TIMESTAMP,
|
||||
FOREIGN KEY (saga_id) REFERENCES saga_state(saga_id)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_saga_state_status_started
|
||||
ON saga_state(status, started_at);
|
||||
```
|
||||
@@ -0,0 +1,323 @@
|
||||
# Error Handling and Retry Strategies
|
||||
|
||||
## Retry Configuration
|
||||
|
||||
Use Spring Retry for automatic retry logic:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
@EnableRetry
|
||||
public class RetryConfig {
|
||||
|
||||
@Bean
|
||||
public RetryTemplate retryTemplate() {
|
||||
RetryTemplate retryTemplate = new RetryTemplate();
|
||||
|
||||
FixedBackOffPolicy backOffPolicy = new FixedBackOffPolicy();
|
||||
backOffPolicy.setBackOffPeriod(2000L); // 2 second delay
|
||||
|
||||
ExponentialBackOffPolicy exponentialBackOff = new ExponentialBackOffPolicy();
|
||||
exponentialBackOff.setInitialInterval(1000L);
|
||||
exponentialBackOff.setMultiplier(2.0);
|
||||
exponentialBackOff.setMaxInterval(10000L);
|
||||
|
||||
SimpleRetryPolicy retryPolicy = new SimpleRetryPolicy();
|
||||
retryPolicy.setMaxAttempts(3);
|
||||
|
||||
retryTemplate.setBackOffPolicy(exponentialBackOff);
|
||||
retryTemplate.setRetryPolicy(retryPolicy);
|
||||
|
||||
return retryTemplate;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Retry with @Retryable
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class OrderService {
|
||||
|
||||
@Retryable(
|
||||
value = {TransientException.class},
|
||||
maxAttempts = 3,
|
||||
backoff = @Backoff(delay = 1000, multiplier = 2)
|
||||
)
|
||||
public void processOrder(String orderId) {
|
||||
// Order processing logic
|
||||
}
|
||||
|
||||
@Recover
|
||||
public void recover(TransientException ex, String orderId) {
|
||||
logger.error("Order processing failed after retries: {}", orderId, ex);
|
||||
// Fallback logic
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Circuit Breaker with Resilience4j
|
||||
|
||||
Prevent cascading failures:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
public class CircuitBreakerConfig {
|
||||
|
||||
@Bean
|
||||
public CircuitBreakerRegistry circuitBreakerRegistry() {
|
||||
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
|
||||
.failureRateThreshold(50) // Open after 50% failures
|
||||
.waitDurationInOpenState(Duration.ofMillis(1000))
|
||||
.slidingWindowSize(2) // Check last 2 calls
|
||||
.build();
|
||||
|
||||
return CircuitBreakerRegistry.of(config);
|
||||
}
|
||||
}
|
||||
|
||||
@Service
|
||||
public class PaymentService {
|
||||
|
||||
private final CircuitBreaker circuitBreaker;
|
||||
|
||||
public PaymentService(CircuitBreakerRegistry registry) {
|
||||
this.circuitBreaker = registry.circuitBreaker("payment");
|
||||
}
|
||||
|
||||
public PaymentResult processPayment(PaymentRequest request) {
|
||||
return circuitBreaker.executeSupplier(
|
||||
() -> callPaymentGateway(request)
|
||||
);
|
||||
}
|
||||
|
||||
private PaymentResult callPaymentGateway(PaymentRequest request) {
|
||||
// Call external payment gateway
|
||||
return new PaymentResult(...);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Dead Letter Queue
|
||||
|
||||
Handle failed messages:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
public class DeadLetterQueueConfig {
|
||||
|
||||
@Bean
|
||||
public NewTopic deadLetterTopic() {
|
||||
return new NewTopic("saga-dlq", 1, (short) 1);
|
||||
}
|
||||
}
|
||||
|
||||
@Component
|
||||
public class SagaErrorHandler implements ConsumerAwareErrorHandler {
|
||||
|
||||
private final KafkaTemplate<String, Object> kafkaTemplate;
|
||||
|
||||
@Override
|
||||
public void handle(Exception thrownException,
|
||||
List<ConsumerRecord<?, ?>> records,
|
||||
Consumer<?, ?> consumer,
|
||||
MessageListenerContainer container) {
|
||||
|
||||
records.forEach(record -> {
|
||||
logger.error("Processing failed for message: {}", record.key());
|
||||
kafkaTemplate.send("saga-dlq", record.key(), record.value());
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Timeout Handling
|
||||
|
||||
Define and enforce timeout policies:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class TimeoutHandler {
|
||||
|
||||
private final SagaStateRepository sagaStateRepository;
|
||||
private static final Duration STEP_TIMEOUT = Duration.ofSeconds(30);
|
||||
|
||||
@Scheduled(fixedDelay = 5000)
|
||||
public void checkForTimeouts() {
|
||||
Instant timeoutThreshold = Instant.now().minus(STEP_TIMEOUT);
|
||||
|
||||
List<SagaState> timedOutSagas = sagaStateRepository
|
||||
.findByStatusAndUpdatedAtBefore(SagaStatus.PROCESSING, timeoutThreshold);
|
||||
|
||||
timedOutSagas.forEach(saga -> {
|
||||
logger.warn("Saga {} timed out at step {}",
|
||||
saga.getSagaId(), saga.getCurrentStep());
|
||||
compensateSaga(saga);
|
||||
});
|
||||
}
|
||||
|
||||
private void compensateSaga(SagaState saga) {
|
||||
saga.setStatus(SagaStatus.COMPENSATING);
|
||||
sagaStateRepository.save(saga);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Exponential Backoff
|
||||
|
||||
Prevent overwhelming downstream services:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class BackoffService {
|
||||
|
||||
public Duration calculateBackoff(int attemptNumber) {
|
||||
long baseDelay = 1000; // 1 second
|
||||
long delay = baseDelay * (long) Math.pow(2, attemptNumber - 1);
|
||||
long maxDelay = 30000; // 30 seconds
|
||||
|
||||
return Duration.ofMillis(Math.min(delay, maxDelay));
|
||||
}
|
||||
|
||||
@Retryable(
|
||||
value = {ServiceUnavailableException.class},
|
||||
maxAttempts = 5,
|
||||
backoff = @Backoff(
|
||||
delay = 1000,
|
||||
multiplier = 2.0,
|
||||
maxDelay = 30000
|
||||
)
|
||||
)
|
||||
public void callExternalService() {
|
||||
// External service call
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Idempotent Retry
|
||||
|
||||
Ensure retries don't cause duplicate processing:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class IdempotentPaymentService {
|
||||
|
||||
private final PaymentRepository paymentRepository;
|
||||
private final Map<String, PaymentResult> processedPayments = new ConcurrentHashMap<>();
|
||||
|
||||
public PaymentResult processPayment(String paymentId, BigDecimal amount) {
|
||||
// Check if already processed
|
||||
if (processedPayments.containsKey(paymentId)) {
|
||||
return processedPayments.get(paymentId);
|
||||
}
|
||||
|
||||
// Check database
|
||||
Optional<Payment> existing = paymentRepository.findById(paymentId);
|
||||
if (existing.isPresent()) {
|
||||
return new PaymentResult(existing.get());
|
||||
}
|
||||
|
||||
// Process payment
|
||||
PaymentResult result = callPaymentGateway(paymentId, amount);
|
||||
|
||||
// Cache and persist
|
||||
processedPayments.put(paymentId, result);
|
||||
paymentRepository.save(new Payment(paymentId, amount, result.getStatus()));
|
||||
|
||||
return result;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Global Exception Handler
|
||||
|
||||
Centralize error handling:
|
||||
|
||||
```java
|
||||
@RestControllerAdvice
|
||||
public class GlobalExceptionHandler {
|
||||
|
||||
@ExceptionHandler(SagaExecutionException.class)
|
||||
public ResponseEntity<ErrorResponse> handleSagaError(
|
||||
SagaExecutionException ex) {
|
||||
|
||||
return ResponseEntity
|
||||
.status(HttpStatus.UNPROCESSABLE_ENTITY)
|
||||
.body(new ErrorResponse(
|
||||
"SAGA_EXECUTION_FAILED",
|
||||
ex.getMessage(),
|
||||
ex.getSagaId()
|
||||
));
|
||||
}
|
||||
|
||||
@ExceptionHandler(ServiceUnavailableException.class)
|
||||
public ResponseEntity<ErrorResponse> handleServiceUnavailable(
|
||||
ServiceUnavailableException ex) {
|
||||
|
||||
return ResponseEntity
|
||||
.status(HttpStatus.SERVICE_UNAVAILABLE)
|
||||
.body(new ErrorResponse(
|
||||
"SERVICE_UNAVAILABLE",
|
||||
"Required service is temporarily unavailable"
|
||||
));
|
||||
}
|
||||
|
||||
@ExceptionHandler(TimeoutException.class)
|
||||
public ResponseEntity<ErrorResponse> handleTimeout(
|
||||
TimeoutException ex) {
|
||||
|
||||
return ResponseEntity
|
||||
.status(HttpStatus.REQUEST_TIMEOUT)
|
||||
.body(new ErrorResponse(
|
||||
"REQUEST_TIMEOUT",
|
||||
"Request timed out after " + ex.getDuration()
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
public record ErrorResponse(
|
||||
String code,
|
||||
String message,
|
||||
String details
|
||||
) {
|
||||
public ErrorResponse(String code, String message) {
|
||||
this(code, message, null);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Monitoring Error Rates
|
||||
|
||||
Track failure metrics:
|
||||
|
||||
```java
|
||||
@Component
|
||||
public class SagaErrorMetrics {
|
||||
|
||||
private final MeterRegistry meterRegistry;
|
||||
|
||||
public SagaErrorMetrics(MeterRegistry meterRegistry) {
|
||||
this.meterRegistry = meterRegistry;
|
||||
}
|
||||
|
||||
public void recordSagaFailure(String sagaType) {
|
||||
Counter.builder("saga.failure")
|
||||
.tag("type", sagaType)
|
||||
.register(meterRegistry)
|
||||
.increment();
|
||||
}
|
||||
|
||||
public void recordRetry(String sagaType) {
|
||||
Counter.builder("saga.retry")
|
||||
.tag("type", sagaType)
|
||||
.register(meterRegistry)
|
||||
.increment();
|
||||
}
|
||||
|
||||
public void recordTimeout(String sagaType) {
|
||||
Counter.builder("saga.timeout")
|
||||
.tag("type", sagaType)
|
||||
.register(meterRegistry)
|
||||
.increment();
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,320 @@
|
||||
# Testing Strategies for Sagas
|
||||
|
||||
## Unit Testing Saga Logic
|
||||
|
||||
Test saga behavior with Axon test fixtures:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldDispatchPaymentCommandWhenOrderCreated() {
|
||||
// Arrange
|
||||
String orderId = UUID.randomUUID().toString();
|
||||
String paymentId = UUID.randomUUID().toString();
|
||||
|
||||
SagaTestFixture<OrderSaga> fixture = new SagaTestFixture<>(OrderSaga.class);
|
||||
|
||||
// Act & Assert
|
||||
fixture
|
||||
.givenNoPriorActivity()
|
||||
.whenPublishingA(new OrderCreatedEvent(orderId, BigDecimal.TEN, "item-1"))
|
||||
.expectDispatchedCommands(new ProcessPaymentCommand(paymentId, orderId, BigDecimal.TEN));
|
||||
}
|
||||
|
||||
@Test
|
||||
void shouldCompensateWhenPaymentFails() {
|
||||
String orderId = UUID.randomUUID().toString();
|
||||
String paymentId = UUID.randomUUID().toString();
|
||||
|
||||
SagaTestFixture<OrderSaga> fixture = new SagaTestFixture<>(OrderSaga.class);
|
||||
|
||||
fixture
|
||||
.givenNoPriorActivity()
|
||||
.whenPublishingA(new OrderCreatedEvent(orderId, BigDecimal.TEN, "item-1"))
|
||||
.whenPublishingA(new PaymentFailedEvent(paymentId, orderId, "item-1", "Insufficient funds"))
|
||||
.expectDispatchedCommands(new CancelOrderCommand(orderId))
|
||||
.expectScheduledEventOfType(OrderSaga.class, null);
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Event Publishing
|
||||
|
||||
Verify events are published correctly:
|
||||
|
||||
```java
|
||||
@SpringBootTest
|
||||
@WebMvcTest
|
||||
class OrderServiceTest {
|
||||
|
||||
@MockBean
|
||||
private EventPublisher eventPublisher;
|
||||
|
||||
@InjectMocks
|
||||
private OrderService orderService;
|
||||
|
||||
@Test
|
||||
void shouldPublishOrderCreatedEvent() {
|
||||
// Arrange
|
||||
CreateOrderRequest request = new CreateOrderRequest("cust-1", BigDecimal.TEN);
|
||||
|
||||
// Act
|
||||
String orderId = orderService.createOrder(request);
|
||||
|
||||
// Assert
|
||||
verify(eventPublisher).publish(
|
||||
argThat(event -> event instanceof OrderCreatedEvent &&
|
||||
((OrderCreatedEvent) event).orderId().equals(orderId))
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Integration Testing with Testcontainers
|
||||
|
||||
Test complete saga flow with real services:
|
||||
|
||||
```java
|
||||
@SpringBootTest
|
||||
@Testcontainers
|
||||
class SagaIntegrationTest {
|
||||
|
||||
@Container
|
||||
static KafkaContainer kafka = new KafkaContainer(
|
||||
DockerImageName.parse("confluentinc/cp-kafka:7.4.0")
|
||||
);
|
||||
|
||||
@Container
|
||||
static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>(
|
||||
"postgres:15-alpine"
|
||||
);
|
||||
|
||||
@DynamicPropertySource
|
||||
static void overrideProperties(DynamicPropertyRegistry registry) {
|
||||
registry.add("spring.kafka.bootstrap-servers", kafka::getBootstrapServers);
|
||||
registry.add("spring.datasource.url", postgres::getJdbcUrl);
|
||||
registry.add("spring.datasource.username", postgres::getUsername);
|
||||
registry.add("spring.datasource.password", postgres::getPassword);
|
||||
}
|
||||
|
||||
@Test
|
||||
void shouldCompleteOrderSagaSuccessfully(@Autowired OrderService orderService,
|
||||
@Autowired OrderRepository orderRepository,
|
||||
@Autowired EventPublisher eventPublisher) {
|
||||
// Arrange
|
||||
CreateOrderRequest request = new CreateOrderRequest("cust-1", BigDecimal.TEN);
|
||||
|
||||
// Act
|
||||
String orderId = orderService.createOrder(request);
|
||||
|
||||
// Wait for async processing
|
||||
Thread.sleep(2000);
|
||||
|
||||
// Assert
|
||||
Order order = orderRepository.findById(orderId).orElseThrow();
|
||||
assertThat(order.getStatus()).isEqualTo(OrderStatus.COMPLETED);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Idempotency
|
||||
|
||||
Verify operations produce same results on retry:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void compensationShouldBeIdempotent() {
|
||||
// Arrange
|
||||
String paymentId = "payment-123";
|
||||
Payment payment = new Payment(paymentId, "order-1", BigDecimal.TEN);
|
||||
paymentRepository.save(payment);
|
||||
|
||||
// Act - First compensation
|
||||
paymentService.cancelPayment(paymentId);
|
||||
Payment firstResult = paymentRepository.findById(paymentId).orElseThrow();
|
||||
|
||||
// Act - Second compensation (should be idempotent)
|
||||
paymentService.cancelPayment(paymentId);
|
||||
Payment secondResult = paymentRepository.findById(paymentId).orElseThrow();
|
||||
|
||||
// Assert
|
||||
assertThat(firstResult).isEqualTo(secondResult);
|
||||
assertThat(secondResult.getStatus()).isEqualTo(PaymentStatus.CANCELLED);
|
||||
assertThat(secondResult.getVersion()).isEqualTo(firstResult.getVersion());
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Concurrent Sagas
|
||||
|
||||
Verify saga isolation under concurrent execution:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldHandleConcurrentSagaExecutions() throws InterruptedException {
|
||||
// Arrange
|
||||
int numThreads = 10;
|
||||
ExecutorService executor = Executors.newFixedThreadPool(numThreads);
|
||||
CountDownLatch latch = new CountDownLatch(numThreads);
|
||||
|
||||
// Act
|
||||
for (int i = 0; i < numThreads; i++) {
|
||||
final int index = i;
|
||||
executor.submit(() -> {
|
||||
try {
|
||||
CreateOrderRequest request = new CreateOrderRequest(
|
||||
"cust-" + index,
|
||||
BigDecimal.TEN.multiply(BigDecimal.valueOf(index))
|
||||
);
|
||||
orderService.createOrder(request);
|
||||
} finally {
|
||||
latch.countDown();
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
latch.await(10, TimeUnit.SECONDS);
|
||||
|
||||
// Assert
|
||||
long createdOrders = orderRepository.count();
|
||||
assertThat(createdOrders).isEqualTo(numThreads);
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Failure Scenarios
|
||||
|
||||
Test each failure path and compensation:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldCompensateWhenInventoryUnavailable() {
|
||||
// Arrange
|
||||
String orderId = UUID.randomUUID().toString();
|
||||
inventoryService.setAvailability("item-1", 0); // No inventory
|
||||
|
||||
// Act
|
||||
String result = orderService.createOrder(
|
||||
new CreateOrderRequest("cust-1", BigDecimal.TEN)
|
||||
);
|
||||
|
||||
// Wait for saga completion
|
||||
Thread.sleep(2000);
|
||||
|
||||
// Assert
|
||||
Order order = orderRepository.findById(orderId).orElseThrow();
|
||||
assertThat(order.getStatus()).isEqualTo(OrderStatus.CANCELLED);
|
||||
|
||||
// Verify payment was refunded
|
||||
Payment payment = paymentRepository.findByOrderId(orderId).orElseThrow();
|
||||
assertThat(payment.getStatus()).isEqualTo(PaymentStatus.REFUNDED);
|
||||
}
|
||||
|
||||
@Test
|
||||
void shouldHandlePaymentGatewayFailure() {
|
||||
// Arrange
|
||||
paymentGateway.setFailureRate(1.0); // 100% failure
|
||||
|
||||
// Act
|
||||
String orderId = orderService.createOrder(
|
||||
new CreateOrderRequest("cust-1", BigDecimal.TEN)
|
||||
);
|
||||
|
||||
// Wait for saga completion
|
||||
Thread.sleep(2000);
|
||||
|
||||
// Assert
|
||||
Order order = orderRepository.findById(orderId).orElseThrow();
|
||||
assertThat(order.getStatus()).isEqualTo(OrderStatus.CANCELLED);
|
||||
}
|
||||
```
|
||||
|
||||
## Testing State Machine
|
||||
|
||||
Verify state transitions:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldTransitionStatesProperly() {
|
||||
// Arrange
|
||||
String sagaId = UUID.randomUUID().toString();
|
||||
SagaState sagaState = new SagaState(sagaId, SagaStatus.STARTED);
|
||||
sagaStateRepository.save(sagaState);
|
||||
|
||||
// Act & Assert
|
||||
assertThat(sagaState.getStatus()).isEqualTo(SagaStatus.STARTED);
|
||||
|
||||
sagaState.setStatus(SagaStatus.PROCESSING);
|
||||
sagaStateRepository.save(sagaState);
|
||||
assertThat(sagaStateRepository.findById(sagaId).get().getStatus())
|
||||
.isEqualTo(SagaStatus.PROCESSING);
|
||||
|
||||
sagaState.setStatus(SagaStatus.COMPLETED);
|
||||
sagaStateRepository.save(sagaState);
|
||||
assertThat(sagaStateRepository.findById(sagaId).get().getStatus())
|
||||
.isEqualTo(SagaStatus.COMPLETED);
|
||||
}
|
||||
```
|
||||
|
||||
## Test Data Builders
|
||||
|
||||
Use builders for cleaner test code:
|
||||
|
||||
```java
|
||||
public class OrderRequestBuilder {
|
||||
|
||||
private String customerId = "cust-default";
|
||||
private BigDecimal totalAmount = BigDecimal.TEN;
|
||||
private List<OrderItem> items = new ArrayList<>();
|
||||
|
||||
public OrderRequestBuilder withCustomerId(String customerId) {
|
||||
this.customerId = customerId;
|
||||
return this;
|
||||
}
|
||||
|
||||
public OrderRequestBuilder withAmount(BigDecimal amount) {
|
||||
this.totalAmount = amount;
|
||||
return this;
|
||||
}
|
||||
|
||||
public OrderRequestBuilder withItem(String productId, int quantity) {
|
||||
items.add(new OrderItem(productId, "Product", quantity, BigDecimal.TEN));
|
||||
return this;
|
||||
}
|
||||
|
||||
public CreateOrderRequest build() {
|
||||
return new CreateOrderRequest(customerId, totalAmount, items);
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void shouldCreateOrderWithCustomization() {
|
||||
CreateOrderRequest request = new OrderRequestBuilder()
|
||||
.withCustomerId("customer-123")
|
||||
.withAmount(BigDecimal.valueOf(50))
|
||||
.withItem("product-1", 2)
|
||||
.withItem("product-2", 1)
|
||||
.build();
|
||||
|
||||
String orderId = orderService.createOrder(request);
|
||||
assertThat(orderId).isNotNull();
|
||||
}
|
||||
```
|
||||
|
||||
## Performance Testing
|
||||
|
||||
Measure saga execution time:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldCompleteOrderSagaWithinTimeLimit() {
|
||||
// Arrange
|
||||
CreateOrderRequest request = new CreateOrderRequest("cust-1", BigDecimal.TEN);
|
||||
long maxDurationMs = 5000; // 5 seconds
|
||||
|
||||
// Act
|
||||
Instant start = Instant.now();
|
||||
String orderId = orderService.createOrder(request);
|
||||
Instant end = Instant.now();
|
||||
|
||||
// Assert
|
||||
long duration = Duration.between(start, end).toMillis();
|
||||
assertThat(duration).isLessThan(maxDurationMs);
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,471 @@
|
||||
# Common Pitfalls and Solutions
|
||||
|
||||
## Pitfall 1: Lost Messages
|
||||
|
||||
### Problem
|
||||
Messages get lost due to broker failures, network issues, or consumer crashes before acknowledgment.
|
||||
|
||||
### Solution
|
||||
Use persistent messages with acknowledgments:
|
||||
|
||||
```java
|
||||
@Bean
|
||||
public ProducerFactory<String, Object> producerFactory() {
|
||||
Map<String, Object> config = new HashMap<>();
|
||||
config.put(ProducerConfig.ACKS_CONFIG, "all"); // All replicas must acknowledge
|
||||
config.put(ProducerConfig.RETRIES_CONFIG, 3); // Retry failed sends
|
||||
config.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true); // Prevent duplicates
|
||||
return new DefaultKafkaProducerFactory<>(config);
|
||||
}
|
||||
|
||||
@Bean
|
||||
public ConsumerFactory<String, Object> consumerFactory() {
|
||||
Map<String, Object> config = new HashMap<>();
|
||||
config.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false); // Manual commit
|
||||
return new DefaultKafkaConsumerFactory<>(config);
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Configure producer to wait for all replicas (`acks=all`)
|
||||
- ✓ Enable idempotence to prevent duplicate messages
|
||||
- ✓ Use manual commit for consumers
|
||||
- ✓ Monitor message lag and broker health
|
||||
- ✓ Use transactional outbox pattern
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 2: Duplicate Processing
|
||||
|
||||
### Problem
|
||||
Same message processed multiple times due to failed acknowledgments or retries, causing side effects.
|
||||
|
||||
### Solution
|
||||
Implement idempotency with deduplication:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class DeduplicationService {
|
||||
|
||||
private final DeduplicationRepository repository;
|
||||
|
||||
public boolean isDuplicate(String messageId) {
|
||||
return repository.existsById(messageId);
|
||||
}
|
||||
|
||||
public void recordProcessed(String messageId) {
|
||||
DeduplicatedMessage entry = new DeduplicatedMessage(
|
||||
messageId,
|
||||
Instant.now()
|
||||
);
|
||||
repository.save(entry);
|
||||
}
|
||||
}
|
||||
|
||||
@Component
|
||||
public class PaymentEventListener {
|
||||
|
||||
private final DeduplicationService deduplicationService;
|
||||
private final PaymentService paymentService;
|
||||
|
||||
@Bean
|
||||
public Consumer<PaymentEvent> handlePaymentEvent() {
|
||||
return event -> {
|
||||
String messageId = event.getMessageId();
|
||||
|
||||
if (deduplicationService.isDuplicate(messageId)) {
|
||||
logger.info("Duplicate message ignored: {}", messageId);
|
||||
return;
|
||||
}
|
||||
|
||||
paymentService.processPayment(event);
|
||||
deduplicationService.recordProcessed(messageId);
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Add unique message ID to all events
|
||||
- ✓ Implement deduplication cache/database
|
||||
- ✓ Make all operations idempotent
|
||||
- ✓ Use version control for entity updates
|
||||
- ✓ Test with message replay
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 3: Saga State Inconsistency
|
||||
|
||||
### Problem
|
||||
Saga state in database doesn't match actual service states, leading to orphaned or stuck sagas.
|
||||
|
||||
### Solution
|
||||
Use event sourcing or state reconciliation:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class SagaStateReconciler {
|
||||
|
||||
private final SagaStateRepository stateRepository;
|
||||
private final OrderRepository orderRepository;
|
||||
private final PaymentRepository paymentRepository;
|
||||
|
||||
@Scheduled(fixedDelay = 60000) // Run every minute
|
||||
public void reconcileSagaStates() {
|
||||
List<SagaState> processingSagas = stateRepository
|
||||
.findByStatus(SagaStatus.PROCESSING);
|
||||
|
||||
processingSagas.forEach(saga -> {
|
||||
if (isActuallyCompleted(saga)) {
|
||||
logger.info("Reconciling saga {} - marking as completed", saga.getSagaId());
|
||||
saga.setStatus(SagaStatus.COMPLETED);
|
||||
saga.setCompletedAt(Instant.now());
|
||||
stateRepository.save(saga);
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
private boolean isActuallyCompleted(SagaState saga) {
|
||||
String orderId = saga.getSagaId();
|
||||
|
||||
Order order = orderRepository.findById(orderId).orElse(null);
|
||||
if (order == null || order.getStatus() != OrderStatus.COMPLETED) {
|
||||
return false;
|
||||
}
|
||||
|
||||
Payment payment = paymentRepository.findByOrderId(orderId).orElse(null);
|
||||
if (payment == null || payment.getStatus() != PaymentStatus.PROCESSED) {
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Use event sourcing for complete audit trail
|
||||
- ✓ Implement state reconciliation job
|
||||
- ✓ Add health checks for saga coordinator
|
||||
- ✓ Monitor saga state transitions
|
||||
- ✓ Persist compensation steps
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 4: Orchestrator Single Point of Failure
|
||||
|
||||
### Problem
|
||||
Orchestration-based saga fails when orchestrator is down, blocking all sagas.
|
||||
|
||||
### Solution
|
||||
Implement clustering and failover:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
public class SagaOrchestratorClusterConfig {
|
||||
|
||||
@Bean
|
||||
public SagaStateRepository sagaStateRepository() {
|
||||
// Use shared database for cluster-wide state
|
||||
return new DatabaseSagaStateRepository();
|
||||
}
|
||||
|
||||
@Bean
|
||||
@Primary
|
||||
public CommandGateway clusterAwareCommandGateway(
|
||||
CommandBus commandBus) {
|
||||
|
||||
return new ClusterAwareCommandGateway(commandBus);
|
||||
}
|
||||
}
|
||||
|
||||
@Component
|
||||
public class OrchestratorHealthCheck extends AbstractHealthIndicator {
|
||||
|
||||
private final SagaStateRepository repository;
|
||||
|
||||
@Override
|
||||
protected void doHealthCheck(Health.Builder builder) {
|
||||
long stuckSagas = repository.countStuckSagas(Duration.ofMinutes(30));
|
||||
|
||||
if (stuckSagas > 100) {
|
||||
builder.down()
|
||||
.withDetail("stuckSagas", stuckSagas)
|
||||
.withDetail("severity", "critical");
|
||||
} else if (stuckSagas > 10) {
|
||||
builder.degraded()
|
||||
.withDetail("stuckSagas", stuckSagas)
|
||||
.withDetail("severity", "warning");
|
||||
} else {
|
||||
builder.up()
|
||||
.withDetail("stuckSagas", stuckSagas);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Deploy orchestrator in cluster with shared state
|
||||
- ✓ Use distributed coordination (ZooKeeper, Consul)
|
||||
- ✓ Implement heartbeat monitoring
|
||||
- ✓ Set up automatic failover
|
||||
- ✓ Use circuit breakers for service calls
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 5: Non-Idempotent Compensations
|
||||
|
||||
### Problem
|
||||
Compensation logic fails on retry because it's not idempotent, leaving system in inconsistent state.
|
||||
|
||||
### Solution
|
||||
Design all compensations to be idempotent:
|
||||
|
||||
```java
|
||||
// Bad - Not idempotent
|
||||
@Service
|
||||
public class BadPaymentService {
|
||||
public void refundPayment(String paymentId) {
|
||||
Payment payment = paymentRepository.findById(paymentId).orElseThrow();
|
||||
payment.setStatus(PaymentStatus.REFUNDED);
|
||||
paymentRepository.save(payment);
|
||||
|
||||
// If this fails partway, retry causes problems
|
||||
externalPaymentGateway.refund(payment.getTransactionId());
|
||||
}
|
||||
}
|
||||
|
||||
// Good - Idempotent
|
||||
@Service
|
||||
public class GoodPaymentService {
|
||||
public void refundPayment(String paymentId) {
|
||||
Payment payment = paymentRepository.findById(paymentId)
|
||||
.orElse(null);
|
||||
|
||||
if (payment == null) {
|
||||
// Already deleted or doesn't exist
|
||||
logger.info("Payment {} not found, skipping refund", paymentId);
|
||||
return;
|
||||
}
|
||||
|
||||
if (payment.getStatus() == PaymentStatus.REFUNDED) {
|
||||
// Already refunded
|
||||
logger.info("Payment {} already refunded", paymentId);
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
externalPaymentGateway.refund(payment.getTransactionId());
|
||||
payment.setStatus(PaymentStatus.REFUNDED);
|
||||
paymentRepository.save(payment);
|
||||
} catch (Exception e) {
|
||||
logger.error("Refund failed, will retry", e);
|
||||
throw e;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Check current state before making changes
|
||||
- ✓ Use status flags to track compensation completion
|
||||
- ✓ Make database updates idempotent
|
||||
- ✓ Test compensation with replays
|
||||
- ✓ Document compensation logic
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 6: Missing Timeouts
|
||||
|
||||
### Problem
|
||||
Sagas hang indefinitely waiting for events that never arrive due to service failures.
|
||||
|
||||
### Solution
|
||||
Implement timeout mechanisms:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
public class SagaTimeoutConfig {
|
||||
|
||||
@Bean
|
||||
public SagaLifecycle sagaLifecycle(SagaStateRepository repository) {
|
||||
return new SagaLifecycle() {
|
||||
@Override
|
||||
public void onSagaFinished(Saga saga) {
|
||||
// Update saga state
|
||||
}
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
@Saga
|
||||
public class OrderSaga {
|
||||
|
||||
@Autowired
|
||||
private transient CommandGateway commandGateway;
|
||||
|
||||
private String orderId;
|
||||
private String paymentId;
|
||||
private DeadlineManager deadlineManager;
|
||||
|
||||
@StartSaga
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(OrderCreatedEvent event) {
|
||||
this.orderId = event.orderId();
|
||||
|
||||
// Schedule timeout for payment processing
|
||||
deadlineManager.scheduleDeadline(
|
||||
Duration.ofSeconds(30),
|
||||
"PaymentTimeout",
|
||||
orderId
|
||||
);
|
||||
|
||||
commandGateway.send(new ProcessPaymentCommand(...));
|
||||
}
|
||||
|
||||
@DeadlineHandler(deadlineName = "PaymentTimeout")
|
||||
public void handlePaymentTimeout() {
|
||||
logger.warn("Payment processing timed out for order {}", orderId);
|
||||
|
||||
// Compensate
|
||||
commandGateway.send(new CancelOrderCommand(orderId));
|
||||
end();
|
||||
}
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentProcessedEvent event) {
|
||||
// Cancel timeout
|
||||
deadlineManager.cancelDeadline("PaymentTimeout", orderId);
|
||||
// Continue saga...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Set timeout for each saga step
|
||||
- ✓ Use deadline manager to track timeouts
|
||||
- ✓ Cancel timeouts when step completes
|
||||
- ✓ Log timeout events
|
||||
- ✓ Alert operations on repeated timeouts
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 7: Tight Coupling Between Services
|
||||
|
||||
### Problem
|
||||
Saga logic couples services tightly, making independent deployment impossible.
|
||||
|
||||
### Solution
|
||||
Use event-driven communication:
|
||||
|
||||
```java
|
||||
// Bad - Tight coupling
|
||||
@Service
|
||||
public class TightlyAgedOrderService {
|
||||
public void createOrder(OrderRequest request) {
|
||||
Order order = orderRepository.save(new Order(...));
|
||||
|
||||
// Direct coupling to payment service
|
||||
paymentService.processPayment(order.getId(), request.getAmount());
|
||||
}
|
||||
}
|
||||
|
||||
// Good - Event-driven
|
||||
@Service
|
||||
public class LooselyAgedOrderService {
|
||||
public void createOrder(OrderRequest request) {
|
||||
Order order = orderRepository.save(new Order(...));
|
||||
|
||||
// Publish event - services listen independently
|
||||
eventPublisher.publish(new OrderCreatedEvent(
|
||||
order.getId(),
|
||||
request.getAmount()
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
@Component
|
||||
public class PaymentServiceListener {
|
||||
|
||||
@Bean
|
||||
public Consumer<OrderCreatedEvent> handleOrderCreated() {
|
||||
return event -> {
|
||||
// Payment service can be deployed independently
|
||||
paymentService.processPayment(
|
||||
event.orderId(),
|
||||
event.amount()
|
||||
);
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Use events for inter-service communication
|
||||
- ✓ Avoid direct service-to-service calls
|
||||
- ✓ Define clear contracts for events
|
||||
- ✓ Version events for backward compatibility
|
||||
- ✓ Deploy services independently
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 8: Inadequate Monitoring
|
||||
|
||||
### Problem
|
||||
Sagas fail silently or get stuck without visibility, making troubleshooting impossible.
|
||||
|
||||
### Solution
|
||||
Implement comprehensive monitoring:
|
||||
|
||||
```java
|
||||
@Component
|
||||
public class SagaMonitoring {
|
||||
|
||||
private final MeterRegistry meterRegistry;
|
||||
|
||||
@Bean
|
||||
public MeterBinder sagaMetrics(SagaStateRepository repository) {
|
||||
return (registry) -> {
|
||||
Gauge.builder("saga.active", repository::countByStatus)
|
||||
.description("Number of active sagas")
|
||||
.register(registry);
|
||||
|
||||
Gauge.builder("saga.stuck", () ->
|
||||
repository.countStuckSagas(Duration.ofMinutes(30)))
|
||||
.description("Number of stuck sagas")
|
||||
.register(registry);
|
||||
};
|
||||
}
|
||||
|
||||
public void recordSagaStart(String sagaType) {
|
||||
Counter.builder("saga.started")
|
||||
.tag("type", sagaType)
|
||||
.register(meterRegistry)
|
||||
.increment();
|
||||
}
|
||||
|
||||
public void recordSagaCompletion(String sagaType, long durationMs) {
|
||||
Timer.builder("saga.duration")
|
||||
.tag("type", sagaType)
|
||||
.publishPercentiles(0.5, 0.95, 0.99)
|
||||
.register(meterRegistry)
|
||||
.record(Duration.ofMillis(durationMs));
|
||||
}
|
||||
|
||||
public void recordSagaFailure(String sagaType, String reason) {
|
||||
Counter.builder("saga.failed")
|
||||
.tag("type", sagaType)
|
||||
.tag("reason", reason)
|
||||
.register(meterRegistry)
|
||||
.increment();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Track saga state transitions
|
||||
- ✓ Monitor step execution times
|
||||
- ✓ Alert on stuck sagas
|
||||
- ✓ Log all failures with details
|
||||
- ✓ Use distributed tracing (Sleuth, Zipkin)
|
||||
- ✓ Create dashboards for visibility
|
||||
1537
skills/spring-boot/spring-boot-saga-pattern/references/examples.md
Normal file
1537
skills/spring-boot/spring-boot-saga-pattern/references/examples.md
Normal file
File diff suppressed because it is too large
Load Diff
1264
skills/spring-boot/spring-boot-saga-pattern/references/reference.md
Normal file
1264
skills/spring-boot/spring-boot-saga-pattern/references/reference.md
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user