Initial commit
This commit is contained in:
@@ -0,0 +1,68 @@
|
||||
# Saga Pattern Definition
|
||||
|
||||
## What is a Saga?
|
||||
|
||||
A **Saga** is a sequence of local transactions where each transaction updates data within a single service. Each local transaction publishes an event or message that triggers the next local transaction in the saga. If a local transaction fails, the saga executes compensating transactions to undo the changes made by preceding transactions.
|
||||
|
||||
## Key Characteristics
|
||||
|
||||
**Distributed Transactions**: Spans multiple microservices, each with its own database.
|
||||
|
||||
**Local Transactions**: Each service performs its own ACID transaction.
|
||||
|
||||
**Event-Driven**: Services communicate through events or commands.
|
||||
|
||||
**Compensations**: Rollback mechanism using compensating transactions.
|
||||
|
||||
**Eventual Consistency**: System reaches a consistent state over time.
|
||||
|
||||
## Saga vs Two-Phase Commit (2PC)
|
||||
|
||||
| Feature | Saga Pattern | Two-Phase Commit |
|
||||
|---------|-------------|------------------|
|
||||
| Locking | No distributed locks | Requires locks during commit |
|
||||
| Performance | Better performance | Performance bottleneck |
|
||||
| Scalability | Highly scalable | Limited scalability |
|
||||
| Complexity | Business logic complexity | Protocol complexity |
|
||||
| Failure Handling | Compensating transactions | Automatic rollback |
|
||||
| Isolation | Lower isolation | Full isolation |
|
||||
| NoSQL Support | Yes | No |
|
||||
| Microservices Fit | Excellent | Poor |
|
||||
|
||||
## ACID vs BASE
|
||||
|
||||
**ACID** (Traditional Databases):
|
||||
- **A**tomicity: All or nothing
|
||||
- **C**onsistency: Valid state transitions
|
||||
- **I**solation: Concurrent transactions don't interfere
|
||||
- **D**urability: Committed data persists
|
||||
|
||||
**BASE** (Saga Pattern):
|
||||
- **B**asically **A**vailable: System is available most of the time
|
||||
- **S**oft state: State may change over time
|
||||
- **E**ventual consistency: System becomes consistent eventually
|
||||
|
||||
## When to Use Saga Pattern
|
||||
|
||||
Use the saga pattern when:
|
||||
- Building distributed transactions across multiple microservices
|
||||
- Needing to replace 2PC with a more scalable solution
|
||||
- Services need to maintain eventual consistency
|
||||
- Handling long-running processes spanning multiple services
|
||||
- Implementing compensating transactions for failed operations
|
||||
|
||||
## When NOT to Use Saga Pattern
|
||||
|
||||
Avoid the saga pattern when:
|
||||
- Single service transactions (use local ACID transactions)
|
||||
- Strong consistency is required immediately
|
||||
- Simple CRUD operations without cross-service dependencies
|
||||
- Low transaction volume with simple flows
|
||||
- Team lacks experience with distributed systems
|
||||
|
||||
## Migration Path
|
||||
|
||||
Many organizations migrate from traditional monolithic systems or 2PC-based systems to sagas:
|
||||
|
||||
1. **From Monolith to Saga**: Identify transaction boundaries, extract services gradually, implement sagas incrementally
|
||||
2. **From 2PC to Saga**: Analyze existing 2PC transactions, design compensating transactions, implement sagas in parallel, monitor and compare results before full migration
|
||||
@@ -0,0 +1,153 @@
|
||||
# Choreography-Based Saga Implementation
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
In choreography-based sagas, each service produces and listens to events. Services know what to do when they receive an event. **No central coordinator** manages the flow.
|
||||
|
||||
```
|
||||
Service A → Event → Service B → Event → Service C
|
||||
↓ ↓ ↓
|
||||
Event Event Event
|
||||
↓ ↓ ↓
|
||||
Compensation Compensation Compensation
|
||||
```
|
||||
|
||||
## Event Flow
|
||||
|
||||
### Success Path
|
||||
|
||||
1. **Order Service** creates order → publishes `OrderCreated` event
|
||||
2. **Payment Service** listens → processes payment → publishes `PaymentProcessed` event
|
||||
3. **Inventory Service** listens → reserves inventory → publishes `InventoryReserved` event
|
||||
4. **Shipment Service** listens → prepares shipment → publishes `ShipmentPrepared` event
|
||||
|
||||
### Failure Path (When Payment Fails)
|
||||
|
||||
1. **Payment Service** publishes `PaymentFailed` event
|
||||
2. **Order Service** listens → cancels order → publishes `OrderCancelled` event
|
||||
3. All other services respond to cancellation with cleanup
|
||||
|
||||
## Event Publisher
|
||||
|
||||
```java
|
||||
@Component
|
||||
public class OrderEventPublisher {
|
||||
private final StreamBridge streamBridge;
|
||||
|
||||
public OrderEventPublisher(StreamBridge streamBridge) {
|
||||
this.streamBridge = streamBridge;
|
||||
}
|
||||
|
||||
public void publishOrderCreatedEvent(String orderId, BigDecimal amount, String itemId) {
|
||||
OrderCreatedEvent event = new OrderCreatedEvent(orderId, amount, itemId);
|
||||
streamBridge.send("orderCreated-out-0",
|
||||
MessageBuilder
|
||||
.withPayload(event)
|
||||
.setHeader(MessageHeaders.CONTENT_TYPE, MimeTypeUtils.APPLICATION_JSON)
|
||||
.build());
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Event Listener
|
||||
|
||||
```java
|
||||
@Component
|
||||
public class PaymentEventListener {
|
||||
|
||||
@Bean
|
||||
public Consumer<OrderCreatedEvent> handleOrderCreatedEvent() {
|
||||
return event -> processPayment(event.getOrderId());
|
||||
}
|
||||
|
||||
private void processPayment(String orderId) {
|
||||
// Payment processing logic
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Event Classes
|
||||
|
||||
```java
|
||||
public record OrderCreatedEvent(
|
||||
String orderId,
|
||||
BigDecimal amount,
|
||||
String itemId
|
||||
) {}
|
||||
|
||||
public record PaymentProcessedEvent(
|
||||
String paymentId,
|
||||
String orderId,
|
||||
String itemId
|
||||
) {}
|
||||
|
||||
public record PaymentFailedEvent(
|
||||
String paymentId,
|
||||
String orderId,
|
||||
String itemId,
|
||||
String reason
|
||||
) {}
|
||||
```
|
||||
|
||||
## Spring Cloud Stream Configuration
|
||||
|
||||
```yaml
|
||||
spring:
|
||||
cloud:
|
||||
stream:
|
||||
bindings:
|
||||
orderCreated-out-0:
|
||||
destination: order-events
|
||||
paymentProcessed-out-0:
|
||||
destination: payment-events
|
||||
paymentFailed-out-0:
|
||||
destination: payment-events
|
||||
kafka:
|
||||
binder:
|
||||
brokers: localhost:9092
|
||||
```
|
||||
|
||||
## Maven Dependencies
|
||||
|
||||
```xml
|
||||
<dependency>
|
||||
<groupId>org.springframework.cloud</groupId>
|
||||
<artifactId>spring-cloud-stream</artifactId>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.springframework.cloud</groupId>
|
||||
<artifactId>spring-cloud-stream-binder-kafka</artifactId>
|
||||
</dependency>
|
||||
```
|
||||
|
||||
## Gradle Dependencies
|
||||
|
||||
```groovy
|
||||
implementation 'org.springframework.cloud:spring-cloud-stream'
|
||||
implementation 'org.springframework.cloud:spring-cloud-stream-binder-kafka'
|
||||
```
|
||||
|
||||
## Advantages and Disadvantages
|
||||
|
||||
### Advantages
|
||||
|
||||
- **Simple** for small number of services
|
||||
- **Loose coupling** between services
|
||||
- **No single point of failure**
|
||||
- Each service is independently deployable
|
||||
|
||||
### Disadvantages
|
||||
|
||||
- **Difficult to track workflow state** - distributed across services
|
||||
- **Hard to troubleshoot** - following event flow is complex
|
||||
- **Complexity grows** with number of services
|
||||
- **Distributed source of truth** - saga state not centralized
|
||||
|
||||
## When to Use Choreography
|
||||
|
||||
Use choreography-based sagas when:
|
||||
- Microservices are few in number (< 5 services per saga)
|
||||
- Loose coupling is critical
|
||||
- Team is experienced with event-driven architecture
|
||||
- System can handle eventual consistency
|
||||
- Workflow doesn't need centralized monitoring
|
||||
@@ -0,0 +1,232 @@
|
||||
# Orchestration-Based Saga Implementation
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
A **central orchestrator** (Saga Coordinator) manages the entire transaction flow, sending commands to services and handling responses.
|
||||
|
||||
```
|
||||
Saga Orchestrator
|
||||
/ | \
|
||||
Service A Service B Service C
|
||||
```
|
||||
|
||||
## Orchestrator Responsibilities
|
||||
|
||||
1. **Command Dispatch**: Send commands to services
|
||||
2. **Response Handling**: Process service responses
|
||||
3. **State Management**: Track saga execution state
|
||||
4. **Compensation Coordination**: Trigger compensating transactions on failure
|
||||
5. **Timeout Management**: Handle service timeouts
|
||||
6. **Retry Logic**: Manage retry attempts
|
||||
|
||||
## Axon Framework Implementation
|
||||
|
||||
### Saga Class
|
||||
|
||||
```java
|
||||
@Saga
|
||||
public class OrderSaga {
|
||||
|
||||
@Autowired
|
||||
private transient CommandGateway commandGateway;
|
||||
|
||||
@StartSaga
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(OrderCreatedEvent event) {
|
||||
String paymentId = UUID.randomUUID().toString();
|
||||
ProcessPaymentCommand command = new ProcessPaymentCommand(
|
||||
paymentId,
|
||||
event.getOrderId(),
|
||||
event.getAmount(),
|
||||
event.getItemId()
|
||||
);
|
||||
commandGateway.send(command);
|
||||
}
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentProcessedEvent event) {
|
||||
ReserveInventoryCommand command = new ReserveInventoryCommand(
|
||||
event.getOrderId(),
|
||||
event.getItemId()
|
||||
);
|
||||
commandGateway.send(command);
|
||||
}
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentFailedEvent event) {
|
||||
CancelOrderCommand command = new CancelOrderCommand(event.getOrderId());
|
||||
commandGateway.send(command);
|
||||
end();
|
||||
}
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(InventoryReservedEvent event) {
|
||||
PrepareShipmentCommand command = new PrepareShipmentCommand(
|
||||
event.getOrderId(),
|
||||
event.getItemId()
|
||||
);
|
||||
commandGateway.send(command);
|
||||
}
|
||||
|
||||
@EndSaga
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(OrderCompletedEvent event) {
|
||||
// Saga completed successfully
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Aggregate for Order Service
|
||||
|
||||
```java
|
||||
@Aggregate
|
||||
public class OrderAggregate {
|
||||
|
||||
@AggregateIdentifier
|
||||
private String orderId;
|
||||
|
||||
private OrderStatus status;
|
||||
|
||||
public OrderAggregate() {
|
||||
}
|
||||
|
||||
@CommandHandler
|
||||
public OrderAggregate(CreateOrderCommand command) {
|
||||
apply(new OrderCreatedEvent(
|
||||
command.getOrderId(),
|
||||
command.getAmount(),
|
||||
command.getItemId()
|
||||
));
|
||||
}
|
||||
|
||||
@EventSourcingHandler
|
||||
public void on(OrderCreatedEvent event) {
|
||||
this.orderId = event.getOrderId();
|
||||
this.status = OrderStatus.PENDING;
|
||||
}
|
||||
|
||||
@CommandHandler
|
||||
public void handle(CancelOrderCommand command) {
|
||||
apply(new OrderCancelledEvent(command.getOrderId()));
|
||||
}
|
||||
|
||||
@EventSourcingHandler
|
||||
public void on(OrderCancelledEvent event) {
|
||||
this.status = OrderStatus.CANCELLED;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Aggregate for Payment Service
|
||||
|
||||
```java
|
||||
@Aggregate
|
||||
public class PaymentAggregate {
|
||||
|
||||
@AggregateIdentifier
|
||||
private String paymentId;
|
||||
|
||||
public PaymentAggregate() {
|
||||
}
|
||||
|
||||
@CommandHandler
|
||||
public PaymentAggregate(ProcessPaymentCommand command) {
|
||||
this.paymentId = command.getPaymentId();
|
||||
|
||||
if (command.getAmount().compareTo(BigDecimal.ZERO) <= 0) {
|
||||
apply(new PaymentFailedEvent(
|
||||
command.getPaymentId(),
|
||||
command.getOrderId(),
|
||||
command.getItemId(),
|
||||
"Payment amount must be greater than zero"
|
||||
));
|
||||
} else {
|
||||
apply(new PaymentProcessedEvent(
|
||||
command.getPaymentId(),
|
||||
command.getOrderId(),
|
||||
command.getItemId()
|
||||
));
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Axon Configuration
|
||||
|
||||
```yaml
|
||||
axon:
|
||||
serializer:
|
||||
general: jackson
|
||||
events: jackson
|
||||
messages: jackson
|
||||
eventhandling:
|
||||
processors:
|
||||
order-processor:
|
||||
mode: tracking
|
||||
source: eventBus
|
||||
axonserver:
|
||||
enabled: false
|
||||
```
|
||||
|
||||
## Maven Dependencies for Axon
|
||||
|
||||
```xml
|
||||
<dependency>
|
||||
<groupId>org.axonframework</groupId>
|
||||
<artifactId>axon-spring-boot-starter</artifactId>
|
||||
<version>4.9.0</version> // Use latest stable version
|
||||
</dependency>
|
||||
```
|
||||
|
||||
## Advantages and Disadvantages
|
||||
|
||||
### Advantages
|
||||
|
||||
- **Centralized visibility** - easy to see workflow status
|
||||
- **Easier to troubleshoot** - single place to analyze flow
|
||||
- **Clear transaction flow** - orchestrator defines sequence
|
||||
- **Simplified error handling** - centralized compensation logic
|
||||
- **Better for complex workflows** - easier to manage many steps
|
||||
|
||||
### Disadvantages
|
||||
|
||||
- **Orchestrator becomes single point of failure** - can be mitigated with clustering
|
||||
- **Additional infrastructure component** - more complexity in deployment
|
||||
- **Potential tight coupling** - if orchestrator knows too much about services
|
||||
|
||||
## Eventuate Tram Sagas
|
||||
|
||||
Eventuate Tram is an alternative to Axon for orchestration-based sagas:
|
||||
|
||||
```xml
|
||||
<dependency>
|
||||
<groupId>io.eventuate.tram.sagas</groupId>
|
||||
<artifactId>eventuate-tram-sagas-spring-starter</artifactId>
|
||||
<version>0.28.0</version> // Use latest stable version
|
||||
</dependency>
|
||||
```
|
||||
|
||||
## Camunda for BPMN-Based Orchestration
|
||||
|
||||
Use Camunda when visual workflow design is beneficial:
|
||||
|
||||
**Features**:
|
||||
- Visual workflow design
|
||||
- BPMN 2.0 standard
|
||||
- Human tasks support
|
||||
- Complex workflow modeling
|
||||
|
||||
**Use When**:
|
||||
- Business process modeling needed
|
||||
- Visual workflow design preferred
|
||||
- Human approval steps required
|
||||
- Complex orchestration logic
|
||||
|
||||
## When to Use Orchestration
|
||||
|
||||
Use orchestration-based sagas when:
|
||||
- Building brownfield applications with existing microservices
|
||||
- Handling complex workflows with many steps
|
||||
- Centralized control and monitoring is critical
|
||||
- Organization wants clear visibility into saga execution
|
||||
- Need for human intervention in workflow
|
||||
@@ -0,0 +1,262 @@
|
||||
# Event-Driven Architecture in Sagas
|
||||
|
||||
## Event Types
|
||||
|
||||
### Domain Events
|
||||
|
||||
Represent business facts that happened within a service:
|
||||
|
||||
```java
|
||||
public record OrderCreatedEvent(
|
||||
String orderId,
|
||||
Instant createdAt,
|
||||
BigDecimal amount
|
||||
) implements DomainEvent {}
|
||||
```
|
||||
|
||||
### Integration Events
|
||||
|
||||
Communication between bounded contexts (microservices):
|
||||
|
||||
```java
|
||||
public record PaymentRequestedEvent(
|
||||
String orderId,
|
||||
String paymentId,
|
||||
BigDecimal amount
|
||||
) implements IntegrationEvent {}
|
||||
```
|
||||
|
||||
### Command Events
|
||||
|
||||
Request for action by another service:
|
||||
|
||||
```java
|
||||
public record ProcessPaymentCommand(
|
||||
String paymentId,
|
||||
String orderId,
|
||||
BigDecimal amount
|
||||
) {}
|
||||
```
|
||||
|
||||
## Event Versioning
|
||||
|
||||
Handle event schema evolution using versioning:
|
||||
|
||||
```java
|
||||
public record OrderCreatedEventV1(
|
||||
String orderId,
|
||||
BigDecimal amount
|
||||
) {}
|
||||
|
||||
public record OrderCreatedEventV2(
|
||||
String orderId,
|
||||
BigDecimal amount,
|
||||
String customerId,
|
||||
Instant timestamp
|
||||
) {}
|
||||
|
||||
// Event Upcaster
|
||||
public class OrderEventUpcaster implements EventUpcaster {
|
||||
@Override
|
||||
public Stream<IntermediateEventRepresentation> upcast(
|
||||
Stream<IntermediateEventRepresentation> eventStream) {
|
||||
|
||||
return eventStream.map(event -> {
|
||||
if (event.getType().getName().equals("OrderCreatedEventV1")) {
|
||||
return upcastV1ToV2(event);
|
||||
}
|
||||
return event;
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Event Store
|
||||
|
||||
Store all events for audit trail and recovery:
|
||||
|
||||
```java
|
||||
@Entity
|
||||
@Table(name = "saga_events")
|
||||
public class SagaEvent {
|
||||
|
||||
@Id
|
||||
@GeneratedValue(strategy = GenerationType.IDENTITY)
|
||||
private Long id;
|
||||
|
||||
@Column(nullable = false)
|
||||
private String sagaId;
|
||||
|
||||
@Column(nullable = false)
|
||||
private String eventType;
|
||||
|
||||
@Column(columnDefinition = "TEXT")
|
||||
private String payload;
|
||||
|
||||
@Column(nullable = false)
|
||||
private Instant timestamp;
|
||||
|
||||
@Column(nullable = false)
|
||||
private Integer version;
|
||||
}
|
||||
```
|
||||
|
||||
## Event Publishing Patterns
|
||||
|
||||
### Outbox Pattern (Transactional)
|
||||
|
||||
Ensure atomic update of database and event publishing:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class OrderService {
|
||||
|
||||
private final OrderRepository orderRepository;
|
||||
private final OutboxRepository outboxRepository;
|
||||
|
||||
@Transactional
|
||||
public void createOrder(CreateOrderRequest request) {
|
||||
// 1. Create and save order
|
||||
Order order = new Order(...);
|
||||
orderRepository.save(order);
|
||||
|
||||
// 2. Create outbox entry in same transaction
|
||||
OutboxEntry entry = new OutboxEntry(
|
||||
"OrderCreated",
|
||||
order.getId(),
|
||||
new OrderCreatedEvent(...)
|
||||
);
|
||||
outboxRepository.save(entry);
|
||||
}
|
||||
}
|
||||
|
||||
@Component
|
||||
public class OutboxPoller {
|
||||
|
||||
@Scheduled(fixedDelay = 1000)
|
||||
public void pollAndPublish() {
|
||||
List<OutboxEntry> unpublished = outboxRepository.findUnpublished();
|
||||
|
||||
unpublished.forEach(entry -> {
|
||||
eventPublisher.publish(entry.getEvent());
|
||||
outboxRepository.markAsPublished(entry.getId());
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Direct Publishing Pattern
|
||||
|
||||
Publish events immediately after transaction:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class OrderService {
|
||||
|
||||
private final OrderRepository orderRepository;
|
||||
private final EventPublisher eventPublisher;
|
||||
|
||||
@Transactional
|
||||
public void createOrder(CreateOrderRequest request) {
|
||||
Order order = new Order(...);
|
||||
orderRepository.save(order);
|
||||
|
||||
// Publish event after transaction commits
|
||||
TransactionSynchronizationManager.registerSynchronization(
|
||||
new TransactionSynchronization() {
|
||||
@Override
|
||||
public void afterCommit() {
|
||||
eventPublisher.publish(new OrderCreatedEvent(...));
|
||||
}
|
||||
}
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Event Sourcing
|
||||
|
||||
Store all state changes as events instead of current state:
|
||||
|
||||
**Benefits**:
|
||||
- Complete audit trail
|
||||
- Time-travel debugging
|
||||
- Natural fit for sagas
|
||||
- Event replay for recovery
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```java
|
||||
@Entity
|
||||
public class Order {
|
||||
|
||||
@Id
|
||||
private String orderId;
|
||||
|
||||
@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true)
|
||||
private List<DomainEvent> events = new ArrayList<>();
|
||||
|
||||
public void createOrder(...) {
|
||||
apply(new OrderCreatedEvent(...));
|
||||
}
|
||||
|
||||
protected void apply(DomainEvent event) {
|
||||
if (event instanceof OrderCreatedEvent e) {
|
||||
this.orderId = e.orderId();
|
||||
this.status = OrderStatus.PENDING;
|
||||
}
|
||||
events.add(event);
|
||||
}
|
||||
|
||||
public List<DomainEvent> getUncommittedEvents() {
|
||||
return new ArrayList<>(events);
|
||||
}
|
||||
|
||||
public void clearUncommittedEvents() {
|
||||
events.clear();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Event Ordering and Consistency
|
||||
|
||||
### Maintain Event Order
|
||||
|
||||
Use partitioning to maintain order within a saga:
|
||||
|
||||
```java
|
||||
@Bean
|
||||
public ProducerFactory<String, Object> producerFactory() {
|
||||
Map<String, Object> config = new HashMap<>();
|
||||
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
|
||||
StringSerializer.class);
|
||||
return new DefaultKafkaProducerFactory<>(config);
|
||||
}
|
||||
|
||||
@Service
|
||||
public class EventPublisher {
|
||||
|
||||
private final KafkaTemplate<String, Object> kafkaTemplate;
|
||||
|
||||
public void publish(DomainEvent event) {
|
||||
// Use sagaId as key to maintain order
|
||||
kafkaTemplate.send("events", event.getSagaId(), event);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Handle Out-of-Order Events
|
||||
|
||||
Use saga state to detect and handle out-of-order events:
|
||||
|
||||
```java
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentProcessedEvent event) {
|
||||
if (saga.getStatus() != SagaStatus.AWAITING_PAYMENT) {
|
||||
// Out of order event, ignore or queue for retry
|
||||
logger.warn("Unexpected event in state: {}", saga.getStatus());
|
||||
return;
|
||||
}
|
||||
// Process event
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,299 @@
|
||||
# Compensating Transactions
|
||||
|
||||
## Design Principles
|
||||
|
||||
### Idempotency
|
||||
|
||||
Execute multiple times with same result:
|
||||
|
||||
```java
|
||||
public void cancelPayment(String paymentId) {
|
||||
Payment payment = paymentRepository.findById(paymentId)
|
||||
.orElse(null);
|
||||
|
||||
if (payment == null) {
|
||||
// Already cancelled or doesn't exist
|
||||
return;
|
||||
}
|
||||
|
||||
if (payment.getStatus() == PaymentStatus.CANCELLED) {
|
||||
// Already cancelled, idempotent
|
||||
return;
|
||||
}
|
||||
|
||||
payment.setStatus(PaymentStatus.CANCELLED);
|
||||
paymentRepository.save(payment);
|
||||
|
||||
// Refund logic here
|
||||
}
|
||||
```
|
||||
|
||||
### Retryability
|
||||
|
||||
Design operations to handle retries without side effects:
|
||||
|
||||
```java
|
||||
@Retryable(
|
||||
value = {TransientException.class},
|
||||
maxAttempts = 3,
|
||||
backoff = @Backoff(delay = 1000, multiplier = 2)
|
||||
)
|
||||
public void releaseInventory(String itemId, int quantity) {
|
||||
// Use set operations for idempotency
|
||||
InventoryItem item = inventoryRepository.findById(itemId)
|
||||
.orElseThrow();
|
||||
|
||||
item.increaseAvailableQuantity(quantity);
|
||||
inventoryRepository.save(item);
|
||||
}
|
||||
```
|
||||
|
||||
## Compensation Strategies
|
||||
|
||||
### Backward Recovery
|
||||
|
||||
Undo completed steps in reverse order:
|
||||
|
||||
```java
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentFailedEvent event) {
|
||||
logger.error("Payment failed, initiating compensation");
|
||||
|
||||
// Step 1: Cancel shipment preparation
|
||||
commandGateway.send(new CancelShipmentCommand(event.getOrderId()));
|
||||
|
||||
// Step 2: Release inventory
|
||||
commandGateway.send(new ReleaseInventoryCommand(event.getOrderId()));
|
||||
|
||||
// Step 3: Cancel order
|
||||
commandGateway.send(new CancelOrderCommand(event.getOrderId()));
|
||||
|
||||
end();
|
||||
}
|
||||
```
|
||||
|
||||
### Forward Recovery
|
||||
|
||||
Retry failed operation with exponential backoff:
|
||||
|
||||
```java
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentTransientFailureEvent event) {
|
||||
if (event.getRetryCount() < MAX_RETRIES) {
|
||||
// Retry payment with backoff
|
||||
ProcessPaymentCommand retryCommand = new ProcessPaymentCommand(
|
||||
event.getPaymentId(),
|
||||
event.getOrderId(),
|
||||
event.getAmount()
|
||||
);
|
||||
commandGateway.send(retryCommand);
|
||||
} else {
|
||||
// After max retries, compensate
|
||||
handlePaymentFailure(event);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Semantic Lock Pattern
|
||||
|
||||
Prevent concurrent modifications during saga execution:
|
||||
|
||||
```java
|
||||
@Entity
|
||||
public class Order {
|
||||
@Id
|
||||
private String orderId;
|
||||
|
||||
@Enumerated(EnumType.STRING)
|
||||
private OrderStatus status;
|
||||
|
||||
@Version
|
||||
private Long version;
|
||||
|
||||
private Instant lockedUntil;
|
||||
|
||||
public boolean tryLock(Duration lockDuration) {
|
||||
if (isLocked()) {
|
||||
return false;
|
||||
}
|
||||
this.lockedUntil = Instant.now().plus(lockDuration);
|
||||
return true;
|
||||
}
|
||||
|
||||
public boolean isLocked() {
|
||||
return lockedUntil != null &&
|
||||
Instant.now().isBefore(lockedUntil);
|
||||
}
|
||||
|
||||
public void unlock() {
|
||||
this.lockedUntil = null;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Compensation in Axon Framework
|
||||
|
||||
```java
|
||||
@Saga
|
||||
public class OrderSaga {
|
||||
|
||||
private String orderId;
|
||||
private String paymentId;
|
||||
private String inventoryId;
|
||||
private boolean compensating = false;
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(InventoryReservationFailedEvent event) {
|
||||
logger.error("Inventory reservation failed");
|
||||
compensating = true;
|
||||
|
||||
// Compensate: refund payment
|
||||
RefundPaymentCommand refundCommand = new RefundPaymentCommand(
|
||||
paymentId,
|
||||
event.getOrderId(),
|
||||
event.getReservedAmount(),
|
||||
"Inventory unavailable"
|
||||
);
|
||||
|
||||
commandGateway.send(refundCommand);
|
||||
}
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentRefundedEvent event) {
|
||||
if (!compensating) return;
|
||||
|
||||
logger.info("Payment refunded, cancelling order");
|
||||
|
||||
// Compensate: cancel order
|
||||
CancelOrderCommand command = new CancelOrderCommand(
|
||||
event.getOrderId(),
|
||||
"Inventory unavailable - payment refunded"
|
||||
);
|
||||
|
||||
commandGateway.send(command);
|
||||
}
|
||||
|
||||
@EndSaga
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(OrderCancelledEvent event) {
|
||||
logger.info("Saga completed with compensation");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Handling Compensation Failures
|
||||
|
||||
Handle cases where compensation itself fails:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class CompensationService {
|
||||
|
||||
private final DeadLetterQueueService dlqService;
|
||||
|
||||
public void handleCompensationFailure(String sagaId, String step, Exception cause) {
|
||||
logger.error("Compensation failed for saga {} at step {}", sagaId, step, cause);
|
||||
|
||||
// Send to dead letter queue for manual intervention
|
||||
dlqService.send(new FailedCompensation(
|
||||
sagaId,
|
||||
step,
|
||||
cause.getMessage(),
|
||||
Instant.now()
|
||||
));
|
||||
|
||||
// Create alert for operations team
|
||||
alertingService.alert(
|
||||
"Compensation Failure",
|
||||
"Saga " + sagaId + " failed compensation at " + step
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Compensation
|
||||
|
||||
Verify that compensation produces expected results:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldCompensateWhenPaymentFails() {
|
||||
String orderId = "order-123";
|
||||
String paymentId = "payment-456";
|
||||
|
||||
// Arrange: execute payment
|
||||
Payment payment = new Payment(paymentId, orderId, BigDecimal.TEN);
|
||||
paymentRepository.save(payment);
|
||||
orderRepository.save(new Order(orderId, OrderStatus.PENDING));
|
||||
|
||||
// Act: compensate
|
||||
paymentService.cancelPayment(paymentId);
|
||||
|
||||
// Assert: verify idempotency
|
||||
paymentService.cancelPayment(paymentId);
|
||||
|
||||
Payment result = paymentRepository.findById(paymentId).orElseThrow();
|
||||
assertThat(result.getStatus()).isEqualTo(PaymentStatus.CANCELLED);
|
||||
}
|
||||
```
|
||||
|
||||
## Common Compensation Patterns
|
||||
|
||||
### Inventory Release
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class InventoryService {
|
||||
|
||||
public void releaseInventory(String orderId) {
|
||||
Order order = orderRepository.findById(orderId).orElseThrow();
|
||||
|
||||
order.getItems().forEach(item -> {
|
||||
InventoryItem inventoryItem = inventoryRepository
|
||||
.findById(item.getProductId())
|
||||
.orElseThrow();
|
||||
|
||||
inventoryItem.increaseAvailableQuantity(item.getQuantity());
|
||||
inventoryRepository.save(inventoryItem);
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Payment Refund
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class PaymentService {
|
||||
|
||||
public void refundPayment(String paymentId) {
|
||||
Payment payment = paymentRepository.findById(paymentId)
|
||||
.orElseThrow();
|
||||
|
||||
if (payment.getStatus() == PaymentStatus.PROCESSED) {
|
||||
payment.setStatus(PaymentStatus.REFUNDED);
|
||||
paymentGateway.refund(payment.getTransactionId());
|
||||
paymentRepository.save(payment);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Order Cancellation
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class OrderService {
|
||||
|
||||
public void cancelOrder(String orderId, String reason) {
|
||||
Order order = orderRepository.findById(orderId)
|
||||
.orElseThrow();
|
||||
|
||||
order.setStatus(OrderStatus.CANCELLED);
|
||||
order.setCancellationReason(reason);
|
||||
order.setCancelledAt(Instant.now());
|
||||
|
||||
orderRepository.save(order);
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,294 @@
|
||||
# State Management in Sagas
|
||||
|
||||
## Saga State Entity
|
||||
|
||||
Persist saga state for recovery and monitoring:
|
||||
|
||||
```java
|
||||
@Entity
|
||||
@Table(name = "saga_state")
|
||||
public class SagaState {
|
||||
|
||||
@Id
|
||||
private String sagaId;
|
||||
|
||||
@Enumerated(EnumType.STRING)
|
||||
private SagaStatus status;
|
||||
|
||||
@Column(columnDefinition = "TEXT")
|
||||
private String currentStep;
|
||||
|
||||
@Column(columnDefinition = "TEXT")
|
||||
private String compensationSteps;
|
||||
|
||||
private Instant startedAt;
|
||||
private Instant completedAt;
|
||||
|
||||
@Version
|
||||
private Long version;
|
||||
}
|
||||
|
||||
public enum SagaStatus {
|
||||
STARTED,
|
||||
PROCESSING,
|
||||
COMPENSATING,
|
||||
COMPLETED,
|
||||
FAILED,
|
||||
CANCELLED
|
||||
}
|
||||
```
|
||||
|
||||
## Saga State Machine with Spring Statemachine
|
||||
|
||||
Define saga state transitions explicitly:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
@EnableStateMachine
|
||||
public class SagaStateMachineConfig
|
||||
extends StateMachineConfigurerAdapter<SagaStatus, SagaEvent> {
|
||||
|
||||
@Override
|
||||
public void configure(
|
||||
StateMachineStateConfigurer<SagaStatus, SagaEvent> states)
|
||||
throws Exception {
|
||||
|
||||
states
|
||||
.withStates()
|
||||
.initial(SagaStatus.STARTED)
|
||||
.states(EnumSet.allOf(SagaStatus.class))
|
||||
.end(SagaStatus.COMPLETED)
|
||||
.end(SagaStatus.FAILED);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void configure(
|
||||
StateMachineTransitionConfigurer<SagaStatus, SagaEvent> transitions)
|
||||
throws Exception {
|
||||
|
||||
transitions
|
||||
.withExternal()
|
||||
.source(SagaStatus.STARTED)
|
||||
.target(SagaStatus.PROCESSING)
|
||||
.event(SagaEvent.ORDER_CREATED)
|
||||
.and()
|
||||
.withExternal()
|
||||
.source(SagaStatus.PROCESSING)
|
||||
.target(SagaStatus.COMPLETED)
|
||||
.event(SagaEvent.ALL_STEPS_COMPLETED)
|
||||
.and()
|
||||
.withExternal()
|
||||
.source(SagaStatus.PROCESSING)
|
||||
.target(SagaStatus.COMPENSATING)
|
||||
.event(SagaEvent.STEP_FAILED)
|
||||
.and()
|
||||
.withExternal()
|
||||
.source(SagaStatus.COMPENSATING)
|
||||
.target(SagaStatus.FAILED)
|
||||
.event(SagaEvent.COMPENSATION_COMPLETED);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## State Transitions
|
||||
|
||||
### Successful Saga Flow
|
||||
|
||||
```
|
||||
STARTED → PROCESSING → COMPLETED
|
||||
```
|
||||
|
||||
### Failed Saga with Compensation
|
||||
|
||||
```
|
||||
STARTED → PROCESSING → COMPENSATING → FAILED
|
||||
```
|
||||
|
||||
### Saga with Retry
|
||||
|
||||
```
|
||||
STARTED → PROCESSING → PROCESSING (retry) → COMPLETED
|
||||
```
|
||||
|
||||
## Persisting Saga Context
|
||||
|
||||
Store context data for saga execution:
|
||||
|
||||
```java
|
||||
@Entity
|
||||
@Table(name = "saga_context")
|
||||
public class SagaContext {
|
||||
|
||||
@Id
|
||||
private String sagaId;
|
||||
|
||||
@Column(columnDefinition = "TEXT")
|
||||
private String contextData; // JSON-serialized
|
||||
|
||||
private Instant createdAt;
|
||||
private Instant updatedAt;
|
||||
|
||||
public <T> T getContextData(Class<T> type) {
|
||||
return JsonUtils.fromJson(contextData, type);
|
||||
}
|
||||
|
||||
public void setContextData(Object data) {
|
||||
this.contextData = JsonUtils.toJson(data);
|
||||
}
|
||||
}
|
||||
|
||||
@Service
|
||||
public class SagaContextService {
|
||||
|
||||
private final SagaContextRepository repository;
|
||||
|
||||
public void saveContext(String sagaId, Object context) {
|
||||
SagaContext sagaContext = new SagaContext(sagaId);
|
||||
sagaContext.setContextData(context);
|
||||
repository.save(sagaContext);
|
||||
}
|
||||
|
||||
public <T> T loadContext(String sagaId, Class<T> type) {
|
||||
return repository.findById(sagaId)
|
||||
.map(ctx -> ctx.getContextData(type))
|
||||
.orElseThrow(() -> new SagaContextNotFoundException(sagaId));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Handling Saga Timeouts
|
||||
|
||||
Detect and handle sagas that exceed expected duration:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class SagaTimeoutHandler {
|
||||
|
||||
private final SagaStateRepository repository;
|
||||
private static final Duration MAX_SAGA_DURATION = Duration.ofMinutes(30);
|
||||
|
||||
@Scheduled(fixedDelay = 60000) // Check every minute
|
||||
public void detectTimeouts() {
|
||||
Instant timeout = Instant.now().minus(MAX_SAGA_DURATION);
|
||||
|
||||
List<SagaState> timedOutSagas = repository
|
||||
.findByStatusAndStartedAtBefore(SagaStatus.PROCESSING, timeout);
|
||||
|
||||
timedOutSagas.forEach(saga -> {
|
||||
logger.warn("Saga {} timed out", saga.getSagaId());
|
||||
compensateSaga(saga);
|
||||
});
|
||||
}
|
||||
|
||||
private void compensateSaga(SagaState saga) {
|
||||
saga.setStatus(SagaStatus.COMPENSATING);
|
||||
repository.save(saga);
|
||||
// Trigger compensation logic
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Saga Recovery
|
||||
|
||||
Recover sagas from failures:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class SagaRecoveryService {
|
||||
|
||||
private final SagaStateRepository stateRepository;
|
||||
private final CommandGateway commandGateway;
|
||||
|
||||
@Scheduled(fixedDelay = 30000) // Check every 30 seconds
|
||||
public void recoverFailedSagas() {
|
||||
List<SagaState> failedSagas = stateRepository
|
||||
.findByStatus(SagaStatus.FAILED);
|
||||
|
||||
failedSagas.forEach(saga -> {
|
||||
if (canBeRetried(saga)) {
|
||||
logger.info("Retrying saga {}", saga.getSagaId());
|
||||
retrySaga(saga);
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
private boolean canBeRetried(SagaState saga) {
|
||||
return saga.getRetryCount() < 3;
|
||||
}
|
||||
|
||||
private void retrySaga(SagaState saga) {
|
||||
saga.setStatus(SagaStatus.STARTED);
|
||||
saga.setRetryCount(saga.getRetryCount() + 1);
|
||||
stateRepository.save(saga);
|
||||
// Send retry command
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Saga State Query
|
||||
|
||||
Query sagas for monitoring:
|
||||
|
||||
```java
|
||||
@Repository
|
||||
public interface SagaStateRepository extends JpaRepository<SagaState, String> {
|
||||
|
||||
List<SagaState> findByStatus(SagaStatus status);
|
||||
|
||||
List<SagaState> findByStatusAndStartedAtBefore(
|
||||
SagaStatus status, Instant before);
|
||||
|
||||
Page<SagaState> findByStatus(SagaStatus status, Pageable pageable);
|
||||
|
||||
long countByStatus(SagaStatus status);
|
||||
|
||||
long countByStatusAndStartedAtBefore(SagaStatus status, Instant before);
|
||||
}
|
||||
|
||||
@RestController
|
||||
@RequestMapping("/api/sagas")
|
||||
public class SagaMonitoringController {
|
||||
|
||||
private final SagaStateRepository repository;
|
||||
|
||||
@GetMapping("/status/{status}")
|
||||
public List<SagaState> getSagasByStatus(
|
||||
@PathVariable SagaStatus status) {
|
||||
return repository.findByStatus(status);
|
||||
}
|
||||
|
||||
@GetMapping("/stuck")
|
||||
public List<SagaState> getStuckSagas() {
|
||||
Instant oneHourAgo = Instant.now().minus(Duration.ofHours(1));
|
||||
return repository.findByStatusAndStartedAtBefore(
|
||||
SagaStatus.PROCESSING, oneHourAgo);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Database Schema for State Management
|
||||
|
||||
```sql
|
||||
CREATE TABLE saga_state (
|
||||
saga_id VARCHAR(255) PRIMARY KEY,
|
||||
status VARCHAR(50) NOT NULL,
|
||||
current_step TEXT,
|
||||
compensation_steps TEXT,
|
||||
started_at TIMESTAMP NOT NULL,
|
||||
completed_at TIMESTAMP,
|
||||
version BIGINT,
|
||||
INDEX idx_status (status),
|
||||
INDEX idx_started_at (started_at)
|
||||
);
|
||||
|
||||
CREATE TABLE saga_context (
|
||||
saga_id VARCHAR(255) PRIMARY KEY,
|
||||
context_data LONGTEXT,
|
||||
created_at TIMESTAMP NOT NULL,
|
||||
updated_at TIMESTAMP,
|
||||
FOREIGN KEY (saga_id) REFERENCES saga_state(saga_id)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_saga_state_status_started
|
||||
ON saga_state(status, started_at);
|
||||
```
|
||||
@@ -0,0 +1,323 @@
|
||||
# Error Handling and Retry Strategies
|
||||
|
||||
## Retry Configuration
|
||||
|
||||
Use Spring Retry for automatic retry logic:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
@EnableRetry
|
||||
public class RetryConfig {
|
||||
|
||||
@Bean
|
||||
public RetryTemplate retryTemplate() {
|
||||
RetryTemplate retryTemplate = new RetryTemplate();
|
||||
|
||||
FixedBackOffPolicy backOffPolicy = new FixedBackOffPolicy();
|
||||
backOffPolicy.setBackOffPeriod(2000L); // 2 second delay
|
||||
|
||||
ExponentialBackOffPolicy exponentialBackOff = new ExponentialBackOffPolicy();
|
||||
exponentialBackOff.setInitialInterval(1000L);
|
||||
exponentialBackOff.setMultiplier(2.0);
|
||||
exponentialBackOff.setMaxInterval(10000L);
|
||||
|
||||
SimpleRetryPolicy retryPolicy = new SimpleRetryPolicy();
|
||||
retryPolicy.setMaxAttempts(3);
|
||||
|
||||
retryTemplate.setBackOffPolicy(exponentialBackOff);
|
||||
retryTemplate.setRetryPolicy(retryPolicy);
|
||||
|
||||
return retryTemplate;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Retry with @Retryable
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class OrderService {
|
||||
|
||||
@Retryable(
|
||||
value = {TransientException.class},
|
||||
maxAttempts = 3,
|
||||
backoff = @Backoff(delay = 1000, multiplier = 2)
|
||||
)
|
||||
public void processOrder(String orderId) {
|
||||
// Order processing logic
|
||||
}
|
||||
|
||||
@Recover
|
||||
public void recover(TransientException ex, String orderId) {
|
||||
logger.error("Order processing failed after retries: {}", orderId, ex);
|
||||
// Fallback logic
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Circuit Breaker with Resilience4j
|
||||
|
||||
Prevent cascading failures:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
public class CircuitBreakerConfig {
|
||||
|
||||
@Bean
|
||||
public CircuitBreakerRegistry circuitBreakerRegistry() {
|
||||
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
|
||||
.failureRateThreshold(50) // Open after 50% failures
|
||||
.waitDurationInOpenState(Duration.ofMillis(1000))
|
||||
.slidingWindowSize(2) // Check last 2 calls
|
||||
.build();
|
||||
|
||||
return CircuitBreakerRegistry.of(config);
|
||||
}
|
||||
}
|
||||
|
||||
@Service
|
||||
public class PaymentService {
|
||||
|
||||
private final CircuitBreaker circuitBreaker;
|
||||
|
||||
public PaymentService(CircuitBreakerRegistry registry) {
|
||||
this.circuitBreaker = registry.circuitBreaker("payment");
|
||||
}
|
||||
|
||||
public PaymentResult processPayment(PaymentRequest request) {
|
||||
return circuitBreaker.executeSupplier(
|
||||
() -> callPaymentGateway(request)
|
||||
);
|
||||
}
|
||||
|
||||
private PaymentResult callPaymentGateway(PaymentRequest request) {
|
||||
// Call external payment gateway
|
||||
return new PaymentResult(...);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Dead Letter Queue
|
||||
|
||||
Handle failed messages:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
public class DeadLetterQueueConfig {
|
||||
|
||||
@Bean
|
||||
public NewTopic deadLetterTopic() {
|
||||
return new NewTopic("saga-dlq", 1, (short) 1);
|
||||
}
|
||||
}
|
||||
|
||||
@Component
|
||||
public class SagaErrorHandler implements ConsumerAwareErrorHandler {
|
||||
|
||||
private final KafkaTemplate<String, Object> kafkaTemplate;
|
||||
|
||||
@Override
|
||||
public void handle(Exception thrownException,
|
||||
List<ConsumerRecord<?, ?>> records,
|
||||
Consumer<?, ?> consumer,
|
||||
MessageListenerContainer container) {
|
||||
|
||||
records.forEach(record -> {
|
||||
logger.error("Processing failed for message: {}", record.key());
|
||||
kafkaTemplate.send("saga-dlq", record.key(), record.value());
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Timeout Handling
|
||||
|
||||
Define and enforce timeout policies:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class TimeoutHandler {
|
||||
|
||||
private final SagaStateRepository sagaStateRepository;
|
||||
private static final Duration STEP_TIMEOUT = Duration.ofSeconds(30);
|
||||
|
||||
@Scheduled(fixedDelay = 5000)
|
||||
public void checkForTimeouts() {
|
||||
Instant timeoutThreshold = Instant.now().minus(STEP_TIMEOUT);
|
||||
|
||||
List<SagaState> timedOutSagas = sagaStateRepository
|
||||
.findByStatusAndUpdatedAtBefore(SagaStatus.PROCESSING, timeoutThreshold);
|
||||
|
||||
timedOutSagas.forEach(saga -> {
|
||||
logger.warn("Saga {} timed out at step {}",
|
||||
saga.getSagaId(), saga.getCurrentStep());
|
||||
compensateSaga(saga);
|
||||
});
|
||||
}
|
||||
|
||||
private void compensateSaga(SagaState saga) {
|
||||
saga.setStatus(SagaStatus.COMPENSATING);
|
||||
sagaStateRepository.save(saga);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Exponential Backoff
|
||||
|
||||
Prevent overwhelming downstream services:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class BackoffService {
|
||||
|
||||
public Duration calculateBackoff(int attemptNumber) {
|
||||
long baseDelay = 1000; // 1 second
|
||||
long delay = baseDelay * (long) Math.pow(2, attemptNumber - 1);
|
||||
long maxDelay = 30000; // 30 seconds
|
||||
|
||||
return Duration.ofMillis(Math.min(delay, maxDelay));
|
||||
}
|
||||
|
||||
@Retryable(
|
||||
value = {ServiceUnavailableException.class},
|
||||
maxAttempts = 5,
|
||||
backoff = @Backoff(
|
||||
delay = 1000,
|
||||
multiplier = 2.0,
|
||||
maxDelay = 30000
|
||||
)
|
||||
)
|
||||
public void callExternalService() {
|
||||
// External service call
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Idempotent Retry
|
||||
|
||||
Ensure retries don't cause duplicate processing:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class IdempotentPaymentService {
|
||||
|
||||
private final PaymentRepository paymentRepository;
|
||||
private final Map<String, PaymentResult> processedPayments = new ConcurrentHashMap<>();
|
||||
|
||||
public PaymentResult processPayment(String paymentId, BigDecimal amount) {
|
||||
// Check if already processed
|
||||
if (processedPayments.containsKey(paymentId)) {
|
||||
return processedPayments.get(paymentId);
|
||||
}
|
||||
|
||||
// Check database
|
||||
Optional<Payment> existing = paymentRepository.findById(paymentId);
|
||||
if (existing.isPresent()) {
|
||||
return new PaymentResult(existing.get());
|
||||
}
|
||||
|
||||
// Process payment
|
||||
PaymentResult result = callPaymentGateway(paymentId, amount);
|
||||
|
||||
// Cache and persist
|
||||
processedPayments.put(paymentId, result);
|
||||
paymentRepository.save(new Payment(paymentId, amount, result.getStatus()));
|
||||
|
||||
return result;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Global Exception Handler
|
||||
|
||||
Centralize error handling:
|
||||
|
||||
```java
|
||||
@RestControllerAdvice
|
||||
public class GlobalExceptionHandler {
|
||||
|
||||
@ExceptionHandler(SagaExecutionException.class)
|
||||
public ResponseEntity<ErrorResponse> handleSagaError(
|
||||
SagaExecutionException ex) {
|
||||
|
||||
return ResponseEntity
|
||||
.status(HttpStatus.UNPROCESSABLE_ENTITY)
|
||||
.body(new ErrorResponse(
|
||||
"SAGA_EXECUTION_FAILED",
|
||||
ex.getMessage(),
|
||||
ex.getSagaId()
|
||||
));
|
||||
}
|
||||
|
||||
@ExceptionHandler(ServiceUnavailableException.class)
|
||||
public ResponseEntity<ErrorResponse> handleServiceUnavailable(
|
||||
ServiceUnavailableException ex) {
|
||||
|
||||
return ResponseEntity
|
||||
.status(HttpStatus.SERVICE_UNAVAILABLE)
|
||||
.body(new ErrorResponse(
|
||||
"SERVICE_UNAVAILABLE",
|
||||
"Required service is temporarily unavailable"
|
||||
));
|
||||
}
|
||||
|
||||
@ExceptionHandler(TimeoutException.class)
|
||||
public ResponseEntity<ErrorResponse> handleTimeout(
|
||||
TimeoutException ex) {
|
||||
|
||||
return ResponseEntity
|
||||
.status(HttpStatus.REQUEST_TIMEOUT)
|
||||
.body(new ErrorResponse(
|
||||
"REQUEST_TIMEOUT",
|
||||
"Request timed out after " + ex.getDuration()
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
public record ErrorResponse(
|
||||
String code,
|
||||
String message,
|
||||
String details
|
||||
) {
|
||||
public ErrorResponse(String code, String message) {
|
||||
this(code, message, null);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Monitoring Error Rates
|
||||
|
||||
Track failure metrics:
|
||||
|
||||
```java
|
||||
@Component
|
||||
public class SagaErrorMetrics {
|
||||
|
||||
private final MeterRegistry meterRegistry;
|
||||
|
||||
public SagaErrorMetrics(MeterRegistry meterRegistry) {
|
||||
this.meterRegistry = meterRegistry;
|
||||
}
|
||||
|
||||
public void recordSagaFailure(String sagaType) {
|
||||
Counter.builder("saga.failure")
|
||||
.tag("type", sagaType)
|
||||
.register(meterRegistry)
|
||||
.increment();
|
||||
}
|
||||
|
||||
public void recordRetry(String sagaType) {
|
||||
Counter.builder("saga.retry")
|
||||
.tag("type", sagaType)
|
||||
.register(meterRegistry)
|
||||
.increment();
|
||||
}
|
||||
|
||||
public void recordTimeout(String sagaType) {
|
||||
Counter.builder("saga.timeout")
|
||||
.tag("type", sagaType)
|
||||
.register(meterRegistry)
|
||||
.increment();
|
||||
}
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,320 @@
|
||||
# Testing Strategies for Sagas
|
||||
|
||||
## Unit Testing Saga Logic
|
||||
|
||||
Test saga behavior with Axon test fixtures:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldDispatchPaymentCommandWhenOrderCreated() {
|
||||
// Arrange
|
||||
String orderId = UUID.randomUUID().toString();
|
||||
String paymentId = UUID.randomUUID().toString();
|
||||
|
||||
SagaTestFixture<OrderSaga> fixture = new SagaTestFixture<>(OrderSaga.class);
|
||||
|
||||
// Act & Assert
|
||||
fixture
|
||||
.givenNoPriorActivity()
|
||||
.whenPublishingA(new OrderCreatedEvent(orderId, BigDecimal.TEN, "item-1"))
|
||||
.expectDispatchedCommands(new ProcessPaymentCommand(paymentId, orderId, BigDecimal.TEN));
|
||||
}
|
||||
|
||||
@Test
|
||||
void shouldCompensateWhenPaymentFails() {
|
||||
String orderId = UUID.randomUUID().toString();
|
||||
String paymentId = UUID.randomUUID().toString();
|
||||
|
||||
SagaTestFixture<OrderSaga> fixture = new SagaTestFixture<>(OrderSaga.class);
|
||||
|
||||
fixture
|
||||
.givenNoPriorActivity()
|
||||
.whenPublishingA(new OrderCreatedEvent(orderId, BigDecimal.TEN, "item-1"))
|
||||
.whenPublishingA(new PaymentFailedEvent(paymentId, orderId, "item-1", "Insufficient funds"))
|
||||
.expectDispatchedCommands(new CancelOrderCommand(orderId))
|
||||
.expectScheduledEventOfType(OrderSaga.class, null);
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Event Publishing
|
||||
|
||||
Verify events are published correctly:
|
||||
|
||||
```java
|
||||
@SpringBootTest
|
||||
@WebMvcTest
|
||||
class OrderServiceTest {
|
||||
|
||||
@MockBean
|
||||
private EventPublisher eventPublisher;
|
||||
|
||||
@InjectMocks
|
||||
private OrderService orderService;
|
||||
|
||||
@Test
|
||||
void shouldPublishOrderCreatedEvent() {
|
||||
// Arrange
|
||||
CreateOrderRequest request = new CreateOrderRequest("cust-1", BigDecimal.TEN);
|
||||
|
||||
// Act
|
||||
String orderId = orderService.createOrder(request);
|
||||
|
||||
// Assert
|
||||
verify(eventPublisher).publish(
|
||||
argThat(event -> event instanceof OrderCreatedEvent &&
|
||||
((OrderCreatedEvent) event).orderId().equals(orderId))
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Integration Testing with Testcontainers
|
||||
|
||||
Test complete saga flow with real services:
|
||||
|
||||
```java
|
||||
@SpringBootTest
|
||||
@Testcontainers
|
||||
class SagaIntegrationTest {
|
||||
|
||||
@Container
|
||||
static KafkaContainer kafka = new KafkaContainer(
|
||||
DockerImageName.parse("confluentinc/cp-kafka:7.4.0")
|
||||
);
|
||||
|
||||
@Container
|
||||
static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>(
|
||||
"postgres:15-alpine"
|
||||
);
|
||||
|
||||
@DynamicPropertySource
|
||||
static void overrideProperties(DynamicPropertyRegistry registry) {
|
||||
registry.add("spring.kafka.bootstrap-servers", kafka::getBootstrapServers);
|
||||
registry.add("spring.datasource.url", postgres::getJdbcUrl);
|
||||
registry.add("spring.datasource.username", postgres::getUsername);
|
||||
registry.add("spring.datasource.password", postgres::getPassword);
|
||||
}
|
||||
|
||||
@Test
|
||||
void shouldCompleteOrderSagaSuccessfully(@Autowired OrderService orderService,
|
||||
@Autowired OrderRepository orderRepository,
|
||||
@Autowired EventPublisher eventPublisher) {
|
||||
// Arrange
|
||||
CreateOrderRequest request = new CreateOrderRequest("cust-1", BigDecimal.TEN);
|
||||
|
||||
// Act
|
||||
String orderId = orderService.createOrder(request);
|
||||
|
||||
// Wait for async processing
|
||||
Thread.sleep(2000);
|
||||
|
||||
// Assert
|
||||
Order order = orderRepository.findById(orderId).orElseThrow();
|
||||
assertThat(order.getStatus()).isEqualTo(OrderStatus.COMPLETED);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Idempotency
|
||||
|
||||
Verify operations produce same results on retry:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void compensationShouldBeIdempotent() {
|
||||
// Arrange
|
||||
String paymentId = "payment-123";
|
||||
Payment payment = new Payment(paymentId, "order-1", BigDecimal.TEN);
|
||||
paymentRepository.save(payment);
|
||||
|
||||
// Act - First compensation
|
||||
paymentService.cancelPayment(paymentId);
|
||||
Payment firstResult = paymentRepository.findById(paymentId).orElseThrow();
|
||||
|
||||
// Act - Second compensation (should be idempotent)
|
||||
paymentService.cancelPayment(paymentId);
|
||||
Payment secondResult = paymentRepository.findById(paymentId).orElseThrow();
|
||||
|
||||
// Assert
|
||||
assertThat(firstResult).isEqualTo(secondResult);
|
||||
assertThat(secondResult.getStatus()).isEqualTo(PaymentStatus.CANCELLED);
|
||||
assertThat(secondResult.getVersion()).isEqualTo(firstResult.getVersion());
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Concurrent Sagas
|
||||
|
||||
Verify saga isolation under concurrent execution:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldHandleConcurrentSagaExecutions() throws InterruptedException {
|
||||
// Arrange
|
||||
int numThreads = 10;
|
||||
ExecutorService executor = Executors.newFixedThreadPool(numThreads);
|
||||
CountDownLatch latch = new CountDownLatch(numThreads);
|
||||
|
||||
// Act
|
||||
for (int i = 0; i < numThreads; i++) {
|
||||
final int index = i;
|
||||
executor.submit(() -> {
|
||||
try {
|
||||
CreateOrderRequest request = new CreateOrderRequest(
|
||||
"cust-" + index,
|
||||
BigDecimal.TEN.multiply(BigDecimal.valueOf(index))
|
||||
);
|
||||
orderService.createOrder(request);
|
||||
} finally {
|
||||
latch.countDown();
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
latch.await(10, TimeUnit.SECONDS);
|
||||
|
||||
// Assert
|
||||
long createdOrders = orderRepository.count();
|
||||
assertThat(createdOrders).isEqualTo(numThreads);
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Failure Scenarios
|
||||
|
||||
Test each failure path and compensation:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldCompensateWhenInventoryUnavailable() {
|
||||
// Arrange
|
||||
String orderId = UUID.randomUUID().toString();
|
||||
inventoryService.setAvailability("item-1", 0); // No inventory
|
||||
|
||||
// Act
|
||||
String result = orderService.createOrder(
|
||||
new CreateOrderRequest("cust-1", BigDecimal.TEN)
|
||||
);
|
||||
|
||||
// Wait for saga completion
|
||||
Thread.sleep(2000);
|
||||
|
||||
// Assert
|
||||
Order order = orderRepository.findById(orderId).orElseThrow();
|
||||
assertThat(order.getStatus()).isEqualTo(OrderStatus.CANCELLED);
|
||||
|
||||
// Verify payment was refunded
|
||||
Payment payment = paymentRepository.findByOrderId(orderId).orElseThrow();
|
||||
assertThat(payment.getStatus()).isEqualTo(PaymentStatus.REFUNDED);
|
||||
}
|
||||
|
||||
@Test
|
||||
void shouldHandlePaymentGatewayFailure() {
|
||||
// Arrange
|
||||
paymentGateway.setFailureRate(1.0); // 100% failure
|
||||
|
||||
// Act
|
||||
String orderId = orderService.createOrder(
|
||||
new CreateOrderRequest("cust-1", BigDecimal.TEN)
|
||||
);
|
||||
|
||||
// Wait for saga completion
|
||||
Thread.sleep(2000);
|
||||
|
||||
// Assert
|
||||
Order order = orderRepository.findById(orderId).orElseThrow();
|
||||
assertThat(order.getStatus()).isEqualTo(OrderStatus.CANCELLED);
|
||||
}
|
||||
```
|
||||
|
||||
## Testing State Machine
|
||||
|
||||
Verify state transitions:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldTransitionStatesProperly() {
|
||||
// Arrange
|
||||
String sagaId = UUID.randomUUID().toString();
|
||||
SagaState sagaState = new SagaState(sagaId, SagaStatus.STARTED);
|
||||
sagaStateRepository.save(sagaState);
|
||||
|
||||
// Act & Assert
|
||||
assertThat(sagaState.getStatus()).isEqualTo(SagaStatus.STARTED);
|
||||
|
||||
sagaState.setStatus(SagaStatus.PROCESSING);
|
||||
sagaStateRepository.save(sagaState);
|
||||
assertThat(sagaStateRepository.findById(sagaId).get().getStatus())
|
||||
.isEqualTo(SagaStatus.PROCESSING);
|
||||
|
||||
sagaState.setStatus(SagaStatus.COMPLETED);
|
||||
sagaStateRepository.save(sagaState);
|
||||
assertThat(sagaStateRepository.findById(sagaId).get().getStatus())
|
||||
.isEqualTo(SagaStatus.COMPLETED);
|
||||
}
|
||||
```
|
||||
|
||||
## Test Data Builders
|
||||
|
||||
Use builders for cleaner test code:
|
||||
|
||||
```java
|
||||
public class OrderRequestBuilder {
|
||||
|
||||
private String customerId = "cust-default";
|
||||
private BigDecimal totalAmount = BigDecimal.TEN;
|
||||
private List<OrderItem> items = new ArrayList<>();
|
||||
|
||||
public OrderRequestBuilder withCustomerId(String customerId) {
|
||||
this.customerId = customerId;
|
||||
return this;
|
||||
}
|
||||
|
||||
public OrderRequestBuilder withAmount(BigDecimal amount) {
|
||||
this.totalAmount = amount;
|
||||
return this;
|
||||
}
|
||||
|
||||
public OrderRequestBuilder withItem(String productId, int quantity) {
|
||||
items.add(new OrderItem(productId, "Product", quantity, BigDecimal.TEN));
|
||||
return this;
|
||||
}
|
||||
|
||||
public CreateOrderRequest build() {
|
||||
return new CreateOrderRequest(customerId, totalAmount, items);
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void shouldCreateOrderWithCustomization() {
|
||||
CreateOrderRequest request = new OrderRequestBuilder()
|
||||
.withCustomerId("customer-123")
|
||||
.withAmount(BigDecimal.valueOf(50))
|
||||
.withItem("product-1", 2)
|
||||
.withItem("product-2", 1)
|
||||
.build();
|
||||
|
||||
String orderId = orderService.createOrder(request);
|
||||
assertThat(orderId).isNotNull();
|
||||
}
|
||||
```
|
||||
|
||||
## Performance Testing
|
||||
|
||||
Measure saga execution time:
|
||||
|
||||
```java
|
||||
@Test
|
||||
void shouldCompleteOrderSagaWithinTimeLimit() {
|
||||
// Arrange
|
||||
CreateOrderRequest request = new CreateOrderRequest("cust-1", BigDecimal.TEN);
|
||||
long maxDurationMs = 5000; // 5 seconds
|
||||
|
||||
// Act
|
||||
Instant start = Instant.now();
|
||||
String orderId = orderService.createOrder(request);
|
||||
Instant end = Instant.now();
|
||||
|
||||
// Assert
|
||||
long duration = Duration.between(start, end).toMillis();
|
||||
assertThat(duration).isLessThan(maxDurationMs);
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,471 @@
|
||||
# Common Pitfalls and Solutions
|
||||
|
||||
## Pitfall 1: Lost Messages
|
||||
|
||||
### Problem
|
||||
Messages get lost due to broker failures, network issues, or consumer crashes before acknowledgment.
|
||||
|
||||
### Solution
|
||||
Use persistent messages with acknowledgments:
|
||||
|
||||
```java
|
||||
@Bean
|
||||
public ProducerFactory<String, Object> producerFactory() {
|
||||
Map<String, Object> config = new HashMap<>();
|
||||
config.put(ProducerConfig.ACKS_CONFIG, "all"); // All replicas must acknowledge
|
||||
config.put(ProducerConfig.RETRIES_CONFIG, 3); // Retry failed sends
|
||||
config.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true); // Prevent duplicates
|
||||
return new DefaultKafkaProducerFactory<>(config);
|
||||
}
|
||||
|
||||
@Bean
|
||||
public ConsumerFactory<String, Object> consumerFactory() {
|
||||
Map<String, Object> config = new HashMap<>();
|
||||
config.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false); // Manual commit
|
||||
return new DefaultKafkaConsumerFactory<>(config);
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Configure producer to wait for all replicas (`acks=all`)
|
||||
- ✓ Enable idempotence to prevent duplicate messages
|
||||
- ✓ Use manual commit for consumers
|
||||
- ✓ Monitor message lag and broker health
|
||||
- ✓ Use transactional outbox pattern
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 2: Duplicate Processing
|
||||
|
||||
### Problem
|
||||
Same message processed multiple times due to failed acknowledgments or retries, causing side effects.
|
||||
|
||||
### Solution
|
||||
Implement idempotency with deduplication:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class DeduplicationService {
|
||||
|
||||
private final DeduplicationRepository repository;
|
||||
|
||||
public boolean isDuplicate(String messageId) {
|
||||
return repository.existsById(messageId);
|
||||
}
|
||||
|
||||
public void recordProcessed(String messageId) {
|
||||
DeduplicatedMessage entry = new DeduplicatedMessage(
|
||||
messageId,
|
||||
Instant.now()
|
||||
);
|
||||
repository.save(entry);
|
||||
}
|
||||
}
|
||||
|
||||
@Component
|
||||
public class PaymentEventListener {
|
||||
|
||||
private final DeduplicationService deduplicationService;
|
||||
private final PaymentService paymentService;
|
||||
|
||||
@Bean
|
||||
public Consumer<PaymentEvent> handlePaymentEvent() {
|
||||
return event -> {
|
||||
String messageId = event.getMessageId();
|
||||
|
||||
if (deduplicationService.isDuplicate(messageId)) {
|
||||
logger.info("Duplicate message ignored: {}", messageId);
|
||||
return;
|
||||
}
|
||||
|
||||
paymentService.processPayment(event);
|
||||
deduplicationService.recordProcessed(messageId);
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Add unique message ID to all events
|
||||
- ✓ Implement deduplication cache/database
|
||||
- ✓ Make all operations idempotent
|
||||
- ✓ Use version control for entity updates
|
||||
- ✓ Test with message replay
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 3: Saga State Inconsistency
|
||||
|
||||
### Problem
|
||||
Saga state in database doesn't match actual service states, leading to orphaned or stuck sagas.
|
||||
|
||||
### Solution
|
||||
Use event sourcing or state reconciliation:
|
||||
|
||||
```java
|
||||
@Service
|
||||
public class SagaStateReconciler {
|
||||
|
||||
private final SagaStateRepository stateRepository;
|
||||
private final OrderRepository orderRepository;
|
||||
private final PaymentRepository paymentRepository;
|
||||
|
||||
@Scheduled(fixedDelay = 60000) // Run every minute
|
||||
public void reconcileSagaStates() {
|
||||
List<SagaState> processingSagas = stateRepository
|
||||
.findByStatus(SagaStatus.PROCESSING);
|
||||
|
||||
processingSagas.forEach(saga -> {
|
||||
if (isActuallyCompleted(saga)) {
|
||||
logger.info("Reconciling saga {} - marking as completed", saga.getSagaId());
|
||||
saga.setStatus(SagaStatus.COMPLETED);
|
||||
saga.setCompletedAt(Instant.now());
|
||||
stateRepository.save(saga);
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
private boolean isActuallyCompleted(SagaState saga) {
|
||||
String orderId = saga.getSagaId();
|
||||
|
||||
Order order = orderRepository.findById(orderId).orElse(null);
|
||||
if (order == null || order.getStatus() != OrderStatus.COMPLETED) {
|
||||
return false;
|
||||
}
|
||||
|
||||
Payment payment = paymentRepository.findByOrderId(orderId).orElse(null);
|
||||
if (payment == null || payment.getStatus() != PaymentStatus.PROCESSED) {
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Use event sourcing for complete audit trail
|
||||
- ✓ Implement state reconciliation job
|
||||
- ✓ Add health checks for saga coordinator
|
||||
- ✓ Monitor saga state transitions
|
||||
- ✓ Persist compensation steps
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 4: Orchestrator Single Point of Failure
|
||||
|
||||
### Problem
|
||||
Orchestration-based saga fails when orchestrator is down, blocking all sagas.
|
||||
|
||||
### Solution
|
||||
Implement clustering and failover:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
public class SagaOrchestratorClusterConfig {
|
||||
|
||||
@Bean
|
||||
public SagaStateRepository sagaStateRepository() {
|
||||
// Use shared database for cluster-wide state
|
||||
return new DatabaseSagaStateRepository();
|
||||
}
|
||||
|
||||
@Bean
|
||||
@Primary
|
||||
public CommandGateway clusterAwareCommandGateway(
|
||||
CommandBus commandBus) {
|
||||
|
||||
return new ClusterAwareCommandGateway(commandBus);
|
||||
}
|
||||
}
|
||||
|
||||
@Component
|
||||
public class OrchestratorHealthCheck extends AbstractHealthIndicator {
|
||||
|
||||
private final SagaStateRepository repository;
|
||||
|
||||
@Override
|
||||
protected void doHealthCheck(Health.Builder builder) {
|
||||
long stuckSagas = repository.countStuckSagas(Duration.ofMinutes(30));
|
||||
|
||||
if (stuckSagas > 100) {
|
||||
builder.down()
|
||||
.withDetail("stuckSagas", stuckSagas)
|
||||
.withDetail("severity", "critical");
|
||||
} else if (stuckSagas > 10) {
|
||||
builder.degraded()
|
||||
.withDetail("stuckSagas", stuckSagas)
|
||||
.withDetail("severity", "warning");
|
||||
} else {
|
||||
builder.up()
|
||||
.withDetail("stuckSagas", stuckSagas);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Deploy orchestrator in cluster with shared state
|
||||
- ✓ Use distributed coordination (ZooKeeper, Consul)
|
||||
- ✓ Implement heartbeat monitoring
|
||||
- ✓ Set up automatic failover
|
||||
- ✓ Use circuit breakers for service calls
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 5: Non-Idempotent Compensations
|
||||
|
||||
### Problem
|
||||
Compensation logic fails on retry because it's not idempotent, leaving system in inconsistent state.
|
||||
|
||||
### Solution
|
||||
Design all compensations to be idempotent:
|
||||
|
||||
```java
|
||||
// Bad - Not idempotent
|
||||
@Service
|
||||
public class BadPaymentService {
|
||||
public void refundPayment(String paymentId) {
|
||||
Payment payment = paymentRepository.findById(paymentId).orElseThrow();
|
||||
payment.setStatus(PaymentStatus.REFUNDED);
|
||||
paymentRepository.save(payment);
|
||||
|
||||
// If this fails partway, retry causes problems
|
||||
externalPaymentGateway.refund(payment.getTransactionId());
|
||||
}
|
||||
}
|
||||
|
||||
// Good - Idempotent
|
||||
@Service
|
||||
public class GoodPaymentService {
|
||||
public void refundPayment(String paymentId) {
|
||||
Payment payment = paymentRepository.findById(paymentId)
|
||||
.orElse(null);
|
||||
|
||||
if (payment == null) {
|
||||
// Already deleted or doesn't exist
|
||||
logger.info("Payment {} not found, skipping refund", paymentId);
|
||||
return;
|
||||
}
|
||||
|
||||
if (payment.getStatus() == PaymentStatus.REFUNDED) {
|
||||
// Already refunded
|
||||
logger.info("Payment {} already refunded", paymentId);
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
externalPaymentGateway.refund(payment.getTransactionId());
|
||||
payment.setStatus(PaymentStatus.REFUNDED);
|
||||
paymentRepository.save(payment);
|
||||
} catch (Exception e) {
|
||||
logger.error("Refund failed, will retry", e);
|
||||
throw e;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Check current state before making changes
|
||||
- ✓ Use status flags to track compensation completion
|
||||
- ✓ Make database updates idempotent
|
||||
- ✓ Test compensation with replays
|
||||
- ✓ Document compensation logic
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 6: Missing Timeouts
|
||||
|
||||
### Problem
|
||||
Sagas hang indefinitely waiting for events that never arrive due to service failures.
|
||||
|
||||
### Solution
|
||||
Implement timeout mechanisms:
|
||||
|
||||
```java
|
||||
@Configuration
|
||||
public class SagaTimeoutConfig {
|
||||
|
||||
@Bean
|
||||
public SagaLifecycle sagaLifecycle(SagaStateRepository repository) {
|
||||
return new SagaLifecycle() {
|
||||
@Override
|
||||
public void onSagaFinished(Saga saga) {
|
||||
// Update saga state
|
||||
}
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
@Saga
|
||||
public class OrderSaga {
|
||||
|
||||
@Autowired
|
||||
private transient CommandGateway commandGateway;
|
||||
|
||||
private String orderId;
|
||||
private String paymentId;
|
||||
private DeadlineManager deadlineManager;
|
||||
|
||||
@StartSaga
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(OrderCreatedEvent event) {
|
||||
this.orderId = event.orderId();
|
||||
|
||||
// Schedule timeout for payment processing
|
||||
deadlineManager.scheduleDeadline(
|
||||
Duration.ofSeconds(30),
|
||||
"PaymentTimeout",
|
||||
orderId
|
||||
);
|
||||
|
||||
commandGateway.send(new ProcessPaymentCommand(...));
|
||||
}
|
||||
|
||||
@DeadlineHandler(deadlineName = "PaymentTimeout")
|
||||
public void handlePaymentTimeout() {
|
||||
logger.warn("Payment processing timed out for order {}", orderId);
|
||||
|
||||
// Compensate
|
||||
commandGateway.send(new CancelOrderCommand(orderId));
|
||||
end();
|
||||
}
|
||||
|
||||
@SagaEventHandler(associationProperty = "orderId")
|
||||
public void handle(PaymentProcessedEvent event) {
|
||||
// Cancel timeout
|
||||
deadlineManager.cancelDeadline("PaymentTimeout", orderId);
|
||||
// Continue saga...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Set timeout for each saga step
|
||||
- ✓ Use deadline manager to track timeouts
|
||||
- ✓ Cancel timeouts when step completes
|
||||
- ✓ Log timeout events
|
||||
- ✓ Alert operations on repeated timeouts
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 7: Tight Coupling Between Services
|
||||
|
||||
### Problem
|
||||
Saga logic couples services tightly, making independent deployment impossible.
|
||||
|
||||
### Solution
|
||||
Use event-driven communication:
|
||||
|
||||
```java
|
||||
// Bad - Tight coupling
|
||||
@Service
|
||||
public class TightlyAgedOrderService {
|
||||
public void createOrder(OrderRequest request) {
|
||||
Order order = orderRepository.save(new Order(...));
|
||||
|
||||
// Direct coupling to payment service
|
||||
paymentService.processPayment(order.getId(), request.getAmount());
|
||||
}
|
||||
}
|
||||
|
||||
// Good - Event-driven
|
||||
@Service
|
||||
public class LooselyAgedOrderService {
|
||||
public void createOrder(OrderRequest request) {
|
||||
Order order = orderRepository.save(new Order(...));
|
||||
|
||||
// Publish event - services listen independently
|
||||
eventPublisher.publish(new OrderCreatedEvent(
|
||||
order.getId(),
|
||||
request.getAmount()
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
@Component
|
||||
public class PaymentServiceListener {
|
||||
|
||||
@Bean
|
||||
public Consumer<OrderCreatedEvent> handleOrderCreated() {
|
||||
return event -> {
|
||||
// Payment service can be deployed independently
|
||||
paymentService.processPayment(
|
||||
event.orderId(),
|
||||
event.amount()
|
||||
);
|
||||
};
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Use events for inter-service communication
|
||||
- ✓ Avoid direct service-to-service calls
|
||||
- ✓ Define clear contracts for events
|
||||
- ✓ Version events for backward compatibility
|
||||
- ✓ Deploy services independently
|
||||
|
||||
---
|
||||
|
||||
## Pitfall 8: Inadequate Monitoring
|
||||
|
||||
### Problem
|
||||
Sagas fail silently or get stuck without visibility, making troubleshooting impossible.
|
||||
|
||||
### Solution
|
||||
Implement comprehensive monitoring:
|
||||
|
||||
```java
|
||||
@Component
|
||||
public class SagaMonitoring {
|
||||
|
||||
private final MeterRegistry meterRegistry;
|
||||
|
||||
@Bean
|
||||
public MeterBinder sagaMetrics(SagaStateRepository repository) {
|
||||
return (registry) -> {
|
||||
Gauge.builder("saga.active", repository::countByStatus)
|
||||
.description("Number of active sagas")
|
||||
.register(registry);
|
||||
|
||||
Gauge.builder("saga.stuck", () ->
|
||||
repository.countStuckSagas(Duration.ofMinutes(30)))
|
||||
.description("Number of stuck sagas")
|
||||
.register(registry);
|
||||
};
|
||||
}
|
||||
|
||||
public void recordSagaStart(String sagaType) {
|
||||
Counter.builder("saga.started")
|
||||
.tag("type", sagaType)
|
||||
.register(meterRegistry)
|
||||
.increment();
|
||||
}
|
||||
|
||||
public void recordSagaCompletion(String sagaType, long durationMs) {
|
||||
Timer.builder("saga.duration")
|
||||
.tag("type", sagaType)
|
||||
.publishPercentiles(0.5, 0.95, 0.99)
|
||||
.register(meterRegistry)
|
||||
.record(Duration.ofMillis(durationMs));
|
||||
}
|
||||
|
||||
public void recordSagaFailure(String sagaType, String reason) {
|
||||
Counter.builder("saga.failed")
|
||||
.tag("type", sagaType)
|
||||
.tag("reason", reason)
|
||||
.register(meterRegistry)
|
||||
.increment();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Prevention Checklist
|
||||
- ✓ Track saga state transitions
|
||||
- ✓ Monitor step execution times
|
||||
- ✓ Alert on stuck sagas
|
||||
- ✓ Log all failures with details
|
||||
- ✓ Use distributed tracing (Sleuth, Zipkin)
|
||||
- ✓ Create dashboards for visibility
|
||||
1537
skills/spring-boot/spring-boot-saga-pattern/references/examples.md
Normal file
1537
skills/spring-boot/spring-boot-saga-pattern/references/examples.md
Normal file
File diff suppressed because it is too large
Load Diff
1264
skills/spring-boot/spring-boot-saga-pattern/references/reference.md
Normal file
1264
skills/spring-boot/spring-boot-saga-pattern/references/reference.md
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user