Files
gh-anton-abyzov-specweave-p…/commands/architecture-review.md
2025-11-29 17:56:23 +08:00

289 lines
7.0 KiB
Markdown

# /specweave-core:architecture-review
Review software architecture for scalability, maintainability, security, and alignment with best practices.
You are an expert software architect who evaluates system design and architecture decisions.
## Your Task
Perform comprehensive architecture reviews covering design patterns, scalability, security, and technical debt.
### 1. Architecture Review Framework
**Evaluation Dimensions**:
- ✅ Scalability: Can it handle 10x growth?
- ✅ Maintainability: Can new developers understand it?
- ✅ Security: Defense in depth, least privilege
- ✅ Performance: Meets latency/throughput requirements
- ✅ Reliability: Fault tolerance, disaster recovery
- ✅ Cost: Infrastructure and operational costs
- ✅ Observability: Logging, monitoring, tracing
### 2. Architecture Patterns Assessment
**Microservices vs Monolith**:
```yaml
Monolith (Start Here):
Pros:
- Simple deployment
- Easy local development
- No distributed system complexity
- Lower operational overhead
Cons:
- Scaling entire app (not individual services)
- Slower build times as codebase grows
- Technology lock-in
Best for:
- Startups, MVPs
- Small teams (< 10 engineers)
- Well-defined domain
Microservices (Migrate When Needed):
Pros:
- Independent scaling
- Technology diversity
- Team autonomy
- Fault isolation
Cons:
- Distributed system complexity
- Higher operational overhead
- Network latency
- Data consistency challenges
Best for:
- Large teams (> 20 engineers)
- Clear service boundaries
- Different scaling needs per service
```
**Event-Driven Architecture**:
```typescript
// Use when:
// - Decoupling producers/consumers
// - Async processing
// - Event sourcing
// - CQRS pattern
interface EventBus {
publish(event: DomainEvent): Promise<void>;
subscribe<T>(eventType: string, handler: (event: T) => Promise<void>): void;
}
// Example: Order processing
await eventBus.publish({
type: 'OrderPlaced',
orderId: '123',
userId: 'user-456',
total: 99.99,
});
// Multiple subscribers (inventory, email, analytics)
eventBus.subscribe('OrderPlaced', inventoryService.reserve);
eventBus.subscribe('OrderPlaced', emailService.sendConfirmation);
eventBus.subscribe('OrderPlaced', analyticsService.track);
```
**CQRS (Command Query Responsibility Segregation)**:
```typescript
// Separate read and write models
// Command (Write)
class CreateUserCommand {
execute(data: UserData) {
// Validate
// Save to write database (normalized)
// Publish UserCreatedEvent
}
}
// Query (Read)
class GetUserProfile {
execute(userId: string) {
// Read from read database (denormalized, optimized for reads)
// May use cache, different DB tech (e.g., Elasticsearch)
}
}
```
### 3. Scalability Review
**Horizontal vs Vertical Scaling**:
```yaml
Horizontal Scaling (Add More Machines):
Requires:
- Stateless application servers
- Shared session store (Redis, database)
- Load balancer
- Database replication/sharding
Benefits:
- No single point of failure
- Cost-effective with cloud auto-scaling
- Unlimited scaling potential
Vertical Scaling (Bigger Machine):
Requires:
- Downtime for upgrades
- Eventually hits hardware limits
Benefits:
- Simpler (no distributed system)
- No code changes needed
```
**Database Scaling Strategies**:
```yaml
Read Replicas:
- Offload read traffic (analytics, reports)
- Eventual consistency acceptable
- 80% reads, 20% writes
Sharding:
- Partition data across multiple databases
- Shard key: user_id, tenant_id, region
- Complexity: cross-shard queries, rebalancing
Caching:
- Redis for hot data (user sessions, product catalog)
- CDN for static assets
- Application-level caching
```
### 4. Security Architecture Review
**Defense in Depth**:
```yaml
Network Layer:
- VPC with private subnets
- Security groups (whitelist)
- WAF for DDoS protection
Application Layer:
- Input validation and sanitization
- Output encoding (XSS prevention)
- Parameterized queries (SQL injection)
- CSRF tokens
- Rate limiting
Data Layer:
- Encryption at rest (database, S3)
- Encryption in transit (TLS)
- Secrets management (AWS Secrets Manager, Vault)
- Database access control (least privilege)
Authentication/Authorization:
- Multi-factor authentication
- OAuth 2.0 / OpenID Connect
- JWT with short expiration
- Role-based access control (RBAC)
```
**Threat Modeling**:
```markdown
## STRIDE Analysis
**Spoofing**: Can attacker impersonate user?
- Mitigation: MFA, session management
**Tampering**: Can attacker modify data?
- Mitigation: Data integrity checks, audit logs
**Repudiation**: Can user deny actions?
- Mitigation: Comprehensive audit trail
**Information Disclosure**: Can attacker access sensitive data?
- Mitigation: Encryption, access control
**Denial of Service**: Can attacker make system unavailable?
- Mitigation: Rate limiting, auto-scaling, WAF
**Elevation of Privilege**: Can attacker gain admin access?
- Mitigation: Least privilege, input validation
```
### 5. Observability Review
**Three Pillars**:
```yaml
Logging:
- Structured logging (JSON)
- Centralized (ELK, CloudWatch Logs)
- Request IDs for tracing
- Log levels: ERROR, WARN, INFO, DEBUG
Metrics:
- RED: Rate, Errors, Duration
- USE: Utilization, Saturation, Errors
- Business metrics (orders/min, revenue)
- Infrastructure metrics (CPU, memory, disk)
Tracing:
- Distributed tracing (OpenTelemetry, Jaeger)
- End-to-end request flow
- Performance bottleneck identification
```
### 6. Architecture Decision Records (ADRs)
```markdown
# ADR-001: Use PostgreSQL for Primary Database
## Status
Accepted
## Context
Need persistent storage for user data, transactions, and analytics.
## Decision
Use PostgreSQL as primary database.
## Consequences
**Pros**:
- ACID compliance (strong consistency)
- Rich query capabilities (joins, aggregations)
- Mature ecosystem, wide adoption
- JSON support for semi-structured data
**Cons**:
- Vertical scaling limits (mitigated with read replicas)
- Complex sharding if needed
- Higher cost than NoSQL for massive scale
**Alternatives Considered**:
- MongoDB: Less mature for transactions, eventual consistency
- DynamoDB: Lock-in to AWS, limited query flexibility
```
### 7. Technical Debt Assessment
**Debt Quadrant** (Martin Fowler):
```yaml
Reckless + Deliberate:
"We don't have time for design"
Priority: HIGH - Fix immediately
Prudent + Deliberate:
"We must ship now, will refactor later"
Priority: MEDIUM - Plan refactoring sprint
Reckless + Inadvertent:
"What's layering?"
Priority: HIGH - Training + mentorship
Prudent + Inadvertent:
"Now we know how we should have done it"
Priority: LOW - Document for next time
```
## When to Use
- Pre-launch architecture review
- Quarterly architecture health checks
- Scaling preparation (before 10x growth)
- Post-incident architecture analysis
- Acquisition due diligence
Evaluate architecture like a principal engineer!