16 KiB
How to Create a Component Specification
Component specifications document individual system components or services, including their responsibilities, interfaces, configuration, and deployment characteristics.
Quick Start
# 1. Create a new component spec
scripts/generate-spec.sh component cmp-001-descriptive-slug
# 2. Open and fill in the file
# (The file will be created at: docs/specs/component/cmp-001-descriptive-slug.md)
# 3. Fill in the sections, then validate:
scripts/validate-spec.sh docs/specs/component/cmp-001-descriptive-slug.md
# 4. Fix issues and check completeness:
scripts/check-completeness.sh docs/specs/component/cmp-001-descriptive-slug.md
When to Write a Component Specification
Use a Component Spec when you need to:
- Document a microservice or major system component
- Specify component responsibilities and interfaces
- Define configuration requirements
- Document deployment procedures
- Enable teams to understand component behavior
- Plan for monitoring and observability
Research Phase
1. Research Related Specifications
Find what informed this component:
# Find design documents that reference this component
grep -r "design\|architecture" docs/specs/ --include="*.md"
# Find API contracts this component implements
grep -r "api\|endpoint" docs/specs/ --include="*.md"
# Find data models this component uses
grep -r "data\|model" docs/specs/ --include="*.md"
2. Review Similar Components
- How are other components in your system designed?
- What patterns and conventions exist?
- How are they deployed and monitored?
- What's the standard for documentation?
3. Understand Dependencies
- What services or systems does this component depend on?
- What services depend on this component?
- What data flows through this component?
- What are the integration points?
Structure & Content Guide
Title & Metadata
- Title: "Export Service", "User Authentication Service", etc.
- Type: Microservice, Library, Worker, API Gateway, etc.
- Version: Current version number
Component Description
# Export Service
The Export Service is a microservice responsible for handling bulk user data exports.
Manages export job lifecycle: queuing, processing, storage, and delivery.
**Type**: Microservice
**Language**: Node.js + TypeScript
**Deployment**: Kubernetes (3+ replicas)
**Status**: Stable (production)
Purpose & Responsibilities Section
## Purpose
Provide reliable, scalable handling of user data exports in multiple formats
while maintaining system stability and data security.
## Primary Responsibilities
1. **Job Queueing**: Accept export requests and queue them for processing
- Validate request parameters
- Create export job records
- Enqueue jobs for processing
- Return job ID to client
2. **Job Processing**: Execute export jobs asynchronously
- Query user data from database
- Transform data to requested format (CSV, JSON)
- Compress files for storage
- Handle processing errors and retries
3. **File Storage**: Manage exported file storage and lifecycle
- Store completed exports to S3
- Generate secure download URLs
- Implement TTL-based cleanup
- Maintain export audit logs
4. **Status Tracking**: Provide job status and progress information
- Track job state (queued, processing, completed, failed)
- Record completion time and file metadata
- Handle cancellation requests
5. **Error Handling**: Manage failures gracefully
- Retry failed jobs with exponential backoff
- Notify users of failures
- Log errors for debugging
- Preserve system stability during failures
Interfaces & APIs Section
## Interfaces
### REST API Endpoints
The service exposes these HTTP endpoints:
#### POST /exports
**Purpose**: Create a new export job
**Authentication**: Required (Bearer token)
**Request Body**:
```json
{
"data_types": ["users", "transactions"],
"format": "csv",
"date_range": { "start": "2024-01-01", "end": "2024-01-31" }
}
Response (201 Created):
{
"id": "exp_123456",
"status": "queued",
"created_at": "2024-01-15T10:00:00Z"
}
GET /exports/{id}
Purpose: Get export job status Response (200 OK):
{
"id": "exp_123456",
"status": "completed",
"download_url": "https://...",
"file_size_bytes": 2048576
}
Event Publishing
The service publishes events to message queue:
export.started
{
"event": "export.started",
"export_id": "exp_123456",
"user_id": "usr_789012",
"timestamp": "2024-01-15T10:00:00Z"
}
export.completed
{
"event": "export.completed",
"export_id": "exp_123456",
"file_size_bytes": 2048576,
"format": "csv",
"timestamp": "2024-01-15T10:05:00Z"
}
export.failed
{
"event": "export.failed",
"export_id": "exp_123456",
"error": "database_connection_timeout",
"timestamp": "2024-01-15T10:05:00Z"
}
Dependencies (Consumed APIs)
- User Service API: GET /users/{id}, GET /users (for data export)
- Auth Service: JWT validation
- Notification Service: Send export completion notifications
### Configuration Section
```markdown
## Configuration
### Environment Variables
| Variable | Type | Required | Description |
|----------|------|----------|-------------|
| NODE_ENV | string | Yes | Environment (dev, staging, production) |
| PORT | number | Yes | HTTP server port (default: 3000) |
| DATABASE_URL | string | Yes | PostgreSQL connection string |
| REDIS_URL | string | Yes | Redis connection for job queue |
| S3_BUCKET | string | Yes | S3 bucket for export files |
| S3_REGION | string | Yes | AWS region (e.g., us-east-1) |
| AWS_ACCESS_KEY_ID | string | Yes | AWS credentials |
| AWS_SECRET_ACCESS_KEY | string | Yes | AWS credentials |
| EXPORT_TTL_DAYS | number | No | Export file retention days (default: 7) |
| MAX_EXPORT_SIZE_MB | number | No | Maximum export file size (default: 500) |
| CONCURRENT_WORKERS | number | No | Number of concurrent job processors (default: 5) |
### Configuration File (config.json)
```json
{
"server": {
"port": 3000,
"timeout_ms": 30000
},
"jobs": {
"max_retries": 3,
"retry_delay_ms": 1000,
"timeout_ms": 300000
},
"export": {
"max_file_size_mb": 500,
"ttl_days": 7,
"formats": ["csv", "json"]
},
"storage": {
"type": "s3",
"cleanup_interval_hours": 24
}
}
Runtime Requirements
- Memory: 512MB minimum, 2GB recommended
- CPU: 1 core minimum, 2 cores recommended
- Disk: 10GB for temporary files
- Network: Must reach PostgreSQL, Redis, S3, Auth Service
### Data Dependencies Section
```markdown
## Data Dependencies
### Input Data
The service requires access to:
- **User data**: From User Service or User DB
- Fields: id, email, name, created_at, etc.
- Constraints: User must be authenticated
- Volume: Scale with user dataset
- **Transaction data**: From Transaction DB
- Fields: id, user_id, amount, date, etc.
- Volume: Can be large (100k+ per user)
### Output Data
The service produces:
- **Export files**: CSV or JSON format
- Stored in S3
- Size: Up to 500MB per file
- Retention: 7 days
- **Export metadata**: Stored in PostgreSQL
- Export record with status, size, completion time
- Audit trail of all exports
Deployment Section
## Deployment
### Container Image
- **Base Image**: node:18-alpine
- **Build**: Dockerfile in repository root
- **Registry**: ECR (AWS Elastic Container Registry)
- **Tag**: Semver (e.g., v1.2.3, latest)
### Kubernetes Deployment
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: export-service
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: export-service
template:
metadata:
labels:
app: export-service
spec:
containers:
- name: export-service
image: 123456789.dkr.ecr.us-east-1.amazonaws.com/export-service:latest
ports:
- containerPort: 3000
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
env:
- name: NODE_ENV
value: "production"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: export-service-secrets
key: database-url
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 10
periodSeconds: 5
Deployment Steps
- Build:
docker build -t export-service:v1.2.3 . - Push:
docker push <registry>/export-service:v1.2.3 - Update:
kubectl set image deployment/export-service export-service=<registry>/export-service:v1.2.3 - Verify:
kubectl rollout status deployment/export-service
Rollback Procedure
# If deployment fails, rollback to previous version
kubectl rollout undo deployment/export-service
# Verify successful rollback
kubectl rollout status deployment/export-service
Pre-Deployment Checklist
- All tests passing locally
- Database migrations run successfully
- Configuration environment variables set in staging
- Health check endpoints responding
- Metrics and logging verified
### Monitoring & Observability Section
```markdown
## Monitoring
### Health Checks
**Liveness Probe**: GET /health
- Returns 200 if service is running
- Used by Kubernetes to restart unhealthy pods
**Readiness Probe**: GET /ready
- Returns 200 if service is ready to receive traffic
- Checks database connectivity, Redis availability
- Used by Kubernetes for traffic routing
### Metrics
Export these Prometheus metrics:
| Metric | Type | Description |
|--------|------|-------------|
| exports_created_total | Counter | Total exports created |
| exports_completed_total | Counter | Total exports completed successfully |
| exports_failed_total | Counter | Total exports failed |
| export_duration_seconds | Histogram | Time to complete export (p50, p95, p99) |
| export_file_size_bytes | Histogram | Size of exported files |
| export_job_queue_depth | Gauge | Number of jobs awaiting processing |
| export_active_jobs | Gauge | Number of jobs currently processing |
### Alerts
Configure these alerts:
**Export Job Backlog Growing**
- Alert if `export_job_queue_depth > 100` for 5+ minutes
- Action: Scale up worker replicas
**Export Failures Increasing**
- Alert if `exports_failed_total` increases by > 10% in 1 hour
- Action: Investigate failure logs
**Service Unhealthy**
- Alert if liveness probe fails
- Action: Restart pod, check logs
### Logging
Log format (JSON):
```json
{
"timestamp": "2024-01-15T10:05:00Z",
"level": "info",
"service": "export-service",
"export_id": "exp_123456",
"event": "export_completed",
"duration_ms": 5000,
"file_size_bytes": 2048576,
"message": "Export completed successfully"
}
Log Levels
debug: Detailed debugging informationinfo: Important operational eventswarn: Warning conditions (retries, slow operations)error: Error conditions (failures, exceptions)
### Dependencies & Integration Section
```markdown
## Dependencies
### Service Dependencies
| Service | Purpose | Criticality | Failure Impact |
|---------|---------|-------------|----------------|
| PostgreSQL | Export job storage | Critical | Service down |
| Redis | Job queue | Critical | Exports won't process |
| S3 | Export file storage | Critical | Can't store exports |
| Auth Service | JWT validation | Critical | Can't validate requests |
| User Service | User data source | Critical | Can't export user data |
| Notification Service | Email notifications | Optional | Users won't get notification |
### External Dependencies
- **AWS S3**: For file storage and retrieval
- **PostgreSQL**: For export metadata
- **Redis**: For job queue
- **Kubernetes**: For orchestration
### Fallback Strategies
- Redis unavailable: Use in-memory queue (single instance only)
- User Service unavailable: Fail export with "upstream_error"
- S3 unavailable: Retry with exponential backoff, max 3 times
Performance & SLA Section
## Performance Characteristics
### Throughput
- Process up to 1000 exports per day
- Handle 100 concurrent job workers
- Queue depth auto-scales based on load
### Latency
- Create export job: < 100ms (p95)
- Process 100MB export: 3-5 minutes average
- Query export status: < 50ms (p95)
### Resource Usage
- Memory: 800MB average, peaks at 1.5GB
- CPU: 25% average, peaks at 60%
- Disk (temp): 50GB for concurrent exports
### Service Level Objectives (SLOs)
| Objective | Target |
|-----------|--------|
| Availability | 99.5% uptime |
| Error Rate | < 0.1% |
| p95 Latency (status query) | < 100ms |
| Export Completion | < 10 minutes for 100MB |
### Scalability
- Horizontal: Add more pods for higher throughput
- Vertical: Increase pod memory/CPU for larger exports
- Maximum tested: 10k exports/day on 5 pod cluster
Writing Tips
Be Specific About Responsibilities
- What does this component do?
- What does it NOT do?
- Where do responsibilities start/stop?
Document All Interfaces
- REST APIs? Document endpoints and schemas
- Message queues? Show event formats
- Database? Show schema and queries
- Dependencies? Show what's called and how
Include Deployment Details
- How is it deployed (containers, VMs, serverless)?
- Configuration required?
- Health checks?
- Monitoring setup?
Link to Related Specs
- Reference design docs:
[DES-001] - Reference API contracts:
[API-001] - Reference data models:
[DATA-001] - Reference deployment procedures:
[DEPLOY-001]
Document Failure Modes
- What happens if dependencies fail?
- How does the component recover?
- What alerts fire when things go wrong?
Validation & Fixing Issues
Run the Validator
scripts/validate-spec.sh docs/specs/component/cmp-001-your-spec.md
Common Issues & Fixes
Issue: "Missing Interfaces section"
- Fix: Document all APIs, event formats, and data contracts
Issue: "Configuration incomplete"
- Fix: Add environment variables, configuration files, and runtime requirements
Issue: "No Monitoring section"
- Fix: Add health checks, metrics, alerts, and logging strategy
Issue: "Deployment steps unclear"
- Fix: Add step-by-step deployment and rollback procedures
Decision-Making Framework
When writing a component spec, consider:
-
Boundaries: What is this component's responsibility?
- What does it own?
- What does it depend on?
- Where are boundaries clear?
-
Interfaces: How will others interact with this?
- REST, gRPC, events, direct calls?
- What contracts must be maintained?
- How do we evolve interfaces?
-
Configuration: What's configurable vs. hardcoded?
- Environment-specific settings?
- Runtime tuning parameters?
- Feature flags?
-
Operations: How will we run this in production?
- Deployment model?
- Monitoring and alerting?
- Failure recovery?
-
Scale: How much can this component handle?
- Throughput limits?
- Scaling strategy?
- Resource requirements?
Next Steps
- Create the spec:
scripts/generate-spec.sh component cmp-XXX-slug - Research: Find design docs and existing components
- Define responsibilities and boundaries
- Document interfaces for all interactions
- Plan deployment and monitoring
- Validate:
scripts/validate-spec.sh docs/specs/component/cmp-XXX-slug.md - Share with architecture/ops before implementation