Initial commit
This commit is contained in:
600
skills/spec-author/guides/component.md
Normal file
600
skills/spec-author/guides/component.md
Normal file
@@ -0,0 +1,600 @@
|
||||
# How to Create a Component Specification
|
||||
|
||||
Component specifications document individual system components or services, including their responsibilities, interfaces, configuration, and deployment characteristics.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# 1. Create a new component spec
|
||||
scripts/generate-spec.sh component cmp-001-descriptive-slug
|
||||
|
||||
# 2. Open and fill in the file
|
||||
# (The file will be created at: docs/specs/component/cmp-001-descriptive-slug.md)
|
||||
|
||||
# 3. Fill in the sections, then validate:
|
||||
scripts/validate-spec.sh docs/specs/component/cmp-001-descriptive-slug.md
|
||||
|
||||
# 4. Fix issues and check completeness:
|
||||
scripts/check-completeness.sh docs/specs/component/cmp-001-descriptive-slug.md
|
||||
```
|
||||
|
||||
## When to Write a Component Specification
|
||||
|
||||
Use a Component Spec when you need to:
|
||||
- Document a microservice or major system component
|
||||
- Specify component responsibilities and interfaces
|
||||
- Define configuration requirements
|
||||
- Document deployment procedures
|
||||
- Enable teams to understand component behavior
|
||||
- Plan for monitoring and observability
|
||||
|
||||
## Research Phase
|
||||
|
||||
### 1. Research Related Specifications
|
||||
Find what informed this component:
|
||||
|
||||
```bash
|
||||
# Find design documents that reference this component
|
||||
grep -r "design\|architecture" docs/specs/ --include="*.md"
|
||||
|
||||
# Find API contracts this component implements
|
||||
grep -r "api\|endpoint" docs/specs/ --include="*.md"
|
||||
|
||||
# Find data models this component uses
|
||||
grep -r "data\|model" docs/specs/ --include="*.md"
|
||||
```
|
||||
|
||||
### 2. Review Similar Components
|
||||
- How are other components in your system designed?
|
||||
- What patterns and conventions exist?
|
||||
- How are they deployed and monitored?
|
||||
- What's the standard for documentation?
|
||||
|
||||
### 3. Understand Dependencies
|
||||
- What services or systems does this component depend on?
|
||||
- What services depend on this component?
|
||||
- What data flows through this component?
|
||||
- What are the integration points?
|
||||
|
||||
## Structure & Content Guide
|
||||
|
||||
### Title & Metadata
|
||||
- **Title**: "Export Service", "User Authentication Service", etc.
|
||||
- **Type**: Microservice, Library, Worker, API Gateway, etc.
|
||||
- **Version**: Current version number
|
||||
|
||||
### Component Description
|
||||
|
||||
```markdown
|
||||
# Export Service
|
||||
|
||||
The Export Service is a microservice responsible for handling bulk user data exports.
|
||||
Manages export job lifecycle: queuing, processing, storage, and delivery.
|
||||
|
||||
**Type**: Microservice
|
||||
**Language**: Node.js + TypeScript
|
||||
**Deployment**: Kubernetes (3+ replicas)
|
||||
**Status**: Stable (production)
|
||||
```
|
||||
|
||||
### Purpose & Responsibilities Section
|
||||
|
||||
```markdown
|
||||
## Purpose
|
||||
|
||||
Provide reliable, scalable handling of user data exports in multiple formats
|
||||
while maintaining system stability and data security.
|
||||
|
||||
## Primary Responsibilities
|
||||
|
||||
1. **Job Queueing**: Accept export requests and queue them for processing
|
||||
- Validate request parameters
|
||||
- Create export job records
|
||||
- Enqueue jobs for processing
|
||||
- Return job ID to client
|
||||
|
||||
2. **Job Processing**: Execute export jobs asynchronously
|
||||
- Query user data from database
|
||||
- Transform data to requested format (CSV, JSON)
|
||||
- Compress files for storage
|
||||
- Handle processing errors and retries
|
||||
|
||||
3. **File Storage**: Manage exported file storage and lifecycle
|
||||
- Store completed exports to S3
|
||||
- Generate secure download URLs
|
||||
- Implement TTL-based cleanup
|
||||
- Maintain export audit logs
|
||||
|
||||
4. **Status Tracking**: Provide job status and progress information
|
||||
- Track job state (queued, processing, completed, failed)
|
||||
- Record completion time and file metadata
|
||||
- Handle cancellation requests
|
||||
|
||||
5. **Error Handling**: Manage failures gracefully
|
||||
- Retry failed jobs with exponential backoff
|
||||
- Notify users of failures
|
||||
- Log errors for debugging
|
||||
- Preserve system stability during failures
|
||||
```
|
||||
|
||||
### Interfaces & APIs Section
|
||||
|
||||
```markdown
|
||||
## Interfaces
|
||||
|
||||
### REST API Endpoints
|
||||
|
||||
The service exposes these HTTP endpoints:
|
||||
|
||||
#### POST /exports
|
||||
**Purpose**: Create a new export job
|
||||
**Authentication**: Required (Bearer token)
|
||||
**Request Body**:
|
||||
```json
|
||||
{
|
||||
"data_types": ["users", "transactions"],
|
||||
"format": "csv",
|
||||
"date_range": { "start": "2024-01-01", "end": "2024-01-31" }
|
||||
}
|
||||
```
|
||||
**Response** (201 Created):
|
||||
```json
|
||||
{
|
||||
"id": "exp_123456",
|
||||
"status": "queued",
|
||||
"created_at": "2024-01-15T10:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
#### GET /exports/{id}
|
||||
**Purpose**: Get export job status
|
||||
**Response** (200 OK):
|
||||
```json
|
||||
{
|
||||
"id": "exp_123456",
|
||||
"status": "completed",
|
||||
"download_url": "https://...",
|
||||
"file_size_bytes": 2048576
|
||||
}
|
||||
```
|
||||
|
||||
### Event Publishing
|
||||
|
||||
The service publishes events to message queue:
|
||||
|
||||
**export.started**
|
||||
```json
|
||||
{
|
||||
"event": "export.started",
|
||||
"export_id": "exp_123456",
|
||||
"user_id": "usr_789012",
|
||||
"timestamp": "2024-01-15T10:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
**export.completed**
|
||||
```json
|
||||
{
|
||||
"event": "export.completed",
|
||||
"export_id": "exp_123456",
|
||||
"file_size_bytes": 2048576,
|
||||
"format": "csv",
|
||||
"timestamp": "2024-01-15T10:05:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
**export.failed**
|
||||
```json
|
||||
{
|
||||
"event": "export.failed",
|
||||
"export_id": "exp_123456",
|
||||
"error": "database_connection_timeout",
|
||||
"timestamp": "2024-01-15T10:05:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
### Dependencies (Consumed APIs)
|
||||
|
||||
- **User Service API**: GET /users/{id}, GET /users (for data export)
|
||||
- **Auth Service**: JWT validation
|
||||
- **Notification Service**: Send export completion notifications
|
||||
```
|
||||
|
||||
### Configuration Section
|
||||
|
||||
```markdown
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Type | Required | Description |
|
||||
|----------|------|----------|-------------|
|
||||
| NODE_ENV | string | Yes | Environment (dev, staging, production) |
|
||||
| PORT | number | Yes | HTTP server port (default: 3000) |
|
||||
| DATABASE_URL | string | Yes | PostgreSQL connection string |
|
||||
| REDIS_URL | string | Yes | Redis connection for job queue |
|
||||
| S3_BUCKET | string | Yes | S3 bucket for export files |
|
||||
| S3_REGION | string | Yes | AWS region (e.g., us-east-1) |
|
||||
| AWS_ACCESS_KEY_ID | string | Yes | AWS credentials |
|
||||
| AWS_SECRET_ACCESS_KEY | string | Yes | AWS credentials |
|
||||
| EXPORT_TTL_DAYS | number | No | Export file retention days (default: 7) |
|
||||
| MAX_EXPORT_SIZE_MB | number | No | Maximum export file size (default: 500) |
|
||||
| CONCURRENT_WORKERS | number | No | Number of concurrent job processors (default: 5) |
|
||||
|
||||
### Configuration File (config.json)
|
||||
|
||||
```json
|
||||
{
|
||||
"server": {
|
||||
"port": 3000,
|
||||
"timeout_ms": 30000
|
||||
},
|
||||
"jobs": {
|
||||
"max_retries": 3,
|
||||
"retry_delay_ms": 1000,
|
||||
"timeout_ms": 300000
|
||||
},
|
||||
"export": {
|
||||
"max_file_size_mb": 500,
|
||||
"ttl_days": 7,
|
||||
"formats": ["csv", "json"]
|
||||
},
|
||||
"storage": {
|
||||
"type": "s3",
|
||||
"cleanup_interval_hours": 24
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Runtime Requirements
|
||||
|
||||
- **Memory**: 512MB minimum, 2GB recommended
|
||||
- **CPU**: 1 core minimum, 2 cores recommended
|
||||
- **Disk**: 10GB for temporary files
|
||||
- **Network**: Must reach PostgreSQL, Redis, S3, Auth Service
|
||||
```
|
||||
|
||||
### Data Dependencies Section
|
||||
|
||||
```markdown
|
||||
## Data Dependencies
|
||||
|
||||
### Input Data
|
||||
|
||||
The service requires access to:
|
||||
- **User data**: From User Service or User DB
|
||||
- Fields: id, email, name, created_at, etc.
|
||||
- Constraints: User must be authenticated
|
||||
- Volume: Scale with user dataset
|
||||
|
||||
- **Transaction data**: From Transaction DB
|
||||
- Fields: id, user_id, amount, date, etc.
|
||||
- Volume: Can be large (100k+ per user)
|
||||
|
||||
### Output Data
|
||||
|
||||
The service produces:
|
||||
- **Export files**: CSV or JSON format
|
||||
- Stored in S3
|
||||
- Size: Up to 500MB per file
|
||||
- Retention: 7 days
|
||||
|
||||
- **Export metadata**: Stored in PostgreSQL
|
||||
- Export record with status, size, completion time
|
||||
- Audit trail of all exports
|
||||
```
|
||||
|
||||
### Deployment Section
|
||||
|
||||
```markdown
|
||||
## Deployment
|
||||
|
||||
### Container Image
|
||||
|
||||
- **Base Image**: node:18-alpine
|
||||
- **Build**: Dockerfile in repository root
|
||||
- **Registry**: ECR (AWS Elastic Container Registry)
|
||||
- **Tag**: Semver (e.g., v1.2.3, latest)
|
||||
|
||||
### Kubernetes Deployment
|
||||
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: export-service
|
||||
namespace: production
|
||||
spec:
|
||||
replicas: 3
|
||||
selector:
|
||||
matchLabels:
|
||||
app: export-service
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: export-service
|
||||
spec:
|
||||
containers:
|
||||
- name: export-service
|
||||
image: 123456789.dkr.ecr.us-east-1.amazonaws.com/export-service:latest
|
||||
ports:
|
||||
- containerPort: 3000
|
||||
resources:
|
||||
requests:
|
||||
memory: "512Mi"
|
||||
cpu: "250m"
|
||||
limits:
|
||||
memory: "2Gi"
|
||||
cpu: "1000m"
|
||||
env:
|
||||
- name: NODE_ENV
|
||||
value: "production"
|
||||
- name: DATABASE_URL
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: export-service-secrets
|
||||
key: database-url
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /health
|
||||
port: 3000
|
||||
initialDelaySeconds: 30
|
||||
periodSeconds: 10
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /ready
|
||||
port: 3000
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 5
|
||||
```
|
||||
|
||||
### Deployment Steps
|
||||
|
||||
1. **Build**: `docker build -t export-service:v1.2.3 .`
|
||||
2. **Push**: `docker push <registry>/export-service:v1.2.3`
|
||||
3. **Update**: `kubectl set image deployment/export-service export-service=<registry>/export-service:v1.2.3`
|
||||
4. **Verify**: `kubectl rollout status deployment/export-service`
|
||||
|
||||
### Rollback Procedure
|
||||
|
||||
```bash
|
||||
# If deployment fails, rollback to previous version
|
||||
kubectl rollout undo deployment/export-service
|
||||
|
||||
# Verify successful rollback
|
||||
kubectl rollout status deployment/export-service
|
||||
```
|
||||
|
||||
### Pre-Deployment Checklist
|
||||
|
||||
- [ ] All tests passing locally
|
||||
- [ ] Database migrations run successfully
|
||||
- [ ] Configuration environment variables set in staging
|
||||
- [ ] Health check endpoints responding
|
||||
- [ ] Metrics and logging verified
|
||||
```
|
||||
|
||||
### Monitoring & Observability Section
|
||||
|
||||
```markdown
|
||||
## Monitoring
|
||||
|
||||
### Health Checks
|
||||
|
||||
**Liveness Probe**: GET /health
|
||||
- Returns 200 if service is running
|
||||
- Used by Kubernetes to restart unhealthy pods
|
||||
|
||||
**Readiness Probe**: GET /ready
|
||||
- Returns 200 if service is ready to receive traffic
|
||||
- Checks database connectivity, Redis availability
|
||||
- Used by Kubernetes for traffic routing
|
||||
|
||||
### Metrics
|
||||
|
||||
Export these Prometheus metrics:
|
||||
|
||||
| Metric | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| exports_created_total | Counter | Total exports created |
|
||||
| exports_completed_total | Counter | Total exports completed successfully |
|
||||
| exports_failed_total | Counter | Total exports failed |
|
||||
| export_duration_seconds | Histogram | Time to complete export (p50, p95, p99) |
|
||||
| export_file_size_bytes | Histogram | Size of exported files |
|
||||
| export_job_queue_depth | Gauge | Number of jobs awaiting processing |
|
||||
| export_active_jobs | Gauge | Number of jobs currently processing |
|
||||
|
||||
### Alerts
|
||||
|
||||
Configure these alerts:
|
||||
|
||||
**Export Job Backlog Growing**
|
||||
- Alert if `export_job_queue_depth > 100` for 5+ minutes
|
||||
- Action: Scale up worker replicas
|
||||
|
||||
**Export Failures Increasing**
|
||||
- Alert if `exports_failed_total` increases by > 10% in 1 hour
|
||||
- Action: Investigate failure logs
|
||||
|
||||
**Service Unhealthy**
|
||||
- Alert if liveness probe fails
|
||||
- Action: Restart pod, check logs
|
||||
|
||||
### Logging
|
||||
|
||||
Log format (JSON):
|
||||
```json
|
||||
{
|
||||
"timestamp": "2024-01-15T10:05:00Z",
|
||||
"level": "info",
|
||||
"service": "export-service",
|
||||
"export_id": "exp_123456",
|
||||
"event": "export_completed",
|
||||
"duration_ms": 5000,
|
||||
"file_size_bytes": 2048576,
|
||||
"message": "Export completed successfully"
|
||||
}
|
||||
```
|
||||
|
||||
**Log Levels**
|
||||
- `debug`: Detailed debugging information
|
||||
- `info`: Important operational events
|
||||
- `warn`: Warning conditions (retries, slow operations)
|
||||
- `error`: Error conditions (failures, exceptions)
|
||||
```
|
||||
|
||||
### Dependencies & Integration Section
|
||||
|
||||
```markdown
|
||||
## Dependencies
|
||||
|
||||
### Service Dependencies
|
||||
|
||||
| Service | Purpose | Criticality | Failure Impact |
|
||||
|---------|---------|-------------|----------------|
|
||||
| PostgreSQL | Export job storage | Critical | Service down |
|
||||
| Redis | Job queue | Critical | Exports won't process |
|
||||
| S3 | Export file storage | Critical | Can't store exports |
|
||||
| Auth Service | JWT validation | Critical | Can't validate requests |
|
||||
| User Service | User data source | Critical | Can't export user data |
|
||||
| Notification Service | Email notifications | Optional | Users won't get notification |
|
||||
|
||||
### External Dependencies
|
||||
|
||||
- **AWS S3**: For file storage and retrieval
|
||||
- **PostgreSQL**: For export metadata
|
||||
- **Redis**: For job queue
|
||||
- **Kubernetes**: For orchestration
|
||||
|
||||
### Fallback Strategies
|
||||
|
||||
- Redis unavailable: Use in-memory queue (single instance only)
|
||||
- User Service unavailable: Fail export with "upstream_error"
|
||||
- S3 unavailable: Retry with exponential backoff, max 3 times
|
||||
```
|
||||
|
||||
### Performance & SLA Section
|
||||
|
||||
```markdown
|
||||
## Performance Characteristics
|
||||
|
||||
### Throughput
|
||||
- Process up to 1000 exports per day
|
||||
- Handle 100 concurrent job workers
|
||||
- Queue depth auto-scales based on load
|
||||
|
||||
### Latency
|
||||
- Create export job: < 100ms (p95)
|
||||
- Process 100MB export: 3-5 minutes average
|
||||
- Query export status: < 50ms (p95)
|
||||
|
||||
### Resource Usage
|
||||
- Memory: 800MB average, peaks at 1.5GB
|
||||
- CPU: 25% average, peaks at 60%
|
||||
- Disk (temp): 50GB for concurrent exports
|
||||
|
||||
### Service Level Objectives (SLOs)
|
||||
|
||||
| Objective | Target |
|
||||
|-----------|--------|
|
||||
| Availability | 99.5% uptime |
|
||||
| Error Rate | < 0.1% |
|
||||
| p95 Latency (status query) | < 100ms |
|
||||
| Export Completion | < 10 minutes for 100MB |
|
||||
|
||||
### Scalability
|
||||
|
||||
- Horizontal: Add more pods for higher throughput
|
||||
- Vertical: Increase pod memory/CPU for larger exports
|
||||
- Maximum tested: 10k exports/day on 5 pod cluster
|
||||
```
|
||||
|
||||
## Writing Tips
|
||||
|
||||
### Be Specific About Responsibilities
|
||||
- What does this component do?
|
||||
- What does it NOT do?
|
||||
- Where do responsibilities start/stop?
|
||||
|
||||
### Document All Interfaces
|
||||
- REST APIs? Document endpoints and schemas
|
||||
- Message queues? Show event formats
|
||||
- Database? Show schema and queries
|
||||
- Dependencies? Show what's called and how
|
||||
|
||||
### Include Deployment Details
|
||||
- How is it deployed (containers, VMs, serverless)?
|
||||
- Configuration required?
|
||||
- Health checks?
|
||||
- Monitoring setup?
|
||||
|
||||
### Link to Related Specs
|
||||
- Reference design docs: `[DES-001]`
|
||||
- Reference API contracts: `[API-001]`
|
||||
- Reference data models: `[DATA-001]`
|
||||
- Reference deployment procedures: `[DEPLOY-001]`
|
||||
|
||||
### Document Failure Modes
|
||||
- What happens if dependencies fail?
|
||||
- How does the component recover?
|
||||
- What alerts fire when things go wrong?
|
||||
|
||||
## Validation & Fixing Issues
|
||||
|
||||
### Run the Validator
|
||||
```bash
|
||||
scripts/validate-spec.sh docs/specs/component/cmp-001-your-spec.md
|
||||
```
|
||||
|
||||
### Common Issues & Fixes
|
||||
|
||||
**Issue**: "Missing Interfaces section"
|
||||
- **Fix**: Document all APIs, event formats, and data contracts
|
||||
|
||||
**Issue**: "Configuration incomplete"
|
||||
- **Fix**: Add environment variables, configuration files, and runtime requirements
|
||||
|
||||
**Issue**: "No Monitoring section"
|
||||
- **Fix**: Add health checks, metrics, alerts, and logging strategy
|
||||
|
||||
**Issue**: "Deployment steps unclear"
|
||||
- **Fix**: Add step-by-step deployment and rollback procedures
|
||||
|
||||
## Decision-Making Framework
|
||||
|
||||
When writing a component spec, consider:
|
||||
|
||||
1. **Boundaries**: What is this component's responsibility?
|
||||
- What does it own?
|
||||
- What does it depend on?
|
||||
- Where are boundaries clear?
|
||||
|
||||
2. **Interfaces**: How will others interact with this?
|
||||
- REST, gRPC, events, direct calls?
|
||||
- What contracts must be maintained?
|
||||
- How do we evolve interfaces?
|
||||
|
||||
3. **Configuration**: What's configurable vs. hardcoded?
|
||||
- Environment-specific settings?
|
||||
- Runtime tuning parameters?
|
||||
- Feature flags?
|
||||
|
||||
4. **Operations**: How will we run this in production?
|
||||
- Deployment model?
|
||||
- Monitoring and alerting?
|
||||
- Failure recovery?
|
||||
|
||||
5. **Scale**: How much can this component handle?
|
||||
- Throughput limits?
|
||||
- Scaling strategy?
|
||||
- Resource requirements?
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Create the spec**: `scripts/generate-spec.sh component cmp-XXX-slug`
|
||||
2. **Research**: Find design docs and existing components
|
||||
3. **Define responsibilities** and boundaries
|
||||
4. **Document interfaces** for all interactions
|
||||
5. **Plan deployment** and monitoring
|
||||
6. **Validate**: `scripts/validate-spec.sh docs/specs/component/cmp-XXX-slug.md`
|
||||
7. **Share with architecture/ops** before implementation
|
||||
Reference in New Issue
Block a user