# How to Create a Component Specification Component specifications document individual system components or services, including their responsibilities, interfaces, configuration, and deployment characteristics. ## Quick Start ```bash # 1. Create a new component spec scripts/generate-spec.sh component cmp-001-descriptive-slug # 2. Open and fill in the file # (The file will be created at: docs/specs/component/cmp-001-descriptive-slug.md) # 3. Fill in the sections, then validate: scripts/validate-spec.sh docs/specs/component/cmp-001-descriptive-slug.md # 4. Fix issues and check completeness: scripts/check-completeness.sh docs/specs/component/cmp-001-descriptive-slug.md ``` ## When to Write a Component Specification Use a Component Spec when you need to: - Document a microservice or major system component - Specify component responsibilities and interfaces - Define configuration requirements - Document deployment procedures - Enable teams to understand component behavior - Plan for monitoring and observability ## Research Phase ### 1. Research Related Specifications Find what informed this component: ```bash # Find design documents that reference this component grep -r "design\|architecture" docs/specs/ --include="*.md" # Find API contracts this component implements grep -r "api\|endpoint" docs/specs/ --include="*.md" # Find data models this component uses grep -r "data\|model" docs/specs/ --include="*.md" ``` ### 2. Review Similar Components - How are other components in your system designed? - What patterns and conventions exist? - How are they deployed and monitored? - What's the standard for documentation? ### 3. Understand Dependencies - What services or systems does this component depend on? - What services depend on this component? - What data flows through this component? - What are the integration points? ## Structure & Content Guide ### Title & Metadata - **Title**: "Export Service", "User Authentication Service", etc. - **Type**: Microservice, Library, Worker, API Gateway, etc. - **Version**: Current version number ### Component Description ```markdown # Export Service The Export Service is a microservice responsible for handling bulk user data exports. Manages export job lifecycle: queuing, processing, storage, and delivery. **Type**: Microservice **Language**: Node.js + TypeScript **Deployment**: Kubernetes (3+ replicas) **Status**: Stable (production) ``` ### Purpose & Responsibilities Section ```markdown ## Purpose Provide reliable, scalable handling of user data exports in multiple formats while maintaining system stability and data security. ## Primary Responsibilities 1. **Job Queueing**: Accept export requests and queue them for processing - Validate request parameters - Create export job records - Enqueue jobs for processing - Return job ID to client 2. **Job Processing**: Execute export jobs asynchronously - Query user data from database - Transform data to requested format (CSV, JSON) - Compress files for storage - Handle processing errors and retries 3. **File Storage**: Manage exported file storage and lifecycle - Store completed exports to S3 - Generate secure download URLs - Implement TTL-based cleanup - Maintain export audit logs 4. **Status Tracking**: Provide job status and progress information - Track job state (queued, processing, completed, failed) - Record completion time and file metadata - Handle cancellation requests 5. **Error Handling**: Manage failures gracefully - Retry failed jobs with exponential backoff - Notify users of failures - Log errors for debugging - Preserve system stability during failures ``` ### Interfaces & APIs Section ```markdown ## Interfaces ### REST API Endpoints The service exposes these HTTP endpoints: #### POST /exports **Purpose**: Create a new export job **Authentication**: Required (Bearer token) **Request Body**: ```json { "data_types": ["users", "transactions"], "format": "csv", "date_range": { "start": "2024-01-01", "end": "2024-01-31" } } ``` **Response** (201 Created): ```json { "id": "exp_123456", "status": "queued", "created_at": "2024-01-15T10:00:00Z" } ``` #### GET /exports/{id} **Purpose**: Get export job status **Response** (200 OK): ```json { "id": "exp_123456", "status": "completed", "download_url": "https://...", "file_size_bytes": 2048576 } ``` ### Event Publishing The service publishes events to message queue: **export.started** ```json { "event": "export.started", "export_id": "exp_123456", "user_id": "usr_789012", "timestamp": "2024-01-15T10:00:00Z" } ``` **export.completed** ```json { "event": "export.completed", "export_id": "exp_123456", "file_size_bytes": 2048576, "format": "csv", "timestamp": "2024-01-15T10:05:00Z" } ``` **export.failed** ```json { "event": "export.failed", "export_id": "exp_123456", "error": "database_connection_timeout", "timestamp": "2024-01-15T10:05:00Z" } ``` ### Dependencies (Consumed APIs) - **User Service API**: GET /users/{id}, GET /users (for data export) - **Auth Service**: JWT validation - **Notification Service**: Send export completion notifications ``` ### Configuration Section ```markdown ## Configuration ### Environment Variables | Variable | Type | Required | Description | |----------|------|----------|-------------| | NODE_ENV | string | Yes | Environment (dev, staging, production) | | PORT | number | Yes | HTTP server port (default: 3000) | | DATABASE_URL | string | Yes | PostgreSQL connection string | | REDIS_URL | string | Yes | Redis connection for job queue | | S3_BUCKET | string | Yes | S3 bucket for export files | | S3_REGION | string | Yes | AWS region (e.g., us-east-1) | | AWS_ACCESS_KEY_ID | string | Yes | AWS credentials | | AWS_SECRET_ACCESS_KEY | string | Yes | AWS credentials | | EXPORT_TTL_DAYS | number | No | Export file retention days (default: 7) | | MAX_EXPORT_SIZE_MB | number | No | Maximum export file size (default: 500) | | CONCURRENT_WORKERS | number | No | Number of concurrent job processors (default: 5) | ### Configuration File (config.json) ```json { "server": { "port": 3000, "timeout_ms": 30000 }, "jobs": { "max_retries": 3, "retry_delay_ms": 1000, "timeout_ms": 300000 }, "export": { "max_file_size_mb": 500, "ttl_days": 7, "formats": ["csv", "json"] }, "storage": { "type": "s3", "cleanup_interval_hours": 24 } } ``` ### Runtime Requirements - **Memory**: 512MB minimum, 2GB recommended - **CPU**: 1 core minimum, 2 cores recommended - **Disk**: 10GB for temporary files - **Network**: Must reach PostgreSQL, Redis, S3, Auth Service ``` ### Data Dependencies Section ```markdown ## Data Dependencies ### Input Data The service requires access to: - **User data**: From User Service or User DB - Fields: id, email, name, created_at, etc. - Constraints: User must be authenticated - Volume: Scale with user dataset - **Transaction data**: From Transaction DB - Fields: id, user_id, amount, date, etc. - Volume: Can be large (100k+ per user) ### Output Data The service produces: - **Export files**: CSV or JSON format - Stored in S3 - Size: Up to 500MB per file - Retention: 7 days - **Export metadata**: Stored in PostgreSQL - Export record with status, size, completion time - Audit trail of all exports ``` ### Deployment Section ```markdown ## Deployment ### Container Image - **Base Image**: node:18-alpine - **Build**: Dockerfile in repository root - **Registry**: ECR (AWS Elastic Container Registry) - **Tag**: Semver (e.g., v1.2.3, latest) ### Kubernetes Deployment ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: export-service namespace: production spec: replicas: 3 selector: matchLabels: app: export-service template: metadata: labels: app: export-service spec: containers: - name: export-service image: 123456789.dkr.ecr.us-east-1.amazonaws.com/export-service:latest ports: - containerPort: 3000 resources: requests: memory: "512Mi" cpu: "250m" limits: memory: "2Gi" cpu: "1000m" env: - name: NODE_ENV value: "production" - name: DATABASE_URL valueFrom: secretKeyRef: name: export-service-secrets key: database-url livenessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 3000 initialDelaySeconds: 10 periodSeconds: 5 ``` ### Deployment Steps 1. **Build**: `docker build -t export-service:v1.2.3 .` 2. **Push**: `docker push /export-service:v1.2.3` 3. **Update**: `kubectl set image deployment/export-service export-service=/export-service:v1.2.3` 4. **Verify**: `kubectl rollout status deployment/export-service` ### Rollback Procedure ```bash # If deployment fails, rollback to previous version kubectl rollout undo deployment/export-service # Verify successful rollback kubectl rollout status deployment/export-service ``` ### Pre-Deployment Checklist - [ ] All tests passing locally - [ ] Database migrations run successfully - [ ] Configuration environment variables set in staging - [ ] Health check endpoints responding - [ ] Metrics and logging verified ``` ### Monitoring & Observability Section ```markdown ## Monitoring ### Health Checks **Liveness Probe**: GET /health - Returns 200 if service is running - Used by Kubernetes to restart unhealthy pods **Readiness Probe**: GET /ready - Returns 200 if service is ready to receive traffic - Checks database connectivity, Redis availability - Used by Kubernetes for traffic routing ### Metrics Export these Prometheus metrics: | Metric | Type | Description | |--------|------|-------------| | exports_created_total | Counter | Total exports created | | exports_completed_total | Counter | Total exports completed successfully | | exports_failed_total | Counter | Total exports failed | | export_duration_seconds | Histogram | Time to complete export (p50, p95, p99) | | export_file_size_bytes | Histogram | Size of exported files | | export_job_queue_depth | Gauge | Number of jobs awaiting processing | | export_active_jobs | Gauge | Number of jobs currently processing | ### Alerts Configure these alerts: **Export Job Backlog Growing** - Alert if `export_job_queue_depth > 100` for 5+ minutes - Action: Scale up worker replicas **Export Failures Increasing** - Alert if `exports_failed_total` increases by > 10% in 1 hour - Action: Investigate failure logs **Service Unhealthy** - Alert if liveness probe fails - Action: Restart pod, check logs ### Logging Log format (JSON): ```json { "timestamp": "2024-01-15T10:05:00Z", "level": "info", "service": "export-service", "export_id": "exp_123456", "event": "export_completed", "duration_ms": 5000, "file_size_bytes": 2048576, "message": "Export completed successfully" } ``` **Log Levels** - `debug`: Detailed debugging information - `info`: Important operational events - `warn`: Warning conditions (retries, slow operations) - `error`: Error conditions (failures, exceptions) ``` ### Dependencies & Integration Section ```markdown ## Dependencies ### Service Dependencies | Service | Purpose | Criticality | Failure Impact | |---------|---------|-------------|----------------| | PostgreSQL | Export job storage | Critical | Service down | | Redis | Job queue | Critical | Exports won't process | | S3 | Export file storage | Critical | Can't store exports | | Auth Service | JWT validation | Critical | Can't validate requests | | User Service | User data source | Critical | Can't export user data | | Notification Service | Email notifications | Optional | Users won't get notification | ### External Dependencies - **AWS S3**: For file storage and retrieval - **PostgreSQL**: For export metadata - **Redis**: For job queue - **Kubernetes**: For orchestration ### Fallback Strategies - Redis unavailable: Use in-memory queue (single instance only) - User Service unavailable: Fail export with "upstream_error" - S3 unavailable: Retry with exponential backoff, max 3 times ``` ### Performance & SLA Section ```markdown ## Performance Characteristics ### Throughput - Process up to 1000 exports per day - Handle 100 concurrent job workers - Queue depth auto-scales based on load ### Latency - Create export job: < 100ms (p95) - Process 100MB export: 3-5 minutes average - Query export status: < 50ms (p95) ### Resource Usage - Memory: 800MB average, peaks at 1.5GB - CPU: 25% average, peaks at 60% - Disk (temp): 50GB for concurrent exports ### Service Level Objectives (SLOs) | Objective | Target | |-----------|--------| | Availability | 99.5% uptime | | Error Rate | < 0.1% | | p95 Latency (status query) | < 100ms | | Export Completion | < 10 minutes for 100MB | ### Scalability - Horizontal: Add more pods for higher throughput - Vertical: Increase pod memory/CPU for larger exports - Maximum tested: 10k exports/day on 5 pod cluster ``` ## Writing Tips ### Be Specific About Responsibilities - What does this component do? - What does it NOT do? - Where do responsibilities start/stop? ### Document All Interfaces - REST APIs? Document endpoints and schemas - Message queues? Show event formats - Database? Show schema and queries - Dependencies? Show what's called and how ### Include Deployment Details - How is it deployed (containers, VMs, serverless)? - Configuration required? - Health checks? - Monitoring setup? ### Link to Related Specs - Reference design docs: `[DES-001]` - Reference API contracts: `[API-001]` - Reference data models: `[DATA-001]` - Reference deployment procedures: `[DEPLOY-001]` ### Document Failure Modes - What happens if dependencies fail? - How does the component recover? - What alerts fire when things go wrong? ## Validation & Fixing Issues ### Run the Validator ```bash scripts/validate-spec.sh docs/specs/component/cmp-001-your-spec.md ``` ### Common Issues & Fixes **Issue**: "Missing Interfaces section" - **Fix**: Document all APIs, event formats, and data contracts **Issue**: "Configuration incomplete" - **Fix**: Add environment variables, configuration files, and runtime requirements **Issue**: "No Monitoring section" - **Fix**: Add health checks, metrics, alerts, and logging strategy **Issue**: "Deployment steps unclear" - **Fix**: Add step-by-step deployment and rollback procedures ## Decision-Making Framework When writing a component spec, consider: 1. **Boundaries**: What is this component's responsibility? - What does it own? - What does it depend on? - Where are boundaries clear? 2. **Interfaces**: How will others interact with this? - REST, gRPC, events, direct calls? - What contracts must be maintained? - How do we evolve interfaces? 3. **Configuration**: What's configurable vs. hardcoded? - Environment-specific settings? - Runtime tuning parameters? - Feature flags? 4. **Operations**: How will we run this in production? - Deployment model? - Monitoring and alerting? - Failure recovery? 5. **Scale**: How much can this component handle? - Throughput limits? - Scaling strategy? - Resource requirements? ## Next Steps 1. **Create the spec**: `scripts/generate-spec.sh component cmp-XXX-slug` 2. **Research**: Find design docs and existing components 3. **Define responsibilities** and boundaries 4. **Document interfaces** for all interactions 5. **Plan deployment** and monitoring 6. **Validate**: `scripts/validate-spec.sh docs/specs/component/cmp-XXX-slug.md` 7. **Share with architecture/ops** before implementation