zhongwei/gh-onezerocompany-claude-project-basics

Files

Zhongwei Li ca9b85ccda Initial commit

2025-11-30 08:45:31 +08:00

16 KiB

Raw Blame History

How to Create a Component Specification

Component specifications document individual system components or services, including their responsibilities, interfaces, configuration, and deployment characteristics.

Quick Start

# 1. Create a new component spec
scripts/generate-spec.sh component cmp-001-descriptive-slug

# 2. Open and fill in the file
# (The file will be created at: docs/specs/component/cmp-001-descriptive-slug.md)

# 3. Fill in the sections, then validate:
scripts/validate-spec.sh docs/specs/component/cmp-001-descriptive-slug.md

# 4. Fix issues and check completeness:
scripts/check-completeness.sh docs/specs/component/cmp-001-descriptive-slug.md

When to Write a Component Specification

Use a Component Spec when you need to:

Document a microservice or major system component
Specify component responsibilities and interfaces
Define configuration requirements
Document deployment procedures
Enable teams to understand component behavior
Plan for monitoring and observability

Research Phase

Find what informed this component:

# Find design documents that reference this component
grep -r "design\|architecture" docs/specs/ --include="*.md"

# Find API contracts this component implements
grep -r "api\|endpoint" docs/specs/ --include="*.md"

# Find data models this component uses
grep -r "data\|model" docs/specs/ --include="*.md"

2. Review Similar Components

How are other components in your system designed?
What patterns and conventions exist?
How are they deployed and monitored?
What's the standard for documentation?

3. Understand Dependencies

What services or systems does this component depend on?
What services depend on this component?
What data flows through this component?
What are the integration points?

Structure & Content Guide

Title & Metadata

Title: "Export Service", "User Authentication Service", etc.
Type: Microservice, Library, Worker, API Gateway, etc.
Version: Current version number

Component Description

# Export Service

The Export Service is a microservice responsible for handling bulk user data exports.
Manages export job lifecycle: queuing, processing, storage, and delivery.

**Type**: Microservice
**Language**: Node.js + TypeScript
**Deployment**: Kubernetes (3+ replicas)
**Status**: Stable (production)

Purpose & Responsibilities Section

## Purpose

Provide reliable, scalable handling of user data exports in multiple formats
while maintaining system stability and data security.

## Primary Responsibilities

1. **Job Queueing**: Accept export requests and queue them for processing
   - Validate request parameters
   - Create export job records
   - Enqueue jobs for processing
   - Return job ID to client

2. **Job Processing**: Execute export jobs asynchronously
   - Query user data from database
   - Transform data to requested format (CSV, JSON)
   - Compress files for storage
   - Handle processing errors and retries

3. **File Storage**: Manage exported file storage and lifecycle
   - Store completed exports to S3
   - Generate secure download URLs
   - Implement TTL-based cleanup
   - Maintain export audit logs

4. **Status Tracking**: Provide job status and progress information
   - Track job state (queued, processing, completed, failed)
   - Record completion time and file metadata
   - Handle cancellation requests

5. **Error Handling**: Manage failures gracefully
   - Retry failed jobs with exponential backoff
   - Notify users of failures
   - Log errors for debugging
   - Preserve system stability during failures

Interfaces & APIs Section

## Interfaces

### REST API Endpoints

The service exposes these HTTP endpoints:

#### POST /exports
**Purpose**: Create a new export job
**Authentication**: Required (Bearer token)
**Request Body**:
```json
{
  "data_types": ["users", "transactions"],
  "format": "csv",
  "date_range": { "start": "2024-01-01", "end": "2024-01-31" }
}

Response (201 Created):

{
  "id": "exp_123456",
  "status": "queued",
  "created_at": "2024-01-15T10:00:00Z"
}

GET /exports/{id}

Purpose: Get export job status Response (200 OK):

{
  "id": "exp_123456",
  "status": "completed",
  "download_url": "https://...",
  "file_size_bytes": 2048576
}

Event Publishing

The service publishes events to message queue:

export.started

{
  "event": "export.started",
  "export_id": "exp_123456",
  "user_id": "usr_789012",
  "timestamp": "2024-01-15T10:00:00Z"
}

export.completed

{
  "event": "export.completed",
  "export_id": "exp_123456",
  "file_size_bytes": 2048576,
  "format": "csv",
  "timestamp": "2024-01-15T10:05:00Z"
}

export.failed

{
  "event": "export.failed",
  "export_id": "exp_123456",
  "error": "database_connection_timeout",
  "timestamp": "2024-01-15T10:05:00Z"
}

Dependencies (Consumed APIs)

User Service API: GET /users/{id}, GET /users (for data export)
Auth Service: JWT validation
Notification Service: Send export completion notifications


### Configuration Section

```markdown
## Configuration

### Environment Variables

| Variable | Type | Required | Description |
|----------|------|----------|-------------|
| NODE_ENV | string | Yes | Environment (dev, staging, production) |
| PORT | number | Yes | HTTP server port (default: 3000) |
| DATABASE_URL | string | Yes | PostgreSQL connection string |
| REDIS_URL | string | Yes | Redis connection for job queue |
| S3_BUCKET | string | Yes | S3 bucket for export files |
| S3_REGION | string | Yes | AWS region (e.g., us-east-1) |
| AWS_ACCESS_KEY_ID | string | Yes | AWS credentials |
| AWS_SECRET_ACCESS_KEY | string | Yes | AWS credentials |
| EXPORT_TTL_DAYS | number | No | Export file retention days (default: 7) |
| MAX_EXPORT_SIZE_MB | number | No | Maximum export file size (default: 500) |
| CONCURRENT_WORKERS | number | No | Number of concurrent job processors (default: 5) |

### Configuration File (config.json)

```json
{
  "server": {
    "port": 3000,
    "timeout_ms": 30000
  },
  "jobs": {
    "max_retries": 3,
    "retry_delay_ms": 1000,
    "timeout_ms": 300000
  },
  "export": {
    "max_file_size_mb": 500,
    "ttl_days": 7,
    "formats": ["csv", "json"]
  },
  "storage": {
    "type": "s3",
    "cleanup_interval_hours": 24
  }
}

Runtime Requirements

Memory: 512MB minimum, 2GB recommended
CPU: 1 core minimum, 2 cores recommended
Disk: 10GB for temporary files
Network: Must reach PostgreSQL, Redis, S3, Auth Service


### Data Dependencies Section

```markdown
## Data Dependencies

### Input Data

The service requires access to:
- **User data**: From User Service or User DB
  - Fields: id, email, name, created_at, etc.
  - Constraints: User must be authenticated
  - Volume: Scale with user dataset

- **Transaction data**: From Transaction DB
  - Fields: id, user_id, amount, date, etc.
  - Volume: Can be large (100k+ per user)

### Output Data

The service produces:
- **Export files**: CSV or JSON format
  - Stored in S3
  - Size: Up to 500MB per file
  - Retention: 7 days

- **Export metadata**: Stored in PostgreSQL
  - Export record with status, size, completion time
  - Audit trail of all exports

Deployment Section

## Deployment

### Container Image

- **Base Image**: node:18-alpine
- **Build**: Dockerfile in repository root
- **Registry**: ECR (AWS Elastic Container Registry)
- **Tag**: Semver (e.g., v1.2.3, latest)

### Kubernetes Deployment

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: export-service
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: export-service
  template:
    metadata:
      labels:
        app: export-service
    spec:
      containers:
      - name: export-service
        image: 123456789.dkr.ecr.us-east-1.amazonaws.com/export-service:latest
        ports:
        - containerPort: 3000
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
        env:
        - name: NODE_ENV
          value: "production"
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: export-service-secrets
              key: database-url
        livenessProbe:
          httpGet:
            path: /health
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 5

Deployment Steps

Build: docker build -t export-service:v1.2.3 .
Push: docker push <registry>/export-service:v1.2.3
Update: kubectl set image deployment/export-service export-service=<registry>/export-service:v1.2.3
Verify: kubectl rollout status deployment/export-service

Rollback Procedure

# If deployment fails, rollback to previous version
kubectl rollout undo deployment/export-service

# Verify successful rollback
kubectl rollout status deployment/export-service

Pre-Deployment Checklist

All tests passing locally
Database migrations run successfully
Configuration environment variables set in staging
Health check endpoints responding
Metrics and logging verified


### Monitoring & Observability Section

```markdown
## Monitoring

### Health Checks

**Liveness Probe**: GET /health
- Returns 200 if service is running
- Used by Kubernetes to restart unhealthy pods

**Readiness Probe**: GET /ready
- Returns 200 if service is ready to receive traffic
- Checks database connectivity, Redis availability
- Used by Kubernetes for traffic routing

### Metrics

Export these Prometheus metrics:

| Metric | Type | Description |
|--------|------|-------------|
| exports_created_total | Counter | Total exports created |
| exports_completed_total | Counter | Total exports completed successfully |
| exports_failed_total | Counter | Total exports failed |
| export_duration_seconds | Histogram | Time to complete export (p50, p95, p99) |
| export_file_size_bytes | Histogram | Size of exported files |
| export_job_queue_depth | Gauge | Number of jobs awaiting processing |
| export_active_jobs | Gauge | Number of jobs currently processing |

### Alerts

Configure these alerts:

**Export Job Backlog Growing**
- Alert if `export_job_queue_depth > 100` for 5+ minutes
- Action: Scale up worker replicas

**Export Failures Increasing**
- Alert if `exports_failed_total` increases by > 10% in 1 hour
- Action: Investigate failure logs

**Service Unhealthy**
- Alert if liveness probe fails
- Action: Restart pod, check logs

### Logging

Log format (JSON):
```json
{
  "timestamp": "2024-01-15T10:05:00Z",
  "level": "info",
  "service": "export-service",
  "export_id": "exp_123456",
  "event": "export_completed",
  "duration_ms": 5000,
  "file_size_bytes": 2048576,
  "message": "Export completed successfully"
}

Log Levels

debug: Detailed debugging information
info: Important operational events
warn: Warning conditions (retries, slow operations)
error: Error conditions (failures, exceptions)


### Dependencies & Integration Section

```markdown
## Dependencies

### Service Dependencies

| Service | Purpose | Criticality | Failure Impact |
|---------|---------|-------------|----------------|
| PostgreSQL | Export job storage | Critical | Service down |
| Redis | Job queue | Critical | Exports won't process |
| S3 | Export file storage | Critical | Can't store exports |
| Auth Service | JWT validation | Critical | Can't validate requests |
| User Service | User data source | Critical | Can't export user data |
| Notification Service | Email notifications | Optional | Users won't get notification |

### External Dependencies

- **AWS S3**: For file storage and retrieval
- **PostgreSQL**: For export metadata
- **Redis**: For job queue
- **Kubernetes**: For orchestration

### Fallback Strategies

- Redis unavailable: Use in-memory queue (single instance only)
- User Service unavailable: Fail export with "upstream_error"
- S3 unavailable: Retry with exponential backoff, max 3 times

Performance & SLA Section

## Performance Characteristics

### Throughput
- Process up to 1000 exports per day
- Handle 100 concurrent job workers
- Queue depth auto-scales based on load

### Latency
- Create export job: < 100ms (p95)
- Process 100MB export: 3-5 minutes average
- Query export status: < 50ms (p95)

### Resource Usage
- Memory: 800MB average, peaks at 1.5GB
- CPU: 25% average, peaks at 60%
- Disk (temp): 50GB for concurrent exports

### Service Level Objectives (SLOs)

| Objective | Target |
|-----------|--------|
| Availability | 99.5% uptime |
| Error Rate | < 0.1% |
| p95 Latency (status query) | < 100ms |
| Export Completion | < 10 minutes for 100MB |

### Scalability

- Horizontal: Add more pods for higher throughput
- Vertical: Increase pod memory/CPU for larger exports
- Maximum tested: 10k exports/day on 5 pod cluster

Writing Tips

Be Specific About Responsibilities

What does this component do?
What does it NOT do?
Where do responsibilities start/stop?

Document All Interfaces

REST APIs? Document endpoints and schemas
Message queues? Show event formats
Database? Show schema and queries
Dependencies? Show what's called and how

Include Deployment Details

How is it deployed (containers, VMs, serverless)?
Configuration required?
Health checks?
Monitoring setup?

Reference design docs: [DES-001]
Reference API contracts: [API-001]
Reference data models: [DATA-001]
Reference deployment procedures: [DEPLOY-001]

Document Failure Modes

What happens if dependencies fail?
How does the component recover?
What alerts fire when things go wrong?

Validation & Fixing Issues

Run the Validator

scripts/validate-spec.sh docs/specs/component/cmp-001-your-spec.md

Common Issues & Fixes

Issue: "Missing Interfaces section"

Fix: Document all APIs, event formats, and data contracts

Issue: "Configuration incomplete"

Fix: Add environment variables, configuration files, and runtime requirements

Issue: "No Monitoring section"

Fix: Add health checks, metrics, alerts, and logging strategy

Issue: "Deployment steps unclear"

Fix: Add step-by-step deployment and rollback procedures

Decision-Making Framework

When writing a component spec, consider:

Boundaries: What is this component's responsibility?
- What does it own?
- What does it depend on?
- Where are boundaries clear?
Interfaces: How will others interact with this?
- REST, gRPC, events, direct calls?
- What contracts must be maintained?
- How do we evolve interfaces?
Configuration: What's configurable vs. hardcoded?
- Environment-specific settings?
- Runtime tuning parameters?
- Feature flags?
Operations: How will we run this in production?
- Deployment model?
- Monitoring and alerting?
- Failure recovery?
Scale: How much can this component handle?
- Throughput limits?
- Scaling strategy?
- Resource requirements?

Next Steps

Create the spec: scripts/generate-spec.sh component cmp-XXX-slug
Research: Find design docs and existing components
Define responsibilities and boundaries
Document interfaces for all interactions
Plan deployment and monitoring
Validate: scripts/validate-spec.sh docs/specs/component/cmp-XXX-slug.md
Share with architecture/ops before implementation

16 KiB Raw Blame History