gh-onezerocompany-claude-pr…/skills/spec-author/guides/configuration-schema.md

# How to Create a Configuration Schema Specification

Configuration schema specifications document all configurable parameters for a system, including their types, valid values, defaults, and impact.

## Quick Start

```bash
# 1. Create a new configuration schema
scripts/generate-spec.sh configuration-schema config-001-descriptive-slug

# 2. Open and fill in the file
# (The file will be created at: docs/specs/configuration-schema/config-001-descriptive-slug.md)

# 3. Fill in configuration fields and validation rules, then validate:
scripts/validate-spec.sh docs/specs/configuration-schema/config-001-descriptive-slug.md

# 4. Fix issues and check completeness:
scripts/check-completeness.sh docs/specs/configuration-schema/config-001-descriptive-slug.md
```

## When to Write a Configuration Schema

Use a Configuration Schema when you need to:
- Document all configurable system parameters
- Specify environment variables and their meanings
- Define configuration file formats
- Document validation rules and constraints
- Enable operations teams to configure systems safely
- Provide examples for different environments

## Research Phase

### 1. Research Related Specifications
Find what you're configuring:

```bash
# Find component specs
grep -r "component" docs/specs/ --include="*.md"

# Find deployment procedures
grep -r "deploy" docs/specs/ --include="*.md"

# Find existing configuration specs
grep -r "config" docs/specs/ --include="*.md"
```

### 2. Understand Configuration Needs
- What aspects of the system need to be configurable?
- What differs between environments (dev, staging, prod)?
- What can change at runtime vs. requires restart?
- What's sensitive (secrets, credentials)?

### 3. Review Existing Configurations
- How are other services configured?
- What configuration format is used?
- What environment variables exist?
- What patterns should be followed?

## Structure & Content Guide

### Title & Metadata
- **Title**: "Export Service Configuration", "API Gateway Config", etc.
- **Component**: What component is being configured
- **Version**: Configuration format version
- **Status**: Current, Deprecated, etc.

### Overview Section

```markdown
# Export Service Configuration Schema

## Summary
Defines all configurable parameters for the Export Service microservice.
Configuration can be set via environment variables or JSON config file.

**Configuration Methods**:
- Environment variables (recommended for Docker/Kubernetes)
- config.json file (for monolithic deployments)
- Command-line arguments (for local development)

**Scope**: All settings that affect Export Service behavior
**Format**: JSON Schema compliant
```

### Configuration Methods Section

```markdown
## Configuration Methods

### Method 1: Environment Variables (Recommended for Production)
Used in containerized deployments (Docker, Kubernetes).
Set before starting the service.

**Syntax**: `EXPORT_SERVICE_KEY=value`

**Example**:
```bash
export EXPORT_SERVICE_PORT=3000
export EXPORT_SERVICE_LOG_LEVEL=info
export EXPORT_SERVICE_DATABASE_URL=postgresql://user:pass@host/db
```

### Method 2: Configuration File (config.json)
Used in monolithic or local deployments.
JSON format with hierarchical structure.

**Location**: `./config.json` in working directory

**Example**:
```json
{
  "server": {
    "port": 3000,
    "timeout": 30000
  },
  "database": {
    "url": "postgresql://user:pass@host/db",
    "pool": 10
  }
}
```

### Method 3: Command-Line Arguments
Used in local development. Takes precedence over file config.

**Syntax**: `--key value` or `--key=value`

**Example**:
```bash
node index.js --port 3000 --log-level debug
```

### Precedence (Priority Order)
1. Command-line arguments (highest priority)
2. Environment variables
3. config.json file
4. Default values (lowest priority)
```

### Configuration Fields Section

Document each configuration field:

```markdown
## Configuration Fields

### Server Section

#### PORT
- **Type**: integer
- **Default**: 3000
- **Range**: 1024-65535
- **Environment Variable**: `EXPORT_SERVICE_PORT`
- **Config File Key**: `server.port`
- **Description**: HTTP server listening port
- **Examples**:
  - Development: 3000 (local machine, different services use different ports)
  - Production: 3000 (behind load balancer, port not exposed)
- **Impact**: Service not reachable if port already in use
- **Can Change at Runtime**: No (requires restart)

#### TIMEOUT_MS
- **Type**: integer (milliseconds)
- **Default**: 30000 (30 seconds)
- **Range**: 5000-120000
- **Environment Variable**: `EXPORT_SERVICE_TIMEOUT_MS`
- **Config File Key**: `server.timeout_ms`
- **Description**: HTTP request timeout
- **Considerations**:
  - Must be longer than longest export duration
  - If too short: Long exports time out and fail
  - If too long: Failed connections hang longer
- **Examples**:
  - Development: 30000 (quick feedback on errors)
  - Production: 120000 (accounts for large exports)

#### ENABLE_COMPRESSION
- **Type**: boolean
- **Default**: true
- **Environment Variable**: `EXPORT_SERVICE_ENABLE_COMPRESSION`
- **Config File Key**: `server.enable_compression`
- **Description**: Enable HTTP response compression (gzip)
- **Considerations**:
  - Reduces bandwidth but increases CPU usage
  - Should be true unless CPU constrained
- **Typical Value**: true (always)

### Database Section

#### DATABASE_URL
- **Type**: string (connection string)
- **Default**: None (required)
- **Environment Variable**: `EXPORT_SERVICE_DATABASE_URL`
- **Config File Key**: `database.url`
- **Format**: `postgresql://user:password@host:port/database`
- **Description**: PostgreSQL connection string
- **Examples**:
  - Development: `postgresql://localhost/export_service`
  - Staging: `postgresql://stage-db.example.com/export_stage`
  - Production: `postgresql://prod-db.example.com/export_prod` (managed RDS)
- **Sensitive**: Yes (contains credentials - use secrets management)
- **Required**: Yes
- **Validation**:
  - Must be valid PostgreSQL connection string
  - Service fails to start if URL invalid or unreachable

#### DATABASE_POOL_SIZE
- **Type**: integer
- **Default**: 10
- **Range**: 1-100
- **Environment Variable**: `EXPORT_SERVICE_DATABASE_POOL_SIZE`
- **Config File Key**: `database.pool_size`
- **Description**: Number of database connections to maintain
- **Considerations**:
  - More connections allow more concurrent queries
  - Each connection uses memory and database slot
  - Database has max_connections limit (typically 100-500)
- **Tuning**:
  - 1 service instance: 5-10 connections
  - 5 service instances: 2-4 connections each (25-40 total)
  - Kubernetes auto-scaling: 2-3 per pod (auto-scaled)

#### DATABASE_QUERY_TIMEOUT_MS
- **Type**: integer (milliseconds)
- **Default**: 10000 (10 seconds)
- **Range**: 1000-60000
- **Environment Variable**: `EXPORT_SERVICE_DATABASE_QUERY_TIMEOUT_MS`
- **Config File Key**: `database.query_timeout_ms`
- **Description**: Timeout for individual database queries
- **Considerations**:
  - Export queries can take several seconds for large datasets
  - If too short: Queries fail prematurely
  - If too long: Failed queries block connection pool
- **Typical Values**:
  - Simple queries: 5000ms
  - Large exports: 30000ms

### Redis (Job Queue) Section

#### REDIS_URL
- **Type**: string (connection string)
- **Default**: None (required)
- **Environment Variable**: `EXPORT_SERVICE_REDIS_URL`
- **Config File Key**: `redis.url`
- **Format**: `redis://user:password@host:port/db`
- **Description**: Redis connection string for job queue
- **Examples**:
  - Development: `redis://localhost:6379/0`
  - Staging: `redis://redis-stage.example.com:6379/0`
  - Production: `redis://redis-prod.example.com:6379/0` (managed ElastiCache)
- **Sensitive**: Yes (may contain credentials)
- **Required**: Yes

#### REDIS_MAX_RETRIES
- **Type**: integer
- **Default**: 3
- **Range**: 1-10
- **Environment Variable**: `EXPORT_SERVICE_REDIS_MAX_RETRIES`
- **Config File Key**: `redis.max_retries`
- **Description**: Maximum retry attempts for Redis operations
- **Considerations**:
  - More retries provide resilience but increase latency on failure
  - Should be 3-5 for production
- **Typical Values**: 3

#### CONCURRENT_WORKERS
- **Type**: integer
- **Default**: 3
- **Range**: 1-20
- **Environment Variable**: `EXPORT_SERVICE_CONCURRENT_WORKERS`
- **Config File Key**: `redis.concurrent_workers`
- **Description**: Number of concurrent export workers
- **Considerations**:
  - Each worker processes one export job at a time
  - More workers process jobs faster but use more resources
  - Limited by CPU and memory available
  - Kubernetes scales pods, not this setting
- **Tuning**:
  - Development: 1-2 (for debugging)
  - Production with 2 CPU: 2-3 workers
  - Production with 4+ CPU: 4-8 workers

### Export Section

#### MAX_EXPORT_SIZE_MB
- **Type**: integer
- **Default**: 500
- **Range**: 10-5000
- **Environment Variable**: `EXPORT_SERVICE_MAX_EXPORT_SIZE_MB`
- **Config File Key**: `export.max_export_size_mb`
- **Description**: Maximum size for an export file (in MB)
- **Considerations**:
  - Files larger than this are rejected
  - Limited by disk space and memory
  - Should match S3 bucket policies
- **Typical Values**:
  - Small deployments: 100MB
  - Standard: 500MB
  - Enterprise: 1000-5000MB

#### EXPORT_TTL_DAYS
- **Type**: integer (days)
- **Default**: 7
- **Range**: 1-365
- **Environment Variable**: `EXPORT_SERVICE_EXPORT_TTL_DAYS`
- **Config File Key**: `export.ttl_days`
- **Description**: How long to retain export files after completion
- **Considerations**:
  - Files deleted after TTL expires
  - Affects storage costs (shorter TTL = lower cost)
  - Users must download before expiration
- **Typical Values**:
  - Short retention: 3 days (reduce storage cost)
  - Standard: 7 days (reasonable download window)
  - Long retention: 30 days (enterprise customers)

#### EXPORT_FORMATS
- **Type**: array of strings
- **Default**: ["csv", "json"]
- **Valid Values**: "csv", "json", "parquet"
- **Environment Variable**: `EXPORT_SERVICE_EXPORT_FORMATS` (comma-separated)
- **Config File Key**: `export.formats`
- **Description**: Supported export file formats
- **Examples**:
  - `["csv", "json"]` (most common)
  - `["csv", "json", "parquet"]` (full support)
- **Configuration**:
  - Environment: `EXPORT_SERVICE_EXPORT_FORMATS=csv,json`
  - File: `"formats": ["csv", "json"]`

#### COMPRESSION_ENABLED
- **Type**: boolean
- **Default**: true
- **Environment Variable**: `EXPORT_SERVICE_COMPRESSION_ENABLED`
- **Config File Key**: `export.compression_enabled`
- **Description**: Enable gzip compression for export files
- **Considerations**:
  - Reduces file size by 60-80% typically
  - Increases CPU usage during export
  - Should be enabled unless CPU is bottleneck
- **Typical Value**: true

### Storage Section

#### S3_BUCKET
- **Type**: string
- **Default**: None (required)
- **Environment Variable**: `EXPORT_SERVICE_S3_BUCKET`
- **Config File Key**: `storage.s3_bucket`
- **Description**: AWS S3 bucket for storing export files
- **Format**: `bucket-name` (no s3:// prefix)
- **Examples**:
  - Development: `export-service-dev`
  - Staging: `export-service-stage`
  - Production: `export-service-prod`
- **Required**: Yes
- **IAM Requirements**: Service role must have s3:PutObject, s3:GetObject

#### S3_REGION
- **Type**: string
- **Default**: `us-east-1`
- **Valid Values**: Any AWS region (us-east-1, eu-west-1, etc.)
- **Environment Variable**: `EXPORT_SERVICE_S3_REGION`
- **Config File Key**: `storage.s3_region`
- **Description**: AWS region for S3 bucket
- **Examples**:
  - us-east-1 (US East - Virginia)
  - eu-west-1 (EU - Ireland)

### Logging Section

#### LOG_LEVEL
- **Type**: string (enum)
- **Default**: "info"
- **Valid Values**: "debug", "info", "warn", "error"
- **Environment Variable**: `EXPORT_SERVICE_LOG_LEVEL`
- **Config File Key**: `logging.level`
- **Description**: Logging verbosity level
- **Examples**:
  - Development: "debug" (verbose, detailed logs)
  - Staging: "info" (normal level)
  - Production: "info" or "warn" (minimal logs, better performance)
- **Considerations**:
  - debug: Very verbose, affects performance
  - info: Standard operational logs
  - warn: Only warnings and errors
  - error: Only errors

#### LOG_FORMAT
- **Type**: string (enum)
- **Default**: "json"
- **Valid Values**: "json", "text"
- **Environment Variable**: `EXPORT_SERVICE_LOG_FORMAT`
- **Config File Key**: `logging.format`
- **Description**: Log output format
- **Examples**:
  - json: Machine-parseable JSON logs (recommended for production)
  - text: Human-readable text (good for development)

### Feature Flags Section

#### FEATURE_PARQUET_EXPORT
- **Type**: boolean
- **Default**: false
- **Environment Variable**: `EXPORT_SERVICE_FEATURE_PARQUET_EXPORT`
- **Config File Key**: `features.parquet_export`
- **Description**: Enable experimental Parquet export format
- **Considerations**:
  - Set to false for stable deployments
  - Set to true in staging for testing
  - Disabled by default in production
- **Typical Values**:
  - Development: true (test new feature)
  - Staging: true (validate before production)
  - Production: false (disabled until stable)
```

### Validation Rules Section

```markdown
## Validation & Constraints

### Required Fields
These fields must be provided (no default value):
- `DATABASE_URL` - PostgreSQL connection string required
- `REDIS_URL` - Redis connection required
- `S3_BUCKET` - S3 bucket must be specified

### Type Validation
- Integers: Must be valid numeric values
- Booleans: Accept true, false, "true", "false", 1, 0
- Strings: Must not be empty (unless explicitly optional)
- Arrays: Must be comma-separated in environment, JSON array in file

### Range Validation
- PORT: 1024-65535 (avoid system ports)
- POOL_SIZE: 1-100 (reasonable connection pool)
- TIMEOUT_MS: 5000-120000 (between 5 seconds and 2 minutes)
- MAX_EXPORT_SIZE_MB: 10-5000 (reasonable file sizes)

### Format Validation
- DATABASE_URL: Must be valid PostgreSQL connection string
- S3_BUCKET: Must follow S3 naming rules (lowercase, hyphens only)
- S3_REGION: Must be valid AWS region code

### Interdependency Rules
- If COMPRESSION_ENABLED=true: MAX_EXPORT_SIZE_MB can be larger
- If MAX_EXPORT_SIZE_MB > 100: DATABASE_QUERY_TIMEOUT_MS should be > 10000
- If CONCURRENT_WORKERS > 5: Memory requirements increase significantly

### Error Cases
What happens if validation fails:
- Service fails to start with validation error
- Specific field and reason for validation failure logged
- Error message includes valid range/values
```

### Environment-Specific Configurations Section

```markdown
## Environment-Specific Configurations

### Development Environment

```json
{
  "server": {
    "port": 3000,
    "timeout_ms": 30000
  },
  "database": {
    "url": "postgresql://localhost/export_service",
    "pool_size": 5
  },
  "redis": {
    "url": "redis://localhost:6379/0",
    "concurrent_workers": 1
  },
  "export": {
    "max_export_size_mb": 100,
    "ttl_days": 7,
    "formats": ["csv", "json"]
  },
  "logging": {
    "level": "debug",
    "format": "text"
  },
  "features": {
    "parquet_export": false
  }
}
```

**Notes**:
- Runs locally with minimal resources
- Verbose logging for debugging
- Limited concurrent workers (1)
- Smaller max export size for testing

### Staging Environment

```bash
EXPORT_SERVICE_PORT=3000
EXPORT_SERVICE_DATABASE_URL=postgresql://stage-db.example.com/export_stage
EXPORT_SERVICE_REDIS_URL=redis://redis-stage.example.com:6379/0
EXPORT_SERVICE_S3_BUCKET=export-service-stage
EXPORT_SERVICE_S3_REGION=us-east-1
EXPORT_SERVICE_LOG_LEVEL=info
EXPORT_SERVICE_LOG_FORMAT=json
EXPORT_SERVICE_CONCURRENT_WORKERS=3
EXPORT_SERVICE_MAX_EXPORT_SIZE_MB=500
EXPORT_SERVICE_FEATURE_PARQUET_EXPORT=true
```

**Notes**:
- Tests new features before production
- Similar resources to production
- Parquet export enabled for testing

### Production Environment

```bash
EXPORT_SERVICE_PORT=3000
EXPORT_SERVICE_DATABASE_URL=<from AWS Secrets Manager>
EXPORT_SERVICE_REDIS_URL=<from AWS Secrets Manager>
EXPORT_SERVICE_S3_BUCKET=export-service-prod
EXPORT_SERVICE_S3_REGION=us-east-1
EXPORT_SERVICE_LOG_LEVEL=info
EXPORT_SERVICE_LOG_FORMAT=json
EXPORT_SERVICE_CONCURRENT_WORKERS=4
EXPORT_SERVICE_DATABASE_POOL_SIZE=3
EXPORT_SERVICE_MAX_EXPORT_SIZE_MB=500
EXPORT_SERVICE_EXPORT_TTL_DAYS=7
EXPORT_SERVICE_FEATURE_PARQUET_EXPORT=false
```

**Notes**:
- Credentials from secrets manager
- Optimized for performance and reliability
- Experimental features disabled
- Standard deployment settings
```

### Configuration Examples Section

```markdown
## Complete Configuration Examples

### Minimal Configuration (Development)
```bash
# Minimal settings needed to run locally
export EXPORT_SERVICE_DATABASE_URL=postgresql://localhost/export_service
export EXPORT_SERVICE_REDIS_URL=redis://localhost:6379/0
export EXPORT_SERVICE_S3_BUCKET=export-service-local
export EXPORT_SERVICE_S3_REGION=us-east-1
```

### High-Throughput Configuration (Production)
```bash
# Optimized for maximum throughput
export EXPORT_SERVICE_CONCURRENT_WORKERS=8
export EXPORT_SERVICE_DATABASE_POOL_SIZE=5
export EXPORT_SERVICE_MAX_EXPORT_SIZE_MB=1000
export EXPORT_SERVICE_COMPRESSION_ENABLED=true
export EXPORT_SERVICE_EXPORT_TTL_DAYS=30
```

### Low-Resource Configuration (Cost-Optimized)
```bash
# Minimizes resource usage and cost
export EXPORT_SERVICE_CONCURRENT_WORKERS=1
export EXPORT_SERVICE_DATABASE_POOL_SIZE=2
export EXPORT_SERVICE_MAX_EXPORT_SIZE_MB=100
export EXPORT_SERVICE_EXPORT_TTL_DAYS=1
export EXPORT_SERVICE_LOG_LEVEL=warn
```
```

### Secrets Management Section

```markdown
## Handling Sensitive Configuration

### Sensitive Fields
These fields contain credentials or sensitive information:
- DATABASE_URL (contains password)
- REDIS_URL (may contain password)
- AWS credentials (if not using IAM roles)

### Security Best Practices
1. **Never commit secrets to git**
   - Use .gitignore to exclude config files with secrets
   - Use environment variables instead

2. **Use Secrets Management**
   - AWS Secrets Manager (recommended for production)
   - HashiCorp Vault (for multi-team deployments)
   - Kubernetes Secrets (for K8s deployments)

3. **Rotate Credentials**
   - Rotate database passwords regularly
   - Rotate AWS API keys
   - Update service after rotation

4. **Limit Access**
   - Only operations team can see production credentials
   - Audit logs track who accessed what credentials
   - Use IAM roles instead of static credentials when possible

### Example: Using AWS Secrets Manager
```bash
# In Kubernetes deployment, inject from AWS Secrets Manager
DATABASE_URL=$(aws secretsmanager get-secret-value \
  --secret-id export-service/db-url \
  --query SecretString --output text)

export EXPORT_SERVICE_DATABASE_URL=$DATABASE_URL
```
```

## Writing Tips

### Be Clear About Scope
- What can users configure?
- What's fixed/non-configurable and why?
- What requires restart vs. hot reload?

### Provide Realistic Examples
- Show real values, not placeholders
- Include examples for different environments
- Show both correct and incorrect formats

### Document Trade-offs
- Why choose certain defaults?
- What's the impact of changing values?
- What happens if value is too high/low?

### Include Validation
- What values are valid?
- What happens if invalid values provided?
- How do users know if config is wrong?

### Think About Operations
- What configuration might ops teams want to change?
- What parameters help troubleshoot issues?
- What can be tuned for performance?

## Validation & Fixing Issues

### Run the Validator
```bash
scripts/validate-spec.sh docs/specs/configuration-schema/config-001-your-spec.md
```

### Common Issues & Fixes

**Issue**: "Configuration fields lack descriptions"
- **Fix**: Add purpose, examples, and impact for each field

**Issue**: "No validation rules documented"
- **Fix**: Document valid ranges, formats, required fields

**Issue**: "No environment-specific examples"
- **Fix**: Add configurations for dev, staging, and production

**Issue**: "Sensitive fields not highlighted"
- **Fix**: Clearly mark sensitive fields and document secrets management

## Decision-Making Framework

When designing configuration schema:

1. **Scope**: What should be configurable?
   - Environment-specific settings?
   - Performance tuning parameters?
   - Feature flags?
   - Operational settings?

2. **Defaults**: What are good default values?
   - Production-safe defaults?
   - Development-friendly for new users?
   - Documented reasoning?

3. **Flexibility**: How much should users configure?
   - Too much: Confusing, hard to troubleshoot
   - Too little: Can't adapt to needs
   - Right amount: Common use cases covered

4. **Safety**: How do we prevent misconfiguration?
   - Validation rules?
   - Error messages?
   - Documentation of constraints?

5. **Evolution**: How will configuration change?
   - Backward compatibility?
   - Migration path for old configs?
   - Deprecation timeline?

## Next Steps

1. **Create the spec**: `scripts/generate-spec.sh configuration-schema config-XXX-slug`
2. **List fields**: What can be configured?
3. **Document each field** with type, default, range, impact
4. **Provide examples** for different environments
5. **Document validation** rules and constraints
6. **Validate**: `scripts/validate-spec.sh docs/specs/configuration-schema/config-XXX-slug.md`
7. **Share with operations team** for feedback