# Infrastructure Optimization Operation You are executing the **infrastructure** operation to optimize infrastructure scaling, CDN configuration, resource allocation, deployment, and cost efficiency. ## Parameters **Received**: `$ARGUMENTS` (after removing 'infrastructure' operation name) Expected format: `target:"scaling|cdn|resources|deployment|costs|all" [environment:"prod|staging|dev"] [provider:"aws|azure|gcp|vercel"] [budget_constraint:"true|false"]` **Parameter definitions**: - `target` (required): What to optimize - `scaling`, `cdn`, `resources`, `deployment`, `costs`, or `all` - `environment` (optional): Target environment (default: production) - `provider` (optional): Cloud provider (auto-detected if not specified) - `budget_constraint` (optional): Prioritize cost reduction (default: false) ## Workflow ### 1. Detect Infrastructure Provider ```bash # Check for cloud provider configuration ls -la .aws/ .azure/ .gcp/ vercel.json netlify.toml 2>/dev/null # Check for container orchestration kubectl config current-context 2>/dev/null docker-compose version 2>/dev/null # Check for IaC tools ls -la terraform/ *.tf serverless.yml cloudformation/ 2>/dev/null ``` ### 2. Analyze Current Infrastructure **Resource Utilization (Kubernetes)**: ```bash # Node resource usage kubectl top nodes # Pod resource usage kubectl top pods --all-namespaces # Check resource requests vs limits kubectl get pods -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[*].resources}{"\n"}{end}' ``` **Resource Utilization (AWS EC2)**: ```bash # CloudWatch metrics aws cloudwatch get-metric-statistics \ --namespace AWS/EC2 \ --metric-name CPUUtilization \ --dimensions Name=InstanceId,Value=i-1234567890abcdef0 \ --start-time 2025-10-07T00:00:00Z \ --end-time 2025-10-14T00:00:00Z \ --period 3600 \ --statistics Average ``` ### 3. Scaling Optimization #### 3.1. Horizontal Pod Autoscaling (Kubernetes) ```yaml # BEFORE (fixed 3 replicas) apiVersion: apps/v1 kind: Deployment metadata: name: api-server spec: replicas: 3 # Fixed count, wastes resources at low traffic template: spec: containers: - name: api image: api:v1.0.0 resources: requests: memory: "512Mi" cpu: "500m" # AFTER (horizontal pod autoscaler) apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: api-server-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: api-server minReplicas: 2 # Minimum for high availability maxReplicas: 10 # Scale up under load metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 # Target 70% CPU - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 behavior: scaleDown: stabilizationWindowSeconds: 300 # Wait 5 min before scaling down scaleUp: stabilizationWindowSeconds: 0 # Scale up immediately policies: - type: Percent value: 100 # Double pods at a time periodSeconds: 15 # Result: # - Off-peak: 2 pods (save 33% resources) # - Peak: Up to 10 pods (handle 5x traffic) # - Cost savings: ~40% while maintaining performance ``` #### 3.2. Vertical Pod Autoscaling ```yaml # Automatically adjust resource requests/limits apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: api-server-vpa spec: targetRef: apiVersion: apps/v1 kind: Deployment name: api-server updatePolicy: updateMode: "Auto" # Automatically apply recommendations resourcePolicy: containerPolicies: - containerName: api minAllowed: memory: "256Mi" cpu: "100m" maxAllowed: memory: "2Gi" cpu: "2000m" controlledResources: ["cpu", "memory"] ``` #### 3.3. AWS Auto Scaling Groups ```json { "AutoScalingGroupName": "api-server-asg", "MinSize": 2, "MaxSize": 10, "DesiredCapacity": 2, "DefaultCooldown": 300, "HealthCheckType": "ELB", "HealthCheckGracePeriod": 180, "TargetGroupARNs": ["arn:aws:elasticloadbalancing:..."], "TargetTrackingScalingPolicies": [ { "PolicyName": "target-tracking-cpu", "TargetValue": 70.0, "PredefinedMetricSpecification": { "PredefinedMetricType": "ASGAverageCPUUtilization" } } ] } ``` ### 4. CDN Optimization #### 4.1. CloudFront Configuration (AWS) ```json { "DistributionConfig": { "CallerReference": "api-cdn-2025", "Comment": "Optimized CDN for static assets", "Enabled": true, "PriceClass": "PriceClass_100", "Origins": [ { "Id": "S3-static-assets", "DomainName": "static-assets.s3.amazonaws.com", "S3OriginConfig": { "OriginAccessIdentity": "origin-access-identity/cloudfront/..." } } ], "DefaultCacheBehavior": { "TargetOriginId": "S3-static-assets", "ViewerProtocolPolicy": "redirect-to-https", "Compress": true, "MinTTL": 0, "DefaultTTL": 86400, "MaxTTL": 31536000, "ForwardedValues": { "QueryString": false, "Cookies": { "Forward": "none" } } }, "CacheBehaviors": [ { "PathPattern": "*.js", "TargetOriginId": "S3-static-assets", "Compress": true, "MinTTL": 31536000, "CachePolicyId": "immutable-assets" }, { "PathPattern": "*.css", "TargetOriginId": "S3-static-assets", "Compress": true, "MinTTL": 31536000 } ] } } ``` **Cache Headers**: ```javascript // Express server - set appropriate cache headers app.use('/static', express.static('public', { maxAge: '1y', // Immutable assets with hash in filename immutable: true })); app.use('/api', (req, res, next) => { res.set('Cache-Control', 'no-cache'); // API responses next(); }); // HTML pages - short cache with revalidation app.get('/', (req, res) => { res.set('Cache-Control', 'public, max-age=300, must-revalidate'); res.sendFile('index.html'); }); ``` #### 4.2. Image Optimization with CDN ```nginx # Nginx configuration for image optimization location ~* \.(jpg|jpeg|png|gif|webp)$ { expires 1y; add_header Cache-Control "public, immutable"; # Enable compression gzip on; gzip_comp_level 6; # Serve WebP if browser supports it set $webp_suffix ""; if ($http_accept ~* "webp") { set $webp_suffix ".webp"; } try_files $uri$webp_suffix $uri =404; } ``` ### 5. Resource Right-Sizing #### 5.1. Analyze Resource Usage Patterns ```bash # Kubernetes - Resource usage over time kubectl top pods --containers --namespace production | awk '{ if (NR>1) { split($3, cpu, "m"); split($4, mem, "Mi"); print $1, $2, cpu[1], mem[1] } }' > resource-usage.txt # Analyze patterns # If CPU consistently <30% → reduce CPU request # If memory consistently <50% → reduce memory request ``` **Optimization Example**: ```yaml # BEFORE (over-provisioned) resources: requests: memory: "2Gi" # Usage: 600Mi (30%) cpu: "1000m" # Usage: 200m (20%) limits: memory: "4Gi" cpu: "2000m" # AFTER (right-sized) resources: requests: memory: "768Mi" # 600Mi + 28% headroom cpu: "300m" # 200m + 50% headroom limits: memory: "1.5Gi" # 2x request cpu: "600m" # 2x request # Savings: 62% CPU, 61% memory # Cost impact: ~60% reduction per pod ``` #### 5.2. Reserved Instances / Savings Plans **AWS Reserved Instances**: ```bash # Analyze instance usage patterns aws ce get-reservation-utilization \ --time-period Start=2024-10-01,End=2025-10-01 \ --granularity MONTHLY # Recommendation: Convert frequently-used instances to Reserved Instances # Example savings: # - On-Demand t3.large: $0.0832/hour = $612/month # - Reserved t3.large (1 year): $0.0520/hour = $383/month # - Savings: 37% ($229/month per instance) ``` ### 6. Deployment Optimization #### 6.1. Container Image Optimization ```dockerfile # BEFORE (large image: 1.2GB) FROM node:18 WORKDIR /app COPY . . RUN npm install CMD ["npm", "start"] # AFTER (optimized image: 180MB) # Multi-stage build FROM node:18-alpine AS builder WORKDIR /app COPY package*.json ./ RUN npm ci --only=production COPY . . RUN npm run build FROM node:18-alpine WORKDIR /app COPY --from=builder /app/dist ./dist COPY --from=builder /app/node_modules ./node_modules COPY package*.json ./ # Create non-root user RUN addgroup -g 1001 -S nodejs && adduser -S nodejs -u 1001 USER nodejs EXPOSE 3000 CMD ["node", "dist/main.js"] # Image size: 1.2GB → 180MB (85% smaller) # Security: Non-root user, minimal attack surface ``` #### 6.2. Blue-Green Deployment ```yaml # Kubernetes Blue-Green deployment # Green (new version) apiVersion: apps/v1 kind: Deployment metadata: name: api-green spec: replicas: 3 selector: matchLabels: app: api version: green template: metadata: labels: app: api version: green spec: containers: - name: api image: api:v2.0.0 --- # Service - switch traffic by changing selector apiVersion: v1 kind: Service metadata: name: api-service spec: selector: app: api version: green # Change from 'blue' to 'green' to switch traffic ports: - port: 80 targetPort: 3000 # Zero-downtime deployment # Instant rollback by changing selector back to 'blue' ``` ### 7. Cost Optimization #### 7.1. Spot Instances for Non-Critical Workloads ```yaml # Kubernetes - Use spot instances for batch jobs apiVersion: batch/v1 kind: Job metadata: name: data-processing spec: template: spec: nodeSelector: node.kubernetes.io/instance-type: spot # Use spot instances tolerations: - key: "spot" operator: "Equal" value: "true" effect: "NoSchedule" containers: - name: processor image: data-processor:v1.0.0 # Savings: 70-90% cost reduction for spot vs on-demand # Trade-off: May be interrupted (acceptable for batch jobs) ``` #### 7.2. Storage Optimization ```bash # S3 Lifecycle Policy aws s3api put-bucket-lifecycle-configuration \ --bucket static-assets \ --lifecycle-configuration '{ "Rules": [ { "Id": "archive-old-logs", "Status": "Enabled", "Filter": { "Prefix": "logs/" }, "Transitions": [ { "Days": 30, "StorageClass": "STANDARD_IA" }, { "Days": 90, "StorageClass": "GLACIER" } ], "Expiration": { "Days": 365 } } ] }' # Cost impact: # - Standard: $0.023/GB/month # - Standard-IA: $0.0125/GB/month (46% cheaper) # - Glacier: $0.004/GB/month (83% cheaper) ``` #### 7.3. Database Instance Right-Sizing ```sql -- Analyze actual database usage SELECT datname, pg_size_pretty(pg_database_size(datname)) AS size FROM pg_database ORDER BY pg_database_size(datname) DESC; -- Check connection usage SELECT count(*) AS connections, max_conn, max_conn - count(*) AS available FROM pg_stat_activity, (SELECT setting::int AS max_conn FROM pg_settings WHERE name='max_connections') mc GROUP BY max_conn; -- Recommendation: If consistently using <30% connections and <50% storage -- Consider downsizing from db.r5.xlarge to db.r5.large -- Savings: ~50% cost reduction ``` ### 8. Monitoring and Alerting **CloudWatch Alarms (AWS)**: ```json { "AlarmName": "high-cpu-utilization", "ComparisonOperator": "GreaterThanThreshold", "EvaluationPeriods": 2, "MetricName": "CPUUtilization", "Namespace": "AWS/EC2", "Period": 300, "Statistic": "Average", "Threshold": 80.0, "ActionsEnabled": true, "AlarmActions": ["arn:aws:sns:us-east-1:123456789012:ops-team"] } ``` **Prometheus Alerts (Kubernetes)**: ```yaml groups: - name: infrastructure rules: - alert: HighMemoryUsage expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes > 0.85 for: 5m labels: severity: warning annotations: summary: "High memory usage on {{ $labels.instance }}" - alert: HighCPUUsage expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80 for: 5m labels: severity: warning ``` ## Output Format ```markdown # Infrastructure Optimization Report: [Environment] **Optimization Date**: [Date] **Provider**: [AWS/Azure/GCP/Hybrid] **Environment**: [production/staging] **Target**: [scaling/cdn/resources/costs/all] ## Executive Summary [Summary of infrastructure state and optimizations] ## Baseline Metrics ### Resource Utilization - **CPU**: 68% average across nodes - **Memory**: 72% average - **Network**: 45% utilization - **Storage**: 60% utilization ### Cost Breakdown (Monthly) - **Compute**: $4,500 (EC2 instances) - **Database**: $1,200 (RDS) - **Storage**: $800 (S3, EBS) - **Network**: $600 (Data transfer, CloudFront) - **Total**: $7,100/month ### Scaling Configuration - **Auto Scaling**: Fixed 5 instances (no scaling) - **Pod Count**: Fixed 15 pods - **Resource Allocation**: Static (no HPA/VPA) ## Optimizations Implemented ### 1. Horizontal Pod Autoscaling **Before**: Fixed 15 pods **After**: 8-25 pods based on load **Impact**: - Off-peak: 8 pods (47% reduction) - Peak: 25 pods (67% increase capacity) - Cost savings: $1,350/month (30%) ### 2. Resource Right-Sizing **Optimized 12 deployments**: - Average CPU reduction: 55% - Average memory reduction: 48% - Cost impact: $945/month savings ### 3. CDN Configuration **Implemented**: - CloudFront for static assets - Cache-Control headers optimized - Compression enabled **Impact**: - Origin requests: 85% reduction - TTFB: 750ms → 120ms (84% faster) - Bandwidth costs: $240/month savings ### 4. Reserved Instances **Converted**: - 3 x t3.large on-demand → Reserved - Commitment: 1 year, no upfront **Savings**: $687/month (37% per instance) ### 5. Storage Lifecycle Policies **Implemented**: - Logs: Standard → Standard-IA (30d) → Glacier (90d) - Backups: Glacier after 30 days - Old assets: Glacier after 180 days **Savings**: $285/month ## Results Summary ### Cost Optimization | Category | Before | After | Savings | |----------|--------|-------|---------| | Compute | $4,500 | $2,518 | $1,982 (44%) | | Database | $1,200 | $720 | $480 (40%) | | Storage | $800 | $515 | $285 (36%) | | Network | $600 | $360 | $240 (40%) | | **Total** | **$7,100** | **$4,113** | **$2,987 (42%)** | **Annual Savings**: $35,844 ### Performance Improvements | Metric | Before | After | Improvement | |--------|--------|-------|-------------| | Average Response Time | 285ms | 125ms | 56% faster | | TTFB (with CDN) | 750ms | 120ms | 84% faster | | Resource Utilization | 68% | 75% | Better efficiency | | Auto-scaling Response | N/A | 30s | Handles traffic spikes | ### Scalability Improvements - **Traffic Capacity**: 2x increase (25 pods vs 15 fixed) - **Scaling Response Time**: 30 seconds to scale up - **Cost Efficiency**: Pay for what you use ## Trade-offs and Considerations **Auto-scaling**: - **Benefit**: 42% cost reduction, 2x capacity - **Trade-off**: 30s delay for cold starts - **Mitigation**: Min 8 pods for baseline capacity **Reserved Instances**: - **Benefit**: 37% savings per instance - **Trade-off**: 1-year commitment - **Risk**: Low (steady baseline load confirmed) **CDN Caching**: - **Benefit**: 84% faster TTFB, 85% fewer origin requests - **Trade-off**: Cache invalidation complexity - **Mitigation**: Short TTL for dynamic content ## Monitoring Recommendations 1. **Cost Tracking**: - Daily cost reports - Budget alerts at 80%, 100% - Tag-based cost allocation 2. **Performance Monitoring**: - CloudWatch dashboards - Prometheus + Grafana - APM for application metrics 3. **Auto-scaling Health**: - HPA metrics (scale events) - Resource utilization trends - Alert on frequent scaling ## Next Steps 1. Evaluate spot instances for batch workloads (potential 70% savings) 2. Implement multi-region deployment for better global performance 3. Consider serverless for low-traffic endpoints 4. Review database read replicas for read-heavy workloads