--- name: cost-optimization description: Optimize cloud costs through resource rightsizing, tagging strategies, reserved instances, and spending analysis. Use when reducing cloud expenses, analyzing infrastructure costs, or implementing cost governance policies. --- # Cloud Cost Optimization Strategies and patterns for optimizing cloud costs across AWS, Azure, and GCP. ## Purpose Implement systematic cost optimization strategies to reduce cloud spending while maintaining performance and reliability. ## When to Use - Reduce cloud spending - Right-size resources - Implement cost governance - Optimize multi-cloud costs - Meet budget constraints ## Cost Optimization Framework ### 1. Visibility - Implement cost allocation tags - Use cloud cost management tools - Set up budget alerts - Create cost dashboards ### 2. Right-Sizing - Analyze resource utilization - Downsize over-provisioned resources - Use auto-scaling - Remove idle resources ### 3. Pricing Models - Use reserved capacity - Leverage spot/preemptible instances - Implement savings plans - Use committed use discounts ### 4. Architecture Optimization - Use managed services - Implement caching - Optimize data transfer - Use lifecycle policies ## AWS Cost Optimization ### Reserved Instances ``` Savings: 30-72% vs On-Demand Term: 1 or 3 years Payment: All/Partial/No upfront Flexibility: Standard or Convertible ``` ### Savings Plans ``` Compute Savings Plans: 66% savings EC2 Instance Savings Plans: 72% savings Applies to: EC2, Fargate, Lambda Flexible across: Instance families, regions, OS ``` ### Spot Instances ``` Savings: Up to 90% vs On-Demand Best for: Batch jobs, CI/CD, stateless workloads Risk: 2-minute interruption notice Strategy: Mix with On-Demand for resilience ``` ### S3 Cost Optimization ```hcl resource "aws_s3_bucket_lifecycle_configuration" "example" { bucket = aws_s3_bucket.example.id rule { id = "transition-to-ia" status = "Enabled" transition { days = 30 storage_class = "STANDARD_IA" } transition { days = 90 storage_class = "GLACIER" } expiration { days = 365 } } } ``` ## Azure Cost Optimization ### Reserved VM Instances - 1 or 3 year terms - Up to 72% savings - Flexible sizing - Exchangeable ### Azure Hybrid Benefit - Use existing Windows Server licenses - Up to 80% savings with RI - Available for Windows and SQL Server ### Azure Advisor Recommendations - Right-size VMs - Delete unused resources - Use reserved capacity - Optimize storage ## GCP Cost Optimization ### Committed Use Discounts - 1 or 3 year commitment - Up to 57% savings - Applies to vCPUs and memory - Resource-based or spend-based ### Sustained Use Discounts - Automatic discounts - Up to 30% for running instances - No commitment required - Applies to Compute Engine, GKE ### Preemptible VMs - Up to 80% savings - 24-hour maximum runtime - Best for batch workloads ## Tagging Strategy ### AWS Tagging ```hcl locals { common_tags = { Environment = "production" Project = "my-project" CostCenter = "engineering" Owner = "team@example.com" ManagedBy = "terraform" } } resource "aws_instance" "example" { ami = "ami-12345678" instance_type = "t3.medium" tags = merge( local.common_tags, { Name = "web-server" } ) } ``` **Reference:** See `references/tagging-standards.md` ## Cost Monitoring ### Budget Alerts ```hcl # AWS Budget resource "aws_budgets_budget" "monthly" { name = "monthly-budget" budget_type = "COST" limit_amount = "1000" limit_unit = "USD" time_period_start = "2024-01-01_00:00" time_unit = "MONTHLY" notification { comparison_operator = "GREATER_THAN" threshold = 80 threshold_type = "PERCENTAGE" notification_type = "ACTUAL" subscriber_email_addresses = ["team@example.com"] } } ``` ### Cost Anomaly Detection - AWS Cost Anomaly Detection - Azure Cost Management alerts - GCP Budget alerts ## Architecture Patterns ### Pattern 1: Serverless First - Use Lambda/Functions for event-driven - Pay only for execution time - Auto-scaling included - No idle costs ### Pattern 2: Right-Sized Databases ``` Development: t3.small RDS Staging: t3.large RDS Production: r6g.2xlarge RDS with read replicas ``` ### Pattern 3: Multi-Tier Storage ``` Hot data: S3 Standard Warm data: S3 Standard-IA (30 days) Cold data: S3 Glacier (90 days) Archive: S3 Deep Archive (365 days) ``` ### Pattern 4: Auto-Scaling ```hcl resource "aws_autoscaling_policy" "scale_up" { name = "scale-up" scaling_adjustment = 2 adjustment_type = "ChangeInCapacity" cooldown = 300 autoscaling_group_name = aws_autoscaling_group.main.name } resource "aws_cloudwatch_metric_alarm" "cpu_high" { alarm_name = "cpu-high" comparison_operator = "GreaterThanThreshold" evaluation_periods = "2" metric_name = "CPUUtilization" namespace = "AWS/EC2" period = "60" statistic = "Average" threshold = "80" alarm_actions = [aws_autoscaling_policy.scale_up.arn] } ``` ## Cost Optimization Checklist - [ ] Implement cost allocation tags - [ ] Delete unused resources (EBS, EIPs, snapshots) - [ ] Right-size instances based on utilization - [ ] Use reserved capacity for steady workloads - [ ] Implement auto-scaling - [ ] Optimize storage classes - [ ] Use lifecycle policies - [ ] Enable cost anomaly detection - [ ] Set budget alerts - [ ] Review costs weekly - [ ] Use spot/preemptible instances - [ ] Optimize data transfer costs - [ ] Implement caching layers - [ ] Use managed services - [ ] Monitor and optimize continuously ## Tools - **AWS:** Cost Explorer, Cost Anomaly Detection, Compute Optimizer - **Azure:** Cost Management, Advisor - **GCP:** Cost Management, Recommender - **Multi-cloud:** CloudHealth, Cloudability, Kubecost ## Reference Files - `references/tagging-standards.md` - Tagging conventions - `assets/cost-analysis-template.xlsx` - Cost analysis spreadsheet ## Related Skills - `terraform-module-library` - For resource provisioning - `multi-cloud-architecture` - For cloud selection