Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:18:56 +08:00
commit 5ff654ec1a
11 changed files with 382 additions and 0 deletions

View File

@@ -0,0 +1,52 @@
---
name: configuring-auto-scaling-policies
description: |
This skill configures auto-scaling policies for applications and infrastructure. It generates production-ready configurations based on user requirements, implementing best practices for scalability and security. Use this skill when the user requests help with auto-scaling setup, high availability, or dynamic resource allocation, specifically mentioning terms like "auto-scaling," "HPA," "scaling policies," or "dynamic scaling." This skill provides complete configuration code for various platforms.
allowed-tools: Read, Write, Edit, Grep, Glob, Bash
version: 1.0.0
---
## Overview
This skill empowers Claude to create and configure auto-scaling policies tailored to specific application and infrastructure needs. It streamlines the process of setting up dynamic resource allocation, ensuring optimal performance and resilience.
## How It Works
1. **Requirement Gathering**: Claude analyzes the user's request to understand the specific auto-scaling requirements, including target metrics (CPU, memory, etc.), scaling thresholds, and desired platform.
2. **Configuration Generation**: Based on the gathered requirements, Claude generates a production-ready auto-scaling configuration, incorporating best practices for security and scalability. This includes HPA configurations, scaling policies, and necessary infrastructure setup code.
3. **Code Presentation**: Claude presents the generated configuration code to the user, ready for deployment.
## When to Use This Skill
This skill activates when you need to:
- Configure auto-scaling for a Kubernetes deployment.
- Set up dynamic scaling policies based on CPU or memory utilization.
- Implement high availability and fault tolerance through auto-scaling.
## Examples
### Example 1: Scaling a Web Application
User request: "I need to configure auto-scaling for my web application in Kubernetes based on CPU utilization. Scale up when CPU usage exceeds 70%."
The skill will:
1. Analyze the request and identify the need for a Kubernetes HPA configuration.
2. Generate an HPA configuration file that scales the web application based on CPU utilization, with a target threshold of 70%.
### Example 2: Scaling Infrastructure Based on Load
User request: "Configure auto-scaling for my infrastructure to handle peak loads during business hours. Scale up based on the number of incoming requests."
The skill will:
1. Analyze the request and determine the need for infrastructure-level auto-scaling policies.
2. Generate configuration code for scaling the infrastructure based on the number of incoming requests, considering peak load times.
## Best Practices
- **Monitoring**: Ensure proper monitoring is in place to track the performance metrics used for auto-scaling decisions.
- **Threshold Setting**: Carefully choose scaling thresholds to avoid excessive scaling or under-provisioning.
- **Testing**: Thoroughly test the auto-scaling configuration to ensure it behaves as expected under various load conditions.
## Integration
This skill can be used in conjunction with other DevOps plugins to automate the entire deployment pipeline, from code generation to infrastructure provisioning.

View File

@@ -0,0 +1,7 @@
# Assets
Bundled resources for auto-scaling-configurator skill
- [ ] config_template.yaml: YAML template for auto-scaling configuration files, providing a starting point for users.
- [ ] example_hpa.yaml: Example Kubernetes Horizontal Pod Autoscaler (HPA) configuration file.
- [ ] example_aws_scaling_policy.json: Example AWS auto-scaling policy configuration file.

View File

@@ -0,0 +1,79 @@
# Auto-Scaling Configuration Template
# --- Global Settings ---
global:
# Enable or disable auto-scaling globally
enabled: true
# Default cooldown period (in seconds) after a scaling event
cooldown: 300 # seconds (5 minutes)
# --- Application Configuration ---
application:
name: REPLACE_ME # Application Name (e.g., web-app, api-server)
type: web # Application type (e.g., web, api, worker)
# Resource limits (CPU, Memory)
resources:
cpu:
min: 0.5 # Minimum CPU units
max: 4 # Maximum CPU units
memory:
min: 512 # Minimum memory in MB
max: 4096 # Maximum memory in MB
# --- Scaling Policies ---
scaling_policies:
# --- CPU Utilization Scaling ---
cpu_utilization:
enabled: true
target_utilization: 70 # Target CPU utilization percentage
scale_up_threshold: 80 # CPU usage percentage to trigger scale-up
scale_down_threshold: 40 # CPU usage percentage to trigger scale-down
scale_up_increment: 1 # Number of instances to add during scale-up
scale_down_decrement: 1 # Number of instances to remove during scale-down
# --- Memory Utilization Scaling ---
memory_utilization:
enabled: true
target_utilization: 75 # Target Memory utilization percentage
scale_up_threshold: 85 # Memory usage percentage to trigger scale-up
scale_down_threshold: 50 # Memory usage percentage to trigger scale-down
scale_up_increment: 1 # Number of instances to add during scale-up
scale_down_decrement: 1 # Number of instances to remove during scale-down
# --- Request Latency Scaling ---
request_latency:
enabled: false # Enable only if latency metrics are available
target_latency: 200 # Target request latency in milliseconds
scale_up_threshold: 500 # Latency in milliseconds to trigger scale-up
scale_down_threshold: 100 # Latency in milliseconds to trigger scale-down
scale_up_increment: 1 # Number of instances to add during scale-up
scale_down_decrement: 1 # Number of instances to remove during scale-down
# --- Custom Metric Scaling ---
custom_metric:
enabled: false # Enable only if a custom metric is available
metric_name: YOUR_VALUE_HERE # Name of the custom metric
target_value: YOUR_VALUE_HERE # Target value for the custom metric
scale_up_threshold: YOUR_VALUE_HERE # Threshold to trigger scale-up
scale_down_threshold: YOUR_VALUE_HERE # Threshold to trigger scale-down
scale_up_increment: 1 # Number of instances to add during scale-up
scale_down_decrement: 1 # Number of instances to remove during scale-down
# --- Infrastructure Configuration ---
infrastructure:
platform: aws # Cloud platform (e.g., aws, azure, gcp, on-prem)
region: us-east-1 # Cloud region
instance_type: t3.medium # Instance type for new instances
min_instances: 1 # Minimum number of instances
max_instances: 5 # Maximum number of instances
# --- Monitoring Configuration ---
monitoring:
# Integration with monitoring tools (e.g., CloudWatch, Prometheus)
# Configure details for your monitoring system here
# Example:
type: cloudwatch # Monitoring system type
namespace: MyApp # Monitoring namespace
metric_prefix: MyAppInstance # Metric prefix

View File

@@ -0,0 +1,70 @@
{
"_comment": "Example AWS Auto Scaling Policy Configuration",
"PolicyName": "WebAppCPUPolicy",
"PolicyType": "TargetTrackingScaling",
"TargetTrackingConfiguration": {
"_comment": "Target utilization for CPU",
"TargetValue": 70.0,
"_comment": "Metric to track",
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ASGAverageCPUUtilization"
},
"_comment": "Cooldown periods",
"ScaleOutCooldown": 300,
"ScaleInCooldown": 600
},
"_comment": "Resource ID - Auto Scaling Group ARN",
"ResourceId": "autoScalingGroupName/my-web-app-asg",
"ScalableDimension": "autoscaling:autoScalingGroup:DesiredCapacity",
"ServiceNamespace": "autoscaling",
"_comment": "Optional - Adjustments for more control",
"StepScalingPolicyConfiguration": {
"_comment": "Adjustment type (ChangeInCapacity, PercentChangeInCapacity, ExactCapacity)",
"AdjustmentType": "ChangeInCapacity",
"_comment": "Minimum number of instances to add/remove",
"MinAdjustmentMagnitude": 1,
"_comment": "Cooldown period",
"Cooldown": 300,
"_comment": "Scaling adjustments based on metric value",
"StepAdjustments": [
{
"MetricIntervalLowerBound": 0.0,
"MetricIntervalUpperBound": 10.0,
"ScalingAdjustment": -1
},
{
"MetricIntervalLowerBound": 70.0,
"ScalingAdjustment": 1
}
]
},
"_comment": "Optional - Metric Alarms",
"Alarms": [
{
"AlarmName": "CPUHigh",
"MetricName": "CPUUtilization",
"Namespace": "AWS/EC2",
"Statistic": "Average",
"Period": 60,
"EvaluationPeriods": 5,
"Threshold": 80.0,
"ComparisonOperator": "GreaterThanThreshold",
"AlarmActions": [
"arn:aws:sns:us-east-1:123456789012:HighCPUAlarm"
]
},
{
"AlarmName": "CPULow",
"MetricName": "CPUUtilization",
"Namespace": "AWS/EC2",
"Statistic": "Average",
"Period": 60,
"EvaluationPeriods": 5,
"Threshold": 20.0,
"ComparisonOperator": "LessThanThreshold",
"AlarmActions": [
"arn:aws:sns:us-east-1:123456789012:LowCPUAlarm"
]
}
]
}

View File

@@ -0,0 +1,43 @@
# example_hpa.yaml
# This is an example Kubernetes Horizontal Pod Autoscaler (HPA) configuration file.
# It defines how the number of pods in a deployment or replication controller
# should be automatically scaled based on observed CPU utilization.
apiVersion: autoscaling/v2 # Use autoscaling/v2 for more features (e.g., resource metrics)
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa # The name of the HPA resource
namespace: default # The namespace where the HPA should be deployed (REPLACE_ME if needed)
spec:
scaleTargetRef: # Defines the target resource to scale
apiVersion: apps/v1 # API version of the target resource
kind: Deployment # The type of resource to scale (e.g., Deployment, ReplicationController)
name: example-deployment # The name of the Deployment to scale (REPLACE_ME)
minReplicas: 2 # The minimum number of replicas to maintain
maxReplicas: 10 # The maximum number of replicas to scale to
metrics: # Defines the metrics used to trigger scaling
- type: Resource # Scale based on resource utilization
resource:
name: cpu # The resource to monitor (CPU in this case)
target:
type: Utilization # Target utilization percentage
averageUtilization: 70 # Target CPU utilization percentage (e.g., 70%)
- type: Resource # Scale based on memory utilization
resource:
name: memory # The resource to monitor (Memory in this case)
target:
type: Utilization # Target utilization percentage
averageUtilization: 80 # Target Memory utilization percentage (e.g., 80%)
behavior: # Optional: Define scaling behavior
scaleUp: # Define scale up behavior
stabilizationWindowSeconds: 300 # Delay for scaling up after a scaling event
policies: # Scaling policies (e.g., percentage or fixed number of replicas)
- type: Percent # Scale up by a percentage
value: 20 # Percentage increase
periodSeconds: 60 # Evaluate every 60 seconds
scaleDown: # Define scale down behavior
stabilizationWindowSeconds: 300 # Delay for scaling down after a scaling event
policies: # Scaling policies (e.g., percentage or fixed number of replicas)
- type: Percent # Scale down by a percentage
value: 10 # Percentage decrease
periodSeconds: 60 # Evaluate every 60 seconds

View File

@@ -0,0 +1,8 @@
# References
Bundled resources for auto-scaling-configurator skill
- [ ] aws_auto_scaling_best_practices.md: Detailed documentation on AWS Auto Scaling best practices, including security considerations and performance optimization.
- [ ] azure_auto_scaling_guide.md: Comprehensive guide to configuring auto-scaling in Azure, covering different scaling scenarios and configuration options.
- [ ] gcp_auto_scaling_reference.md: Reference documentation for Google Cloud Platform's auto-scaling features, including API details and configuration examples.
- [ ] configuration_schema.json: JSON schema defining the structure and validation rules for the auto-scaling configuration files.

View File

@@ -0,0 +1,7 @@
# Scripts
Bundled resources for auto-scaling-configurator skill
- [ ] generate_config.py: Generates auto-scaling configuration files based on user input and best practices.
- [ ] validate_config.py: Validates the generated configuration files against a predefined schema to ensure correctness and security.
- [ ] deploy_config.sh: Deploys the generated configuration to the target infrastructure (e.g., AWS, Azure, GCP).