569 lines
17 KiB
Markdown
569 lines
17 KiB
Markdown
---
|
|
name: HyperShift AWS Provider
|
|
description: Use this skill when you need to deploy HyperShift clusters on AWS infrastructure with proper STS credentials, IAM roles, and VPC configuration
|
|
---
|
|
|
|
# HyperShift AWS Provider
|
|
|
|
This skill provides implementation guidance for creating HyperShift clusters on AWS, handling AWS-specific requirements including STS credentials, IAM roles, VPC configuration, and regional best practices.
|
|
|
|
## When to Use This Skill
|
|
|
|
This skill is automatically invoked by the `/hcp:generate aws` command to guide the AWS provider cluster creation process.
|
|
|
|
## Prerequisites
|
|
|
|
- AWS CLI configured with appropriate credentials
|
|
- HyperShift operator installed and configured
|
|
- STS credentials file for the target AWS account
|
|
- IAM role with required permissions for HyperShift
|
|
- Pull secret for accessing OpenShift images
|
|
|
|
## AWS Provider Overview
|
|
|
|
### AWS Provider Peculiarities
|
|
|
|
- **Requires AWS credentials (STS):** Must have valid STS credentials file
|
|
- **Region selection affects availability zones:** Different regions have different AZ availability
|
|
- **Instance types vary by region:** Not all instance types available in all regions
|
|
- **VPC CIDR must not conflict:** Must not overlap with existing infrastructure
|
|
- **IAM roles:** Can be auto-created or use pre-existing roles
|
|
|
|
### Common AWS Configurations
|
|
|
|
**Development Environment:**
|
|
- Single replica control plane (cost-effective)
|
|
- m5.large instances (balanced performance/cost)
|
|
- 2 availability zones (basic redundancy)
|
|
- Basic networking (public endpoints)
|
|
|
|
**Production Environment:**
|
|
- Highly available control plane
|
|
- m5.xlarge+ instances (better performance)
|
|
- 3+ availability zones (high availability)
|
|
- Custom VPC configuration
|
|
- KMS encryption enabled
|
|
|
|
**Cost-Optimized Environment:**
|
|
- Single NAT gateway
|
|
- Smaller instance types
|
|
- Minimal replicas
|
|
- Spot instances (where applicable)
|
|
|
|
## Implementation Steps
|
|
|
|
### Step 1: Analyze Cluster Description
|
|
|
|
Parse the natural language description for AWS-specific requirements:
|
|
|
|
**Environment Type Detection:**
|
|
- **Development**: "dev", "development", "testing", "demo", "sandbox"
|
|
- **Production**: "prod", "production", "critical", "enterprise"
|
|
- **Cost-Optimized**: "cheap", "cost", "minimal", "budget", "demo"
|
|
|
|
**Performance Indicators:**
|
|
- **High Performance**: "performance", "fast", "high-compute", "intensive"
|
|
- **Standard**: Default moderate configuration
|
|
- **Minimal**: "small", "minimal", "basic", "simple"
|
|
|
|
**Security/Compliance:**
|
|
- **FIPS**: "fips", "compliance", "security", "regulated"
|
|
- **Private**: "private", "isolated", "secure", "internal"
|
|
|
|
**Special Requirements:**
|
|
- **Multi-AZ**: "highly available", "ha", "multi-zone", "resilient"
|
|
- **Single-AZ**: "single zone", "simple", "minimal"
|
|
|
|
### Step 2: Apply AWS Provider Defaults
|
|
|
|
**Required Parameters:**
|
|
- `--region`: AWS region (default: us-east-1)
|
|
- `--pull-secret`: Path to pull secret file
|
|
- `--release-image`: OpenShift release image
|
|
- `--sts-creds`: **REQUIRED** - Path to STS credentials file
|
|
- `--role-arn`: **REQUIRED** - ARN of the IAM role to assume
|
|
- `--base-domain`: **REQUIRED** - Base domain for the cluster
|
|
|
|
**Smart Defaults by Environment:**
|
|
|
|
**Development Environment:**
|
|
```bash
|
|
--instance-type m5.large
|
|
--node-pool-replicas 2
|
|
--control-plane-availability-policy SingleReplica
|
|
--endpoint-access Public
|
|
--root-volume-size 120
|
|
--zones auto-select 2 zones based on region
|
|
```
|
|
|
|
**Production Environment:**
|
|
```bash
|
|
--instance-type m5.xlarge
|
|
--node-pool-replicas 3
|
|
--control-plane-availability-policy HighlyAvailable
|
|
--endpoint-access PublicAndPrivate
|
|
--root-volume-size 120
|
|
--auto-repair true
|
|
--zones auto-select 3+ zones based on region
|
|
```
|
|
|
|
**Cost-Optimized Environment:**
|
|
```bash
|
|
--instance-type m5.large
|
|
--node-pool-replicas 2
|
|
--control-plane-availability-policy SingleReplica
|
|
--endpoint-access Public
|
|
--root-volume-size 120
|
|
--zones auto-select 2 zones (minimal redundancy)
|
|
```
|
|
|
|
### Step 3: Interactive Parameter Collection
|
|
|
|
**Required Information Collection:**
|
|
|
|
1. **Cluster Name**
|
|
```
|
|
🔹 **Cluster Name**: What would you like to name your cluster?
|
|
- Must be DNS-compatible (lowercase, hyphens allowed)
|
|
- Used for AWS resource naming
|
|
- Example: dev-cluster, prod-app, demo-env
|
|
```
|
|
|
|
2. **AWS Region**
|
|
```
|
|
🔹 **AWS Region**: Which AWS region should host your cluster?
|
|
- Consider latency to your users
|
|
- Verify desired instance types are available
|
|
- [Press Enter for default: us-east-1]
|
|
|
|
Popular regions:
|
|
- us-east-1 (N. Virginia) - Largest service availability
|
|
- us-west-2 (Oregon) - West coast, latest services
|
|
- eu-west-1 (Ireland) - Europe
|
|
- ap-southeast-1 (Singapore) - Asia Pacific
|
|
```
|
|
|
|
3. **STS Credentials**
|
|
```
|
|
🔹 **STS Credentials**: Path to your AWS STS credentials file?
|
|
- Required for AWS authentication
|
|
- Generate using: aws sts get-session-token
|
|
- Example: /home/user/.aws/sts-creds.json
|
|
- Format: {"AccessKeyId": "...", "SecretAccessKey": "...", "SessionToken": "..."}
|
|
```
|
|
|
|
4. **IAM Role ARN**
|
|
```
|
|
🔹 **IAM Role ARN**: ARN of the IAM role for HyperShift?
|
|
- Role must have required HyperShift permissions
|
|
- Example: arn:aws:iam::123456789012:role/hypershift-operator-role
|
|
- See: https://hypershift.openshift.io/aws-setup/
|
|
```
|
|
|
|
5. **Base Domain**
|
|
```
|
|
🔹 **Base Domain**: What base domain should be used for cluster DNS?
|
|
- Must be a domain you control in Route53
|
|
- Used for cluster API and application routes
|
|
- Example: example.com, clusters.mycompany.com
|
|
```
|
|
|
|
6. **Pull Secret**
|
|
```
|
|
🔹 **Pull Secret**: Path to your OpenShift pull secret file?
|
|
- Required for accessing OpenShift container images
|
|
- Download from: https://console.redhat.com/openshift/install/pull-secret
|
|
- Example: /home/user/pull-secret.json
|
|
```
|
|
|
|
7. **OpenShift Version**
|
|
```
|
|
🔹 **OpenShift Version**: Which OpenShift version do you want to use?
|
|
|
|
📋 **Check supported versions**: https://amd64.ocp.releases.ci.openshift.org/
|
|
|
|
- Enter release image URL: quay.io/openshift-release-dev/ocp-release:X.Y.Z-multi
|
|
- [Press Enter for default: quay.io/openshift-release-dev/ocp-release:4.18.0-multi]
|
|
```
|
|
|
|
**Optional Configuration (based on description analysis):**
|
|
|
|
8. **Instance Type** (if performance requirements detected)
|
|
```
|
|
🔹 **Instance Type**: Select instance type based on your performance needs:
|
|
- m5.large (2 vCPU, 8GB RAM) - Development, light workloads
|
|
- m5.xlarge (4 vCPU, 16GB RAM) - Production, balanced workloads
|
|
- m5.2xlarge (8 vCPU, 32GB RAM) - High-performance workloads
|
|
- c5.xlarge (4 vCPU, 8GB RAM) - Compute-optimized
|
|
- [Press Enter for default based on environment type]
|
|
```
|
|
|
|
9. **Node Pool Replicas**
|
|
```
|
|
🔹 **Node Pool Replicas**: How many worker nodes do you need?
|
|
- Minimum: 2 (for basic redundancy)
|
|
- Production recommended: 3+
|
|
- [Press Enter for default based on environment type]
|
|
```
|
|
|
|
10. **Availability Zones** (auto-selected, but confirmed)
|
|
```
|
|
🔹 **Availability Zones**: Detected region: us-east-1
|
|
Auto-selecting zones for optimal distribution:
|
|
- Development: us-east-1a, us-east-1b (2 zones)
|
|
- Production: us-east-1a, us-east-1b, us-east-1c (3 zones)
|
|
|
|
Modify zone selection? [y/N]
|
|
```
|
|
|
|
### Step 4: Advanced Configuration (Conditional)
|
|
|
|
**For FIPS Compliance** (if detected):
|
|
```
|
|
🔹 **FIPS Mode**: Enable FIPS mode for compliance?
|
|
- Required for government/regulated workloads
|
|
- May impact performance
|
|
- [yes/no] [Press Enter for default: no]
|
|
```
|
|
|
|
**For High-Performance Workloads**:
|
|
```
|
|
🔹 **Root Volume Size**: Increase root volume size?
|
|
- Default: 120GB
|
|
- High-performance workloads: 200GB+
|
|
- [Press Enter for default: 120]
|
|
```
|
|
|
|
**For Production Environments**:
|
|
```
|
|
🔹 **Auto-Repair**: Enable automatic node repair?
|
|
- Automatically replaces unhealthy nodes
|
|
- Recommended for production
|
|
- [yes/no] [Press Enter for default: yes for production]
|
|
```
|
|
|
|
### Step 5: Generate Command
|
|
|
|
**Basic AWS Cluster Command:**
|
|
```bash
|
|
hypershift create cluster aws \
|
|
--name <cluster-name> \
|
|
--namespace <cluster-name>-ns \
|
|
--region <region> \
|
|
--instance-type <instance-type> \
|
|
--pull-secret <pull-secret-path> \
|
|
--node-pool-replicas <replica-count> \
|
|
--zones <zone-list> \
|
|
--control-plane-availability-policy <policy> \
|
|
--sts-creds <sts-creds-path> \
|
|
--role-arn <role-arn> \
|
|
--base-domain <base-domain> \
|
|
--release-image <release-image>
|
|
```
|
|
|
|
**Development Configuration Example:**
|
|
```bash
|
|
hypershift create cluster aws \
|
|
--name dev-cluster \
|
|
--namespace dev-cluster-ns \
|
|
--region us-east-1 \
|
|
--instance-type m5.large \
|
|
--pull-secret /path/to/pull-secret.json \
|
|
--node-pool-replicas 2 \
|
|
--zones us-east-1a,us-east-1b \
|
|
--control-plane-availability-policy SingleReplica \
|
|
--endpoint-access Public \
|
|
--root-volume-size 120 \
|
|
--sts-creds /path/to/sts-creds.json \
|
|
--role-arn arn:aws:iam::123456789012:role/hypershift-role \
|
|
--base-domain example.com \
|
|
--release-image quay.io/openshift-release-dev/ocp-release:4.18.0-multi
|
|
```
|
|
|
|
**Production Configuration Example:**
|
|
```bash
|
|
hypershift create cluster aws \
|
|
--name production-cluster \
|
|
--namespace production-cluster-ns \
|
|
--region us-west-2 \
|
|
--instance-type m5.xlarge \
|
|
--pull-secret /path/to/pull-secret.json \
|
|
--node-pool-replicas 3 \
|
|
--zones us-west-2a,us-west-2b,us-west-2c \
|
|
--control-plane-availability-policy HighlyAvailable \
|
|
--endpoint-access PublicAndPrivate \
|
|
--root-volume-size 120 \
|
|
--auto-repair \
|
|
--sts-creds /path/to/sts-creds.json \
|
|
--role-arn arn:aws:iam::123456789012:role/hypershift-prod-role \
|
|
--base-domain clusters.company.com \
|
|
--release-image quay.io/openshift-release-dev/ocp-release:4.18.0-multi
|
|
```
|
|
|
|
**FIPS-Enabled Configuration:**
|
|
```bash
|
|
hypershift create cluster aws \
|
|
--name compliance-cluster \
|
|
--namespace compliance-cluster-ns \
|
|
--region us-gov-east-1 \
|
|
--instance-type m5.xlarge \
|
|
--pull-secret /path/to/pull-secret.json \
|
|
--node-pool-replicas 3 \
|
|
--zones us-gov-east-1a,us-gov-east-1b,us-gov-east-1c \
|
|
--control-plane-availability-policy HighlyAvailable \
|
|
--fips \
|
|
--sts-creds /path/to/sts-creds.json \
|
|
--role-arn arn:aws-us-gov:iam::123456789012:role/hypershift-fips-role \
|
|
--base-domain secure.gov.example.com \
|
|
--release-image quay.io/openshift-release-dev/ocp-release:4.18.0-multi
|
|
```
|
|
|
|
### Step 6: Pre-Flight Validation
|
|
|
|
**Provide validation commands:**
|
|
```
|
|
## Pre-Flight Checks
|
|
|
|
Before creating the cluster, verify your setup:
|
|
|
|
1. **AWS Credentials:**
|
|
aws sts get-caller-identity
|
|
|
|
2. **STS Credentials File:**
|
|
cat /path/to/sts-creds.json | jq .
|
|
|
|
3. **IAM Role Access:**
|
|
aws iam get-role --role-name hypershift-role
|
|
|
|
4. **Route53 Domain:**
|
|
aws route53 list-hosted-zones --query "HostedZones[?Name=='example.com.']"
|
|
|
|
5. **Region Availability:**
|
|
aws ec2 describe-availability-zones --region us-east-1
|
|
|
|
6. **Instance Type Availability:**
|
|
aws ec2 describe-instance-type-offerings --location-type availability-zone --filters Name=instance-type,Values=m5.large --region us-east-1
|
|
```
|
|
|
|
### Step 7: Post-Generation Instructions
|
|
|
|
**Next Steps:**
|
|
```
|
|
## Next Steps
|
|
|
|
1. **Verify prerequisites are met:**
|
|
- AWS credentials configured
|
|
- STS credentials file exists and is valid
|
|
- IAM role has required permissions
|
|
- Base domain exists in Route53
|
|
|
|
2. **Run the generated command:**
|
|
Copy and paste the command above
|
|
|
|
3. **Monitor cluster creation:**
|
|
kubectl get hostedcluster -n <cluster-namespace>
|
|
kubectl get nodepool -n <cluster-namespace>
|
|
|
|
4. **Check AWS resources:**
|
|
- EC2 instances in AWS console
|
|
- Load balancers created
|
|
- VPC and networking resources
|
|
|
|
5. **Access cluster when ready:**
|
|
hypershift create kubeconfig --name <cluster-name> --namespace <cluster-namespace>
|
|
export KUBECONFIG=<cluster-name>-kubeconfig
|
|
oc get nodes
|
|
```
|
|
|
|
## Error Handling
|
|
|
|
### Invalid AWS Credentials
|
|
|
|
**Scenario:** AWS credentials are invalid or expired.
|
|
|
|
**Action:**
|
|
```
|
|
AWS credentials validation failed.
|
|
|
|
Please check:
|
|
1. AWS CLI configuration: aws configure list
|
|
2. STS credentials file validity
|
|
3. IAM permissions
|
|
|
|
Regenerate STS credentials:
|
|
aws sts get-session-token --duration-seconds 3600
|
|
```
|
|
|
|
### IAM Role Not Found
|
|
|
|
**Scenario:** Specified IAM role doesn't exist or can't be assumed.
|
|
|
|
**Action:**
|
|
```
|
|
IAM role "arn:aws:iam::123456789012:role/hypershift-role" not found or inaccessible.
|
|
|
|
Please verify:
|
|
1. Role exists: aws iam get-role --role-name hypershift-role
|
|
2. Role has required permissions
|
|
3. Trust relationship allows your account to assume the role
|
|
|
|
See HyperShift AWS setup guide: https://hypershift.openshift.io/aws-setup/
|
|
```
|
|
|
|
### Region/Zone Issues
|
|
|
|
**Scenario:** Instance type not available in selected region/zones.
|
|
|
|
**Action:**
|
|
```
|
|
Instance type "m5.large" not available in zone "us-east-1f".
|
|
|
|
Checking alternative zones in us-east-1:
|
|
✅ us-east-1a (available)
|
|
✅ us-east-1b (available)
|
|
❌ us-east-1f (not available)
|
|
|
|
Suggested zones: us-east-1a,us-east-1b
|
|
|
|
Would you like me to update the command?
|
|
```
|
|
|
|
### Route53 Domain Issues
|
|
|
|
**Scenario:** Base domain not found in Route53 or not accessible.
|
|
|
|
**Action:**
|
|
```
|
|
Base domain "example.com" not found in Route53.
|
|
|
|
Please ensure:
|
|
1. Domain exists in Route53: aws route53 list-hosted-zones
|
|
2. Account has access to the hosted zone
|
|
3. Domain spelling is correct
|
|
|
|
Alternative: Use a subdomain you control (e.g., clusters.mydomain.com)
|
|
```
|
|
|
|
### Resource Limits
|
|
|
|
**Scenario:** AWS account limits would be exceeded.
|
|
|
|
**Action:**
|
|
```
|
|
AWS service limits may be exceeded:
|
|
- EC2 instances: Current: 18/20, Requested: 5 more
|
|
- Elastic IPs: Current: 4/5, Requested: 2 more
|
|
|
|
Consider:
|
|
1. Request limit increases via AWS Support
|
|
2. Choose smaller instance types
|
|
3. Reduce node count
|
|
4. Clean up unused resources
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
### Cost Optimization
|
|
|
|
1. **Right-size instances:** Don't over-provision for development
|
|
2. **Use Spot instances:** Where appropriate for non-critical workloads
|
|
3. **Monitor resource usage:** Regularly review AWS costs
|
|
4. **Clean up unused clusters:** Delete development clusters when not needed
|
|
|
|
### Security
|
|
|
|
1. **Least privilege IAM:** Use minimal required permissions
|
|
2. **STS credentials:** Use short-lived credentials when possible
|
|
3. **Private networking:** Use PrivateAndPublic endpoints for production
|
|
4. **KMS encryption:** Enable for sensitive workloads
|
|
|
|
### High Availability
|
|
|
|
1. **Multi-AZ deployment:** Use 3+ availability zones for production
|
|
2. **Instance distribution:** Spread nodes across zones
|
|
3. **Auto-repair:** Enable for automatic recovery
|
|
4. **Monitoring:** Set up CloudWatch monitoring
|
|
|
|
### Network Planning
|
|
|
|
1. **VPC design:** Plan CIDR ranges carefully
|
|
2. **Subnet strategy:** Use public/private subnet design
|
|
3. **Load balancer:** Configure appropriate load balancer types
|
|
4. **DNS:** Ensure proper Route53 configuration
|
|
|
|
## Anti-Patterns to Avoid
|
|
|
|
❌ **Using root AWS credentials**
|
|
```
|
|
Never use root account credentials for HyperShift
|
|
```
|
|
✅ Use IAM roles and STS credentials
|
|
|
|
❌ **Single availability zone for production**
|
|
```
|
|
--zones us-east-1a # Single point of failure
|
|
```
|
|
✅ Use multiple zones: `--zones us-east-1a,us-east-1b,us-east-1c`
|
|
|
|
❌ **Over-provisioning for development**
|
|
```
|
|
--instance-type m5.8xlarge --node-pool-replicas 10 # Expensive for dev
|
|
```
|
|
✅ Use appropriate sizing: `--instance-type m5.large --node-pool-replicas 2`
|
|
|
|
❌ **Ignoring region-specific limitations**
|
|
```
|
|
Choosing regions without checking instance type availability
|
|
```
|
|
✅ Verify instance types and services are available in target region
|
|
|
|
## Example Workflows
|
|
|
|
### Startup Development Environment
|
|
```
|
|
Input: "cheap AWS cluster for testing our new microservice"
|
|
|
|
Analysis:
|
|
- Environment: Development
|
|
- Cost focus: High priority
|
|
- Scale: Minimal
|
|
|
|
Generated Command:
|
|
hypershift create cluster aws \
|
|
--name dev-microservice \
|
|
--namespace dev-microservice-ns \
|
|
--region us-east-1 \
|
|
--instance-type m5.large \
|
|
--node-pool-replicas 2 \
|
|
--control-plane-availability-policy SingleReplica \
|
|
--endpoint-access Public
|
|
```
|
|
|
|
### Enterprise Production
|
|
```
|
|
Input: "highly available AWS production cluster for customer-facing applications"
|
|
|
|
Analysis:
|
|
- Environment: Production
|
|
- Availability: High priority
|
|
- Scale: Enterprise
|
|
|
|
Generated Command:
|
|
hypershift create cluster aws \
|
|
--name prod-customer-apps \
|
|
--namespace prod-customer-apps-ns \
|
|
--region us-west-2 \
|
|
--instance-type m5.xlarge \
|
|
--node-pool-replicas 5 \
|
|
--zones us-west-2a,us-west-2b,us-west-2c \
|
|
--control-plane-availability-policy HighlyAvailable \
|
|
--endpoint-access PublicAndPrivate \
|
|
--auto-repair
|
|
```
|
|
|
|
## See Also
|
|
|
|
- [HyperShift AWS Provider Documentation](https://hypershift.openshift.io/aws-setup/)
|
|
- [AWS IAM Roles for HyperShift](https://hypershift.openshift.io/aws-setup/#_prerequisites)
|
|
- [AWS CLI Configuration Guide](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html)
|
|
- [OpenShift on AWS Best Practices](https://docs.openshift.com/container-platform/latest/installing/installing_aws/) |