5.7 KiB
5.7 KiB
Kubernetes Cluster Setup
Set up a production-ready Kubernetes cluster with essential components.
Task
You are a Kubernetes infrastructure expert. Guide users through setting up a production cluster.
Steps:
-
Ask for Platform:
- Managed (EKS, GKE, AKS)
- Self-hosted (kubeadm, k3s, kind)
- Local dev (minikube, kind, k3d)
-
Generate Cluster Configuration:
EKS (AWS):
# eksctl config
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: production-cluster
region: us-east-1
version: "1.28"
managedNodeGroups:
- name: general-purpose
instanceType: t3.medium
minSize: 3
maxSize: 10
desiredCapacity: 3
volumeSize: 50
ssh:
allow: true
labels:
workload-type: general
tags:
nodegroup-role: general-purpose
iam:
withAddonPolicies:
autoScaler: true
certManager: true
externalDNS: true
ebs: true
efs: true
addons:
- name: vpc-cni
- name: coredns
- name: kube-proxy
- name: aws-ebs-csi-driver
GKE (Google Cloud):
gcloud container clusters create production-cluster \
--region us-central1 \
--num-nodes 3 \
--machine-type n1-standard-2 \
--disk-size 50 \
--enable-autoscaling \
--min-nodes 3 \
--max-nodes 10 \
--enable-autorepair \
--enable-autoupgrade \
--maintenance-window-start "2024-01-01T00:00:00Z" \
--maintenance-window-duration 4h \
--addons HorizontalPodAutoscaling,HttpLoadBalancing,GcePersistentDiskCsiDriver \
--workload-pool=production-cluster.svc.id.goog \
--enable-shielded-nodes \
--enable-ip-alias \
--network default \
--subnetwork default \
--cluster-version latest
AKS (Azure):
az aks create \
--resource-group production-rg \
--name production-cluster \
--location eastus \
--kubernetes-version 1.28.0 \
--node-count 3 \
--node-vm-size Standard_D2s_v3 \
--enable-cluster-autoscaler \
--min-count 3 \
--max-count 10 \
--network-plugin azure \
--enable-managed-identity \
--enable-pod-security-policy \
--enable-addons monitoring,azure-policy \
--generate-ssh-keys
- Install Essential Add-ons:
Ingress Controller (NGINX):
# Helm install
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx \
--create-namespace \
--set controller.replicaCount=3 \
--set controller.service.type=LoadBalancer \
--set controller.metrics.enabled=true
Cert-Manager (TLS certificates):
helm repo add jetstack https://charts.jetstack.io
helm upgrade --install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--set installCRDs=true
# ClusterIssuer for Let's Encrypt
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
Prometheus + Grafana (Monitoring):
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm upgrade --install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace \
--set prometheus.prometheusSpec.retention=30d \
--set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=50Gi \
--set grafana.adminPassword=admin123
External DNS (auto DNS records):
helm repo add external-dns https://kubernetes-sigs.github.io/external-dns/
helm upgrade --install external-dns external-dns/external-dns \
--namespace kube-system \
--set provider=aws \ # or google, azure
--set txtOwnerId=production-cluster \
--set policy=sync
ArgoCD (GitOps):
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# Access UI
kubectl port-forward svc/argocd-server -n argocd 8080:443
# Get admin password
kubectl -n argocd get secret argocd-initial-admin-secret \
-o jsonpath="{.data.password}" | base64 -d
- Security Setup:
Network Policies:
# Default deny all
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
# Allow DNS
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
Pod Security Standards:
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
- Storage Classes:
# Fast SSD storage
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast
provisioner: ebs.csi.aws.com # or pd.csi.storage.gke.io, disk.csi.azure.com
parameters:
type: gp3
iops: "3000"
throughput: "125"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
reclaimPolicy: Delete
Best Practices Included:
- Multi-AZ/region deployment
- Auto-scaling (cluster and pods)
- Monitoring and logging
- TLS certificate automation
- GitOps with ArgoCD
- Network policies
- Resource quotas
- RBAC configuration
Example Usage:
User: "Set up production EKS cluster with monitoring"
Result: Complete EKS config + all essential add-ons