Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:34:45 +08:00
commit 99e2727c28
21 changed files with 5694 additions and 0 deletions

View File

@@ -0,0 +1,285 @@
---
name: gitops-workflow
description: Implement GitOps workflows with ArgoCD and Flux for automated, declarative Kubernetes deployments with continuous reconciliation. Use when implementing GitOps practices, automating Kubernetes deployments, or setting up declarative infrastructure management.
---
# GitOps Workflow
Complete guide to implementing GitOps workflows with ArgoCD and Flux for automated Kubernetes deployments.
## Purpose
Implement declarative, Git-based continuous delivery for Kubernetes using ArgoCD or Flux CD, following OpenGitOps principles.
## When to Use This Skill
- Set up GitOps for Kubernetes clusters
- Automate application deployments from Git
- Implement progressive delivery strategies
- Manage multi-cluster deployments
- Configure automated sync policies
- Set up secret management in GitOps
## OpenGitOps Principles
1. **Declarative** - Entire system described declaratively
2. **Versioned and Immutable** - Desired state stored in Git
3. **Pulled Automatically** - Software agents pull desired state
4. **Continuously Reconciled** - Agents reconcile actual vs desired state
## ArgoCD Setup
### 1. Installation
```bash
# Create namespace
kubectl create namespace argocd
# Install ArgoCD
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# Get admin password
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
```
**Reference:** See `references/argocd-setup.md` for detailed setup
### 2. Repository Structure
```
gitops-repo/
├── apps/
│ ├── production/
│ │ ├── app1/
│ │ │ ├── kustomization.yaml
│ │ │ └── deployment.yaml
│ │ └── app2/
│ └── staging/
├── infrastructure/
│ ├── ingress-nginx/
│ ├── cert-manager/
│ └── monitoring/
└── argocd/
├── applications/
└── projects/
```
### 3. Create Application
```yaml
# argocd/applications/my-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-app
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/org/gitops-repo
targetRevision: main
path: apps/production/my-app
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
```
### 4. App of Apps Pattern
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: applications
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/org/gitops-repo
targetRevision: main
path: argocd/applications
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated: {}
```
## Flux CD Setup
### 1. Installation
```bash
# Install Flux CLI
curl -s https://fluxcd.io/install.sh | sudo bash
# Bootstrap Flux
flux bootstrap github \
--owner=org \
--repository=gitops-repo \
--branch=main \
--path=clusters/production \
--personal
```
### 2. Create GitRepository
```yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: my-app
namespace: flux-system
spec:
interval: 1m
url: https://github.com/org/my-app
ref:
branch: main
```
### 3. Create Kustomization
```yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: my-app
namespace: flux-system
spec:
interval: 5m
path: ./deploy
prune: true
sourceRef:
kind: GitRepository
name: my-app
```
## Sync Policies
### Auto-Sync Configuration
**ArgoCD:**
```yaml
syncPolicy:
automated:
prune: true # Delete resources not in Git
selfHeal: true # Reconcile manual changes
allowEmpty: false
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
```
**Flux:**
```yaml
spec:
interval: 1m
prune: true
wait: true
timeout: 5m
```
**Reference:** See `references/sync-policies.md`
## Progressive Delivery
### Canary Deployment with ArgoCD Rollouts
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: my-app
spec:
replicas: 5
strategy:
canary:
steps:
- setWeight: 20
- pause: {duration: 1m}
- setWeight: 50
- pause: {duration: 2m}
- setWeight: 100
```
### Blue-Green Deployment
```yaml
strategy:
blueGreen:
activeService: my-app
previewService: my-app-preview
autoPromotionEnabled: false
```
## Secret Management
### External Secrets Operator
```yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: db-credentials
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: SecretStore
target:
name: db-credentials
data:
- secretKey: password
remoteRef:
key: prod/db/password
```
### Sealed Secrets
```bash
# Encrypt secret
kubeseal --format yaml < secret.yaml > sealed-secret.yaml
# Commit sealed-secret.yaml to Git
```
## Best Practices
1. **Use separate repos or branches** for different environments
2. **Implement RBAC** for Git repositories
3. **Enable notifications** for sync failures
4. **Use health checks** for custom resources
5. **Implement approval gates** for production
6. **Keep secrets out of Git** (use External Secrets)
7. **Use App of Apps pattern** for organization
8. **Tag releases** for easy rollback
9. **Monitor sync status** with alerts
10. **Test changes** in staging first
## Troubleshooting
**Sync failures:**
```bash
argocd app get my-app
argocd app sync my-app --prune
```
**Out of sync status:**
```bash
argocd app diff my-app
argocd app sync my-app --force
```
## Related Skills
- `k8s-manifest-generator` - For creating manifests
- `helm-chart-scaffolding` - For packaging applications

View File

@@ -0,0 +1,134 @@
# ArgoCD Setup and Configuration
## Installation Methods
### 1. Standard Installation
```bash
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
```
### 2. High Availability Installation
```bash
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/ha/install.yaml
```
### 3. Helm Installation
```bash
helm repo add argo https://argoproj.github.io/argo-helm
helm install argocd argo/argo-cd -n argocd --create-namespace
```
## Initial Configuration
### Access ArgoCD UI
```bash
# Port forward
kubectl port-forward svc/argocd-server -n argocd 8080:443
# Get initial admin password
argocd admin initial-password -n argocd
```
### Configure Ingress
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: argocd-server-ingress
namespace: argocd
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
spec:
ingressClassName: nginx
rules:
- host: argocd.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: argocd-server
port:
number: 443
tls:
- hosts:
- argocd.example.com
secretName: argocd-secret
```
## CLI Configuration
### Login
```bash
argocd login argocd.example.com --username admin
```
### Add Repository
```bash
argocd repo add https://github.com/org/repo --username user --password token
```
### Create Application
```bash
argocd app create my-app \
--repo https://github.com/org/repo \
--path apps/my-app \
--dest-server https://kubernetes.default.svc \
--dest-namespace production
```
## SSO Configuration
### GitHub OAuth
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-cm
namespace: argocd
data:
url: https://argocd.example.com
dex.config: |
connectors:
- type: github
id: github
name: GitHub
config:
clientID: $GITHUB_CLIENT_ID
clientSecret: $GITHUB_CLIENT_SECRET
orgs:
- name: my-org
```
## RBAC Configuration
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-rbac-cm
namespace: argocd
data:
policy.default: role:readonly
policy.csv: |
p, role:developers, applications, *, */dev, allow
p, role:operators, applications, *, */*, allow
g, my-org:devs, role:developers
g, my-org:ops, role:operators
```
## Best Practices
1. Enable SSO for production
2. Implement RBAC policies
3. Use separate projects for teams
4. Enable audit logging
5. Configure notifications
6. Use ApplicationSets for multi-cluster
7. Implement resource hooks
8. Configure health checks
9. Use sync windows for maintenance
10. Monitor with Prometheus metrics

View File

@@ -0,0 +1,131 @@
# GitOps Sync Policies
## ArgoCD Sync Policies
### Automated Sync
```yaml
syncPolicy:
automated:
prune: true # Delete resources removed from Git
selfHeal: true # Reconcile manual changes
allowEmpty: false # Prevent empty sync
```
### Manual Sync
```yaml
syncPolicy:
syncOptions:
- PrunePropagationPolicy=foreground
- CreateNamespace=true
```
### Sync Windows
```yaml
syncWindows:
- kind: allow
schedule: "0 8 * * *"
duration: 1h
applications:
- my-app
- kind: deny
schedule: "0 22 * * *"
duration: 8h
applications:
- '*'
```
### Retry Policy
```yaml
syncPolicy:
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
```
## Flux Sync Policies
### Kustomization Sync
```yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: my-app
spec:
interval: 5m
prune: true
wait: true
timeout: 5m
retryInterval: 1m
force: false
```
### Source Sync Interval
```yaml
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: my-app
spec:
interval: 1m
timeout: 60s
```
## Health Assessment
### Custom Health Checks
```yaml
# ArgoCD
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-cm
namespace: argocd
data:
resource.customizations.health.MyCustomResource: |
hs = {}
if obj.status ~= nil then
if obj.status.conditions ~= nil then
for i, condition in ipairs(obj.status.conditions) do
if condition.type == "Ready" and condition.status == "False" then
hs.status = "Degraded"
hs.message = condition.message
return hs
end
if condition.type == "Ready" and condition.status == "True" then
hs.status = "Healthy"
hs.message = condition.message
return hs
end
end
end
end
hs.status = "Progressing"
hs.message = "Waiting for status"
return hs
```
## Sync Options
### Common Sync Options
- `PrunePropagationPolicy=foreground` - Wait for pruned resources to be deleted
- `CreateNamespace=true` - Auto-create namespace
- `Validate=false` - Skip kubectl validation
- `PruneLast=true` - Prune resources after sync
- `RespectIgnoreDifferences=true` - Honor ignore differences
- `ApplyOutOfSyncOnly=true` - Only apply out-of-sync resources
## Best Practices
1. Use automated sync for non-production
2. Require manual approval for production
3. Configure sync windows for maintenance
4. Implement health checks for custom resources
5. Use selective sync for large applications
6. Configure appropriate retry policies
7. Monitor sync failures with alerts
8. Use prune with caution in production
9. Test sync policies in staging
10. Document sync behavior for teams

View File

@@ -0,0 +1,544 @@
---
name: helm-chart-scaffolding
description: Design, organize, and manage Helm charts for templating and packaging Kubernetes applications with reusable configurations. Use when creating Helm charts, packaging Kubernetes applications, or implementing templated deployments.
---
# Helm Chart Scaffolding
Comprehensive guidance for creating, organizing, and managing Helm charts for packaging and deploying Kubernetes applications.
## Purpose
This skill provides step-by-step instructions for building production-ready Helm charts, including chart structure, templating patterns, values management, and validation strategies.
## When to Use This Skill
Use this skill when you need to:
- Create new Helm charts from scratch
- Package Kubernetes applications for distribution
- Manage multi-environment deployments with Helm
- Implement templating for reusable Kubernetes manifests
- Set up Helm chart repositories
- Follow Helm best practices and conventions
## Helm Overview
**Helm** is the package manager for Kubernetes that:
- Templates Kubernetes manifests for reusability
- Manages application releases and rollbacks
- Handles dependencies between charts
- Provides version control for deployments
- Simplifies configuration management across environments
## Step-by-Step Workflow
### 1. Initialize Chart Structure
**Create new chart:**
```bash
helm create my-app
```
**Standard chart structure:**
```
my-app/
├── Chart.yaml # Chart metadata
├── values.yaml # Default configuration values
├── charts/ # Chart dependencies
├── templates/ # Kubernetes manifest templates
│ ├── NOTES.txt # Post-install notes
│ ├── _helpers.tpl # Template helpers
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── ingress.yaml
│ ├── serviceaccount.yaml
│ ├── hpa.yaml
│ └── tests/
│ └── test-connection.yaml
└── .helmignore # Files to ignore
```
### 2. Configure Chart.yaml
**Chart metadata defines the package:**
```yaml
apiVersion: v2
name: my-app
description: A Helm chart for My Application
type: application
version: 1.0.0 # Chart version
appVersion: "2.1.0" # Application version
# Keywords for chart discovery
keywords:
- web
- api
- backend
# Maintainer information
maintainers:
- name: DevOps Team
email: devops@example.com
url: https://github.com/example/my-app
# Source code repository
sources:
- https://github.com/example/my-app
# Homepage
home: https://example.com
# Chart icon
icon: https://example.com/icon.png
# Dependencies
dependencies:
- name: postgresql
version: "12.0.0"
repository: "https://charts.bitnami.com/bitnami"
condition: postgresql.enabled
- name: redis
version: "17.0.0"
repository: "https://charts.bitnami.com/bitnami"
condition: redis.enabled
```
**Reference:** See `assets/Chart.yaml.template` for complete example
### 3. Design values.yaml Structure
**Organize values hierarchically:**
```yaml
# Image configuration
image:
repository: myapp
tag: "1.0.0"
pullPolicy: IfNotPresent
# Number of replicas
replicaCount: 3
# Service configuration
service:
type: ClusterIP
port: 80
targetPort: 8080
# Ingress configuration
ingress:
enabled: false
className: nginx
hosts:
- host: app.example.com
paths:
- path: /
pathType: Prefix
# Resources
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
# Autoscaling
autoscaling:
enabled: false
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 80
# Environment variables
env:
- name: LOG_LEVEL
value: "info"
# ConfigMap data
configMap:
data:
APP_MODE: production
# Dependencies
postgresql:
enabled: true
auth:
database: myapp
username: myapp
redis:
enabled: false
```
**Reference:** See `assets/values.yaml.template` for complete structure
### 4. Create Template Files
**Use Go templating with Helm functions:**
**templates/deployment.yaml:**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "my-app.fullname" . }}
labels:
{{- include "my-app.labels" . | nindent 4 }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "my-app.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "my-app.selectorLabels" . | nindent 8 }}
spec:
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.service.targetPort }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
env:
{{- toYaml .Values.env | nindent 12 }}
```
### 5. Create Template Helpers
**templates/_helpers.tpl:**
```yaml
{{/*
Expand the name of the chart.
*/}}
{{- define "my-app.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
*/}}
{{- define "my-app.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "my-app.labels" -}}
helm.sh/chart: {{ include "my-app.chart" . }}
{{ include "my-app.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "my-app.selectorLabels" -}}
app.kubernetes.io/name: {{ include "my-app.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
```
### 6. Manage Dependencies
**Add dependencies in Chart.yaml:**
```yaml
dependencies:
- name: postgresql
version: "12.0.0"
repository: "https://charts.bitnami.com/bitnami"
condition: postgresql.enabled
```
**Update dependencies:**
```bash
helm dependency update
helm dependency build
```
**Override dependency values:**
```yaml
# values.yaml
postgresql:
enabled: true
auth:
database: myapp
username: myapp
password: changeme
primary:
persistence:
enabled: true
size: 10Gi
```
### 7. Test and Validate
**Validation commands:**
```bash
# Lint the chart
helm lint my-app/
# Dry-run installation
helm install my-app ./my-app --dry-run --debug
# Template rendering
helm template my-app ./my-app
# Template with values
helm template my-app ./my-app -f values-prod.yaml
# Show computed values
helm show values ./my-app
```
**Validation script:**
```bash
#!/bin/bash
set -e
echo "Linting chart..."
helm lint .
echo "Testing template rendering..."
helm template test-release . --dry-run
echo "Checking for required values..."
helm template test-release . --validate
echo "All validations passed!"
```
**Reference:** See `scripts/validate-chart.sh`
### 8. Package and Distribute
**Package the chart:**
```bash
helm package my-app/
# Creates: my-app-1.0.0.tgz
```
**Create chart repository:**
```bash
# Create index
helm repo index .
# Upload to repository
# AWS S3 example
aws s3 sync . s3://my-helm-charts/ --exclude "*" --include "*.tgz" --include "index.yaml"
```
**Use the chart:**
```bash
helm repo add my-repo https://charts.example.com
helm repo update
helm install my-app my-repo/my-app
```
### 9. Multi-Environment Configuration
**Environment-specific values files:**
```
my-app/
├── values.yaml # Defaults
├── values-dev.yaml # Development
├── values-staging.yaml # Staging
└── values-prod.yaml # Production
```
**values-prod.yaml:**
```yaml
replicaCount: 5
image:
tag: "2.1.0"
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 20
ingress:
enabled: true
hosts:
- host: app.example.com
paths:
- path: /
pathType: Prefix
postgresql:
enabled: true
primary:
persistence:
size: 100Gi
```
**Install with environment:**
```bash
helm install my-app ./my-app -f values-prod.yaml --namespace production
```
### 10. Implement Hooks and Tests
**Pre-install hook:**
```yaml
# templates/pre-install-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "my-app.fullname" . }}-db-setup
annotations:
"helm.sh/hook": pre-install
"helm.sh/hook-weight": "-5"
"helm.sh/hook-delete-policy": hook-succeeded
spec:
template:
spec:
containers:
- name: db-setup
image: postgres:15
command: ["psql", "-c", "CREATE DATABASE myapp"]
restartPolicy: Never
```
**Test connection:**
```yaml
# templates/tests/test-connection.yaml
apiVersion: v1
kind: Pod
metadata:
name: "{{ include "my-app.fullname" . }}-test-connection"
annotations:
"helm.sh/hook": test
spec:
containers:
- name: wget
image: busybox
command: ['wget']
args: ['{{ include "my-app.fullname" . }}:{{ .Values.service.port }}']
restartPolicy: Never
```
**Run tests:**
```bash
helm test my-app
```
## Common Patterns
### Pattern 1: Conditional Resources
```yaml
{{- if .Values.ingress.enabled }}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: {{ include "my-app.fullname" . }}
spec:
# ...
{{- end }}
```
### Pattern 2: Iterating Over Lists
```yaml
env:
{{- range .Values.env }}
- name: {{ .name }}
value: {{ .value | quote }}
{{- end }}
```
### Pattern 3: Including Files
```yaml
data:
config.yaml: |
{{- .Files.Get "config/application.yaml" | nindent 4 }}
```
### Pattern 4: Global Values
```yaml
global:
imageRegistry: docker.io
imagePullSecrets:
- name: regcred
# Use in templates:
image: {{ .Values.global.imageRegistry }}/{{ .Values.image.repository }}
```
## Best Practices
1. **Use semantic versioning** for chart and app versions
2. **Document all values** in values.yaml with comments
3. **Use template helpers** for repeated logic
4. **Validate charts** before packaging
5. **Pin dependency versions** explicitly
6. **Use conditions** for optional resources
7. **Follow naming conventions** (lowercase, hyphens)
8. **Include NOTES.txt** with usage instructions
9. **Add labels** consistently using helpers
10. **Test installations** in all environments
## Troubleshooting
**Template rendering errors:**
```bash
helm template my-app ./my-app --debug
```
**Dependency issues:**
```bash
helm dependency update
helm dependency list
```
**Installation failures:**
```bash
helm install my-app ./my-app --dry-run --debug
kubectl get events --sort-by='.lastTimestamp'
```
## Reference Files
- `assets/Chart.yaml.template` - Chart metadata template
- `assets/values.yaml.template` - Values structure template
- `scripts/validate-chart.sh` - Validation script
- `references/chart-structure.md` - Detailed chart organization
## Related Skills
- `k8s-manifest-generator` - For creating base Kubernetes manifests
- `gitops-workflow` - For automated Helm chart deployments

View File

@@ -0,0 +1,42 @@
apiVersion: v2
name: <chart-name>
description: <Chart description>
type: application
version: 0.1.0
appVersion: "1.0.0"
keywords:
- <keyword1>
- <keyword2>
home: https://github.com/<org>/<repo>
sources:
- https://github.com/<org>/<repo>
maintainers:
- name: <Maintainer Name>
email: <maintainer@example.com>
url: https://github.com/<username>
icon: https://example.com/icon.png
kubeVersion: ">=1.24.0"
dependencies:
- name: postgresql
version: "12.0.0"
repository: "https://charts.bitnami.com/bitnami"
condition: postgresql.enabled
tags:
- database
- name: redis
version: "17.0.0"
repository: "https://charts.bitnami.com/bitnami"
condition: redis.enabled
tags:
- cache
annotations:
category: Application
licenses: Apache-2.0

View File

@@ -0,0 +1,185 @@
# Global values shared with subcharts
global:
imageRegistry: docker.io
imagePullSecrets: []
storageClass: ""
# Image configuration
image:
registry: docker.io
repository: myapp/web
tag: "" # Defaults to .Chart.AppVersion
pullPolicy: IfNotPresent
# Override chart name
nameOverride: ""
fullnameOverride: ""
# Number of replicas
replicaCount: 3
revisionHistoryLimit: 10
# ServiceAccount
serviceAccount:
create: true
annotations: {}
name: ""
# Pod annotations
podAnnotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"
# Pod security context
podSecurityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
# Container security context
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
# Service configuration
service:
type: ClusterIP
port: 80
targetPort: http
annotations: {}
sessionAffinity: None
# Ingress configuration
ingress:
enabled: false
className: nginx
annotations: {}
hosts:
- host: app.example.com
paths:
- path: /
pathType: Prefix
tls: []
# Resources
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 250m
memory: 256Mi
# Liveness probe
livenessProbe:
httpGet:
path: /health/live
port: http
initialDelaySeconds: 30
periodSeconds: 10
# Readiness probe
readinessProbe:
httpGet:
path: /health/ready
port: http
initialDelaySeconds: 5
periodSeconds: 5
# Autoscaling
autoscaling:
enabled: false
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 80
targetMemoryUtilizationPercentage: 80
# Pod Disruption Budget
podDisruptionBudget:
enabled: true
minAvailable: 1
# Node selection
nodeSelector: {}
tolerations: []
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- '{{ include "my-app.name" . }}'
topologyKey: kubernetes.io/hostname
# Environment variables
env: []
# - name: LOG_LEVEL
# value: "info"
# ConfigMap data
configMap:
enabled: true
data: {}
# APP_MODE: production
# DATABASE_HOST: postgres.example.com
# Secrets (use external secret management in production)
secrets:
enabled: false
data: {}
# Persistent Volume
persistence:
enabled: false
storageClass: ""
accessMode: ReadWriteOnce
size: 10Gi
annotations: {}
# PostgreSQL dependency
postgresql:
enabled: false
auth:
database: myapp
username: myapp
password: changeme
primary:
persistence:
enabled: true
size: 10Gi
# Redis dependency
redis:
enabled: false
auth:
enabled: false
master:
persistence:
enabled: false
# ServiceMonitor for Prometheus Operator
serviceMonitor:
enabled: false
interval: 30s
scrapeTimeout: 10s
labels: {}
# Network Policy
networkPolicy:
enabled: false
policyTypes:
- Ingress
- Egress
ingress: []
egress: []

View File

@@ -0,0 +1,500 @@
# Helm Chart Structure Reference
Complete guide to Helm chart organization, file conventions, and best practices.
## Standard Chart Directory Structure
```
my-app/
├── Chart.yaml # Chart metadata (required)
├── Chart.lock # Dependency lock file (generated)
├── values.yaml # Default configuration values (required)
├── values.schema.json # JSON schema for values validation
├── .helmignore # Patterns to ignore when packaging
├── README.md # Chart documentation
├── LICENSE # Chart license
├── charts/ # Chart dependencies (bundled)
│ └── postgresql-12.0.0.tgz
├── crds/ # Custom Resource Definitions
│ └── my-crd.yaml
├── templates/ # Kubernetes manifest templates (required)
│ ├── NOTES.txt # Post-install instructions
│ ├── _helpers.tpl # Template helper functions
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── ingress.yaml
│ ├── configmap.yaml
│ ├── secret.yaml
│ ├── serviceaccount.yaml
│ ├── hpa.yaml
│ ├── pdb.yaml
│ ├── networkpolicy.yaml
│ └── tests/
│ └── test-connection.yaml
└── files/ # Additional files to include
└── config/
└── app.conf
```
## Chart.yaml Specification
### API Version v2 (Helm 3+)
```yaml
apiVersion: v2 # Required: API version
name: my-application # Required: Chart name
version: 1.2.3 # Required: Chart version (SemVer)
appVersion: "2.5.0" # Application version
description: A Helm chart for my application # Required
type: application # Chart type: application or library
keywords: # Search keywords
- web
- api
- backend
home: https://example.com # Project home page
sources: # Source code URLs
- https://github.com/example/my-app
maintainers: # Maintainer list
- name: John Doe
email: john@example.com
url: https://github.com/johndoe
icon: https://example.com/icon.png # Chart icon URL
kubeVersion: ">=1.24.0" # Compatible Kubernetes versions
deprecated: false # Mark chart as deprecated
annotations: # Arbitrary annotations
example.com/release-notes: https://example.com/releases/v1.2.3
dependencies: # Chart dependencies
- name: postgresql
version: "12.0.0"
repository: "https://charts.bitnami.com/bitnami"
condition: postgresql.enabled
tags:
- database
import-values:
- child: database
parent: database
alias: db
```
## Chart Types
### Application Chart
```yaml
type: application
```
- Standard Kubernetes applications
- Can be installed and managed
- Contains templates for K8s resources
### Library Chart
```yaml
type: library
```
- Shared template helpers
- Cannot be installed directly
- Used as dependency by other charts
- No templates/ directory
## Values Files Organization
### values.yaml (defaults)
```yaml
# Global values (shared with subcharts)
global:
imageRegistry: docker.io
imagePullSecrets: []
# Image configuration
image:
registry: docker.io
repository: myapp/web
tag: "" # Defaults to .Chart.AppVersion
pullPolicy: IfNotPresent
# Deployment settings
replicaCount: 1
revisionHistoryLimit: 10
# Pod configuration
podAnnotations: {}
podSecurityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
# Container security
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
# Service
service:
type: ClusterIP
port: 80
targetPort: http
annotations: {}
# Resources
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
# Autoscaling
autoscaling:
enabled: false
minReplicas: 1
maxReplicas: 100
targetCPUUtilizationPercentage: 80
# Node selection
nodeSelector: {}
tolerations: []
affinity: {}
# Monitoring
serviceMonitor:
enabled: false
interval: 30s
```
### values.schema.json (validation)
```json
{
"$schema": "https://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"replicaCount": {
"type": "integer",
"minimum": 1
},
"image": {
"type": "object",
"required": ["repository"],
"properties": {
"repository": {
"type": "string"
},
"tag": {
"type": "string"
},
"pullPolicy": {
"type": "string",
"enum": ["Always", "IfNotPresent", "Never"]
}
}
}
},
"required": ["image"]
}
```
## Template Files
### Template Naming Conventions
- **Lowercase with hyphens**: `deployment.yaml`, `service-account.yaml`
- **Partial templates**: Prefix with underscore `_helpers.tpl`
- **Tests**: Place in `templates/tests/`
- **CRDs**: Place in `crds/` (not templated)
### Common Templates
#### _helpers.tpl
```yaml
{{/*
Standard naming helpers
*/}}
{{- define "my-app.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- define "my-app.fullname" -}}
{{- if .Values.fullnameOverride -}}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- if contains $name .Release.Name -}}
{{- .Release.Name | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- end -}}
{{- end -}}
{{- define "my-app.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{/*
Common labels
*/}}
{{- define "my-app.labels" -}}
helm.sh/chart: {{ include "my-app.chart" . }}
{{ include "my-app.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end -}}
{{- define "my-app.selectorLabels" -}}
app.kubernetes.io/name: {{ include "my-app.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end -}}
{{/*
Image name helper
*/}}
{{- define "my-app.image" -}}
{{- $registry := .Values.global.imageRegistry | default .Values.image.registry -}}
{{- $repository := .Values.image.repository -}}
{{- $tag := .Values.image.tag | default .Chart.AppVersion -}}
{{- printf "%s/%s:%s" $registry $repository $tag -}}
{{- end -}}
```
#### NOTES.txt
```
Thank you for installing {{ .Chart.Name }}.
Your release is named {{ .Release.Name }}.
To learn more about the release, try:
$ helm status {{ .Release.Name }}
$ helm get all {{ .Release.Name }}
{{- if .Values.ingress.enabled }}
Application URL:
{{- range .Values.ingress.hosts }}
http{{ if $.Values.ingress.tls }}s{{ end }}://{{ .host }}{{ .path }}
{{- end }}
{{- else }}
Get the application URL by running:
export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "app.kubernetes.io/name={{ include "my-app.name" . }}" -o jsonpath="{.items[0].metadata.name}")
kubectl port-forward $POD_NAME 8080:80
echo "Visit http://127.0.0.1:8080"
{{- end }}
```
## Dependencies Management
### Declaring Dependencies
```yaml
# Chart.yaml
dependencies:
- name: postgresql
version: "12.0.0"
repository: "https://charts.bitnami.com/bitnami"
condition: postgresql.enabled # Enable/disable via values
tags: # Group dependencies
- database
import-values: # Import values from subchart
- child: database
parent: database
alias: db # Reference as .Values.db
```
### Managing Dependencies
```bash
# Update dependencies
helm dependency update
# List dependencies
helm dependency list
# Build dependencies
helm dependency build
```
### Chart.lock
Generated automatically by `helm dependency update`:
```yaml
dependencies:
- name: postgresql
repository: https://charts.bitnami.com/bitnami
version: 12.0.0
digest: sha256:abcd1234...
generated: "2024-01-01T00:00:00Z"
```
## .helmignore
Exclude files from chart package:
```
# Development files
.git/
.gitignore
*.md
docs/
# Build artifacts
*.swp
*.bak
*.tmp
*.orig
# CI/CD
.travis.yml
.gitlab-ci.yml
Jenkinsfile
# Testing
test/
*.test
# IDE
.vscode/
.idea/
*.iml
```
## Custom Resource Definitions (CRDs)
Place CRDs in `crds/` directory:
```
crds/
├── my-app-crd.yaml
└── another-crd.yaml
```
**Important CRD notes:**
- CRDs are installed before any templates
- CRDs are NOT templated (no `{{ }}` syntax)
- CRDs are NOT upgraded or deleted with chart
- Use `helm install --skip-crds` to skip installation
## Chart Versioning
### Semantic Versioning
- **Chart Version**: Increment when chart changes
- MAJOR: Breaking changes
- MINOR: New features, backward compatible
- PATCH: Bug fixes
- **App Version**: Application version being deployed
- Can be any string
- Not required to follow SemVer
```yaml
version: 2.3.1 # Chart version
appVersion: "1.5.0" # Application version
```
## Chart Testing
### Test Files
```yaml
# templates/tests/test-connection.yaml
apiVersion: v1
kind: Pod
metadata:
name: "{{ include "my-app.fullname" . }}-test-connection"
annotations:
"helm.sh/hook": test
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
containers:
- name: wget
image: busybox
command: ['wget']
args: ['{{ include "my-app.fullname" . }}:{{ .Values.service.port }}']
restartPolicy: Never
```
### Running Tests
```bash
helm test my-release
helm test my-release --logs
```
## Hooks
Helm hooks allow intervention at specific points:
```yaml
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "my-app.fullname" . }}-migration
annotations:
"helm.sh/hook": pre-upgrade,pre-install
"helm.sh/hook-weight": "-5"
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
```
### Hook Types
- `pre-install`: Before templates rendered
- `post-install`: After all resources loaded
- `pre-delete`: Before any resources deleted
- `post-delete`: After all resources deleted
- `pre-upgrade`: Before upgrade
- `post-upgrade`: After upgrade
- `pre-rollback`: Before rollback
- `post-rollback`: After rollback
- `test`: Run with `helm test`
### Hook Weight
Controls hook execution order (-5 to 5, lower runs first)
### Hook Deletion Policies
- `before-hook-creation`: Delete previous hook before new one
- `hook-succeeded`: Delete after successful execution
- `hook-failed`: Delete if hook fails
## Best Practices
1. **Use helpers** for repeated template logic
2. **Quote strings** in templates: `{{ .Values.name | quote }}`
3. **Validate values** with values.schema.json
4. **Document all values** in values.yaml
5. **Use semantic versioning** for chart versions
6. **Pin dependency versions** exactly
7. **Include NOTES.txt** with usage instructions
8. **Add tests** for critical functionality
9. **Use hooks** for database migrations
10. **Keep charts focused** - one application per chart
## Chart Repository Structure
```
helm-charts/
├── index.yaml
├── my-app-1.0.0.tgz
├── my-app-1.1.0.tgz
├── my-app-1.2.0.tgz
└── another-chart-2.0.0.tgz
```
### Creating Repository Index
```bash
helm repo index . --url https://charts.example.com
```
## Related Resources
- [Helm Documentation](https://helm.sh/docs/)
- [Chart Template Guide](https://helm.sh/docs/chart_template_guide/)
- [Best Practices](https://helm.sh/docs/chart_best_practices/)

View File

@@ -0,0 +1,244 @@
#!/bin/bash
set -e
CHART_DIR="${1:-.}"
RELEASE_NAME="test-release"
echo "═══════════════════════════════════════════════════════"
echo " Helm Chart Validation"
echo "═══════════════════════════════════════════════════════"
echo ""
# Colors
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
NC='\033[0m' # No Color
success() {
echo -e "${GREEN}${NC} $1"
}
warning() {
echo -e "${YELLOW}${NC} $1"
}
error() {
echo -e "${RED}${NC} $1"
}
# Check if Helm is installed
if ! command -v helm &> /dev/null; then
error "Helm is not installed"
exit 1
fi
echo "📦 Chart directory: $CHART_DIR"
echo ""
# 1. Check chart structure
echo "1⃣ Checking chart structure..."
if [ ! -f "$CHART_DIR/Chart.yaml" ]; then
error "Chart.yaml not found"
exit 1
fi
success "Chart.yaml exists"
if [ ! -f "$CHART_DIR/values.yaml" ]; then
error "values.yaml not found"
exit 1
fi
success "values.yaml exists"
if [ ! -d "$CHART_DIR/templates" ]; then
error "templates/ directory not found"
exit 1
fi
success "templates/ directory exists"
echo ""
# 2. Lint the chart
echo "2⃣ Linting chart..."
if helm lint "$CHART_DIR"; then
success "Chart passed lint"
else
error "Chart failed lint"
exit 1
fi
echo ""
# 3. Check Chart.yaml
echo "3⃣ Validating Chart.yaml..."
CHART_NAME=$(grep "^name:" "$CHART_DIR/Chart.yaml" | awk '{print $2}')
CHART_VERSION=$(grep "^version:" "$CHART_DIR/Chart.yaml" | awk '{print $2}')
APP_VERSION=$(grep "^appVersion:" "$CHART_DIR/Chart.yaml" | awk '{print $2}' | tr -d '"')
if [ -z "$CHART_NAME" ]; then
error "Chart name not found"
exit 1
fi
success "Chart name: $CHART_NAME"
if [ -z "$CHART_VERSION" ]; then
error "Chart version not found"
exit 1
fi
success "Chart version: $CHART_VERSION"
if [ -z "$APP_VERSION" ]; then
warning "App version not specified"
else
success "App version: $APP_VERSION"
fi
echo ""
# 4. Test template rendering
echo "4⃣ Testing template rendering..."
if helm template "$RELEASE_NAME" "$CHART_DIR" > /dev/null 2>&1; then
success "Templates rendered successfully"
else
error "Template rendering failed"
helm template "$RELEASE_NAME" "$CHART_DIR"
exit 1
fi
echo ""
# 5. Dry-run installation
echo "5⃣ Testing dry-run installation..."
if helm install "$RELEASE_NAME" "$CHART_DIR" --dry-run --debug > /dev/null 2>&1; then
success "Dry-run installation successful"
else
error "Dry-run installation failed"
exit 1
fi
echo ""
# 6. Check for required Kubernetes resources
echo "6⃣ Checking generated resources..."
MANIFESTS=$(helm template "$RELEASE_NAME" "$CHART_DIR")
if echo "$MANIFESTS" | grep -q "kind: Deployment"; then
success "Deployment found"
else
warning "No Deployment found"
fi
if echo "$MANIFESTS" | grep -q "kind: Service"; then
success "Service found"
else
warning "No Service found"
fi
if echo "$MANIFESTS" | grep -q "kind: ServiceAccount"; then
success "ServiceAccount found"
else
warning "No ServiceAccount found"
fi
echo ""
# 7. Check for security best practices
echo "7⃣ Checking security best practices..."
if echo "$MANIFESTS" | grep -q "runAsNonRoot: true"; then
success "Running as non-root user"
else
warning "Not explicitly running as non-root"
fi
if echo "$MANIFESTS" | grep -q "readOnlyRootFilesystem: true"; then
success "Using read-only root filesystem"
else
warning "Not using read-only root filesystem"
fi
if echo "$MANIFESTS" | grep -q "allowPrivilegeEscalation: false"; then
success "Privilege escalation disabled"
else
warning "Privilege escalation not explicitly disabled"
fi
echo ""
# 8. Check for resource limits
echo "8⃣ Checking resource configuration..."
if echo "$MANIFESTS" | grep -q "resources:"; then
if echo "$MANIFESTS" | grep -q "limits:"; then
success "Resource limits defined"
else
warning "No resource limits defined"
fi
if echo "$MANIFESTS" | grep -q "requests:"; then
success "Resource requests defined"
else
warning "No resource requests defined"
fi
else
warning "No resources defined"
fi
echo ""
# 9. Check for health probes
echo "9⃣ Checking health probes..."
if echo "$MANIFESTS" | grep -q "livenessProbe:"; then
success "Liveness probe configured"
else
warning "No liveness probe found"
fi
if echo "$MANIFESTS" | grep -q "readinessProbe:"; then
success "Readiness probe configured"
else
warning "No readiness probe found"
fi
echo ""
# 10. Check dependencies
if [ -f "$CHART_DIR/Chart.yaml" ] && grep -q "^dependencies:" "$CHART_DIR/Chart.yaml"; then
echo "🔟 Checking dependencies..."
if helm dependency list "$CHART_DIR" > /dev/null 2>&1; then
success "Dependencies valid"
if [ -f "$CHART_DIR/Chart.lock" ]; then
success "Chart.lock file present"
else
warning "Chart.lock file missing (run 'helm dependency update')"
fi
else
error "Dependencies check failed"
fi
echo ""
fi
# 11. Check for values schema
if [ -f "$CHART_DIR/values.schema.json" ]; then
echo "1⃣1⃣ Validating values schema..."
success "values.schema.json present"
# Validate schema if jq is available
if command -v jq &> /dev/null; then
if jq empty "$CHART_DIR/values.schema.json" 2>/dev/null; then
success "values.schema.json is valid JSON"
else
error "values.schema.json contains invalid JSON"
exit 1
fi
fi
echo ""
fi
# Summary
echo "═══════════════════════════════════════════════════════"
echo " Validation Complete!"
echo "═══════════════════════════════════════════════════════"
echo ""
echo "Chart: $CHART_NAME"
echo "Version: $CHART_VERSION"
if [ -n "$APP_VERSION" ]; then
echo "App Version: $APP_VERSION"
fi
echo ""
success "All validations passed!"
echo ""
echo "Next steps:"
echo " • helm package $CHART_DIR"
echo " • helm install my-release $CHART_DIR"
echo " • helm test my-release"
echo ""

View File

@@ -0,0 +1,511 @@
---
name: k8s-manifest-generator
description: Create production-ready Kubernetes manifests for Deployments, Services, ConfigMaps, and Secrets following best practices and security standards. Use when generating Kubernetes YAML manifests, creating K8s resources, or implementing production-grade Kubernetes configurations.
---
# Kubernetes Manifest Generator
Step-by-step guidance for creating production-ready Kubernetes manifests including Deployments, Services, ConfigMaps, Secrets, and PersistentVolumeClaims.
## Purpose
This skill provides comprehensive guidance for generating well-structured, secure, and production-ready Kubernetes manifests following cloud-native best practices and Kubernetes conventions.
## When to Use This Skill
Use this skill when you need to:
- Create new Kubernetes Deployment manifests
- Define Service resources for network connectivity
- Generate ConfigMap and Secret resources for configuration management
- Create PersistentVolumeClaim manifests for stateful workloads
- Follow Kubernetes best practices and naming conventions
- Implement resource limits, health checks, and security contexts
- Design manifests for multi-environment deployments
## Step-by-Step Workflow
### 1. Gather Requirements
**Understand the workload:**
- Application type (stateless/stateful)
- Container image and version
- Environment variables and configuration needs
- Storage requirements
- Network exposure requirements (internal/external)
- Resource requirements (CPU, memory)
- Scaling requirements
- Health check endpoints
**Questions to ask:**
- What is the application name and purpose?
- What container image and tag will be used?
- Does the application need persistent storage?
- What ports does the application expose?
- Are there any secrets or configuration files needed?
- What are the CPU and memory requirements?
- Does the application need to be exposed externally?
### 2. Create Deployment Manifest
**Follow this structure:**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: <app-name>
namespace: <namespace>
labels:
app: <app-name>
version: <version>
spec:
replicas: 3
selector:
matchLabels:
app: <app-name>
template:
metadata:
labels:
app: <app-name>
version: <version>
spec:
containers:
- name: <container-name>
image: <image>:<tag>
ports:
- containerPort: <port>
name: http
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: http
initialDelaySeconds: 5
periodSeconds: 5
env:
- name: ENV_VAR
value: "value"
envFrom:
- configMapRef:
name: <app-name>-config
- secretRef:
name: <app-name>-secret
```
**Best practices to apply:**
- Always set resource requests and limits
- Implement both liveness and readiness probes
- Use specific image tags (never `:latest`)
- Apply security context for non-root users
- Use labels for organization and selection
- Set appropriate replica count based on availability needs
**Reference:** See `references/deployment-spec.md` for detailed deployment options
### 3. Create Service Manifest
**Choose the appropriate Service type:**
**ClusterIP (internal only):**
```yaml
apiVersion: v1
kind: Service
metadata:
name: <app-name>
namespace: <namespace>
labels:
app: <app-name>
spec:
type: ClusterIP
selector:
app: <app-name>
ports:
- name: http
port: 80
targetPort: 8080
protocol: TCP
```
**LoadBalancer (external access):**
```yaml
apiVersion: v1
kind: Service
metadata:
name: <app-name>
namespace: <namespace>
labels:
app: <app-name>
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: nlb
spec:
type: LoadBalancer
selector:
app: <app-name>
ports:
- name: http
port: 80
targetPort: 8080
protocol: TCP
```
**Reference:** See `references/service-spec.md` for service types and networking
### 4. Create ConfigMap
**For application configuration:**
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: <app-name>-config
namespace: <namespace>
data:
APP_MODE: production
LOG_LEVEL: info
DATABASE_HOST: db.example.com
# For config files
app.properties: |
server.port=8080
server.host=0.0.0.0
logging.level=INFO
```
**Best practices:**
- Use ConfigMaps for non-sensitive data only
- Organize related configuration together
- Use meaningful names for keys
- Consider using one ConfigMap per component
- Version ConfigMaps when making changes
**Reference:** See `assets/configmap-template.yaml` for examples
### 5. Create Secret
**For sensitive data:**
```yaml
apiVersion: v1
kind: Secret
metadata:
name: <app-name>-secret
namespace: <namespace>
type: Opaque
stringData:
DATABASE_PASSWORD: "changeme"
API_KEY: "secret-api-key"
# For certificate files
tls.crt: |
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
tls.key: |
-----BEGIN PRIVATE KEY-----
...
-----END PRIVATE KEY-----
```
**Security considerations:**
- Never commit secrets to Git in plain text
- Use Sealed Secrets, External Secrets Operator, or Vault
- Rotate secrets regularly
- Use RBAC to limit secret access
- Consider using Secret type: `kubernetes.io/tls` for TLS secrets
### 6. Create PersistentVolumeClaim (if needed)
**For stateful applications:**
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: <app-name>-data
namespace: <namespace>
spec:
accessModes:
- ReadWriteOnce
storageClassName: gp3
resources:
requests:
storage: 10Gi
```
**Mount in Deployment:**
```yaml
spec:
template:
spec:
containers:
- name: app
volumeMounts:
- name: data
mountPath: /var/lib/app
volumes:
- name: data
persistentVolumeClaim:
claimName: <app-name>-data
```
**Storage considerations:**
- Choose appropriate StorageClass for performance needs
- Use ReadWriteOnce for single-pod access
- Use ReadWriteMany for multi-pod shared storage
- Consider backup strategies
- Set appropriate retention policies
### 7. Apply Security Best Practices
**Add security context to Deployment:**
```yaml
spec:
template:
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
```
**Security checklist:**
- [ ] Run as non-root user
- [ ] Drop all capabilities
- [ ] Use read-only root filesystem
- [ ] Disable privilege escalation
- [ ] Set seccomp profile
- [ ] Use Pod Security Standards
### 8. Add Labels and Annotations
**Standard labels (recommended):**
```yaml
metadata:
labels:
app.kubernetes.io/name: <app-name>
app.kubernetes.io/instance: <instance-name>
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/component: backend
app.kubernetes.io/part-of: <system-name>
app.kubernetes.io/managed-by: kubectl
```
**Useful annotations:**
```yaml
metadata:
annotations:
description: "Application description"
contact: "team@example.com"
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"
```
### 9. Organize Multi-Resource Manifests
**File organization options:**
**Option 1: Single file with `---` separator**
```yaml
# app-name.yaml
---
apiVersion: v1
kind: ConfigMap
...
---
apiVersion: v1
kind: Secret
...
---
apiVersion: apps/v1
kind: Deployment
...
---
apiVersion: v1
kind: Service
...
```
**Option 2: Separate files**
```
manifests/
├── configmap.yaml
├── secret.yaml
├── deployment.yaml
├── service.yaml
└── pvc.yaml
```
**Option 3: Kustomize structure**
```
base/
├── kustomization.yaml
├── deployment.yaml
├── service.yaml
└── configmap.yaml
overlays/
├── dev/
│ └── kustomization.yaml
└── prod/
└── kustomization.yaml
```
### 10. Validate and Test
**Validation steps:**
```bash
# Dry-run validation
kubectl apply -f manifest.yaml --dry-run=client
# Server-side validation
kubectl apply -f manifest.yaml --dry-run=server
# Validate with kubeval
kubeval manifest.yaml
# Validate with kube-score
kube-score score manifest.yaml
# Check with kube-linter
kube-linter lint manifest.yaml
```
**Testing checklist:**
- [ ] Manifest passes dry-run validation
- [ ] All required fields are present
- [ ] Resource limits are reasonable
- [ ] Health checks are configured
- [ ] Security context is set
- [ ] Labels follow conventions
- [ ] Namespace exists or is created
## Common Patterns
### Pattern 1: Simple Stateless Web Application
**Use case:** Standard web API or microservice
**Components needed:**
- Deployment (3 replicas for HA)
- ClusterIP Service
- ConfigMap for configuration
- Secret for API keys
- HorizontalPodAutoscaler (optional)
**Reference:** See `assets/deployment-template.yaml`
### Pattern 2: Stateful Database Application
**Use case:** Database or persistent storage application
**Components needed:**
- StatefulSet (not Deployment)
- Headless Service
- PersistentVolumeClaim template
- ConfigMap for DB configuration
- Secret for credentials
### Pattern 3: Background Job or Cron
**Use case:** Scheduled tasks or batch processing
**Components needed:**
- CronJob or Job
- ConfigMap for job parameters
- Secret for credentials
- ServiceAccount with RBAC
### Pattern 4: Multi-Container Pod
**Use case:** Application with sidecar containers
**Components needed:**
- Deployment with multiple containers
- Shared volumes between containers
- Init containers for setup
- Service (if needed)
## Templates
The following templates are available in the `assets/` directory:
- `deployment-template.yaml` - Standard deployment with best practices
- `service-template.yaml` - Service configurations (ClusterIP, LoadBalancer, NodePort)
- `configmap-template.yaml` - ConfigMap examples with different data types
- `secret-template.yaml` - Secret examples (to be generated, not committed)
- `pvc-template.yaml` - PersistentVolumeClaim templates
## Reference Documentation
- `references/deployment-spec.md` - Detailed Deployment specification
- `references/service-spec.md` - Service types and networking details
## Best Practices Summary
1. **Always set resource requests and limits** - Prevents resource starvation
2. **Implement health checks** - Ensures Kubernetes can manage your application
3. **Use specific image tags** - Avoid unpredictable deployments
4. **Apply security contexts** - Run as non-root, drop capabilities
5. **Use ConfigMaps and Secrets** - Separate config from code
6. **Label everything** - Enables filtering and organization
7. **Follow naming conventions** - Use standard Kubernetes labels
8. **Validate before applying** - Use dry-run and validation tools
9. **Version your manifests** - Keep in Git with version control
10. **Document with annotations** - Add context for other developers
## Troubleshooting
**Pods not starting:**
- Check image pull errors: `kubectl describe pod <pod-name>`
- Verify resource availability: `kubectl get nodes`
- Check events: `kubectl get events --sort-by='.lastTimestamp'`
**Service not accessible:**
- Verify selector matches pod labels: `kubectl get endpoints <service-name>`
- Check service type and port configuration
- Test from within cluster: `kubectl run debug --rm -it --image=busybox -- sh`
**ConfigMap/Secret not loading:**
- Verify names match in Deployment
- Check namespace
- Ensure resources exist: `kubectl get configmap,secret`
## Next Steps
After creating manifests:
1. Store in Git repository
2. Set up CI/CD pipeline for deployment
3. Consider using Helm or Kustomize for templating
4. Implement GitOps with ArgoCD or Flux
5. Add monitoring and observability
## Related Skills
- `helm-chart-scaffolding` - For templating and packaging
- `gitops-workflow` - For automated deployments
- `k8s-security-policies` - For advanced security configurations

View File

@@ -0,0 +1,296 @@
# Kubernetes ConfigMap Templates
---
# Template 1: Simple Key-Value Configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: <app-name>-config
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
app.kubernetes.io/instance: <instance-name>
data:
# Simple key-value pairs
APP_ENV: "production"
LOG_LEVEL: "info"
DATABASE_HOST: "db.example.com"
DATABASE_PORT: "5432"
CACHE_TTL: "3600"
MAX_CONNECTIONS: "100"
---
# Template 2: Configuration File
apiVersion: v1
kind: ConfigMap
metadata:
name: <app-name>-config-file
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
data:
# Application configuration file
application.yaml: |
server:
port: 8080
host: 0.0.0.0
logging:
level: INFO
format: json
database:
host: db.example.com
port: 5432
pool_size: 20
timeout: 30
cache:
enabled: true
ttl: 3600
max_entries: 10000
features:
new_ui: true
beta_features: false
---
# Template 3: Multiple Configuration Files
apiVersion: v1
kind: ConfigMap
metadata:
name: <app-name>-multi-config
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
data:
# Nginx configuration
nginx.conf: |
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
keepalive_timeout 65;
include /etc/nginx/conf.d/*.conf;
}
# Default site configuration
default.conf: |
server {
listen 80;
server_name _;
location / {
proxy_pass http://backend:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /health {
access_log off;
return 200 "healthy\n";
}
}
---
# Template 4: JSON Configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: <app-name>-json-config
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
data:
config.json: |
{
"server": {
"port": 8080,
"host": "0.0.0.0",
"timeout": 30
},
"database": {
"host": "postgres.example.com",
"port": 5432,
"database": "myapp",
"pool": {
"min": 2,
"max": 20
}
},
"redis": {
"host": "redis.example.com",
"port": 6379,
"db": 0
},
"features": {
"auth": true,
"metrics": true,
"tracing": true
}
}
---
# Template 5: Environment-Specific Configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: <app-name>-prod-config
namespace: production
labels:
app.kubernetes.io/name: <app-name>
environment: production
data:
APP_ENV: "production"
LOG_LEVEL: "warn"
DEBUG: "false"
RATE_LIMIT: "1000"
CACHE_TTL: "3600"
DATABASE_POOL_SIZE: "50"
FEATURE_FLAG_NEW_UI: "true"
FEATURE_FLAG_BETA: "false"
---
# Template 6: Script Configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: <app-name>-scripts
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
data:
# Initialization script
init.sh: |
#!/bin/bash
set -e
echo "Running initialization..."
# Wait for database
until nc -z $DATABASE_HOST $DATABASE_PORT; do
echo "Waiting for database..."
sleep 2
done
echo "Database is ready!"
# Run migrations
if [ "$RUN_MIGRATIONS" = "true" ]; then
echo "Running database migrations..."
./migrate up
fi
echo "Initialization complete!"
# Health check script
healthcheck.sh: |
#!/bin/bash
# Check application health endpoint
response=$(curl -sf http://localhost:8080/health)
if [ $? -eq 0 ]; then
echo "Health check passed"
exit 0
else
echo "Health check failed"
exit 1
fi
---
# Template 7: Prometheus Configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: monitoring
labels:
app.kubernetes.io/name: prometheus
data:
prometheus.yml: |
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
cluster: 'production'
region: 'us-west-2'
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
rule_files:
- /etc/prometheus/rules/*.yml
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
---
# Usage Examples:
#
# 1. Mount as environment variables:
# envFrom:
# - configMapRef:
# name: <app-name>-config
#
# 2. Mount as files:
# volumeMounts:
# - name: config
# mountPath: /etc/app
# volumes:
# - name: config
# configMap:
# name: <app-name>-config-file
#
# 3. Mount specific keys as files:
# volumes:
# - name: nginx-config
# configMap:
# name: <app-name>-multi-config
# items:
# - key: nginx.conf
# path: nginx.conf
#
# 4. Use individual environment variables:
# env:
# - name: LOG_LEVEL
# valueFrom:
# configMapKeyRef:
# name: <app-name>-config
# key: LOG_LEVEL

View File

@@ -0,0 +1,203 @@
# Production-Ready Kubernetes Deployment Template
# Replace all <placeholders> with actual values
apiVersion: apps/v1
kind: Deployment
metadata:
name: <app-name>
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
app.kubernetes.io/instance: <instance-name>
app.kubernetes.io/version: "<version>"
app.kubernetes.io/component: <component> # backend, frontend, database, cache
app.kubernetes.io/part-of: <system-name>
app.kubernetes.io/managed-by: kubectl
annotations:
description: "<application description>"
contact: "<team-email>"
spec:
replicas: 3 # Minimum 3 for production HA
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/name: <app-name>
app.kubernetes.io/instance: <instance-name>
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0 # Zero-downtime deployment
minReadySeconds: 10
progressDeadlineSeconds: 600
template:
metadata:
labels:
app.kubernetes.io/name: <app-name>
app.kubernetes.io/instance: <instance-name>
app.kubernetes.io/version: "<version>"
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"
spec:
serviceAccountName: <app-name>
# Pod-level security context
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
# Init containers (optional)
initContainers:
- name: init-wait
image: busybox:1.36
command: ['sh', '-c', 'echo "Initializing..."']
securityContext:
allowPrivilegeEscalation: false
runAsNonRoot: true
runAsUser: 1000
containers:
- name: <container-name>
image: <registry>/<image>:<tag> # Never use :latest
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 8080
protocol: TCP
- name: metrics
containerPort: 9090
protocol: TCP
# Environment variables
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
# Load from ConfigMap and Secret
envFrom:
- configMapRef:
name: <app-name>-config
- secretRef:
name: <app-name>-secret
# Resource limits
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
# Startup probe (for slow-starting apps)
startupProbe:
httpGet:
path: /health/startup
port: http
initialDelaySeconds: 0
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 30 # 5 minutes to start
# Liveness probe
livenessProbe:
httpGet:
path: /health/live
port: http
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
# Readiness probe
readinessProbe:
httpGet:
path: /health/ready
port: http
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
# Volume mounts
volumeMounts:
- name: tmp
mountPath: /tmp
- name: cache
mountPath: /app/cache
# - name: data
# mountPath: /var/lib/app
# Container security context
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop:
- ALL
# Lifecycle hooks
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15"] # Graceful shutdown
# Volumes
volumes:
- name: tmp
emptyDir: {}
- name: cache
emptyDir:
sizeLimit: 1Gi
# - name: data
# persistentVolumeClaim:
# claimName: <app-name>-data
# Scheduling
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/name: <app-name>
topologyKey: kubernetes.io/hostname
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app.kubernetes.io/name: <app-name>
terminationGracePeriodSeconds: 30
# Image pull secrets (if using private registry)
# imagePullSecrets:
# - name: regcred

View File

@@ -0,0 +1,171 @@
# Kubernetes Service Templates
---
# Template 1: ClusterIP Service (Internal Only)
apiVersion: v1
kind: Service
metadata:
name: <app-name>
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
app.kubernetes.io/instance: <instance-name>
annotations:
description: "Internal service for <app-name>"
spec:
type: ClusterIP
selector:
app.kubernetes.io/name: <app-name>
app.kubernetes.io/instance: <instance-name>
ports:
- name: http
port: 80
targetPort: http # Named port from container
protocol: TCP
sessionAffinity: None
---
# Template 2: LoadBalancer Service (External Access)
apiVersion: v1
kind: Service
metadata:
name: <app-name>-lb
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
annotations:
# AWS NLB annotations
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
# SSL certificate (optional)
# service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:..."
spec:
type: LoadBalancer
externalTrafficPolicy: Local # Preserves client IP
selector:
app.kubernetes.io/name: <app-name>
ports:
- name: http
port: 80
targetPort: http
protocol: TCP
- name: https
port: 443
targetPort: https
protocol: TCP
# Restrict access to specific IPs (optional)
# loadBalancerSourceRanges:
# - 203.0.113.0/24
---
# Template 3: NodePort Service (Direct Node Access)
apiVersion: v1
kind: Service
metadata:
name: <app-name>-np
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
spec:
type: NodePort
selector:
app.kubernetes.io/name: <app-name>
ports:
- name: http
port: 80
targetPort: 8080
nodePort: 30080 # Optional, 30000-32767 range
protocol: TCP
---
# Template 4: Headless Service (StatefulSet)
apiVersion: v1
kind: Service
metadata:
name: <app-name>-headless
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
spec:
clusterIP: None # Headless
selector:
app.kubernetes.io/name: <app-name>
ports:
- name: client
port: 9042
targetPort: 9042
publishNotReadyAddresses: true # Include not-ready pods in DNS
---
# Template 5: Multi-Port Service with Metrics
apiVersion: v1
kind: Service
metadata:
name: <app-name>-multi
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"
spec:
type: ClusterIP
selector:
app.kubernetes.io/name: <app-name>
ports:
- name: http
port: 80
targetPort: 8080
protocol: TCP
- name: https
port: 443
targetPort: 8443
protocol: TCP
- name: grpc
port: 9090
targetPort: 9090
protocol: TCP
- name: metrics
port: 9091
targetPort: 9091
protocol: TCP
---
# Template 6: Service with Session Affinity
apiVersion: v1
kind: Service
metadata:
name: <app-name>-sticky
namespace: <namespace>
labels:
app.kubernetes.io/name: <app-name>
spec:
type: ClusterIP
selector:
app.kubernetes.io/name: <app-name>
ports:
- name: http
port: 80
targetPort: 8080
protocol: TCP
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800 # 3 hours
---
# Template 7: ExternalName Service (External Service Mapping)
apiVersion: v1
kind: Service
metadata:
name: external-db
namespace: <namespace>
spec:
type: ExternalName
externalName: db.example.com
ports:
- port: 5432
targetPort: 5432
protocol: TCP

View File

@@ -0,0 +1,753 @@
# Kubernetes Deployment Specification Reference
Comprehensive reference for Kubernetes Deployment resources, covering all key fields, best practices, and common patterns.
## Overview
A Deployment provides declarative updates for Pods and ReplicaSets. It manages the desired state of your application, handling rollouts, rollbacks, and scaling operations.
## Complete Deployment Specification
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: production
labels:
app.kubernetes.io/name: my-app
app.kubernetes.io/version: "1.0.0"
app.kubernetes.io/component: backend
app.kubernetes.io/part-of: my-system
annotations:
description: "Main application deployment"
contact: "backend-team@example.com"
spec:
# Replica management
replicas: 3
revisionHistoryLimit: 10
# Pod selection
selector:
matchLabels:
app: my-app
version: v1
# Update strategy
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
# Minimum time for pod to be ready
minReadySeconds: 10
# Deployment will fail if it doesn't progress in this time
progressDeadlineSeconds: 600
# Pod template
template:
metadata:
labels:
app: my-app
version: v1
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
spec:
# Service account for RBAC
serviceAccountName: my-app
# Security context for the pod
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
# Init containers run before main containers
initContainers:
- name: init-db
image: busybox:1.36
command: ['sh', '-c', 'until nc -z db-service 5432; do sleep 1; done']
securityContext:
allowPrivilegeEscalation: false
runAsNonRoot: true
runAsUser: 1000
# Main containers
containers:
- name: app
image: myapp:1.0.0
imagePullPolicy: IfNotPresent
# Container ports
ports:
- name: http
containerPort: 8080
protocol: TCP
- name: metrics
containerPort: 9090
protocol: TCP
# Environment variables
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
# ConfigMap and Secret references
envFrom:
- configMapRef:
name: app-config
- secretRef:
name: app-secrets
# Resource requests and limits
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
# Liveness probe
livenessProbe:
httpGet:
path: /health/live
port: http
httpHeaders:
- name: Custom-Header
value: Awesome
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
# Readiness probe
readinessProbe:
httpGet:
path: /health/ready
port: http
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
successThreshold: 1
failureThreshold: 3
# Startup probe (for slow-starting containers)
startupProbe:
httpGet:
path: /health/startup
port: http
initialDelaySeconds: 0
periodSeconds: 10
timeoutSeconds: 3
successThreshold: 1
failureThreshold: 30
# Volume mounts
volumeMounts:
- name: data
mountPath: /var/lib/app
- name: config
mountPath: /etc/app
readOnly: true
- name: tmp
mountPath: /tmp
# Security context for container
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop:
- ALL
# Lifecycle hooks
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo Container started > /tmp/started"]
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15"]
# Volumes
volumes:
- name: data
persistentVolumeClaim:
claimName: app-data
- name: config
configMap:
name: app-config
- name: tmp
emptyDir: {}
# DNS configuration
dnsPolicy: ClusterFirst
dnsConfig:
options:
- name: ndots
value: "2"
# Scheduling
nodeSelector:
disktype: ssd
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- my-app
topologyKey: kubernetes.io/hostname
tolerations:
- key: "app"
operator: "Equal"
value: "my-app"
effect: "NoSchedule"
# Termination
terminationGracePeriodSeconds: 30
# Image pull secrets
imagePullSecrets:
- name: regcred
```
## Field Reference
### Metadata Fields
#### Required Fields
- `apiVersion`: `apps/v1` (current stable version)
- `kind`: `Deployment`
- `metadata.name`: Unique name within namespace
#### Recommended Metadata
- `metadata.namespace`: Target namespace (defaults to `default`)
- `metadata.labels`: Key-value pairs for organization
- `metadata.annotations`: Non-identifying metadata
### Spec Fields
#### Replica Management
**`replicas`** (integer, default: 1)
- Number of desired pod instances
- Best practice: Use 3+ for production high availability
- Can be scaled manually or via HorizontalPodAutoscaler
**`revisionHistoryLimit`** (integer, default: 10)
- Number of old ReplicaSets to retain for rollback
- Set to 0 to disable rollback capability
- Reduces storage overhead for long-running deployments
#### Update Strategy
**`strategy.type`** (string)
- `RollingUpdate` (default): Gradual pod replacement
- `Recreate`: Delete all pods before creating new ones
**`strategy.rollingUpdate.maxSurge`** (int or percent, default: 25%)
- Maximum pods above desired replicas during update
- Example: With 3 replicas and maxSurge=1, up to 4 pods during update
**`strategy.rollingUpdate.maxUnavailable`** (int or percent, default: 25%)
- Maximum pods below desired replicas during update
- Set to 0 for zero-downtime deployments
- Cannot be 0 if maxSurge is 0
**Best practices:**
```yaml
# Zero-downtime deployment
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
# Fast deployment (can have brief downtime)
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 2
maxUnavailable: 1
# Complete replacement
strategy:
type: Recreate
```
#### Pod Template
**`template.metadata.labels`**
- Must include labels matching `spec.selector.matchLabels`
- Add version labels for blue/green deployments
- Include standard Kubernetes labels
**`template.spec.containers`** (required)
- Array of container specifications
- At least one container required
- Each container needs unique name
#### Container Configuration
**Image Management:**
```yaml
containers:
- name: app
image: registry.example.com/myapp:1.0.0
imagePullPolicy: IfNotPresent # or Always, Never
```
Image pull policies:
- `IfNotPresent`: Pull if not cached (default for tagged images)
- `Always`: Always pull (default for :latest)
- `Never`: Never pull, fail if not cached
**Port Declarations:**
```yaml
ports:
- name: http # Named for referencing in Service
containerPort: 8080
protocol: TCP # TCP (default), UDP, or SCTP
hostPort: 8080 # Optional: Bind to host port (rarely used)
```
#### Resource Management
**Requests vs Limits:**
```yaml
resources:
requests:
memory: "256Mi" # Guaranteed resources
cpu: "250m" # 0.25 CPU cores
limits:
memory: "512Mi" # Maximum allowed
cpu: "500m" # 0.5 CPU cores
```
**QoS Classes (determined automatically):**
1. **Guaranteed**: requests = limits for all containers
- Highest priority
- Last to be evicted
2. **Burstable**: requests < limits or only requests set
- Medium priority
- Evicted before Guaranteed
3. **BestEffort**: No requests or limits set
- Lowest priority
- First to be evicted
**Best practices:**
- Always set requests in production
- Set limits to prevent resource monopolization
- Memory limits should be 1.5-2x requests
- CPU limits can be higher for bursty workloads
#### Health Checks
**Probe Types:**
1. **startupProbe** - For slow-starting applications
```yaml
startupProbe:
httpGet:
path: /health/startup
port: 8080
initialDelaySeconds: 0
periodSeconds: 10
failureThreshold: 30 # 5 minutes to start (10s * 30)
```
2. **livenessProbe** - Restarts unhealthy containers
```yaml
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3 # Restart after 3 failures
```
3. **readinessProbe** - Controls traffic routing
```yaml
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3 # Remove from service after 3 failures
```
**Probe Mechanisms:**
```yaml
# HTTP GET
httpGet:
path: /health
port: 8080
httpHeaders:
- name: Authorization
value: Bearer token
# TCP Socket
tcpSocket:
port: 3306
# Command execution
exec:
command:
- cat
- /tmp/healthy
# gRPC (Kubernetes 1.24+)
grpc:
port: 9090
service: my.service.health.v1.Health
```
**Probe Timing Parameters:**
- `initialDelaySeconds`: Wait before first probe
- `periodSeconds`: How often to probe
- `timeoutSeconds`: Probe timeout
- `successThreshold`: Successes needed to mark healthy (1 for liveness/startup)
- `failureThreshold`: Failures before taking action
#### Security Context
**Pod-level security context:**
```yaml
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
fsGroupChangePolicy: OnRootMismatch
seccompProfile:
type: RuntimeDefault
```
**Container-level security context:**
```yaml
containers:
- name: app
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE # Only if needed
```
**Security best practices:**
- Always run as non-root (`runAsNonRoot: true`)
- Drop all capabilities and add only needed ones
- Use read-only root filesystem when possible
- Enable seccomp profile
- Disable privilege escalation
#### Volumes
**Volume Types:**
```yaml
volumes:
# PersistentVolumeClaim
- name: data
persistentVolumeClaim:
claimName: app-data
# ConfigMap
- name: config
configMap:
name: app-config
items:
- key: app.properties
path: application.properties
# Secret
- name: secrets
secret:
secretName: app-secrets
defaultMode: 0400
# EmptyDir (ephemeral)
- name: cache
emptyDir:
sizeLimit: 1Gi
# HostPath (avoid in production)
- name: host-data
hostPath:
path: /data
type: DirectoryOrCreate
```
#### Scheduling
**Node Selection:**
```yaml
# Simple node selector
nodeSelector:
disktype: ssd
zone: us-west-1a
# Node affinity (more expressive)
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/arch
operator: In
values:
- amd64
- arm64
```
**Pod Affinity/Anti-Affinity:**
```yaml
# Spread pods across nodes
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: my-app
topologyKey: kubernetes.io/hostname
# Co-locate with database
affinity:
podAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: database
topologyKey: kubernetes.io/hostname
```
**Tolerations:**
```yaml
tolerations:
- key: "node.kubernetes.io/unreachable"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 30
- key: "dedicated"
operator: "Equal"
value: "database"
effect: "NoSchedule"
```
## Common Patterns
### High Availability Deployment
```yaml
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: my-app
topologyKey: kubernetes.io/hostname
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: my-app
```
### Sidecar Container Pattern
```yaml
spec:
template:
spec:
containers:
- name: app
image: myapp:1.0.0
volumeMounts:
- name: shared-logs
mountPath: /var/log
- name: log-forwarder
image: fluent-bit:2.0
volumeMounts:
- name: shared-logs
mountPath: /var/log
readOnly: true
volumes:
- name: shared-logs
emptyDir: {}
```
### Init Container for Dependencies
```yaml
spec:
template:
spec:
initContainers:
- name: wait-for-db
image: busybox:1.36
command:
- sh
- -c
- |
until nc -z database-service 5432; do
echo "Waiting for database..."
sleep 2
done
- name: run-migrations
image: myapp:1.0.0
command: ["./migrate", "up"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
containers:
- name: app
image: myapp:1.0.0
```
## Best Practices
### Production Checklist
- [ ] Set resource requests and limits
- [ ] Implement all three probe types (startup, liveness, readiness)
- [ ] Use specific image tags (not :latest)
- [ ] Configure security context (non-root, read-only filesystem)
- [ ] Set replica count >= 3 for HA
- [ ] Configure pod anti-affinity for spread
- [ ] Set appropriate update strategy (maxUnavailable: 0 for zero-downtime)
- [ ] Use ConfigMaps and Secrets for configuration
- [ ] Add standard labels and annotations
- [ ] Configure graceful shutdown (preStop hook, terminationGracePeriodSeconds)
- [ ] Set revisionHistoryLimit for rollback capability
- [ ] Use ServiceAccount with minimal RBAC permissions
### Performance Tuning
**Fast startup:**
```yaml
spec:
minReadySeconds: 5
strategy:
rollingUpdate:
maxSurge: 2
maxUnavailable: 1
```
**Zero-downtime updates:**
```yaml
spec:
minReadySeconds: 10
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
```
**Graceful shutdown:**
```yaml
spec:
template:
spec:
terminationGracePeriodSeconds: 60
containers:
- name: app
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15 && kill -SIGTERM 1"]
```
## Troubleshooting
### Common Issues
**Pods not starting:**
```bash
kubectl describe deployment <name>
kubectl get pods -l app=<app-name>
kubectl describe pod <pod-name>
kubectl logs <pod-name>
```
**ImagePullBackOff:**
- Check image name and tag
- Verify imagePullSecrets
- Check registry credentials
**CrashLoopBackOff:**
- Check container logs
- Verify liveness probe is not too aggressive
- Check resource limits
- Verify application dependencies
**Deployment stuck in progress:**
- Check progressDeadlineSeconds
- Verify readiness probes
- Check resource availability
## Related Resources
- [Kubernetes Deployment API Reference](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#deployment-v1-apps)
- [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/)
- [Resource Management](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/)

View File

@@ -0,0 +1,724 @@
# Kubernetes Service Specification Reference
Comprehensive reference for Kubernetes Service resources, covering service types, networking, load balancing, and service discovery patterns.
## Overview
A Service provides stable network endpoints for accessing Pods. Services enable loose coupling between microservices by providing service discovery and load balancing.
## Service Types
### 1. ClusterIP (Default)
Exposes the service on an internal cluster IP. Only reachable from within the cluster.
```yaml
apiVersion: v1
kind: Service
metadata:
name: backend-service
namespace: production
spec:
type: ClusterIP
selector:
app: backend
ports:
- name: http
port: 80
targetPort: 8080
protocol: TCP
sessionAffinity: None
```
**Use cases:**
- Internal microservice communication
- Database services
- Internal APIs
- Message queues
### 2. NodePort
Exposes the service on each Node's IP at a static port (30000-32767 range).
```yaml
apiVersion: v1
kind: Service
metadata:
name: frontend-service
spec:
type: NodePort
selector:
app: frontend
ports:
- name: http
port: 80
targetPort: 8080
nodePort: 30080 # Optional, auto-assigned if omitted
protocol: TCP
```
**Use cases:**
- Development/testing external access
- Small deployments without load balancer
- Direct node access requirements
**Limitations:**
- Limited port range (30000-32767)
- Must handle node failures
- No built-in load balancing across nodes
### 3. LoadBalancer
Exposes the service using a cloud provider's load balancer.
```yaml
apiVersion: v1
kind: Service
metadata:
name: public-api
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
spec:
type: LoadBalancer
selector:
app: api
ports:
- name: https
port: 443
targetPort: 8443
protocol: TCP
loadBalancerSourceRanges:
- 203.0.113.0/24
```
**Cloud-specific annotations:**
**AWS:**
```yaml
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb" # or "external"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:..."
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http"
```
**Azure:**
```yaml
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
service.beta.kubernetes.io/azure-pip-name: "my-public-ip"
```
**GCP:**
```yaml
annotations:
cloud.google.com/load-balancer-type: "Internal"
cloud.google.com/backend-config: '{"default": "my-backend-config"}'
```
### 4. ExternalName
Maps service to external DNS name (CNAME record).
```yaml
apiVersion: v1
kind: Service
metadata:
name: external-db
spec:
type: ExternalName
externalName: db.external.example.com
ports:
- port: 5432
```
**Use cases:**
- Accessing external services
- Service migration scenarios
- Multi-cluster service references
## Complete Service Specification
```yaml
apiVersion: v1
kind: Service
metadata:
name: my-service
namespace: production
labels:
app: my-app
tier: backend
annotations:
description: "Main application service"
prometheus.io/scrape: "true"
spec:
# Service type
type: ClusterIP
# Pod selector
selector:
app: my-app
version: v1
# Ports configuration
ports:
- name: http
port: 80 # Service port
targetPort: 8080 # Container port (or named port)
protocol: TCP # TCP, UDP, or SCTP
# Session affinity
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800
# IP configuration
clusterIP: 10.0.0.10 # Optional: specific IP
clusterIPs:
- 10.0.0.10
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
# External traffic policy
externalTrafficPolicy: Local
# Internal traffic policy
internalTrafficPolicy: Local
# Health check
healthCheckNodePort: 30000
# Load balancer config (for type: LoadBalancer)
loadBalancerIP: 203.0.113.100
loadBalancerSourceRanges:
- 203.0.113.0/24
# External IPs
externalIPs:
- 80.11.12.10
# Publishing strategy
publishNotReadyAddresses: false
```
## Port Configuration
### Named Ports
Use named ports in Pods for flexibility:
**Deployment:**
```yaml
spec:
template:
spec:
containers:
- name: app
ports:
- name: http
containerPort: 8080
- name: metrics
containerPort: 9090
```
**Service:**
```yaml
spec:
ports:
- name: http
port: 80
targetPort: http # References named port
- name: metrics
port: 9090
targetPort: metrics
```
### Multiple Ports
```yaml
spec:
ports:
- name: http
port: 80
targetPort: 8080
protocol: TCP
- name: https
port: 443
targetPort: 8443
protocol: TCP
- name: grpc
port: 9090
targetPort: 9090
protocol: TCP
```
## Session Affinity
### None (Default)
Distributes requests randomly across pods.
```yaml
spec:
sessionAffinity: None
```
### ClientIP
Routes requests from same client IP to same pod.
```yaml
spec:
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800 # 3 hours
```
**Use cases:**
- Stateful applications
- Session-based applications
- WebSocket connections
## Traffic Policies
### External Traffic Policy
**Cluster (Default):**
```yaml
spec:
externalTrafficPolicy: Cluster
```
- Load balances across all nodes
- May add extra network hop
- Source IP is masked
**Local:**
```yaml
spec:
externalTrafficPolicy: Local
```
- Traffic goes only to pods on receiving node
- Preserves client source IP
- Better performance (no extra hop)
- May cause imbalanced load
### Internal Traffic Policy
```yaml
spec:
internalTrafficPolicy: Local # or Cluster
```
Controls traffic routing for cluster-internal clients.
## Headless Services
Service without cluster IP for direct pod access.
```yaml
apiVersion: v1
kind: Service
metadata:
name: database
spec:
clusterIP: None # Headless
selector:
app: database
ports:
- port: 5432
targetPort: 5432
```
**Use cases:**
- StatefulSet pod discovery
- Direct pod-to-pod communication
- Custom load balancing
- Database clusters
**DNS returns:**
- Individual pod IPs instead of service IP
- Format: `<pod-name>.<service-name>.<namespace>.svc.cluster.local`
## Service Discovery
### DNS
**ClusterIP Service:**
```
<service-name>.<namespace>.svc.cluster.local
```
Example:
```bash
curl http://backend-service.production.svc.cluster.local
```
**Within same namespace:**
```bash
curl http://backend-service
```
**Headless Service (returns pod IPs):**
```
<pod-name>.<service-name>.<namespace>.svc.cluster.local
```
### Environment Variables
Kubernetes injects service info into pods:
```bash
# Service host and port
BACKEND_SERVICE_SERVICE_HOST=10.0.0.100
BACKEND_SERVICE_SERVICE_PORT=80
# For named ports
BACKEND_SERVICE_SERVICE_PORT_HTTP=80
```
**Note:** Pods must be created after the service for env vars to be injected.
## Load Balancing
### Algorithms
Kubernetes uses random selection by default. For advanced load balancing:
**Service Mesh (Istio example):**
```yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: my-destination-rule
spec:
host: my-service
trafficPolicy:
loadBalancer:
simple: LEAST_REQUEST # or ROUND_ROBIN, RANDOM, PASSTHROUGH
connectionPool:
tcp:
maxConnections: 100
```
### Connection Limits
Use pod disruption budgets and resource limits:
```yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: my-app
```
## Service Mesh Integration
### Istio Virtual Service
```yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: my-service
spec:
hosts:
- my-service
http:
- match:
- headers:
version:
exact: v2
route:
- destination:
host: my-service
subset: v2
- route:
- destination:
host: my-service
subset: v1
weight: 90
- destination:
host: my-service
subset: v2
weight: 10
```
## Common Patterns
### Pattern 1: Internal Microservice
```yaml
apiVersion: v1
kind: Service
metadata:
name: user-service
namespace: backend
labels:
app: user-service
tier: backend
spec:
type: ClusterIP
selector:
app: user-service
ports:
- name: http
port: 8080
targetPort: http
protocol: TCP
- name: grpc
port: 9090
targetPort: grpc
protocol: TCP
```
### Pattern 2: Public API with Load Balancer
```yaml
apiVersion: v1
kind: Service
metadata:
name: api-gateway
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:..."
spec:
type: LoadBalancer
externalTrafficPolicy: Local
selector:
app: api-gateway
ports:
- name: https
port: 443
targetPort: 8443
protocol: TCP
loadBalancerSourceRanges:
- 0.0.0.0/0
```
### Pattern 3: StatefulSet with Headless Service
```yaml
apiVersion: v1
kind: Service
metadata:
name: cassandra
spec:
clusterIP: None
selector:
app: cassandra
ports:
- port: 9042
targetPort: 9042
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra
spec:
serviceName: cassandra
replicas: 3
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: cassandra:4.0
```
### Pattern 4: External Service Mapping
```yaml
apiVersion: v1
kind: Service
metadata:
name: external-database
spec:
type: ExternalName
externalName: prod-db.cxyz.us-west-2.rds.amazonaws.com
---
# Or with Endpoints for IP-based external service
apiVersion: v1
kind: Service
metadata:
name: external-api
spec:
ports:
- port: 443
targetPort: 443
protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
name: external-api
subsets:
- addresses:
- ip: 203.0.113.100
ports:
- port: 443
```
### Pattern 5: Multi-Port Service with Metrics
```yaml
apiVersion: v1
kind: Service
metadata:
name: web-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9090"
prometheus.io/path: "/metrics"
spec:
type: ClusterIP
selector:
app: web-app
ports:
- name: http
port: 80
targetPort: 8080
- name: metrics
port: 9090
targetPort: 9090
```
## Network Policies
Control traffic to services:
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
```
## Best Practices
### Service Configuration
1. **Use named ports** for flexibility
2. **Set appropriate service type** based on exposure needs
3. **Use labels and selectors consistently** across Deployments and Services
4. **Configure session affinity** for stateful apps
5. **Set external traffic policy to Local** for IP preservation
6. **Use headless services** for StatefulSets
7. **Implement network policies** for security
8. **Add monitoring annotations** for observability
### Production Checklist
- [ ] Service type appropriate for use case
- [ ] Selector matches pod labels
- [ ] Named ports used for clarity
- [ ] Session affinity configured if needed
- [ ] Traffic policy set appropriately
- [ ] Load balancer annotations configured (if applicable)
- [ ] Source IP ranges restricted (for public services)
- [ ] Health check configuration validated
- [ ] Monitoring annotations added
- [ ] Network policies defined
### Performance Tuning
**For high traffic:**
```yaml
spec:
externalTrafficPolicy: Local
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 3600
```
**For WebSocket/long connections:**
```yaml
spec:
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 86400 # 24 hours
```
## Troubleshooting
### Service not accessible
```bash
# Check service exists
kubectl get service <service-name>
# Check endpoints (should show pod IPs)
kubectl get endpoints <service-name>
# Describe service
kubectl describe service <service-name>
# Check if pods match selector
kubectl get pods -l app=<app-name>
```
**Common issues:**
- Selector doesn't match pod labels
- No pods running (endpoints empty)
- Ports misconfigured
- Network policy blocking traffic
### DNS resolution failing
```bash
# Test DNS from pod
kubectl run debug --rm -it --image=busybox -- nslookup <service-name>
# Check CoreDNS
kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl logs -n kube-system -l k8s-app=kube-dns
```
### Load balancer issues
```bash
# Check load balancer status
kubectl describe service <service-name>
# Check events
kubectl get events --sort-by='.lastTimestamp'
# Verify cloud provider configuration
kubectl describe node
```
## Related Resources
- [Kubernetes Service API Reference](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#service-v1-core)
- [Service Networking](https://kubernetes.io/docs/concepts/services-networking/service/)
- [DNS for Services and Pods](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/)

View File

@@ -0,0 +1,334 @@
---
name: k8s-security-policies
description: Implement Kubernetes security policies including NetworkPolicy, PodSecurityPolicy, and RBAC for production-grade security. Use when securing Kubernetes clusters, implementing network isolation, or enforcing pod security standards.
---
# Kubernetes Security Policies
Comprehensive guide for implementing NetworkPolicy, PodSecurityPolicy, RBAC, and Pod Security Standards in Kubernetes.
## Purpose
Implement defense-in-depth security for Kubernetes clusters using network policies, pod security standards, and RBAC.
## When to Use This Skill
- Implement network segmentation
- Configure pod security standards
- Set up RBAC for least-privilege access
- Create security policies for compliance
- Implement admission control
- Secure multi-tenant clusters
## Pod Security Standards
### 1. Privileged (Unrestricted)
```yaml
apiVersion: v1
kind: Namespace
metadata:
name: privileged-ns
labels:
pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/audit: privileged
pod-security.kubernetes.io/warn: privileged
```
### 2. Baseline (Minimally restrictive)
```yaml
apiVersion: v1
kind: Namespace
metadata:
name: baseline-ns
labels:
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/audit: baseline
pod-security.kubernetes.io/warn: baseline
```
### 3. Restricted (Most restrictive)
```yaml
apiVersion: v1
kind: Namespace
metadata:
name: restricted-ns
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
```
## Network Policies
### Default Deny All
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
```
### Allow Frontend to Backend
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
```
### Allow DNS
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
```
**Reference:** See `assets/network-policy-template.yaml`
## RBAC Configuration
### Role (Namespace-scoped)
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-reader
namespace: production
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
```
### ClusterRole (Cluster-wide)
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: secret-reader
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "watch", "list"]
```
### RoleBinding
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
namespace: production
subjects:
- kind: User
name: jane
apiGroup: rbac.authorization.k8s.io
- kind: ServiceAccount
name: default
namespace: production
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
```
**Reference:** See `references/rbac-patterns.md`
## Pod Security Context
### Restricted Pod
```yaml
apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: myapp:1.0
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
```
## Policy Enforcement with OPA Gatekeeper
### ConstraintTemplate
```yaml
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8srequiredlabels
spec:
crd:
spec:
names:
kind: K8sRequiredLabels
validation:
openAPIV3Schema:
type: object
properties:
labels:
type: array
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequiredlabels
violation[{"msg": msg, "details": {"missing_labels": missing}}] {
provided := {label | input.review.object.metadata.labels[label]}
required := {label | label := input.parameters.labels[_]}
missing := required - provided
count(missing) > 0
msg := sprintf("missing required labels: %v", [missing])
}
```
### Constraint
```yaml
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
name: require-app-label
spec:
match:
kinds:
- apiGroups: ["apps"]
kinds: ["Deployment"]
parameters:
labels: ["app", "environment"]
```
## Service Mesh Security (Istio)
### PeerAuthentication (mTLS)
```yaml
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: production
spec:
mtls:
mode: STRICT
```
### AuthorizationPolicy
```yaml
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: allow-frontend
namespace: production
spec:
selector:
matchLabels:
app: backend
action: ALLOW
rules:
- from:
- source:
principals: ["cluster.local/ns/production/sa/frontend"]
```
## Best Practices
1. **Implement Pod Security Standards** at namespace level
2. **Use Network Policies** for network segmentation
3. **Apply least-privilege RBAC** for all service accounts
4. **Enable admission control** (OPA Gatekeeper/Kyverno)
5. **Run containers as non-root**
6. **Use read-only root filesystem**
7. **Drop all capabilities** unless needed
8. **Implement resource quotas** and limit ranges
9. **Enable audit logging** for security events
10. **Regular security scanning** of images
## Compliance Frameworks
### CIS Kubernetes Benchmark
- Use RBAC authorization
- Enable audit logging
- Use Pod Security Standards
- Configure network policies
- Implement secrets encryption at rest
- Enable node authentication
### NIST Cybersecurity Framework
- Implement defense in depth
- Use network segmentation
- Configure security monitoring
- Implement access controls
- Enable logging and monitoring
## Troubleshooting
**NetworkPolicy not working:**
```bash
# Check if CNI supports NetworkPolicy
kubectl get nodes -o wide
kubectl describe networkpolicy <name>
```
**RBAC permission denied:**
```bash
# Check effective permissions
kubectl auth can-i list pods --as system:serviceaccount:default:my-sa
kubectl auth can-i '*' '*' --as system:serviceaccount:default:my-sa
```
## Reference Files
- `assets/network-policy-template.yaml` - Network policy examples
- `assets/pod-security-template.yaml` - Pod security policies
- `references/rbac-patterns.md` - RBAC configuration patterns
## Related Skills
- `k8s-manifest-generator` - For creating secure manifests
- `gitops-workflow` - For automated policy deployment

View File

@@ -0,0 +1,177 @@
# Network Policy Templates
---
# Template 1: Default Deny All (Start Here)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: <namespace>
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
# Template 2: Allow DNS (Essential)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: <namespace>
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
---
# Template 3: Frontend to Backend
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: <namespace>
spec:
podSelector:
matchLabels:
app: backend
tier: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
tier: frontend
ports:
- protocol: TCP
port: 8080
- protocol: TCP
port: 9090
---
# Template 4: Allow Ingress Controller
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-ingress-controller
namespace: <namespace>
spec:
podSelector:
matchLabels:
app: web
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 80
- protocol: TCP
port: 443
---
# Template 5: Allow Monitoring (Prometheus)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-prometheus-scraping
namespace: <namespace>
spec:
podSelector:
matchLabels:
prometheus.io/scrape: "true"
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: monitoring
ports:
- protocol: TCP
port: 9090
---
# Template 6: Allow External HTTPS
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-external-https
namespace: <namespace>
spec:
podSelector:
matchLabels:
app: api-client
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 169.254.169.254/32 # Block metadata service
ports:
- protocol: TCP
port: 443
---
# Template 7: Database Access
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-app-to-database
namespace: <namespace>
spec:
podSelector:
matchLabels:
app: postgres
tier: database
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
tier: backend
ports:
- protocol: TCP
port: 5432
---
# Template 8: Cross-Namespace Communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-prod-namespace
namespace: <namespace>
spec:
podSelector:
matchLabels:
app: api
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
environment: production
podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080

View File

@@ -0,0 +1,187 @@
# RBAC Patterns and Best Practices
## Common RBAC Patterns
### Pattern 1: Read-Only Access
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: read-only
rules:
- apiGroups: ["", "apps", "batch"]
resources: ["*"]
verbs: ["get", "list", "watch"]
```
### Pattern 2: Namespace Admin
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: namespace-admin
namespace: production
rules:
- apiGroups: ["", "apps", "batch", "extensions"]
resources: ["*"]
verbs: ["*"]
```
### Pattern 3: Deployment Manager
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: deployment-manager
namespace: production
rules:
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
```
### Pattern 4: Secret Reader (ServiceAccount)
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: secret-reader
namespace: production
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
resourceNames: ["app-secrets"] # Specific secret only
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: app-secret-reader
namespace: production
subjects:
- kind: ServiceAccount
name: my-app
namespace: production
roleRef:
kind: Role
name: secret-reader
apiGroup: rbac.authorization.k8s.io
```
### Pattern 5: CI/CD Pipeline Access
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cicd-deployer
rules:
- apiGroups: ["apps"]
resources: ["deployments", "replicasets"]
verbs: ["get", "list", "create", "update", "patch"]
- apiGroups: [""]
resources: ["services", "configmaps"]
verbs: ["get", "list", "create", "update", "patch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
```
## ServiceAccount Best Practices
### Create Dedicated ServiceAccounts
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-app
namespace: production
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
template:
spec:
serviceAccountName: my-app
automountServiceAccountToken: false # Disable if not needed
```
### Least-Privilege ServiceAccount
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: my-app-role
namespace: production
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get"]
resourceNames: ["my-app-config"]
```
## Security Best Practices
1. **Use Roles over ClusterRoles** when possible
2. **Specify resourceNames** for fine-grained access
3. **Avoid wildcard permissions** (`*`) in production
4. **Create dedicated ServiceAccounts** for each app
5. **Disable token auto-mounting** if not needed
6. **Regular RBAC audits** to remove unused permissions
7. **Use groups** for user management
8. **Implement namespace isolation**
9. **Monitor RBAC usage** with audit logs
10. **Document role purposes** in metadata
## Troubleshooting RBAC
### Check User Permissions
```bash
kubectl auth can-i list pods --as john@example.com
kubectl auth can-i '*' '*' --as system:serviceaccount:default:my-app
```
### View Effective Permissions
```bash
kubectl describe clusterrole cluster-admin
kubectl describe rolebinding -n production
```
### Debug Access Issues
```bash
kubectl get rolebindings,clusterrolebindings --all-namespaces -o wide | grep my-user
```
## Common RBAC Verbs
- `get` - Read a specific resource
- `list` - List all resources of a type
- `watch` - Watch for resource changes
- `create` - Create new resources
- `update` - Update existing resources
- `patch` - Partially update resources
- `delete` - Delete resources
- `deletecollection` - Delete multiple resources
- `*` - All verbs (avoid in production)
## Resource Scope
### Cluster-Scoped Resources
- Nodes
- PersistentVolumes
- ClusterRoles
- ClusterRoleBindings
- Namespaces
### Namespace-Scoped Resources
- Pods
- Services
- Deployments
- ConfigMaps
- Secrets
- Roles
- RoleBindings