zhongwei/gh-vukhanhtruong-claude-rock-plugins-devops

Fork 0

Files

Zhongwei Li 74928623b2 Initial commit

2025-11-30 09:05:12 +08:00

6.0 KiB

Raw Blame History

Helm Chart Best Practices

CNCF and Helm community standards for production-ready charts.

Chart Metadata Standards

Chart.yaml Requirements

apiVersion: v2 (Helm 3)
Semantic versioning (version, appVersion)
Meaningful description
Keywords for discoverability
Maintainer information

Naming Conventions

Chart names: lowercase, hyphens (no underscores)
Resource names: {{ template "name.fullname" . }}
Avoid hardcoding names

Kubernetes Label Standards

Required labels (app.kubernetes.io/ namespace):*

labels:
  app.kubernetes.io/name: {{ include "chart.name" . }}
  app.kubernetes.io/instance: {{ .Release.Name }}
  app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
  app.kubernetes.io/managed-by: {{ .Release.Service }}
  helm.sh/chart: {{ include "chart.chart" . }}

Selector labels (must be immutable):

selector:
  matchLabels:
    app.kubernetes.io/name: {{ include "chart.name" . }}
    app.kubernetes.io/instance: {{ .Release.Name }}

Security Best Practices

Pod Security Context

podSecurityContext:
  runAsNonRoot: true
  runAsUser: 1000
  fsGroup: 1000
  seccompProfile:
    type: RuntimeDefault  # Production

Container Security Context

securityContext:
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  runAsNonRoot: true
  capabilities:
    drop:
    - ALL

Security Guidelines

Never run as root (UID 0)
Drop all Linux capabilities by default
Use read-only root filesystem when possible
Apply seccomp profiles in production
Avoid privileged containers
Don't expose host ports or namespaces

Resource Management

Always Define Resources

resources:
  limits:
    cpu: 500m
    memory: 256Mi
  requests:
    cpu: 50m
    memory: 64Mi

Resource Sizing Guidelines

Small apps: 50m CPU / 64Mi memory (requests)
Medium apps: 100m CPU / 128Mi memory (requests)
Large apps: 250m+ CPU / 256Mi+ memory (requests)
Limits should be 2-10x requests
Monitor and adjust based on actual usage

Health Checks

Liveness Probe

Detects when container needs restart:

livenessProbe:
  httpGet:
    path: /health
    port: http
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3

Readiness Probe

Detects when container can accept traffic:

readinessProbe:
  httpGet:
    path: /ready
    port: http
  initialDelaySeconds: 5
  periodSeconds: 5
  timeoutSeconds: 3
  failureThreshold: 3

Probe Best Practices

Always define both liveness and readiness
Use appropriate initialDelaySeconds for slow-starting apps
Health endpoints should be lightweight
Don't use same endpoint for liveness and readiness if startup is slow

Values.yaml Organization

Structure

# 1. Replica configuration
replicaCount: 1

# 2. Image configuration
image:
  repository: example/app
  pullPolicy: IfNotPresent
  tag: ""  # Defaults to Chart.appVersion

# 3. Service account
serviceAccount:
  create: true
  name: ""

# 4. Security contexts
podSecurityContext: {}
securityContext: {}

# 5. Service configuration
service:
  type: ClusterIP
  port: 80

# 6. Resources
resources: {}

# 7. Autoscaling
autoscaling:
  enabled: false

# 8. Additional features (Ingress, ConfigMaps, etc.)

Documentation

Comment every major section
Provide examples for complex values
Document accepted value types
Explain default behavior

Template Best Practices

Use Helper Functions

# _helpers.tpl
{{- define "app.fullname" -}}
{{- printf "%s-%s" .Release.Name .Chart.Name | trunc 63 | trimSuffix "-" }}
{{- end }}

Conditional Resources

{{- if .Values.ingress.enabled -}}
apiVersion: networking.k8s.io/v1
kind: Ingress
...
{{- end }}

Checksum Annotations

Force pod restart on config changes:

annotations:
  checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}

NOTES.txt Guidelines

Provide clear post-installation instructions:

1. How to access the application
2. Default credentials (if any)
3. Next steps for configuration
4. Links to documentation
5. Troubleshooting commands

Multi-Environment Patterns

Base + Override Pattern

values.yaml: Base defaults
values-dev.yaml: Development overrides
values-prod.yaml: Production overrides

Environment-Specific Settings

Dev: Debug enabled, minimal resources, verbose logging
Staging: Production-like, moderate resources
Prod: HA, autoscaling, security hardening, monitoring

Common Pitfalls to Avoid

❌ Don't:

Hardcode values in templates
Forget resource limits
Run containers as root
Skip health checks
Use latest image tag
Expose secrets in values.yaml
Create resources without labels
Ignore security contexts

✅ Do:

Use template functions
Define all resources
Use non-root users
Configure probes
Pin specific versions
Reference external secrets
Apply standard labels
Enable security contexts

Testing Checklist

Before deploying:

helm lint passes
helm template renders correctly
All required labels present
Security contexts configured
Resource limits defined
Health checks configured
NOTES.txt provides clear instructions
README documents all values
Dry run succeeds
Test deployment in dev environment

Validation Commands

# Lint chart
helm lint .

# Template rendering
helm template myrelease .

# Dry run
helm install myrelease . --dry-run --debug

# Install to test namespace
kubectl create ns test
helm install myrelease . -n test

# Verify
kubectl get all -n test
helm test myrelease -n test

# Cleanup
helm uninstall myrelease -n test
kubectl delete ns test

6.0 KiB Raw Blame History