Files
2025-11-30 09:05:12 +08:00

6.0 KiB

Helm Chart Best Practices

CNCF and Helm community standards for production-ready charts.

Chart Metadata Standards

Chart.yaml Requirements

  • apiVersion: v2 (Helm 3)
  • Semantic versioning (version, appVersion)
  • Meaningful description
  • Keywords for discoverability
  • Maintainer information

Naming Conventions

  • Chart names: lowercase, hyphens (no underscores)
  • Resource names: {{ template "name.fullname" . }}
  • Avoid hardcoding names

Kubernetes Label Standards

Required labels (app.kubernetes.io/ namespace):*

labels:
  app.kubernetes.io/name: {{ include "chart.name" . }}
  app.kubernetes.io/instance: {{ .Release.Name }}
  app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
  app.kubernetes.io/managed-by: {{ .Release.Service }}
  helm.sh/chart: {{ include "chart.chart" . }}

Selector labels (must be immutable):

selector:
  matchLabels:
    app.kubernetes.io/name: {{ include "chart.name" . }}
    app.kubernetes.io/instance: {{ .Release.Name }}

Security Best Practices

Pod Security Context

podSecurityContext:
  runAsNonRoot: true
  runAsUser: 1000
  fsGroup: 1000
  seccompProfile:
    type: RuntimeDefault  # Production

Container Security Context

securityContext:
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true
  runAsNonRoot: true
  capabilities:
    drop:
    - ALL

Security Guidelines

  • Never run as root (UID 0)
  • Drop all Linux capabilities by default
  • Use read-only root filesystem when possible
  • Apply seccomp profiles in production
  • Avoid privileged containers
  • Don't expose host ports or namespaces

Resource Management

Always Define Resources

resources:
  limits:
    cpu: 500m
    memory: 256Mi
  requests:
    cpu: 50m
    memory: 64Mi

Resource Sizing Guidelines

  • Small apps: 50m CPU / 64Mi memory (requests)
  • Medium apps: 100m CPU / 128Mi memory (requests)
  • Large apps: 250m+ CPU / 256Mi+ memory (requests)
  • Limits should be 2-10x requests
  • Monitor and adjust based on actual usage

Health Checks

Liveness Probe

Detects when container needs restart:

livenessProbe:
  httpGet:
    path: /health
    port: http
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3

Readiness Probe

Detects when container can accept traffic:

readinessProbe:
  httpGet:
    path: /ready
    port: http
  initialDelaySeconds: 5
  periodSeconds: 5
  timeoutSeconds: 3
  failureThreshold: 3

Probe Best Practices

  • Always define both liveness and readiness
  • Use appropriate initialDelaySeconds for slow-starting apps
  • Health endpoints should be lightweight
  • Don't use same endpoint for liveness and readiness if startup is slow

Values.yaml Organization

Structure

# 1. Replica configuration
replicaCount: 1

# 2. Image configuration
image:
  repository: example/app
  pullPolicy: IfNotPresent
  tag: ""  # Defaults to Chart.appVersion

# 3. Service account
serviceAccount:
  create: true
  name: ""

# 4. Security contexts
podSecurityContext: {}
securityContext: {}

# 5. Service configuration
service:
  type: ClusterIP
  port: 80

# 6. Resources
resources: {}

# 7. Autoscaling
autoscaling:
  enabled: false

# 8. Additional features (Ingress, ConfigMaps, etc.)

Documentation

  • Comment every major section
  • Provide examples for complex values
  • Document accepted value types
  • Explain default behavior

Template Best Practices

Use Helper Functions

# _helpers.tpl
{{- define "app.fullname" -}}
{{- printf "%s-%s" .Release.Name .Chart.Name | trunc 63 | trimSuffix "-" }}
{{- end }}

Conditional Resources

{{- if .Values.ingress.enabled -}}
apiVersion: networking.k8s.io/v1
kind: Ingress
...
{{- end }}

Checksum Annotations

Force pod restart on config changes:

annotations:
  checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}

NOTES.txt Guidelines

Provide clear post-installation instructions:

1. How to access the application
2. Default credentials (if any)
3. Next steps for configuration
4. Links to documentation
5. Troubleshooting commands

Multi-Environment Patterns

Base + Override Pattern

  • values.yaml: Base defaults
  • values-dev.yaml: Development overrides
  • values-prod.yaml: Production overrides

Environment-Specific Settings

  • Dev: Debug enabled, minimal resources, verbose logging
  • Staging: Production-like, moderate resources
  • Prod: HA, autoscaling, security hardening, monitoring

Common Pitfalls to Avoid

Don't:

  • Hardcode values in templates
  • Forget resource limits
  • Run containers as root
  • Skip health checks
  • Use latest image tag
  • Expose secrets in values.yaml
  • Create resources without labels
  • Ignore security contexts

Do:

  • Use template functions
  • Define all resources
  • Use non-root users
  • Configure probes
  • Pin specific versions
  • Reference external secrets
  • Apply standard labels
  • Enable security contexts

Testing Checklist

Before deploying:

  • helm lint passes
  • helm template renders correctly
  • All required labels present
  • Security contexts configured
  • Resource limits defined
  • Health checks configured
  • NOTES.txt provides clear instructions
  • README documents all values
  • Dry run succeeds
  • Test deployment in dev environment

Validation Commands

# Lint chart
helm lint .

# Template rendering
helm template myrelease .

# Dry run
helm install myrelease . --dry-run --debug

# Install to test namespace
kubectl create ns test
helm install myrelease . -n test

# Verify
kubectl get all -n test
helm test myrelease -n test

# Cleanup
helm uninstall myrelease -n test
kubectl delete ns test

References