6.0 KiB
6.0 KiB
Helm Chart Best Practices
CNCF and Helm community standards for production-ready charts.
Chart Metadata Standards
Chart.yaml Requirements
apiVersion: v2(Helm 3)- Semantic versioning (version, appVersion)
- Meaningful description
- Keywords for discoverability
- Maintainer information
Naming Conventions
- Chart names: lowercase, hyphens (no underscores)
- Resource names:
{{ template "name.fullname" . }} - Avoid hardcoding names
Kubernetes Label Standards
Required labels (app.kubernetes.io/ namespace):*
labels:
app.kubernetes.io/name: {{ include "chart.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
helm.sh/chart: {{ include "chart.chart" . }}
Selector labels (must be immutable):
selector:
matchLabels:
app.kubernetes.io/name: {{ include "chart.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
Security Best Practices
Pod Security Context
podSecurityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault # Production
Container Security Context
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
capabilities:
drop:
- ALL
Security Guidelines
- Never run as root (UID 0)
- Drop all Linux capabilities by default
- Use read-only root filesystem when possible
- Apply seccomp profiles in production
- Avoid privileged containers
- Don't expose host ports or namespaces
Resource Management
Always Define Resources
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 50m
memory: 64Mi
Resource Sizing Guidelines
- Small apps: 50m CPU / 64Mi memory (requests)
- Medium apps: 100m CPU / 128Mi memory (requests)
- Large apps: 250m+ CPU / 256Mi+ memory (requests)
- Limits should be 2-10x requests
- Monitor and adjust based on actual usage
Health Checks
Liveness Probe
Detects when container needs restart:
livenessProbe:
httpGet:
path: /health
port: http
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
Readiness Probe
Detects when container can accept traffic:
readinessProbe:
httpGet:
path: /ready
port: http
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
Probe Best Practices
- Always define both liveness and readiness
- Use appropriate initialDelaySeconds for slow-starting apps
- Health endpoints should be lightweight
- Don't use same endpoint for liveness and readiness if startup is slow
Values.yaml Organization
Structure
# 1. Replica configuration
replicaCount: 1
# 2. Image configuration
image:
repository: example/app
pullPolicy: IfNotPresent
tag: "" # Defaults to Chart.appVersion
# 3. Service account
serviceAccount:
create: true
name: ""
# 4. Security contexts
podSecurityContext: {}
securityContext: {}
# 5. Service configuration
service:
type: ClusterIP
port: 80
# 6. Resources
resources: {}
# 7. Autoscaling
autoscaling:
enabled: false
# 8. Additional features (Ingress, ConfigMaps, etc.)
Documentation
- Comment every major section
- Provide examples for complex values
- Document accepted value types
- Explain default behavior
Template Best Practices
Use Helper Functions
# _helpers.tpl
{{- define "app.fullname" -}}
{{- printf "%s-%s" .Release.Name .Chart.Name | trunc 63 | trimSuffix "-" }}
{{- end }}
Conditional Resources
{{- if .Values.ingress.enabled -}}
apiVersion: networking.k8s.io/v1
kind: Ingress
...
{{- end }}
Checksum Annotations
Force pod restart on config changes:
annotations:
checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
NOTES.txt Guidelines
Provide clear post-installation instructions:
1. How to access the application
2. Default credentials (if any)
3. Next steps for configuration
4. Links to documentation
5. Troubleshooting commands
Multi-Environment Patterns
Base + Override Pattern
values.yaml: Base defaultsvalues-dev.yaml: Development overridesvalues-prod.yaml: Production overrides
Environment-Specific Settings
- Dev: Debug enabled, minimal resources, verbose logging
- Staging: Production-like, moderate resources
- Prod: HA, autoscaling, security hardening, monitoring
Common Pitfalls to Avoid
❌ Don't:
- Hardcode values in templates
- Forget resource limits
- Run containers as root
- Skip health checks
- Use
latestimage tag - Expose secrets in values.yaml
- Create resources without labels
- Ignore security contexts
✅ Do:
- Use template functions
- Define all resources
- Use non-root users
- Configure probes
- Pin specific versions
- Reference external secrets
- Apply standard labels
- Enable security contexts
Testing Checklist
Before deploying:
helm lintpasseshelm templaterenders correctly- All required labels present
- Security contexts configured
- Resource limits defined
- Health checks configured
- NOTES.txt provides clear instructions
- README documents all values
- Dry run succeeds
- Test deployment in dev environment
Validation Commands
# Lint chart
helm lint .
# Template rendering
helm template myrelease .
# Dry run
helm install myrelease . --dry-run --debug
# Install to test namespace
kubectl create ns test
helm install myrelease . -n test
# Verify
kubectl get all -n test
helm test myrelease -n test
# Cleanup
helm uninstall myrelease -n test
kubectl delete ns test