4.2 KiB
4.2 KiB
GitOps Best Practices (2024-2025)
CNCF GitOps Principles (OpenGitOps v1.0)
- Declarative: System desired state expressed declaratively
- Versioned: State stored in version control (Git)
- Automated: Changes automatically applied
- Continuous Reconciliation: Software agents ensure desired state
- Auditable: All changes tracked in Git history
Repository Organization
✅ DO:
- Separate infrastructure from applications
- Use clear directory structure (apps/, infrastructure/, clusters/)
- Implement environment promotion (dev → staging → prod)
- Use Kustomize overlays for environment differences
❌ DON'T:
- Commit secrets to Git (use SOPS/Sealed Secrets/ESO)
- Use
:latestimage tags (pin to specific versions) - Make manual cluster changes (everything through Git)
- Skip testing in lower environments
Security Best Practices
- Secrets: Never plain text, use encryption or external stores
- RBAC: Least privilege for GitOps controllers
- Image Security: Pin to digests, scan for vulnerabilities
- Network Policies: Restrict controller traffic
- Audit: Enable audit logging
ArgoCD 3.x Specific
Fine-Grained RBAC (new in 3.0):
p, role:dev, applications, *, dev/*, allow
p, role:dev, applications/resources, *, dev/*/Deployment/*, allow
Resource Exclusions (default in 3.0):
- Reduces API load
- Excludes high-churn resources (Endpoints, Leases)
Annotation Tracking (default):
- More reliable than labels
- Auto-migrates on sync
Flux 2.7 Specific
OCI Artifacts (GA in 2.6):
- Prefer OCI over Git for generated configs
- Use digest pinning for immutability
- Sign artifacts with cosign/notation
Image Automation (GA in 2.7):
- Automated image updates
- GitRepository write-back
Source-Watcher (new in 2.7):
- Improves reconciliation efficiency
- Enable with:
--components-extra=source-watcher
CI/CD Integration
Git Workflow:
1. Developer commits to feature branch
2. CI runs tests, builds image
3. CI updates Git manifest with new image tag
4. Developer creates PR to main
5. GitOps controller syncs after merge
Don't: Deploy directly from CI to cluster (breaks GitOps) Do: Update Git from CI, let GitOps deploy
Monitoring & Observability
Track:
- Sync success rate
- Reconciliation time
- Drift detection frequency
- Failed syncs/reconciliations
Tools:
- Prometheus metrics (both ArgoCD and Flux)
- Grafana dashboards
- Alert on sync failures
Image Management
✅ Good:
image: myapp:v1.2.3
image: myapp@sha256:abc123...
❌ Bad:
image: myapp:latest
image: myapp:dev
Strategy: Semantic versioning + digest pinning
Environment Promotion
Recommended Flow:
Dev (auto-sync) → Staging (auto-sync) → Production (manual approval)
Implementation:
- Separate directories or repos per environment
- PR-based promotion
- Automated tests before promotion
- Manual approval for production
Disaster Recovery
- Git is Source of Truth: Cluster can be rebuilt from Git
- Backup: Git repo + cluster state
- Test Recovery: Practice cluster rebuild
- Document Bootstrap: How to restore from scratch
Performance Optimization
ArgoCD:
- Use ApplicationSets for multi-cluster
- Enable resource exclusions (3.x default)
- Server-side diff for large apps
Flux:
- Use OCI artifacts for large repos
- Enable source-watcher (2.7)
- Tune reconciliation intervals
Common Anti-Patterns to Avoid
- Manual kubectl apply: Bypasses GitOps, creates drift
- Multiple sources of truth: Git should be only source
- Secrets in Git: Always encrypt
- Direct cluster modifications: All changes through Git
- No testing: Always test in dev/staging first
- Missing RBAC: Controllers need minimal permissions
2025 Trends
✅ Adopt:
- OCI artifacts (Flux)
- Workload identity (no static credentials)
- SOPS + age (over PGP)
- External Secrets Operator (dynamic secrets)
- Multi-cluster with ApplicationSets/Flux
⚠️ Avoid:
- Label-based tracking (use annotations - ArgoCD 3.x default)
- PGP encryption (use age)
- Long-lived service account tokens (use workload identity)