3.1 KiB
GitOps Troubleshooting Guide (2024-2025)
Common ArgoCD Issues
1. Application OutOfSync
Symptoms: Application shows OutOfSync status Causes: Git changes not applied, manual cluster changes Fix:
argocd app sync my-app
argocd app diff my-app # See differences
2. Annotation Tracking Migration (ArgoCD 3.x)
Symptoms: Resources not tracked after upgrade to 3.x Cause: Default changed from labels to annotations Fix: Resources auto-migrate on next sync, or force:
argocd app sync my-app --force
3. Sync Fails with "Resource is Invalid"
Cause: YAML validation error, CRD mismatch Fix:
argocd app get my-app --show-operation
kubectl apply --dry-run=client -f manifest.yaml # Test locally
4. Image Pull Errors
Cause: Registry credentials, network issues Fix:
kubectl get events -n <namespace>
kubectl describe pod <pod-name> -n <namespace>
# Check image pull secret
kubectl get secret -n <namespace>
Common Flux Issues
1. GitRepository Not Ready
Symptoms: source not ready, no artifact Causes: Auth failure, branch doesn't exist Fix:
flux get sources git
flux reconcile source git <name> -n flux-system
kubectl describe gitrepository <name> -n flux-system
2. Kustomization Build Failed
Cause: Invalid kustomization.yaml, missing resources Fix:
flux get kustomizations
kubectl describe kustomization <name> -n flux-system
# Test locally
kustomize build <path>
3. HelmRelease Install Failed
Cause: Values error, chart incompatibility Fix:
flux get helmreleases
kubectl logs -n flux-system -l app=helm-controller
# Test locally
helm template <chart> -f values.yaml
4. OCI Repository Issues (Flux 2.6+)
Cause: Registry auth, OCI artifact not found Fix:
flux get sources oci
kubectl describe ocirepository <name>
# Verify artifact exists
crane digest ghcr.io/org/app:v1.0.0
SOPS Decryption Failures
Symptom: Secret not decrypted Fix:
# Check age secret exists
kubectl get secret sops-age -n flux-system
# Test decryption locally
export SOPS_AGE_KEY_FILE=key.txt
sops -d secret.enc.yaml
Performance Issues
ArgoCD Slow Syncs
Cause: Too many resources, inefficient queries Fix (ArgoCD 3.x):
- Use default resource exclusions
- Enable server-side diff
- Increase controller replicas
Flux Slow Reconciliation
Cause: Large monorepos, many sources Fix (Flux 2.7+):
- Enable source-watcher
- Increase interval
- Use OCI artifacts instead of Git
Debugging Commands
ArgoCD:
argocd app get <app> --refresh
argocd app logs <app>
kubectl logs -n argocd -l app.kubernetes.io/name=argocd-application-controller
Flux:
flux logs --all-namespaces
flux check
flux get all
kubectl -n flux-system get events --sort-by='.lastTimestamp'
Quick Wins
- Use
--dry-runbefore applying - Check controller logs first
- Verify RBAC permissions
- Test manifests locally (kubectl apply --dry-run, kustomize build)
- Check Git connectivity (credentials, network)