A blog post was committed and pushed, CI built and pushed the image, but the deployed site showed old content. This post documents the debugging process and the fixes to prevent stale builds.
The Problem
After pushing a new blog post:
- GitLab CI pipeline succeeded
- Kaniko pushed the image to Harbor
- ArgoCD deployed the new image
- The blog showed old content - new post missing
Root Causes
Two caching layers caused the issue:
1. Kaniko Layer Cache
Kaniko caches intermediate build layers to speed up subsequent builds. The cache is stored in a separate repository:
/kaniko/executor \
--cache=true \
--cache-repo="${IMAGE}/cache"
When the cache key matches (based on Dockerfile instructions, not file content checksums), Kaniko reuses the cached layer. This can result in stale content if the cache isn’t properly invalidated.
2. Kubernetes Node Image Cache
Kubernetes nodes cache pulled images locally. With imagePullPolicy: IfNotPresent, if an image tag exists in the node’s cache, Kubernetes won’t pull the updated image from the registry - even if the registry has a newer image with the same tag.
image: harbor.minoko.life/project/app:latest # Cached on node
imagePullPolicy: IfNotPresent # Won't re-pull
The Debugging Process
Verify the image was built correctly
Check the CI job logs for the page count:
Pages │ 173 # Expected 176 with new post
Check what’s in the running container
kubectl exec -n minoko-life-blog deploy/minoko-life-blog -- \
ls /usr/share/nginx/html/posts/ | grep nginx-ingress
# No output - post missing
Compare image digests
Check what the pod is actually running:
kubectl get pod -n minoko-life-blog \
-o jsonpath='{.items[0].status.containerStatuses[0].imageID}'
# harbor.minoko.life/project/app@sha256:b45b611...
Check what’s in the registry:
curl -s -u "user:pass" -I \
-H "Accept: application/vnd.oci.image.manifest.v1+json" \
"https://harbor.minoko.life/v2/project/app/manifests/latest" \
| grep docker-content-digest
# sha256:310ebf8... # Different!
The digests differ - the registry has a newer image, but the node is using a cached older one.
The Fix
1. Proper Kaniko Cache Busting
The Kaniko cache uses Dockerfile instructions as cache keys. Build args declared with ARG before a layer become part of that layer’s cache key. Add the commit SHA as a build arg:
# Dockerfile
FROM docker.io/hugomods/hugo:exts-0.154.0 AS builder
# Cache-busting: declaring ARG before RUN makes it part of the cache key
ARG GIT_COMMIT
WORKDIR /src
COPY . .
RUN hugo --minify
Then pass the commit SHA in CI:
# .gitlab-ci.yml
/kaniko/executor \
--context=$CI_PROJECT_DIR \
--dockerfile=$CI_PROJECT_DIR/Dockerfile \
--build-arg=GIT_COMMIT=$CI_COMMIT_SHA \
--destination ${IMAGE}:${CI_COMMIT_SHORT_SHA} \
--cache=true \
--cache-repo="${IMAGE}/cache"
Each commit gets a unique GIT_COMMIT value, invalidating the cache for the Hugo build layer while still caching the base image layers.
2. Stop Using Mutable Tags
Never use latest or other mutable tags in production. Use immutable tags based on commit SHA:
# Before: pushes both SHA and latest
DESTINATIONS="--destination ${IMAGE}:${CI_COMMIT_SHORT_SHA}"
if [ "$CI_COMMIT_REF_NAME" = "main" ]; then
DESTINATIONS="${DESTINATIONS} --destination ${IMAGE}:latest"
fi
# After: only push SHA
--destination ${IMAGE}:${CI_COMMIT_SHORT_SHA}
3. Use imagePullPolicy: Always
For deployments where images are updated frequently:
# values.yaml
image:
repository: harbor.minoko.life/project/app
pullPolicy: Always
tag: "" # No default - must be set explicitly
4. Require Explicit Tags
Add validation to Helm templates to prevent deployment without a tag:
# templates/deployment.yaml
{{- if not .Values.image.tag }}
{{- fail "image.tag is required - must be set by ArgoCD Image Updater" }}
{{- end }}
5. Configure ArgoCD Image Updater
Update the ImageUpdater to discover tags by commit SHA pattern:
# image-updater.yaml
images:
- alias: blog
imageName: harbor.minoko.life/project/app # No tag
commonUpdateSettings:
allowTags: "regexp:^[a-f0-9]{7,8}$" # Match git short SHA
updateStrategy: newest-build
Clearing Stale Cache
If you need to clear existing cache:
Delete Kaniko cache repository
curl -u "admin:password" -X DELETE \
"https://harbor.minoko.life/api/v2.0/projects/project/repositories/app%2Fcache"
Force Kubernetes to pull new image
Option 1: Use digest instead of tag:
kubectl set image deployment/app app=harbor.minoko.life/project/app@sha256:310ebf8...
Option 2: Delete the pod (with imagePullPolicy: Always):
kubectl delete pod -l app=myapp
Summary
| Setting | Problematic | Recommended |
|---|---|---|
| Kaniko cache | No cache busting | ARG GIT_COMMIT + --build-arg |
| Image tag | latest | ${CI_COMMIT_SHORT_SHA} |
| Pull policy | IfNotPresent | Always |
| Default tag | tag: "latest" | tag: "" (none) |
| Tag validation | None | Helm fail if empty |
The combination of proper cache busting, immutable tags, and imagePullPolicy: Always ensures that every deployment uses exactly the image that was built, with no ambiguity from caching at any layer.