etcd is the heart of a Kubernetes cluster - it stores all cluster state including deployments, secrets, configmaps, and PVC definitions. Losing etcd means losing your entire cluster configuration. Yet many homelab setups neglect etcd backups until it’s too late.
This post walks through setting up automated etcd backups using a Kubernetes CronJob that uploads snapshots to MinIO.
The Challenge
etcd runs as a static pod on the control plane node, which makes backing it up trickier than a regular application:
- It requires TLS certificates to connect
- The
etcdctltool needs to run with access to those certs - The backup job must run on the control plane node
The Solution
A CronJob that:
- Runs on the control plane using
nodeSelectorandtolerations - Uses
hostNetwork: trueto connect to etcd on localhost - Mounts the etcd certificates from the host
- Downloads
etcdctlandmc(MinIO client) via an init container - Creates a snapshot, compresses it, and uploads to MinIO
Implementation
The CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
name: etcd-backup
namespace: kube-system
spec:
schedule: "0 1 * * *" # Daily at 1 AM
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
nodeSelector:
node-role.kubernetes.io/control-plane: ""
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
hostNetwork: true
initContainers:
- name: download-tools
image: alpine:latest
command:
- /bin/sh
- -c
- |
wget -q https://dl.min.io/client/mc/release/linux-amd64/mc -O /tools/mc
chmod +x /tools/mc
ETCD_VER=v3.5.12
wget -q https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -O /tmp/etcd.tar.gz
tar xzf /tmp/etcd.tar.gz -C /tmp
cp /tmp/etcd-${ETCD_VER}-linux-amd64/etcdctl /tools/
volumeMounts:
- name: tools
mountPath: /tools
containers:
- name: backup
image: alpine:latest
command:
- /bin/sh
- -c
- |
set -e
/tools/mc alias set minio http://<MINIO_IP>:9000 "$MINIO_ACCESS_KEY" "$MINIO_SECRET_KEY"
/tools/mc mb --ignore-existing minio/etcd-backups
BACKUP_FILE="etcd-snapshot-$(date +%Y%m%d-%H%M%S).db"
ETCDCTL_API=3 /tools/etcdctl snapshot save /tmp/$BACKUP_FILE \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
gzip /tmp/$BACKUP_FILE
/tools/mc cp /tmp/${BACKUP_FILE}.gz minio/etcd-backups/
/tools/mc rm --older-than 7d --recursive --force minio/etcd-backups/ || true
echo "Backup completed: ${BACKUP_FILE}.gz"
env:
- name: MINIO_ACCESS_KEY
valueFrom:
secretKeyRef:
name: minio-backup-credentials
key: access-key
- name: MINIO_SECRET_KEY
valueFrom:
secretKeyRef:
name: minio-backup-credentials
key: secret-key
volumeMounts:
- name: etcd-certs
mountPath: /etc/kubernetes/pki/etcd
readOnly: true
- name: tools
mountPath: /tools
volumes:
- name: etcd-certs
hostPath:
path: /etc/kubernetes/pki/etcd
type: Directory
- name: tools
emptyDir: {}
Key Points
hostNetwork: true - Required because etcd binds to 127.0.0.1:2379. Without host networking, the pod can’t reach etcd.
nodeSelector and tolerations - Ensures the job runs on the control plane where etcd lives and the certificates are accessible.
Init container pattern - Downloads tools at runtime rather than baking them into an image. This keeps things simple and always gets the latest MinIO client.
7-day retention - The mc rm --older-than 7d cleans up old backups automatically.
Creating the Credentials Secret
kubectl create secret generic minio-backup-credentials -n kube-system \
--from-literal=access-key=YOUR_ACCESS_KEY \
--from-literal=secret-key=YOUR_SECRET_KEY
Testing
Trigger a manual backup:
kubectl create job --from=cronjob/etcd-backup etcd-backup-test -n kube-system
kubectl logs -f job/etcd-backup-test -n kube-system
Verify in MinIO:
mc ls minio/etcd-backups/
Restoring from Backup
If disaster strikes:
# Download backup
mc cp minio/etcd-backups/etcd-snapshot-YYYYMMDD-HHMMSS.db.gz /tmp/
gunzip /tmp/etcd-snapshot-*.db.gz
# On control plane node:
systemctl stop kubelet
# Backup current data
mv /var/lib/etcd /var/lib/etcd.bak
# Restore
ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcd-snapshot-*.db \
--data-dir=/var/lib/etcd
# Restart
systemctl start kubelet
Backup Schedule
I run all my backups in sequence overnight:
| Time | Service |
|---|---|
| 1:00 AM | etcd |
| 2:00 AM | PostgreSQL |
| 3:00 AM | ImmuDB |
| 4:00 AM | MinIO → Scaleway (off-site) |
The off-site sync at 4 AM catches all the night’s backups and replicates them to Scaleway S3, providing geographic redundancy.
Conclusion
etcd backups are essential insurance for any Kubernetes cluster. With this CronJob approach, you get:
- Automated daily backups
- Compression to save storage
- Automatic retention management
- Integration with existing MinIO/S3 infrastructure
- Off-site replication (when combined with MinIO mirroring)
The whole setup takes about 15 minutes and could save hours of rebuilding your cluster from scratch.