If you’ve ever tried to drain a Kubernetes node in a homelab cluster and found yourself staring at a terminal that just… hangs, you’ve probably run into PodDisruptionBudget (PDB) conflicts. Here’s why it happens and how to fix it.
The Problem
I was upgrading my Kubernetes cluster from 1.34 to 1.35, which requires draining each node before upgrading. Simple enough, right?
kubectl drain k8s-worker01 --ignore-daemonsets --delete-emptydir-data
And then… nothing. The command just sat there. No error, no progress, just waiting.
Understanding PodDisruptionBudgets
PodDisruptionBudgets are Kubernetes’ way of ensuring high availability during voluntary disruptions (like node drains, upgrades, or scaling down). A PDB might say “always keep at least 1 replica running” or “never take down more than 25% of pods at once.”
Here’s a typical PDB:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
minAvailable: 1
selector:
matchLabels:
app: my-app
This PDB says “there must always be at least 1 pod of my-app running.”
The Homelab Catch-22
Here’s where homelabs differ from production clusters:
| Environment | Typical Setup | PDB Impact |
|---|---|---|
| Production | 3+ replicas across multiple nodes | Drain one node, others still serve traffic |
| Homelab | 1 replica (resource constraints) | Can’t drain without violating PDB |
Many Helm charts ship with PDBs by default. PostgreSQL, Redis, Kafka, Grafana - they all assume you’re running multiple replicas. In a homelab with limited resources, you’re often running single replicas of everything.
When you try to drain a node with a single-replica StatefulSet that has a PDB requiring minAvailable: 1, Kubernetes enters a deadlock:
- Drain command says: “I need to evict this pod”
- PDB says: “You can’t, minimum 1 must be available”
- Pod can’t move until it’s evicted
- Drain waits forever
The Solution: –disable-eviction
The --disable-eviction flag tells kubectl drain to delete pods directly instead of using the Eviction API:
kubectl drain k8s-worker01 \
--ignore-daemonsets \
--delete-emptydir-data \
--disable-eviction
What’s the Difference?
| Method | Respects PDBs | Use Case |
|---|---|---|
| Normal eviction | Yes | Production with proper replica counts |
--disable-eviction | No | Homelabs, single-replica workloads |
With --disable-eviction, Kubernetes deletes the pod directly. The StatefulSet controller then reschedules it on another node. Yes, there’s brief downtime, but for a homelab, that’s usually acceptable.
When to Use It
Use --disable-eviction when:
- You’re running single-replica StatefulSets
- You have PDBs that require minimum availability you can’t satisfy
- You’re in a homelab/dev environment where brief downtime is acceptable
- You need to perform maintenance and the drain is blocked
Don’t use it when:
- You’re in production and can’t afford any downtime
- You have enough replicas to satisfy PDBs
- The workload is critical and must remain available
Common Culprits
These Helm charts commonly include PDBs that will block drains in single-replica setups:
- PostgreSQL (Bitnami, Zalando) - Often defaults to
minAvailable: 1 - Redis - Sentinel and cluster modes have PDBs
- Kafka - Strimzi and Confluent operators add PDBs
- Grafana - Has a PDB in production-ready configs
- Prometheus - Operator-managed instances often have PDBs
Finding Problematic PDBs
Before draining, check what PDBs exist:
kubectl get pdb --all-namespaces
Look for any where ALLOWED DISRUPTIONS is 0:
NAMESPACE NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
database postgres-pdb 1 N/A 0 30d
monitoring grafana-pdb 1 N/A 0 30d
If allowed disruptions is 0 and you’re about to drain the node where that pod lives, you’ll get stuck.
Alternative Approaches
Option 1: Disable PDBs in Helm Values
Many charts let you disable PDBs:
# PostgreSQL Bitnami chart
pdb:
create: false
Option 2: Increase Replica Count Temporarily
kubectl scale statefulset postgres --replicas=2
# Wait for second replica to be ready
kubectl drain k8s-worker01 --ignore-daemonsets
# Scale back down
kubectl scale statefulset postgres --replicas=1
Option 3: Delete the PDB Temporarily
kubectl get pdb postgres-pdb -o yaml > /tmp/postgres-pdb.yaml
kubectl delete pdb postgres-pdb
kubectl drain k8s-worker01 --ignore-daemonsets
kubectl apply -f /tmp/postgres-pdb.yaml
Option 4: Just Use –disable-eviction
For homelabs, this is usually the simplest. Accept the brief downtime and move on.
My Upgrade Workflow
Here’s what I do when upgrading my homelab cluster:
# 1. Check what's running where
kubectl get pods -A -o wide | grep worker01
# 2. Check PDBs that might block
kubectl get pdb -A
# 3. Drain with disable-eviction
kubectl drain k8s-worker01 \
--ignore-daemonsets \
--delete-emptydir-data \
--disable-eviction
# 4. Perform upgrade
sudo dnf upgrade -y kubeadm kubelet kubectl
sudo systemctl restart kubelet
# 5. Uncordon
kubectl uncordon k8s-worker01
The Ansible Way
If you’re using Ansible for cluster upgrades (and you should be), here’s how to incorporate it:
- name: Drain node
command: >
kubectl drain {{ inventory_hostname }}
--ignore-daemonsets
--delete-emptydir-data
--disable-eviction
--timeout=300s
delegate_to: "{{ groups['control_plane'][0] }}"
Conclusion
PodDisruptionBudgets are a great feature for production clusters with proper redundancy. But in homelabs where resources are limited and single-replica workloads are common, they can turn a simple node drain into an infinite wait.
The --disable-eviction flag is your friend. It bypasses PDB checks and lets you get on with maintenance. Yes, you’ll have brief downtime for affected workloads, but that’s usually a fair trade-off for actually being able to maintain your cluster.
Remember:
- Check PDBs before draining:
kubectl get pdb -A - Use
--disable-evictionwhen blocked by single-replica PDBs - Accept brief downtime in exchange for operational simplicity
- Consider disabling PDBs in Helm values for homelab deployments
Your homelab doesn’t need to pretend it’s a Fortune 500 production cluster. Embrace the constraints and use the tools that make sense for your environment.