If you’ve ever tried to drain a Kubernetes node in a homelab cluster and found yourself staring at a terminal that just… hangs, you’ve probably run into PodDisruptionBudget (PDB) conflicts. Here’s why it happens and how to fix it.

The Problem

I was upgrading my Kubernetes cluster from 1.34 to 1.35, which requires draining each node before upgrading. Simple enough, right?

kubectl drain k8s-worker01 --ignore-daemonsets --delete-emptydir-data

And then… nothing. The command just sat there. No error, no progress, just waiting.

Understanding PodDisruptionBudgets

PodDisruptionBudgets are Kubernetes’ way of ensuring high availability during voluntary disruptions (like node drains, upgrades, or scaling down). A PDB might say “always keep at least 1 replica running” or “never take down more than 25% of pods at once.”

Here’s a typical PDB:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: my-app

This PDB says “there must always be at least 1 pod of my-app running.”

The Homelab Catch-22

Here’s where homelabs differ from production clusters:

EnvironmentTypical SetupPDB Impact
Production3+ replicas across multiple nodesDrain one node, others still serve traffic
Homelab1 replica (resource constraints)Can’t drain without violating PDB

Many Helm charts ship with PDBs by default. PostgreSQL, Redis, Kafka, Grafana - they all assume you’re running multiple replicas. In a homelab with limited resources, you’re often running single replicas of everything.

When you try to drain a node with a single-replica StatefulSet that has a PDB requiring minAvailable: 1, Kubernetes enters a deadlock:

  1. Drain command says: “I need to evict this pod”
  2. PDB says: “You can’t, minimum 1 must be available”
  3. Pod can’t move until it’s evicted
  4. Drain waits forever

The Solution: –disable-eviction

The --disable-eviction flag tells kubectl drain to delete pods directly instead of using the Eviction API:

kubectl drain k8s-worker01 \
  --ignore-daemonsets \
  --delete-emptydir-data \
  --disable-eviction

What’s the Difference?

MethodRespects PDBsUse Case
Normal evictionYesProduction with proper replica counts
--disable-evictionNoHomelabs, single-replica workloads

With --disable-eviction, Kubernetes deletes the pod directly. The StatefulSet controller then reschedules it on another node. Yes, there’s brief downtime, but for a homelab, that’s usually acceptable.

When to Use It

Use --disable-eviction when:

  • You’re running single-replica StatefulSets
  • You have PDBs that require minimum availability you can’t satisfy
  • You’re in a homelab/dev environment where brief downtime is acceptable
  • You need to perform maintenance and the drain is blocked

Don’t use it when:

  • You’re in production and can’t afford any downtime
  • You have enough replicas to satisfy PDBs
  • The workload is critical and must remain available

Common Culprits

These Helm charts commonly include PDBs that will block drains in single-replica setups:

  • PostgreSQL (Bitnami, Zalando) - Often defaults to minAvailable: 1
  • Redis - Sentinel and cluster modes have PDBs
  • Kafka - Strimzi and Confluent operators add PDBs
  • Grafana - Has a PDB in production-ready configs
  • Prometheus - Operator-managed instances often have PDBs

Finding Problematic PDBs

Before draining, check what PDBs exist:

kubectl get pdb --all-namespaces

Look for any where ALLOWED DISRUPTIONS is 0:

NAMESPACE    NAME              MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
database     postgres-pdb      1               N/A               0                     30d
monitoring   grafana-pdb       1               N/A               0                     30d

If allowed disruptions is 0 and you’re about to drain the node where that pod lives, you’ll get stuck.

Alternative Approaches

Option 1: Disable PDBs in Helm Values

Many charts let you disable PDBs:

# PostgreSQL Bitnami chart
pdb:
  create: false

Option 2: Increase Replica Count Temporarily

kubectl scale statefulset postgres --replicas=2
# Wait for second replica to be ready
kubectl drain k8s-worker01 --ignore-daemonsets
# Scale back down
kubectl scale statefulset postgres --replicas=1

Option 3: Delete the PDB Temporarily

kubectl get pdb postgres-pdb -o yaml > /tmp/postgres-pdb.yaml
kubectl delete pdb postgres-pdb
kubectl drain k8s-worker01 --ignore-daemonsets
kubectl apply -f /tmp/postgres-pdb.yaml

Option 4: Just Use –disable-eviction

For homelabs, this is usually the simplest. Accept the brief downtime and move on.

My Upgrade Workflow

Here’s what I do when upgrading my homelab cluster:

# 1. Check what's running where
kubectl get pods -A -o wide | grep worker01

# 2. Check PDBs that might block
kubectl get pdb -A

# 3. Drain with disable-eviction
kubectl drain k8s-worker01 \
  --ignore-daemonsets \
  --delete-emptydir-data \
  --disable-eviction

# 4. Perform upgrade
sudo dnf upgrade -y kubeadm kubelet kubectl
sudo systemctl restart kubelet

# 5. Uncordon
kubectl uncordon k8s-worker01

The Ansible Way

If you’re using Ansible for cluster upgrades (and you should be), here’s how to incorporate it:

- name: Drain node
  command: >
    kubectl drain {{ inventory_hostname }}
    --ignore-daemonsets
    --delete-emptydir-data
    --disable-eviction
    --timeout=300s
  delegate_to: "{{ groups['control_plane'][0] }}"

Conclusion

PodDisruptionBudgets are a great feature for production clusters with proper redundancy. But in homelabs where resources are limited and single-replica workloads are common, they can turn a simple node drain into an infinite wait.

The --disable-eviction flag is your friend. It bypasses PDB checks and lets you get on with maintenance. Yes, you’ll have brief downtime for affected workloads, but that’s usually a fair trade-off for actually being able to maintain your cluster.

Remember:

  • Check PDBs before draining: kubectl get pdb -A
  • Use --disable-eviction when blocked by single-replica PDBs
  • Accept brief downtime in exchange for operational simplicity
  • Consider disabling PDBs in Helm values for homelab deployments

Your homelab doesn’t need to pretend it’s a Fortune 500 production cluster. Embrace the constraints and use the tools that make sense for your environment.