From 56332586bbc67b9651413ccbe2e60381d31f125c Mon Sep 17 00:00:00 2001
From: Mikael Frykholm <mifr@sunet.se>
Date: Tue, 4 Feb 2025 10:41:27 +0100
Subject: [PATCH] Better rolling upgrade instructions.

---
 README.md | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/README.md b/README.md
index 2a64abd..9d80a24 100644
--- a/README.md
+++ b/README.md
@@ -80,13 +80,22 @@ internal-sto4-test-k8sc-1.rut.sunet.se   Ready      <none>   16d   v1.28.7
 
 ## Day 2 operations:
 ### Rolling upgrade:
-On controllers:
+
+Drain one controller at the time with:
 kubectl drain internal-sto4-test-k8sc-0.rut.sunet.se  --ignore-daemonset
+After the first node is drained restart the calio controller with:
+`kubectl rollout restart deployment calico-kube-controllers -n kube-system`
+After that restart the calico-node running on that host by deleting it. It should be automatically recreated by the controller. 
+`kubectl delete pod calico-node-???? -n kube-system`
 
-On workers:
+Continue with the workers (Including PG nodes):
 kubectl drain internal-sto4-test-k8sw-0.rut.sunet.se --force --ignore-daemonsets --delete-emptydir-data --disable-eviction
+`kubectl delete pod calico-node-???? -n kube-system`
+
+## Calico problems
+Calico can get in a bad state. Look for problems like `Candidate IP leak handle och too old resource version` in calico-kube-controllers pod. If theese are found calico can be restarted with:
 
-After upgrade: monitor that calico has working access to the cluster and look for problems like `Candidate IP leak handle och too old resource version` in calico-kube-controllers pod. If theese are found calico cane be restarted with:
 kubectl rollout restart deployment calico-kube-controllers -n kube-system
 kubectl rollout restart daemonset calico-node -n kube-system
 
+This will disrupt the whole cluster for a few seconds.