Better rolling upgrade instructions.

This commit is contained in:
Mikael Frykholm 2025-02-04 10:41:27 +01:00
parent 52e5056e21
commit 56332586bb
Signed by: mifr
GPG key ID: 1467F9D69135C236

View file

@ -80,13 +80,22 @@ internal-sto4-test-k8sc-1.rut.sunet.se Ready <none> 16d v1.28.7
## Day 2 operations:
### Rolling upgrade:
On controllers:
Drain one controller at the time with:
kubectl drain internal-sto4-test-k8sc-0.rut.sunet.se --ignore-daemonset
After the first node is drained restart the calio controller with:
`kubectl rollout restart deployment calico-kube-controllers -n kube-system`
After that restart the calico-node running on that host by deleting it. It should be automatically recreated by the controller.
`kubectl delete pod calico-node-???? -n kube-system`
On workers:
Continue with the workers (Including PG nodes):
kubectl drain internal-sto4-test-k8sw-0.rut.sunet.se --force --ignore-daemonsets --delete-emptydir-data --disable-eviction
`kubectl delete pod calico-node-???? -n kube-system`
## Calico problems
Calico can get in a bad state. Look for problems like `Candidate IP leak handle och too old resource version` in calico-kube-controllers pod. If theese are found calico can be restarted with:
After upgrade: monitor that calico has working access to the cluster and look for problems like `Candidate IP leak handle och too old resource version` in calico-kube-controllers pod. If theese are found calico cane be restarted with:
kubectl rollout restart deployment calico-kube-controllers -n kube-system
kubectl rollout restart daemonset calico-node -n kube-system
This will disrupt the whole cluster for a few seconds.