This commit is contained in:
Mikael Frykholm 2025-02-04 10:51:12 +01:00
parent 56332586bb
commit 778eceba2b
Signed by: mifr
GPG key ID: 1467F9D69135C236

View file

@ -82,20 +82,23 @@ internal-sto4-test-k8sc-1.rut.sunet.se Ready <none> 16d v1.28.7
### Rolling upgrade:
Drain one controller at the time with:
kubectl drain internal-sto4-test-k8sc-0.rut.sunet.se --ignore-daemonset
After the first node is drained restart the calio controller with:
`kubectl rollout restart deployment calico-kube-controllers -n kube-system`
kubectl drain internal-sto4-test-k8sc-0.rut.sunet.se --ignore-daemonset
After the first node is drained restart the calico controller with:
kubectl rollout restart deployment calico-kube-controllers -n kube-system
After that restart the calico-node running on that host by deleting it. It should be automatically recreated by the controller.
`kubectl delete pod calico-node-???? -n kube-system`
kubectl delete pod calico-node-???? -n kube-system
Continue with the workers (Including PG nodes):
kubectl drain internal-sto4-test-k8sw-0.rut.sunet.se --force --ignore-daemonsets --delete-emptydir-data --disable-eviction
`kubectl delete pod calico-node-???? -n kube-system`
## Calico problems
kubectl drain internal-sto4-test-k8sw-0.rut.sunet.se --force --ignore-daemonsets --delete-emptydir-data --disable-eviction
kubectl delete pod calico-node-???? -n kube-system ```
### Calico problems
Calico can get in a bad state. Look for problems like `Candidate IP leak handle och too old resource version` in calico-kube-controllers pod. If theese are found calico can be restarted with:
kubectl rollout restart deployment calico-kube-controllers -n kube-system
kubectl rollout restart daemonset calico-node -n kube-system
kubectl rollout restart deployment calico-kube-controllers -n kube-system
kubectl rollout restart daemonset calico-node -n kube-system
This will disrupt the whole cluster for a few seconds.
This will disrupt the whole cluster for a few seconds.