If you see the following Alert in your cluster SupportBundle then you have been hit by a known ekco
bug:
Kubeadm cluster status ConfigMap is in an inconsistent state.
This error may cause Cluster upgrades to fail and to solve it a manual intervention is required.
Why am I being hit by this ?
Kubeadm uses a ConfigMap called kubeadm-config
in the kube-system
namespace to keep track of what are the configured Control Plane API endpoints. If you are seeing this error it means that at some stage in the past you had used ekco-purge-node
command line utility to remove one of your cluster nodes.
Due to a bug, ekco-purge-node
was rendering an invalid ClusterStatus
YAML and writing it in the kubeadm-config
ConfigMap. This invalid YAML can’t then be read by kubeadm
anymore so cluster upgrades may fail.
How to solve it
We need to manually adjust the ConfigMap, this can be achieved by editing the ConfigMap with the following command:
kubectl edit cm kubeadm-config -n kube-system
For sake of simplicity we should focus only on the ClusterStatus
property of the ConfigMap. After running the command above you most likely gonna see something very similar to this (the YAMLs below are redacted for easier understanding):
apiVersion: v1
kind: ConfigMap
metadata:
name: kubeadm-config
namespace: kube-system
data:
ClusterConfiguration: |
<redacted>
ClusterStatus: |
typemeta:
apiversion: kubeadm.k8s.io/v1beta2
kind: ClusterStatus
apiendpoints:
ip-172-16-10-30:
advertiseaddress: 172.16.10.30
bindport: 6443
ip-172-16-10-88:
advertiseaddress: 172.16.10.88
bindport: 6443
The ClusterStatus property must be adjusted to something like this:
apiVersion: v1
kind: ConfigMap
metadata:
name: kubeadm-config
namespace: kube-system
data:
ClusterConfiguration: |
<redacted>
ClusterStatus: |
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterStatus
apiEndpoints:
ip-172-16-10-30:
advertiseAddress: 172.16.10.30
bindPort: 6443
ip-172-16-10-88:
advertiseAddress: 172.16.10.88
bindPort: 6443
Please note that we are only editing the ClusterStatus portion of the ConfigMap, everything else should be kept as is. To summarise the changes we are executing:
- The
typemeta
section needs to be removed and its properties must be moved up by one level. apiversion
property is renamed toapiVersion
.apiendpoints
property is renamed toapiEndpoints
.advertiseaddress
property is renamed toadvertiseAddress
.bindport
property is renamed tobindPort
.
After that you can save the ConfigMap and you should not see the alert anymore in your SupportBundle.