If you see the following Alert in your cluster SupportBundle then you have been hit by a known ekco bug:
Kubeadm cluster status ConfigMap is in an inconsistent state.
This error may cause Cluster upgrades to fail and to solve it a manual intervention is required.
Why am I being hit by this ?
Kubeadm uses a ConfigMap called kubeadm-config in the kube-system namespace to keep track of what are the configured Control Plane API endpoints. If you are seeing this error it means that at some stage in the past you had used ekco-purge-node command line utility to remove one of your cluster nodes.
Due to a bug, ekco-purge-node was rendering an invalid ClusterStatus YAML and writing it in the kubeadm-config ConfigMap. This invalid YAML can’t then be read by kubeadm anymore so cluster upgrades may fail.
How to solve it
We need to manually adjust the ConfigMap, this can be achieved by editing the ConfigMap with the following command:
kubectl edit cm kubeadm-config -n kube-system
For sake of simplicity we should focus only on the ClusterStatus property of the ConfigMap. After running the command above you most likely gonna see something very similar to this (the YAMLs below are redacted for easier understanding):
apiVersion: v1
kind: ConfigMap
metadata:
name: kubeadm-config
namespace: kube-system
data:
ClusterConfiguration: |
<redacted>
ClusterStatus: |
typemeta:
apiversion: kubeadm.k8s.io/v1beta2
kind: ClusterStatus
apiendpoints:
ip-172-16-10-30:
advertiseaddress: 172.16.10.30
bindport: 6443
ip-172-16-10-88:
advertiseaddress: 172.16.10.88
bindport: 6443
The ClusterStatus property must be adjusted to something like this:
apiVersion: v1
kind: ConfigMap
metadata:
name: kubeadm-config
namespace: kube-system
data:
ClusterConfiguration: |
<redacted>
ClusterStatus: |
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterStatus
apiEndpoints:
ip-172-16-10-30:
advertiseAddress: 172.16.10.30
bindPort: 6443
ip-172-16-10-88:
advertiseAddress: 172.16.10.88
bindPort: 6443
Please note that we are only editing the ClusterStatus portion of the ConfigMap, everything else should be kept as is. To summarise the changes we are executing:
- The
typemetasection needs to be removed and its properties must be moved up by one level. apiversionproperty is renamed toapiVersion.apiendpointsproperty is renamed toapiEndpoints.advertiseaddressproperty is renamed toadvertiseAddress.bindportproperty is renamed tobindPort.
After that you can save the ConfigMap and you should not see the alert anymore in your SupportBundle.