Problem Description
After automatic certificate renewal on kURL clusters running Kubernetes versions prior to 1.29, you may experience issues where certain kubectl
commands fail while others continue to work normally.
Symptoms
Commands that fail:
kubectl exec -it pod-name -- /bin/bash
kubectl logs pod-name
kubectl port-forward pod-name 8080:80
kubectl attach pod-name
Error message:
Error from server (Forbidden): Forbidden (user=kube-apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)
Commands that still work:
kubectl get nodes
kubectl get pods
kubectl create/delete/apply
operations- Most other cluster management commands
Root Cause
This issue occurs when the embedded cluster Kubernetes operator (ECKO) performs certificate renewal and updates the API server’s kubelet client certificate to use a newer security model without creating the corresponding authorization.
Specifically:
- ECKO updates
/etc/kubernetes/pki/apiserver-kubelet-client.crt
to use thekubeadm:cluster-admins
group - The required
kubeadm:cluster-admins
ClusterRoleBinding is not created - The API server can no longer authenticate with kubelets for proxy operations
Diagnosis
To confirm this is the issue affecting your cluster:
Step 1: Check the API Server Certificate
sudo openssl x509 -in /etc/kubernetes/pki/apiserver-kubelet-client.crt -noout -subject
If the output shows:
Subject: O = kubeadm:cluster-admins, CN = kube-apiserver-kubelet-client
Step 2: Check for Missing RBAC
kubectl get clusterrolebinding kubeadm:cluster-admins
If you get:
Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io "kubeadm:cluster-admins" not found
Then you have confirmed the issue.
Solution
Create the missing ClusterRoleBinding to restore proxy command functionality:
kubectl create clusterrolebinding kubeadm:cluster-admins \
--clusterrole=cluster-admin \
--group=kubeadm:cluster-admins
Verify the Fix
-
Confirm the ClusterRoleBinding was created:
kubectl get clusterrolebinding kubeadm:cluster-admins
-
Test proxy commands:
kubectl logs -n kube-system kube-apiserver-$(hostname)
Prevention
This issue will be resolved in future versions of ECKO. Until then, if you perform manual certificate operations or notice this issue after certificate renewal, apply the fix above.
One-Line Detection and Fix Script
For quick detection and resolution:
# Check and fix the issue in one command
if sudo openssl x509 -in /etc/kubernetes/pki/apiserver-kubelet-client.crt -noout -subject 2>/dev/null | grep -q "kubeadm:cluster-admins" && ! kubectl get clusterrolebinding kubeadm:cluster-admins >/dev/null 2>&1; then echo "Issue detected - creating ClusterRoleBinding..." && kubectl create clusterrolebinding kubeadm:cluster-admins --clusterrole=cluster-admin --group=kubeadm:cluster-admins && echo "Fixed!"; else echo "No issue detected or already resolved"; fi
Why This Happens
Kubernetes 1.29 introduced improved certificate security by moving away from the hardcoded system:masters
group to a more manageable kubeadm:cluster-admins
group. This allows for better access control and the ability to revoke certificates without rotating the entire certificate authority.
However, when ECKO applies this newer certificate format to pre-1.29 clusters, it updates the certificate format but doesn’t create the corresponding RBAC that makes the new format work.
Additional Information
- This issue only affects clusters running Kubernetes versions prior to 1.29
- The fix is safe to apply and doesn’t affect cluster security
- Direct API operations continue working because they don’t require kubelet proxy functionality
- This is a temporary workaround; future ECKO versions will handle this automatically
Related Resources
If you continue to experience issues after applying this fix, please reach out to support with the output of the diagnostic commands above.