Troubleshooting kubectl exec and logs Commands After Certificate Renewal

Problem Description

After automatic certificate renewal on kURL clusters running Kubernetes versions prior to 1.29, you may experience issues where certain kubectl commands fail while others continue to work normally.

Symptoms

Commands that fail:

  • kubectl exec -it pod-name -- /bin/bash
  • kubectl logs pod-name
  • kubectl port-forward pod-name 8080:80
  • kubectl attach pod-name

Error message:

Error from server (Forbidden): Forbidden (user=kube-apiserver-kubelet-client, verb=get, resource=nodes, subresource=proxy)

Commands that still work:

  • kubectl get nodes
  • kubectl get pods
  • kubectl create/delete/apply operations
  • Most other cluster management commands

Root Cause

This issue occurs when the embedded cluster Kubernetes operator (ECKO) performs certificate renewal and updates the API server’s kubelet client certificate to use a newer security model without creating the corresponding authorization.

Specifically:

  1. ECKO updates /etc/kubernetes/pki/apiserver-kubelet-client.crt to use the kubeadm:cluster-admins group
  2. The required kubeadm:cluster-admins ClusterRoleBinding is not created
  3. The API server can no longer authenticate with kubelets for proxy operations

Diagnosis

To confirm this is the issue affecting your cluster:

Step 1: Check the API Server Certificate

sudo openssl x509 -in /etc/kubernetes/pki/apiserver-kubelet-client.crt -noout -subject

If the output shows:

Subject: O = kubeadm:cluster-admins, CN = kube-apiserver-kubelet-client

Step 2: Check for Missing RBAC

kubectl get clusterrolebinding kubeadm:cluster-admins

If you get:

Error from server (NotFound): clusterrolebindings.rbac.authorization.k8s.io "kubeadm:cluster-admins" not found

Then you have confirmed the issue.

Solution

Create the missing ClusterRoleBinding to restore proxy command functionality:

kubectl create clusterrolebinding kubeadm:cluster-admins \
  --clusterrole=cluster-admin \
  --group=kubeadm:cluster-admins

Verify the Fix

  1. Confirm the ClusterRoleBinding was created:

    kubectl get clusterrolebinding kubeadm:cluster-admins
    
  2. Test proxy commands:

    kubectl logs -n kube-system kube-apiserver-$(hostname)
    

Prevention

This issue will be resolved in future versions of ECKO. Until then, if you perform manual certificate operations or notice this issue after certificate renewal, apply the fix above.

One-Line Detection and Fix Script

For quick detection and resolution:

# Check and fix the issue in one command
if sudo openssl x509 -in /etc/kubernetes/pki/apiserver-kubelet-client.crt -noout -subject 2>/dev/null | grep -q "kubeadm:cluster-admins" && ! kubectl get clusterrolebinding kubeadm:cluster-admins >/dev/null 2>&1; then echo "Issue detected - creating ClusterRoleBinding..." && kubectl create clusterrolebinding kubeadm:cluster-admins --clusterrole=cluster-admin --group=kubeadm:cluster-admins && echo "Fixed!"; else echo "No issue detected or already resolved"; fi

Why This Happens

Kubernetes 1.29 introduced improved certificate security by moving away from the hardcoded system:masters group to a more manageable kubeadm:cluster-admins group. This allows for better access control and the ability to revoke certificates without rotating the entire certificate authority.

However, when ECKO applies this newer certificate format to pre-1.29 clusters, it updates the certificate format but doesn’t create the corresponding RBAC that makes the new format work.

Additional Information

  • This issue only affects clusters running Kubernetes versions prior to 1.29
  • The fix is safe to apply and doesn’t affect cluster security
  • Direct API operations continue working because they don’t require kubelet proxy functionality
  • This is a temporary workaround; future ECKO versions will handle this automatically

Related Resources


If you continue to experience issues after applying this fix, please reach out to support with the output of the diagnostic commands above.