Hostname changes after reboot for Google Compute Engine Instances

On GCP, for instances created running Linux distributions which utilize Network Manager, it’s possible after a reboot to observe issues with a running Kubernetes cluster like kURL where pods are unable to start and the node reports as NotReady. Upon further investigation you may see the following in the kubelet logs:

kubelet_node_status.go:92] "Unable to register node with API server" err="nodes \"some-node\" is forbidden: node \"some-node.some.domain.internal\" is not allowed to modify node \"some-node\"" node="some-node"

This happens in certain Linux distributions on GCP where a script at the location of /etc/dhcp/dhclient.d/ is run whenever the DHCP client renews its lease. This script runs the command nmcli general hostname "${new_host_name%%.*}" to set the hostname to the short hand name. Before the script is run, the hostname is generally the full fqdn(some-node.some.domain.internal), and afterwards the hostname is simply some-node causing the node to enter a NotReady state.

To workaround this you can either:

  1. Remove the script before running the kURL install with rm -rf /etc/dhcp/dhclient.d/ It’s worth noting that this approach may affect functionality like gcloud compute instances create vmname --hostname=xxxx, so you should proceed with caution.

  2. Manually force the script to run as part of the DHCP lease renewal process before running the kURL install with dhclient -r && dhclient.

  3. Set a permenant hostname manually. Google Compute Engine: how to set hostname permanently? - Stack Overflow

Relevant Links:

1 Like

A bug has been opened for this upstream with the GCP team bug: `google_set_hostname` changes hostname after reboot · Issue #69 · GoogleCloudPlatform/guest-configs · GitHub