Troubleshooting Cluster Networking

One of the most common problems enterprises encounter when installing Kubernetes is configuring their systems to be compatible with pod networking. The following steps can help identify the source of networking problems.

These steps can also help identify errors detected by the kURL installer’s cluster networking check that runs after the CNI plugin (weave/antrea) is installed. If a problem is detected it will print the error There appears to be a problem with cluster networking and may print log lines that include the message Temporary failure in name resolution:

Post "http://kurlnet.default.svc.cluster.local:8080": dial tcp: lookup kurlnet.default.svc.cluster.local: Temporary failure in name resolution

Usually this is due to a general pod networking problem and is not specific to DNS lookup, but if the below steps do not identify the issue, review the coredns logs with the command kubectl -n kube-system logs deployment/coredns.

Firewalls

If no firewall is expected to be running use systemctl status firewalld or systemctl status ufw to ensure that neither is running. If the firewall is expected to be running, temporarily disable it to determine whether it is responsible for the networking failure. (Note that after stopping your firewall, it may be necessary to restart docker and weave to re-create missing iptables rules: systemctl restart docker and kubectl -n kube-system rollout restart daemonset/weave-net.)

IPTables

Check IPTables with the -v flag to get a count of packets handled by each rule in the first column. Pay close attention to any rules with target DROP that have a packet count above 0, as well as to the default policy of each chain.

In the example output below, the INPUT chain has a default policy of ACCEPT, but the last rule is dropping all packets originating from an IP in the default range assigned to pods on kURL clusters.

Also note that the FORWARD chain has a default policy of DROP but because the packet count for the default policy is 0, it’s not the cause of any networking problems encountered on the server.

$ iptables -L -v

Chain INPUT (policy ACCEPT 3193 packets, 533K bytes)
 pkts bytes target     prot opt in     out     source               destination
 511K 1583M KUBE-FIREWALL  all  --  any    any     anywhere             anywhere
 653K 3852M sshguard   all  --  any    any     anywhere             anywhere
    0     0 DROP       tcp  --  any    any     anywhere             localhost            tcp dpt:6784 ADDRTYPE match src-type !LOCAL ! ctstate RELATED,ESTABLISHED /* Block non-local access to Weave Net control port */
 6351  450K WEAVE-NPC-EGRESS  all  --  weave  any     anywhere             anywhere
 480K 1573M WEAVE-IPSEC-IN  all  --  any    any     anywhere             anywhere
  202 18610 DROP       all  --  any    any     10.32.0.0/22         anywhere

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
   99 11996 WEAVE-NPC-EGRESS  all  --  weave  any     anywhere             anywhere             /* NOTE: this must go before '-j KUBE-FORWARD' */
    2   320 WEAVE-NPC  all  --  any    weave   anywhere             anywhere             /* NOTE: this must go before '-j KUBE-FORWARD' */
    0     0 NFLOG      all  --  any    weave   anywhere             anywhere             state NEW nflog-group 86
    0     0 DROP       all  --  any    weave   anywhere             anywhere
    2   170 ACCEPT     all  --  weave  !weave  anywhere             anywhere
    0     0 ACCEPT     all  --  any    weave   anywhere             anywhere             ctstate RELATED,ESTABLISHED
    0     0 KUBE-FORWARD  all  --  any    any     anywhere             anywhere             /* kubernetes forwarding rules */
    0     0 DOCKER-USER  all  --  any    any     anywhere             anywhere
    0     0 DOCKER-ISOLATION-STAGE-1  all  --  any    any     anywhere             anywhere
    0     0 ACCEPT     all  --  any    docker0  anywhere             anywhere             ctstate RELATED,ESTABLISHED
    0     0 DOCKER     all  --  any    docker0  anywhere             anywhere
    0     0 ACCEPT     all  --  docker0 !docker0  anywhere             anywhere
    0     0 ACCEPT     all  --  docker0 docker0  anywhere             anywhere

Sysctl Settings

When a server is able to make outbound requests to the Internet from the host, but pods cannot make the same request, it may be that ip forwarding between interfaces is disabled. Run sysctl -a | grep forwarding and ensure that the following settings are enabled:

net.ipv4.conf.all.forwarding = 1
net.ipv4.conf.default.forwarding = 1
net.ipv4.conf.weave.forwarding = 1

Also check that bridge interfaces (like weave) are integrated with IPtables:

$ sysctl -a | grep nf-call-iptables

net.bridge.bridge-nf-call-iptables = 1

Routing Conflicts

Ensure that IPs used by servers on the local network do not overlap with the IP range used for pods or services (10.32.0.0/20 and 10.96.0.0/22 by default).