One of the most common problems enterprises encounter when installing Kubernetes is configuring their systems to be compatible with pod networking. The following steps can help identify the source of networking problems.
These steps can also help identify errors detected by the kURL installer’s cluster networking check that runs after the CNI plugin (weave/antrea) is installed. If a problem is detected it will print the error There appears to be a problem with cluster networking
and may print log lines that include the message Temporary failure in name resolution
:
Post "http://kurlnet.default.svc.cluster.local:8080": dial tcp: lookup kurlnet.default.svc.cluster.local: Temporary failure in name resolution
Usually this is due to a general pod networking problem and is not specific to DNS lookup, but if the below steps do not identify the issue, review the coredns logs with the command kubectl -n kube-system logs deployment/coredns
.
Firewalls
If no firewall is expected to be running use systemctl status firewalld
or systemctl status ufw
to ensure that neither is running. If the firewall is expected to be running, temporarily disable it to determine whether it is responsible for the networking failure. (Note that after stopping your firewall, it may be necessary to restart docker and weave to re-create missing iptables rules: systemctl restart docker
and kubectl -n kube-system rollout restart daemonset/weave-net
.)
IPTables
Check IPTables with the -v
flag to get a count of packets handled by each rule in the first column. Pay close attention to any rules with target DROP
that have a packet count above 0, as well as to the default policy of each chain.
In the example output below, the INPUT
chain has a default policy of ACCEPT
, but the last rule is dropping all packets originating from an IP in the default range assigned to pods on kURL clusters.
Also note that the FORWARD chain has a default policy of DROP
but because the packet count for the default policy is 0, it’s not the cause of any networking problems encountered on the server.
$ iptables -L -v
Chain INPUT (policy ACCEPT 3193 packets, 533K bytes)
pkts bytes target prot opt in out source destination
511K 1583M KUBE-FIREWALL all -- any any anywhere anywhere
653K 3852M sshguard all -- any any anywhere anywhere
0 0 DROP tcp -- any any anywhere localhost tcp dpt:6784 ADDRTYPE match src-type !LOCAL ! ctstate RELATED,ESTABLISHED /* Block non-local access to Weave Net control port */
6351 450K WEAVE-NPC-EGRESS all -- weave any anywhere anywhere
480K 1573M WEAVE-IPSEC-IN all -- any any anywhere anywhere
202 18610 DROP all -- any any 10.32.0.0/22 anywhere
Chain FORWARD (policy DROP 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
99 11996 WEAVE-NPC-EGRESS all -- weave any anywhere anywhere /* NOTE: this must go before '-j KUBE-FORWARD' */
2 320 WEAVE-NPC all -- any weave anywhere anywhere /* NOTE: this must go before '-j KUBE-FORWARD' */
0 0 NFLOG all -- any weave anywhere anywhere state NEW nflog-group 86
0 0 DROP all -- any weave anywhere anywhere
2 170 ACCEPT all -- weave !weave anywhere anywhere
0 0 ACCEPT all -- any weave anywhere anywhere ctstate RELATED,ESTABLISHED
0 0 KUBE-FORWARD all -- any any anywhere anywhere /* kubernetes forwarding rules */
0 0 DOCKER-USER all -- any any anywhere anywhere
0 0 DOCKER-ISOLATION-STAGE-1 all -- any any anywhere anywhere
0 0 ACCEPT all -- any docker0 anywhere anywhere ctstate RELATED,ESTABLISHED
0 0 DOCKER all -- any docker0 anywhere anywhere
0 0 ACCEPT all -- docker0 !docker0 anywhere anywhere
0 0 ACCEPT all -- docker0 docker0 anywhere anywhere
Sysctl Settings
When a server is able to make outbound requests to the Internet from the host, but pods cannot make the same request, it may be that ip forwarding between interfaces is disabled. Run sysctl -a | grep forwarding
and ensure that the following settings are enabled:
net.ipv4.conf.all.forwarding = 1
net.ipv4.conf.default.forwarding = 1
net.ipv4.conf.weave.forwarding = 1
Also check that bridge interfaces (like weave) are integrated with IPtables:
$ sysctl -a | grep nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1
Routing Conflicts
Ensure that IPs used by servers on the local network do not overlap with the IP range used for pods or services (10.32.0.0/20 and 10.96.0.0/22 by default).