Kubernetes application troubleshooting: part 1 - services

This is part one in a series of posts that will detail common debugging methods for applications running in kubernetes, as well as lightly cover some of the internal k8s mechanisms to add context that could help while debugging.

This post will focus on pod and service networking. I will be assuming that the app we’re trying to debug is at least running and healthy.

KOTS will be used as an example here but these steps aren’t application specific.

To start, The simplest test you can perform is to try and connect to the app locally from a node in the cluster.

On a KURL cluster KOTS has a NodePort type service listening on port 8800 and the kots application listens for both HTTP and HTTPS connections through it’s port to allow the user to configure TLS via a setup page. after TLS is configured it will issue a 301 response to HTTP traffic instructing the client to retry the connection via HTTPS.

We start by confirming how the cluster is configured to direct traffic to our app, to list all services in the cluster we can run: kubectl get svc -A this shows all services from all namespaces.

the output should look something like this:

NAMESPACE        NAME                                 TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                         AGE
...
default          kotsadm                              ClusterIP   10.96.2.81    <none>        3000/TCP                        17h
default          kotsadm-rqlite                       ClusterIP   10.96.0.248   <none>        4001/TCP                        17h
default          kotsadm-rqlite-headless              ClusterIP   None          <none>        4001/TCP                        17h
default          kurl-proxy-kotsadm                   NodePort    10.96.2.89    <none>        8800:8800/TCP                   17h
...

I’ve shortened this list for brevity.

the third column is the service type, we can see there are two types present in this list. ClusterIP type services are internal to the cluster, NodePort type services expose a port on each node of the cluster.

A NodePort type service will list it’s port mapping in the 6th column of this output, the first port in the colon delimted pair is the target port of the pod that the service maps to, the second port is the port that the service has opened on the node. in this example it’s 8800

we can try and communicate with our app over this NodePort service using curl, but first we have to check if the service maps to a pod, without an endpoint the service will be unable to forward traffic.

$ kubectl describe service kurl-proxy-kotsadm 
Name:                     kurl-proxy-kotsadm
...
Endpoints:                10.32.0.46:8800
...

the endpoint here correspods to a pod IP, we can confirm where this endpoint is going with:

$ kubectl get pods -Ao wide | grep "10.32.0.46"
default          kurl-proxy-kotsadm-7cd64b7778-q9w49            1/1     Running     1 (37m ago)   17h   10.32.0.46      kurl   <none>           <none>

if the service does not have any endpoints, then it’s likely that there is a mismatch between the services selectors and the target pods labels.

You should compare the output of the service selector field with the pod’s labels:

$ kubectl get svc kotsadm -o jsonpath='{.spec.selector}'
{"app":"kotsadm"}

$ kubectl get pods kotsadm-57bd95d6b7-5r9vm -o jsonpath='{.metadata.labels}'
{"app":"kotsadm","kots.io/backup":"velero","kots.io/kotsadm":"true","pod-template-hash":"57bd95d6b7"}

Now let’s identify which interface kube-proxy, the component responsible for handling NodePort connections, is listening on using this quick check:

$ kubectl get pod -n kube-system -l k8s-app=kube-proxy -o jsonpath='{.items[0].status.hostIP}'
192.168.8.220

this returns the IP kube-proxy is listening on from the kube-proxy pod.

now we can curl -vvvkL 192.168.8.220:8800 which should return a html response.

The flags used here are:

- vvv: be very verbose
- k: do not attempt to validate TLS certificates
- L: follow redirects

If we’ve got to this point, we’ve confirmed that kubernetes service networking is functioning as expected, and we can move on to the next layer: ingress and load balancing.