Testing Kubernetes Network Connectivity with the Goldpinger Troubleshoot Collector
Network connectivity issues are among the most challenging problems to diagnose in Kubernetes clusters. Pods can’t reach each other, services are unreachable, or mysterious timeouts occur seemingly at random. The goldpinger troubleshoot collector provides a powerful way to quickly assess your cluster’s network health and identify connectivity problems.
What is Goldpinger?
Goldpinger is a network monitoring tool originally developed by Bloomberg that tests connectivity between nodes in a Kubernetes cluster. It works by deploying pods across your cluster nodes and having them ping each other to create a comprehensive network connectivity matrix.
The troubleshoot framework includes a goldpinger collector that can temporarily deploy goldpinger to your cluster, run connectivity tests, and collect the results in a support bundle for analysis.
How the Goldpinger Collector Works
The goldpinger collector makes a request to the <host>/check_all
endpoint. If this collector is run within a kubernetes cluster, the collector will directly make the http request to the goldpinger endpoint (http://goldpinger.<namespace>.svc.cluster.local:80/check_all
). If not, the collector attempts to launch a pod in the cluster, configured with the podLaunchOptions
parameter, and makes the request within the running container.
If goldpinger is not installed, the collector will attempt to temporarily install it, and uninstall goldpinger once the collector has completed.
This automatic behavior means the collector works in three scenarios:
- Existing goldpinger installation - Queries the existing service directly
- No goldpinger, running in-cluster - Temporarily installs goldpinger, collects data, then cleans up
- No goldpinger, running externally - Launches a pod to make internal requests to temporary goldpinger
Output Files
Result of each collector will be stored in goldpinger/
directory of the support bundle.
The collector generates one of two files:
goldpinger/check_all.json
This file will contain the response of <host>/check_all
endpoint with the full connectivity matrix.
goldpinger/error.txt
In case there is an error fetching results goldpinger/error.txt
will contain the error message. Resulting file will contain either goldpinger/check_all.json
or goldpinger/error.txt
but never both.
The simplest way to run the goldpinger collector is with a minimal spec:
1. Create the Troubleshoot Spec
cat > goldpinger-spec.yaml << 'EOF'
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
name: goldpinger
spec:
collectors:
- goldpinger: {}
analyzers:
- goldpinger: {}
EOF
2. Run the Collector
kubectl support-bundle goldpinger-spec.yaml
That’s it! The collector will:
- Automatically install goldpinger if it’s not already running
- Deploy goldpinger pods across your cluster nodes
- Test connectivity between all nodes
- Collect the results
- Clean up the temporary goldpinger installation
- Generate a support bundle with the connectivity data
Understanding the Results
After running the collector, you’ll get a support bundle containing a goldpinger/
directory with connectivity results. Here’s what to look for:
Healthy Single-Node Cluster Example
{
"hosts": [
{
"hostIP": "10.0.0.191",
"podIP": "10.244.62.25",
"podName": "ts-goldpinger-jzc4f"
}
],
"responses": {
"ts-goldpinger-jzc4f": {
"HostIP": "10.0.0.191",
"OK": true,
"PodIP": "10.244.62.25",
"response": {
"podResults": {
"ts-goldpinger-jzc4f": {
"HostIP": "10.0.0.191",
"OK": true,
"PingTime": "2025-06-13T20:32:08.504Z",
"PodIP": "10.244.62.25",
"response-time-ms": 1,
"status-code": 200
}
}
}
}
}
}
Key Metrics to Check
- OK: true/false - Overall connectivity status
- response-time-ms - Network latency between nodes
- status-code: 200 - HTTP response indicating successful connectivity
- Multiple hosts - In multi-node clusters, you should see entries for each node
Multi-Node Cluster Results
In a healthy multi-node cluster, you’ll see connectivity results between all node pairs:
- Node A → Node B, Node C, Node D (all OK: true)
- Node B → Node A, Node C, Node D (all OK: true)
- And so on…
Common Network Issues Goldpinger Can Detect
1. Node Isolation
{
"OK": false,
"Error": "connection timeout"
}
Diagnosis: A node can’t reach other nodes, possibly due to firewall rules or network misconfiguration.
2. High Latency
{
"OK": true,
"response-time-ms": 2500
}
Diagnosis: Nodes can connect but with high latency (>1000ms may indicate network issues).
3. Partial Connectivity
Some node pairs work fine while others fail - often indicates asymmetric routing or security group issues.
Advanced Configuration
The goldpinger collector supports several configuration options for customizing its behavior:
Custom Namespace
collectors:
- goldpinger:
namespace: kurl # Look for goldpinger in 'kurl' namespace
Custom Goldpinger Image
collectors:
- goldpinger:
image: my-registry/goldpinger:custom-tag # Use custom goldpinger image
Collection Delay
collectors:
- goldpinger:
collectDelay: 10s # Wait 10 seconds after goldpinger starts
Pod Launch Options
When goldpinger needs to make requests from within the cluster, you can customize the pod it launches:
collectors:
- goldpinger:
namespace: kurl
podLaunchOptions:
namespace: monitoring # Launch pod in monitoring namespace
image: alpine:latest # Use alpine (needs wget)
imagePullSecret: my-secret # Use custom image pull secret
serviceAccountName: goldpinger # Use specific service account