Troubleshooting Kubernetes Services

Originally published on LinkedIn. Lightly edited for clarity.

When a Kubernetes Service stops routing, it is tempting to jump straight to network plugins or kube-proxy.

Most failures are simpler: wrong selectors, missing endpoints, or DNS issues. A clean sequence helps you isolate the layer that is actually broken.

Start with the Service object

Confirm the Service exists and matches your expectations:

kubectl get svc -n <namespace>
kubectl describe svc <service> -n <namespace>

Check the type, ports, and selectors.

If the selectors do not match any pods, the Service will never get endpoints.

Validate endpoints and labels

Endpoints are the direct truth of where traffic should go:

kubectl get endpoints <service> -n <namespace>

If the endpoint list is empty, check the pod labels and readiness state.

A pod that is not Ready will not appear, even if it matches selectors.

Test DNS resolution

Cluster DNS can mask service issues.

From a test pod in the same namespace:

nslookup <service>

If DNS is failing, you have a CoreDNS issue, not a Service routing issue.

Check kube-proxy and network policies

If endpoints and DNS are correct but traffic still fails:

Validate kube-proxy status and logs.

Confirm there are no network policies blocking the path.

Test direct pod IP access to isolate Service routing.

At this stage, you are troubleshooting the network layer, not Kubernetes metadata.

External access considerations

For LoadBalancer and NodePort services, confirm that the external path actually reaches the node and that security groups or firewall rules allow the traffic.

The Service can be healthy inside the cluster and still be unreachable from outside.

2026 Perspective

The cluster stack is more complex now, but the troubleshooting order still holds: selectors, endpoints, DNS, then routing.

Most “network” issues are metadata mismatches or readiness gating. A disciplined sequence avoids hours of false leads.