Ninja Docs Help

[ Thanos ][ Sidecar ] No connection to started Prometheus

Revision

Date

Description

1.0

24.07.2024

Init Changelog

Problem

When report Incident

Only if all Prometheus Instances on K8s Cluster not working (Prometheus Server with Thanos Sidecar should work in HA mode as default).

Severity

Value

Time

SEV-3

None of Prometheus Server Pod not working on K8s Cluster.

Escalation

I Line

II Line

III Line

DevOps

SRE

Requires access / logs, etc.

Panel

  • Kubernetes API (with kubectl, k9s, etc.) access:

    • Needs at least: LIST, GET, WATCH on pods, pods/logs, statefulsets in monitoring Namespace.

Monitoring

Logs

Pods:

  • Prometheus Server

Environment

Every Kubernetes Cluster.

DB

None.

Steps to take

Verify logs on thanos-sidecar container to investigate problem and take action.

Possible problems:

  • Thanos has changed configuration and it is not working.

  • Prometheus has changed configuration and it is not working.

  • Prometheus has problems and cannot start.

Solution: Connection Refused

level=warn ts=2020-04-18T03:07:00.512902927Z caller=intrumentation.go:54 msg="changing probe status" status=not-ready reason="request flags against http://localhost:9090/api/v1/status/config: Get \"http://localhost:9090/api/v1/status/config\": dial tcp 127.0.0.1:9090: connect: connection refused"
  • Make sure that prometheus is running while thanos is started. The connection_refused states that there is no server running in the localhost:9090, which is the address for prometheus in this case.

Solution: Thanos not identifying Prometheus

level=info ts=2020-04-18T03:16:32.158536285Z caller=grpc.go:137 service=gRPC/server component=sidecar msg="internal server shutdown" err="no external labels configured on Prometheus server, uniquely identifying external labels must be configured"
  • Thanos requires unique external_labels for further processing. So make sure that the external_labels are not empty and globally unique in the prometheus config file.

Last modified: 17 February 2025