Ninja Docs Help

[ Thanos ][ Sidecar ] Bucket Operations Failed

Revision

Date

Description

1.0

24.07.2024

Init Changelog

Problem

When report Incident

Eight hours after the problem occurred and was not resolved. After this time, a gap in the data in the archive may appear.

Severity

Value

Time

SEV-4

8h

Escalation

I Line

II Line

III Line

DevOps

SRE

Requires access / logs, etc.

Panel

  • Kubernetes API (with kubectl, k9s, etc.) access:

    • Needs at least: LIST, GET, WATCH on pods, pods/logs, statefulsets in monitoring Namespace.

  • AWS Access (Management Console or CLI):

    • Account: AWS Common

    • Resources:

      • S3: ap-thanos-storage

Monitoring

Logs

Pods:

  • Prometheus Server (container thanos-sidecar)

Environment

Every Kubernetes Cluster.

DB

None.

Steps to take

Verify logs on thanos-sidecar container to investigate problem and take action.

Last modified: 17 February 2025