I'd like to export the status of each partition too. We can always write some logi

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Fixed by <a class="issue-link js-issue-link" data-error-text="Failed to load title" da

Export partition status about burrow_exporter HOT 6 CLOSED

jirwin commented on July 17, 2024 1

Export partition status

from burrow_exporter.

Comments (6)

kanga333 commented on July 17, 2024

@ercliou
Hello.
I also want this metrics.
You seem to have made some changes after forking, but are you planning to send a patch upstream?

from burrow_exporter.

ercliou-zz commented on July 17, 2024

I ended up implementing by sending all metrics at every scrap. When the status is not the matched one, it sends 0. This increases 1:5 with number of partitions (could be a problem if you have a lot of them).
e.g.

kafka_burrow_partition_state{cluster="MY_CLUSTER",group="MY_GROUP",partition="13",topic="MY_TOPIC",state:"OK"} 1
kafka_burrow_partition_state{cluster="MY_CLUSTER",group="MY_GROUP",partition="13",topic="MY_TOPIC",state:"STOP"} 0
kafka_burrow_partition_state{cluster="MY_CLUSTER",group="MY_GROUP",partition="13",topic="MY_TOPIC",state:"REWIND"} 0
kafka_burrow_partition_state{cluster="MY_CLUSTER",group="MY_GROUP",partition="13",topic="MY_TOPIC",state:"STALL"} 0
kafka_burrow_partition_state{cluster="MY_CLUSTER",group="MY_GROUP",partition="13",topic="MY_TOPIC",state:"WARN"} 0

This is so each one of them stay as one independent time series. The reason of this is that I could query the lag + status at Grafana by partition.
Query:

kafka_burrow_partition_lag{group="MY_GROUP",topic="MY_TOPIC"}
* on (topic, partition, group) group_left(status) 
(kafka_burrow_partition_status{group="MY_GROUP",topic="MY_TOPIC"} == 1)

I could send a patch if @jirwin agrees with this :)

from burrow_exporter.

jirwin commented on July 17, 2024

I'm +1 to this. Partition count isn't generally unbound. Maybe it could be enabled by a command line flag, so people can use their own judgement as to whether the surge in new time series is acceptable to them. Maybe --per-partition-stats or something?

from burrow_exporter.

shibug commented on July 17, 2024

How about we define a numeric scheme for the value of this time series? This will save us from 1:5 time series bloat. Our system has 2525 partitions for 52 topics. I am definitely worried about the bloat.

NOTFOUND = 1
OK = 2
WARN = 3
ERR = 4
STOP = 5
STALL = 6

kafka_burrow_partition_state{cluster="MY_CLUSTER",group="MY_GROUP",partition="13",topic="MY_TOPIC"} 2

from burrow_exporter.

ercliou-zz commented on July 17, 2024

Hi @shibug , I explained a lil bit about the reasoning behind in the above PR (centered mostly around Grafana).

We have 15k partitions and haven't encountered performance problems (yet).
I can't look into command line flag right now, if someone would like to look into this, I appreciate it.

from burrow_exporter.

jirwin commented on July 17, 2024

Fixed by #19.

from burrow_exporter.

Export partition status about burrow_exporter HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent