kafka_exporter项目地址:https://github.com/danielqsj/kafka_exporter
docker-compose部署kafka_exporter
# docker-compose部署多个kafka_exporter,每个exporter对接一个kafka# cat docker-compose.ymlversion: '3.1'
services:kafka-exporter-opslogs:image: bitnami/kafka-exporter:latestcommand:- '--kafka.server=10.2.19.43:9092'- '--kafka.server=10.2.24.62:9092'- '--kafka.server=10.5.98.190:9092'- '--kafka.version=3.2.1'restart: alwaysports:- 9310:9308kafka-exporter-prod:image: bitnami/kafka-exporter:latestcommand:- '--kafka.server=192.168.53.99:9092'- '--kafka.server=192.168.53.53:9092'- '--kafka.server=192.168.53.96:9092'restart: alwaysports:- 9311:9308
注意:配置上每个kafka broker的地址,kafka3需要指定版本
Promethus配置job接入kafka-exporter
- job_name: 'kafka-exporter'metrics_path: /metricsscrape_interval: 15sscrape_timeout: 10sstatic_configs:- targets:- 10.0.0.26:9310labels:name: kafka-opslogs- targets:- 10.0.0.26:9311labels:name: kafka-prod
注意:每个kafka-exporter必须增加name
标签,看板需要使用这个标签
KAFKA Grafana Dashboard
Grafana看板ID:21078
Grafana看板地址:
https://grafana.com/grafana/dashboards/21078
项目仓库:
https://github.com/starsliao/Prometheus/tree/master/kafka
全局信息、消费者与Topic、异常与积压分析
分区维度明细
Prometheus告警规则
- name: kafkarules:- alert: KAFKA_brokers异常expr: kafka_broker_info != 1for: 2mlabels:severity: criticalannotations:description: "{{ $labels.name }}当前brokers异常:{{ $labels.address }}"- alert: 电商生产KAFKA消息整体积压expr: sum(kafka_consumergroup_lag_sum{job="kafka-exporter"}) by (name,consumergroup, topic)>5000for: 2mlabels:severity: criticalannotations:description: "【环境】{{ $labels.name }}\n【消费组】{{ $labels.consumergroup }}\n【topic】{{ $labels.topic }}【积压】:{{ $value | printf \"%.2f\" }}"- alert: 电商生产KAFKA消息分区积压expr: (sum(kafka_consumergroup_lag{job="kafka-exporter"}) by (name,consumergroup, topic, partition)>1500) AND ON() (hour()+8)%24 >= 7 <= 21for: 3mlabels:severity: criticalannotations:description: "【环境】{{ $labels.name }}\n【消费组】{{ $labels.consumergroup }}\n【topic】{{$labels.topic}}【分区】{{ $labels.partition }}【积压】:{{ $value | printf \"%.2f\" }}"- alert: 电商生产KAFKA分区数过多expr: sum by(name)(kafka_topic_partitions{job="kafka-exporter",topic !~"__.*"})>1500for: 2mlabels:severity: criticalannotations:description: "{{ $labels.name }}当前分区数:{{ $value | printf \"%.2f\" }}"- alert: 电商生产KAFKA_brokers丢失expr: kafka_brokers{job="kafka-exporter"} < 3for: 2mlabels:severity: criticalannotations:description: "{{ $labels.name }}当前brokers数:{{ $value | printf \"%.2f\" }}"- alert: 电商生产KAFKA_TopicsReplicasexpr: sum(kafka_topic_partition_in_sync_replica{job="kafka-exporter"}) by (name,topic) <1for: 2mlabels:severity: criticalannotations:description: "{{ $labels.name }} Kafka topic in-sync partition:{{ $value | printf \"%.2f\" }}"