常见Prometheus exporter部署
- Prometheus部署
- Node exporter
- Process exporter
- Redis exporter
- MySQL exporter
- OracleDB exporter
Prometheus部署
本地部署:
wget https://github.com/prometheus/prometheus/releases/download/v*/prometheus-*.*-amd64.tar.gz
tar xvf prometheus-*.*-amd64.tar.gzcd prometheus-*.*
./prometheus --config.file=./prometheus.yml
容器化部署(通过Bind Mount将宿主机上的prometheus目录挂载到容器内):
mkdir -vp /opt/prometheus/datadocker run \-p 9090:9090 \-v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml \-v /opt/prometheus/data:/prometheus \prom/prometheus
Node exporter
本地部署:
wget https://github.com/prometheus/node_exporter/releases/download/v<VERSION>/node_exporter-<VERSION>.<OS>-<ARCH>.tar.gz
tar xvfz node_exporter-*.*-amd64.tar.gzcd node_exporter-*.*-amd64
./node_exportercurl http://localhost:9100/metrics
容器化部署node exporter时,必须通过Bind Mount把要监控的宿主机目录挂载到node exporter运行的容器中。Node exporter会使用path.rootfs
作为前缀来访问宿主机文件系统。
docker run -d \--net="host" \--pid="host" \-v "/:/host:ro,rslave" \quay.io/prometheus/node-exporter:latest \--path.rootfs=/host --no-collector.systemd
对应的docker compose文件如下:
---
version: '3.8'services:node_exporter:image: quay.io/prometheus/node-exporter:latestcontainer_name: node_exportercommand:- '--path.rootfs=/host'- '--no-collector.systemd'network_mode: hostpid: hostrestart: unless-stoppedvolumes:- '/:/host:ro,rslave'
prometheus.yml配置:
global:scrape_interval: 15sscrape_configs:
- job_name: nodestatic_configs:- targets: ['<NODE_EXPORTER_IP>:9100']
Process exporter
以监控mysqld进程为例。
本地部署:
wget https://github.com/ncabatoff/process-exporter/releases/download/v0.7.10/process-exporter-0.7.10.linux-amd64.tar.gztar -zxvf process-exporter-0.7.10.linux-amd64.tar.gz -C /usr/local
mv process-exporter-0.7.10.linux-amd64/ process_exportercd /usr/local && ./process-exporter -procnames=mysqld
容器化部署:
#通过config.path指定配置文件
docker run -d --rm -p 9256:9256 --privileged \
-v /proc:/host/proc \
-v `pwd`:/config ncabatoff/process-exporter \
--procfs /host/proc -threads=false \
-config.path /path/to/config/filename.yml#通过procnames指定被监控的进程
docker run -d --rm -p 9256:9256 --privileged \
-v /proc:/host/proc \
-v `pwd`:/config ncabatoff/process-exporter \
--procfs /host/proc -threads=false \
-procnames=mysqld
Process exporter配置文件:
process_names:- name: "{{.Matches}}"cmdline:- 'mysqld'
prometheus.yml配置:
global:scrape_interval: 15sscrape_configs:
- job_name: Processstatic_configs:- targets: ['<PROCESS_EXPORTER_IP>:9256']
Redis exporter
支持版本:Redis 2.x, 3.x, 4.x, 5.x, 6.x, 7.x
编译:
git clone https://github.com/oliver006/redis_exporter.git
cd redis_exporter
go build .
本地部署:
./redis_exporter --version
容器化部署:
docker run -d --name redis_exporter -p 9121:9121 oliver006/redis_exporter
docker run -d --name redis_exporter --network host oliver006/redis_exporter #仅主机模式curl -X GET http://localhost:9121/metrics
prometheus.ym配置:
scrape_configs:- job_name: redis_exporterstatic_configs:- targets: ['<REDIS-EXPORTER-HOSTNAME>:9121']
MySQL exporter
支持的版本:MySQL >= 5.6, MariaDB >= 10.3
需要权限:
CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'XXXXXXXX' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';
编译:
make build
本地部署:
./mysqld_exporter --web.listen-address=:9104 \
--no-collect.info_schema.query_response_time \
--no-collect.info_schema.innodb_cmp \
--no-collect.info_schema.innodb_cmpmem \
--collect.info_schema.processlist --collect.binlog_size
容器化部署:
docker network create my-mysql-network
docker pull prom/mysqld-exporterdocker run -d \-p 9104:9104 \--network my-mysql-network \prom/mysqld-exporter--config.my-cnf=<path_to_cnf>#仅主机网络模式部署
docker run -d \--network host \prom/mysqld-exporter--config.my-cnf=<path_to_cnf>
prometheus.ym配置:
scrape_configs:- job_name: mysqld_exporterstatic_configs:- targets: ['<MYSQLD-EXPORTER-HOSTNAME>:9104']
OracleDB exporter
本地部署(如果本地没有部署Oracle软件,需要安装Oracle Instant Client Basic):
mkdir /etc/oracledb_exporter
chown root:oracledb_exporter /etc/oracledb_exporter
chmod 775 /etc/oracledb_exporter
Put config files to **/etc/oracledb_exporter**
Put binary to **/usr/local/bin**cat > /etc/systemd/system/oracledb_exporter.service << EOF
[Unit]
Description=Service for oracle telemetry client
After=network.target
[Service]
Type=oneshot
#!!! Set your values and uncomment
#User=oracledb_exporter
#Environment="CUSTOM_METRICS=/etc/oracledb_exporter/custom-metrics.toml"
ExecStart=/usr/local/bin/oracledb_exporter \--default.metrics "/etc/oracledb_exporter/default-metrics.toml" \--log.level error --web.listen-address 0.0.0.0:9161
[Install]
WantedBy=multi-user.target
EOFsystemctl daemon-reload
systemctl start oracledb_exporter
容器化部署:
docker pull ghcr.io/iamseth/oracledb_exporter:0.5.0docker run -it --rm -p 9161:9161 ghcr.io/iamseth/oracledb_exporter:0.5.0 \
--default.metrics "/etc/oracledb_exporter/default-metrics.toml" \
--custom.metrics "/etc/oracledb_exporter/custom-metrics.toml" \
--log.level error
运行oracledb exporter之前需要配置DATA_SOURCE_NAME环境变量:
# export Oracle location:
export DATA_SOURCE_NAME=oracle://system:password@oracle-sid
# or using a complete url:
export DATA_SOURCE_NAME=oracle://user:password@myhost:1521/service# 19c client for primary/standby configuration
export DATA_SOURCE_NAME=oracle://user:password@primaryhost:1521,standbyhost:1521/service
# 19c client for primary/standby configuration with options
export DATA_SOURCE_NAME=oracle://user:password@primaryhost:1521,standbyhost:1521/service?connect_timeout=5&transport_connect_timeout=3&retry_count=3# 19c client for ASM instance connection (requires SYSDBA)
export DATA_SOURCE_NAME=oracle://user:password@primaryhost:1521,standbyhost:1521/+ASM?as=sysdba# Then run the exporter
/path/to/binary/oracledb_exporter --log.level error --web.listen-address 0.0.0.0:9161
OracleDB exporter连接到数据库的用户必须对以下数据字典具有查询权限:
dba_tablespace_usage_metrics
dba_tablespaces
v$system_wait_class
v$asm_diskgroup_stat
v$datafile
v$sysstat
v$process
v$waitclassmetric
v$session
v$resource_limit
通过custom.metrics
指定TOML文件可以为oracledb exporter自定义metrics。
[[metric]]
context = "slow_queries"
metricsdesc = { p95_time_usecs= "Gauge metric with percentile 95 of elapsed time.", p99_time_usecs= "Gauge metric with percentile 99 of elapsed time." }
request = "select percentile_disc(0.95) within group (order by elapsed_time) as p95_time_usecs, percentile_disc(0.99) within group (order by elapsed_time) as p99_time_usecs from v$sql where last_active_time >= sysdate - 5/(24*60)"[[metric]]
context = "big_queries"
metricsdesc = { p95_rows= "Gauge metric with percentile 95 of returned rows.", p99_rows= "Gauge metric with percentile 99 of returned rows." }
request = "select percentile_disc(0.95) within group (order by rownum) as p95_rows, percentile_disc(0.99) within group (order by rownum) as p99_rows from v$sql where last_active_time >= sysdate - 5/(24*60)"[[metric]]
context = "size_user_segments_top100"
metricsdesc = {table_bytes="Gauge metric with the size of the tables in user segments."}
labels = ["segment_name"]
request = "select * from (select segment_name,sum(bytes) as table_bytes from user_segments where segment_type='TABLE' group by segment_name) order by table_bytes DESC FETCH NEXT 100 ROWS ONLY"[[metric]]
context = "size_user_segments_top100"
metricsdesc = {table_partition_bytes="Gauge metric with the size of the table partition in user segments."}
labels = ["segment_name"]
request = "select * from (select segment_name,sum(bytes) as table_partition_bytes from user_segments where segment_type='TABLE PARTITION' group by segment_name) order by table_partition_bytes DESC FETCH NEXT 100 ROWS ONLY"[[metric]]
context = "size_user_segments_top100"
metricsdesc = {cluster_bytes="Gauge metric with the size of the cluster in user segments."}
labels = ["segment_name"]
request = "select * from (select segment_name,sum(bytes) as cluster_bytes from user_segments where segment_type='CLUSTER' group by segment_name) order by cluster_bytes DESC FETCH NEXT 100 ROWS ONLY"[[metric]]
context = "size_dba_segments_top100"
metricsdesc = {table_bytes="Gauge metric with the size of the tables in user segments."}
labels = ["segment_name"]
request = "select * from (select segment_name,sum(bytes) as table_bytes from dba_segments where segment_type='TABLE' group by segment_name) order by table_bytes DESC FETCH NEXT 100 ROWS ONLY"[[metric]]
context = "size_dba_segments_top100"
metricsdesc = {table_partition_bytes="Gauge metric with the size of the table partition in user segments."}
labels = ["segment_name"]
request = "select * from (select segment_name,sum(bytes) as table_partition_bytes from dba_segments where segment_type='TABLE PARTITION' group by segment_name) order by table_partition_bytes DESC FETCH NEXT 100 ROWS ONLY"[[metric]]
context = "size_dba_segments_top100"
metricsdesc = {cluster_bytes="Gauge metric with the size of the cluster in user segments."}
labels = ["segment_name"]
request = "select * from (select segment_name,sum(bytes) as cluster_bytes from dba_segments where segment_type='CLUSTER' group by segment_name) order by cluster_bytes DESC FETCH NEXT 100 ROWS ONLY"[[metric]]
context = "cache_hit_ratio"
metricsdesc = {percentage="Gauge metric with the cache hit ratio."}
request = "select Round(((Sum(Decode(a.name, 'consistent gets', a.value, 0)) + Sum(Decode(a.name, 'db block gets', a.value, 0)) - Sum(Decode(a.name, 'physical reads', a.value, 0)) )/ (Sum(Decode(a.name, 'consistent gets', a.value, 0)) + Sum(Decode(a.name, 'db block gets', a.value, 0)))) *100,2) as percentage FROM v$sysstat a"[[metric]]
context = "startup"
metricsdesc = {time_seconds="Database startup time in seconds."}
request = "SELECT (SYSDATE - STARTUP_TIME) * 24 * 60 * 60 AS time_seconds FROM V$INSTANCE"
prometheus.yml配置:
- job_name: oracledb_exporterscrape_interval: 50sscrape_timeout: 50sstatic_configs:- targets: ['<ORACLEDB_EXPORTER_IP>:9161']
References:
【1】https://prometheus.io/docs/instrumenting/exporters/
【2】https://prometheus.io/docs/guides/node-exporter/
【3】https://github.com/prometheus/node_exporter
【4】https://github.com/ncabatoff/process-exporter
【5】https://github.com/prometheus/mysqld_exporter
【6】https://github.com/oliver006/redis_exporter
【7】https://github.com/iamseth/oracledb_exporter
【8】https://github.com/iamseth/oracledb_exporter/blob/master/custom-metrics-example/custom-metrics.toml
【9】https://github.com/burningalchemist/sql_exporter