介绍
Blackbox Exporter是Prometheus社区提供的官方黑盒监控解决方案,其允许用户通过:HTTP、HTTPS、DNS、TCP以及ICMP的方式对网络进行探测。用户可以直接使用go get命令获取Blackbox Exporter源码并生成本地可执行文件:
go get prometheus/blackbox_exporter
github 地址:
https://github.com/prometheus/blackbox_exporter
部署
1 二进制方式
1.1 下载解压
curl -o blackbox_exporter-0.24.0.linux-amd64.tar.gz https://github.com/prometheus/blackbox_exporter/releases/download/v0.24.0/blackbox_exporter-0.24.0.linux-amd64.tar.gztar -xf blackbox_exporter-0.24.0.linux-amd64.tar.gz -C /usr/local/
mv /usr/local/blackbox_exporter-0.24.0.linux-amd64 /usr/local/blackbox_exporter-0.24.0
1.2 配置 systemd
[Unit]
Description=The blackbox exporter
After=network-online.target remote-fs.target nss-lookup.target
Wants=network-online.target[Service]
ExecStart=/usr/local/blackbox_exporter-0.24.0/blackbox_exporter --config.file=/usr/local/blackbox_exporter-0.24.0/blackbox.ymlKillSignal=SIGQUITRestart=alwaysRestartPreventExitStatus=1 6 SIGABRTTimeoutStopSec=5
KillMode=process
PrivateTmp=true
LimitNOFILE=1048576
LimitNPROC=1048576[Install]
WantedBy=multi-user.target
1.3 配置文件 blackbox.yml
2 容器方式
docker 镜像地址
https://hub.docker.com/r/prom/blackbox-exporter/tags
docker pull prom/blackbox-exporter:v0.23.0
运行Blackbox Exporter时,需要用户提供探针的配置信息,这些配置信息可能是一些自定义的HTTP头信息,也可能是探测时需要的一些TSL配置,也可能是探针本身的验证行为。在Blackbox Exporter每一个探针配置称为一个module,并且以YAML配置文件的形式提供给Blackbox Exporter。 每一个module主要包含以下配置内容,包括探针类型(prober)、验证访问超时时间(timeout)、以及当前探针的具体配置项:
# 探针类型:http、 tcp、 dns、 icmp.prober: <prober_string># 超时时间[ timeout: <duration> ]# 探针的详细配置,最多只能配置其中的一个[ http: <http_probe> ][ tcp: <tcp_probe> ][ dns: <dns_probe> ][ icmp: <icmp_probe> ]
下面是一个简化的探针配置文件blockbox.yml,包含两个HTTP探针配置项:
modules:http_2xx:prober: httptimeout: 10shttp:method: GETpreferred_ip_protocol: "ip4"http_post_2xx:prober: httphttp:method: POST
通过运行以下命令,并指定使用的探针配置文件启动Blockbox Exporter实例:
blackbox_exporter --config.file=/etc/prometheus/blackbox.yml
启动成功后,就可以通过访问http://127.0.0.1:9115/probe?module=http_2xx&target=baidu.com对baidu.com进行探测。这里通过在URL中提供module参数指定了当前使用的探针,target参数指定探测目标,探针的探测结果通过Metrics的形式返回:
# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 0.011633673
# HELP probe_duration_seconds Returns how long the probe took to complete in seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 0.117332275
# HELP probe_failed_due_to_regex Indicates if probe failed due to regex
# TYPE probe_failed_due_to_regex gauge
probe_failed_due_to_regex 0
# HELP probe_http_content_length Length of http content response
# TYPE probe_http_content_length gauge
probe_http_content_length 81
# HELP probe_http_duration_seconds Duration of http request by phase, summed over all redirects
# TYPE probe_http_duration_seconds gauge
probe_http_duration_seconds{phase="connect"} 0.055551141
probe_http_duration_seconds{phase="processing"} 0.049736019
probe_http_duration_seconds{phase="resolve"} 0.011633673
probe_http_duration_seconds{phase="tls"} 0
probe_http_duration_seconds{phase="transfer"} 3.8919e-05
# HELP probe_http_redirects The number of redirects
# TYPE probe_http_redirects gauge
probe_http_redirects 0
# HELP probe_http_ssl Indicates if SSL was used for the final redirect
# TYPE probe_http_ssl gauge
probe_http_ssl 0
# HELP probe_http_status_code Response HTTP status code
# TYPE probe_http_status_code gauge
probe_http_status_code 200
# HELP probe_http_version Returns the version of HTTP of the probe response
# TYPE probe_http_version gauge
probe_http_version 1.1
# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 4
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gauge
probe_success 1
从返回的样本中,用户可以获取站点的DNS解析耗时、站点响应时间、HTTP响应状态码等等和站点访问质量相关的监控指标,从而帮助管理员主动的发现故障和问题。
与Prometheus集成
接下来,只需要在Prometheus下配置对Blockbox Exporter实例的采集任务即可。最直观的配置方式
- job_name: baidu_http2xx_probeparams:module:- http_2xxtarget: - baidu.commetrics_path: /probestatic_configs:- targets:- 127.0.0.1:9115
- job_name: prometheus_http2xx_probeparams:module:- http_2xxtarget:- prometheus.iometrics_path: /probestatic_configs:- targets:- 127.0.0.1:9115
假如我们有N个目标站点且都需要M种探测方式,那么Prometheus中将包含N * M个采集任务,从配置管理的角度来说显然是不可接受的。
这里我们也可以采用Relabling的方式对这些配置进行简化,配置方式如下:
scrape_configs:- job_name: 'blackbox'metrics_path: /probeparams:module: [http_2xx]static_configs:- targets:- http://prometheus.io # Target to probe with http.- https://prometheus.io # Target to probe with https.- http://example.com:8080 # Target to probe with http on port 8080.relabel_configs:- source_labels: [__address__]target_label: __param_target- source_labels: [__param_target]target_label: instance- target_label: __address__replacement: 127.0.0.1:9115
http://127.0.0.1:9115/probe?module=http_2xx&target=baidu.com
- 第1步,根据
static_configs.targets
实例的地址,写入__param_target
标签中。__param_<name>
形式的标签表示,采集任务时会在请求目标地址中添加<name>
参数的值,等同于params的设置; - 第2步,获取
__param_target
的值,并覆写到instance
标签中; - 第3步,覆写Target实例的__address__标签值为BlockBox Exporter实例的访问地址。
blackbox.yml
modules:http_2xx:prober: httphttp_post_2xx:prober: httphttp:method: POSTpreferred_ip_protocol: "ip4"tcp_connect:prober: tcppop3s_banner:prober: tcptcp:query_response:- expect: "^+OK"tls: truetls_config:insecure_skip_verify: falsegrpc:prober: grpcgrpc:tls: truepreferred_ip_protocol: "ip4"grpc_plain:prober: grpcgrpc:tls: falseservice: "service1"ssh_banner:prober: tcptcp:query_response:- expect: "^SSH-2.0-"- send: "SSH-2.0-blackbox-ssh-check"irc_banner:prober: tcptcp:query_response:- send: "NICK prober"- send: "USER prober prober prober :prober"- expect: "PING :([^ ]+)"send: "PONG ${1}"- expect: "^:[^ ]+ 001" icmp:prober: icmpicmp_ttl5:prober: icmptimeout: 5sicmp:ttl: 5
example.yml
modules:http_2xx_example:prober: httptimeout: 5shttp:valid_http_versions: ["HTTP/1.1", "HTTP/2.0"]valid_status_codes: [] # Defaults to 2xxmethod: GETheaders:Host: vhost.example.comAccept-Language: en-USOrigin: example.comno_follow_redirects: falsefail_if_ssl: falsefail_if_not_ssl: falsefail_if_body_matches_regexp:- "Could not connect to database"fail_if_body_not_matches_regexp:- "Download the latest version here"fail_if_header_matches: # Verifies that no cookies are set- header: Set-Cookieallow_missing: trueregexp: '.*'fail_if_header_not_matches:- header: Access-Control-Allow-Originregexp: '(\*|example\.com)'tls_config:insecure_skip_verify: falsepreferred_ip_protocol: "ip4" # defaults to "ip6"ip_protocol_fallback: false # no fallback to "ip6"http_with_proxy:prober: httphttp:proxy_url: "http://127.0.0.1:3128"skip_resolve_phase_with_proxy: truehttp_with_proxy_and_headers:prober: httphttp:proxy_url: "http://127.0.0.1:3128"proxy_connect_header:Proxy-Authorization:- Bearer tokenhttp_post_2xx:prober: httptimeout: 5shttp:method: POSTheaders:Content-Type: application/jsonbody: '{}'http_basic_auth_example:prober: httptimeout: 5shttp:method: POSTheaders:Host: "login.example.com"basic_auth:username: "username"password: "mysecret"http_custom_ca_example:prober: httphttp:method: GETtls_config:ca_file: "/certs/my_cert.crt"http_gzip:prober: httphttp:method: GETcompression: gziphttp_gzip_with_accept_encoding:prober: httphttp:method: GETcompression: gzipheaders:Accept-Encoding: gziptls_connect:prober: tcptimeout: 5stcp:tls: truetcp_connect_example:prober: tcptimeout: 5simap_starttls:prober: tcptimeout: 5stcp:query_response:- expect: "OK.*STARTTLS"- send: ". STARTTLS"- expect: "OK"- starttls: true- send: ". capability"- expect: "CAPABILITY IMAP4rev1"smtp_starttls:prober: tcptimeout: 5stcp:query_response:- expect: "^220 ([^ ]+) ESMTP (.+)$"- send: "EHLO prober\r"- expect: "^250-STARTTLS"- send: "STARTTLS\r"- expect: "^220"- starttls: true- send: "EHLO prober\r"- expect: "^250-AUTH"- send: "QUIT\r"irc_banner_example:prober: tcptimeout: 5stcp:query_response:- send: "NICK prober"- send: "USER prober prober prober :prober"- expect: "PING :([^ ]+)"send: "PONG ${1}"- expect: "^:[^ ]+ 001"icmp_example:prober: icmptimeout: 5sicmp:preferred_ip_protocol: "ip4"source_ip_address: "127.0.0.1"dns_udp_example:prober: dnstimeout: 5sdns:query_name: "www.prometheus.io"query_type: "A"valid_rcodes:- NOERRORvalidate_answer_rrs:fail_if_matches_regexp:- ".*127.0.0.1"fail_if_all_match_regexp:- ".*127.0.0.1"fail_if_not_matches_regexp:- "www.prometheus.io.\t300\tIN\tA\t127.0.0.1"fail_if_none_matches_regexp:- "127.0.0.1"validate_authority_rrs:fail_if_matches_regexp:- ".*127.0.0.1"validate_additional_rrs:fail_if_matches_regexp:- ".*127.0.0.1"dns_soa:prober: dnsdns:query_name: "prometheus.io"query_type: "SOA"dns_tcp_example:prober: dnsdns:transport_protocol: "tcp" # defaults to "udp"preferred_ip_protocol: "ip4" # defaults to "ip6"query_name: "www.prometheus.io"