在prometheus的告警方案中,alertmanager提供了邮件、Slack、webhook等多种通知方式的支持
本篇主要介绍如何通过webhook方式将告警消息发送到自己的告警媒介
首先,部署alertmanager应用,此处不再赘述
配置webhook_configs为我们搭建的web服务地址
在搭建web服务之前,首先我们要先知道,当prometheus监控异常项目告警时,触发的推送异常信息内容是什么,我们通过脚本调试,获取到的报文内容为:
{u'status': u'firing', u'groupLabels': {u'alertname': u'\u5185\u5b58\u4f7f\u7528\u7387\u8fc7\u9ad8'}, u'truncatedAlerts': 0, u'groupKey': u'{}:{alertname="\u5185\u5b58\u4f7f\u7528\u7387\u8fc7\u9ad8"}', u'commonAnnotations': {}, u'alerts': [{u'status': u'firing', u'labels': {u'nodename': u'master', u'instance': u'localhost:9100', u'job': u'node', u'role': u'master', u'alertname': u'\u5185\u5b58\u4f7f\u7528\u7387\u8fc7\u9ad8', u'severity': u'critical'}, u'endsAt': u'0001-01-01T00:00:00Z', u'generatorURL': u'http://localhost.localdomain:9091/graph?g0.expr=100+-+%28node_memory_MemFree_bytes+%2B+node_memory_Cached_bytes+%2B+node_memory_Buffers_bytes%29+%2F+node_memory_MemTotal_bytes+%2A+100+%3E+10&g0.tab=1', u'fingerprint': u'6a6579e7227b6cc5', u'startsAt': u'2023-12-12T03:11:23.514Z', u'annotations': {u'description': u'localhost:9100\u5185\u5b58\u4f7f\u7528\u7387\u8d85\u8fc790%,\u5f53\u524d\u4f7f\u7528\u738716.70720108803758%.', u'summary': u'localhost:9100 \u5185\u5b58\u4f7f\u7528\u7387\u8fc7\u9ad8\uff0c\u8bf7\u5c3d\u5feb\u5904\u7406\uff01'}}], u'version': u'4', u'receiver': u'web\\.hook', u'externalURL': u'http://localhost.localdomain:9093', u'commonLabels': {u'job': u'node', u'severity': u'critical', u'alertname': u'\u5185\u5b58\u4f7f\u7528\u7387\u8fc7\u9ad8'}}
拿到报文后,我们就可以开始我们的内容改造之旅,我们使用flask框架搭建一个简单的web服务,在开始编码之前,先预先装好flask模块
#!/usr/bin/python
# -*- coding: utf-8 -*-
# Copyright: (c) HUWJ Organization. https://huwen.blog.csdn.net
# Copyright: (c) <huwj@sunsharing.com.cn>
# Released under the AGPL-3.0 License.from flask import Flask, request
from log import logger
from datetime import datetime, timedeltaapp = Flask(__name__)@app.route('/', methods=["POST"])
def send_message():"""接收alertmanager告警消息,解析告警内容,推送至自定义告警媒介"""req = request.jsonresponse = ""for alert in req['alerts']:status = ''if alert['status'] == 'firing':status = "告警触发"elif alert['status'] == 'resolved':status = "告警恢复"else:passjob = alert['labels']['job'] team = alert['labels']['team'] severity = alert['labels']['severity'] description = alert['annotations']['description'] name = alert['labels']['alertname'] time_obj = datetime.strptime(alert['startsAt'][:19], '%Y-%m-%dT%H:%M:%S') + timedelta(hours=8)time = datetime.strftime(time_obj, '%Y-%m-%d %H:%M:%S')content = "========={0}=========\n" \"告警名称:{1}\n" \"告警类型:{2}\n" \"告警级别:{3}\n" \"告警小组:{4}\n" \"告警时间:{5}\n" \"告警内容:{6}".format(status, name, job, severity, team, time, description)# 根据接口方法制定请求头类型header = {"Content-Type": "application/json"}# 设定body报文data = [{"sender": "prometheus","content": content,"sendDate": ""}]sendData = json.dumps(data)sendData = sendData.encode("utf-8")try:response = requests.post(url=url, data=sendData, headers=header, verify=False)except:return jsonify({"error": "No message provided"}), 400return jsonify(response.json()), response.status_codeif __name__ == '__main__':app.run(host='127.0.0.1', port=8080)
将以上内容保存为 alertmessage.py ,然后后台启动服务
nohup python ./alertmessage.py &
验证是否正常触发推送,我们手动修改一条告警规则,使其告警,从prometheus查看已触发告警
然后alertmanager端查看
可以看到 web.hook已经触发两条告警 ,查看我们的web服务日志,消息已正常推送
至此,我们搭建的自定义告警服务就已经成功了。