一、监控架构设计
核心组件与数据流
- Prometheus:时序数据采集、存储与告警规则管理
- Node Exporter:采集主机指标(CPU、内存、磁盘、网络等)
- 数据库Exporter:如
mysqld_exporter
、postgres_exporter
- Grafana:数据可视化与仪表盘展示
- Alertmanager(可选):告警通知管理
二、主机环境准备
1. 系统要求
- Linux系统(推荐CentOS 7+/Ubuntu 20.04+)
- 开放端口:9090(Prometheus)、3000(Grafana)、9100(Node Exporter)
- 确保所有节点时间同步(NTP服务)
# CentOS安装NTP
sudo yum install ntp
sudo systemctl start ntpd
sudo systemctl enable ntpd# Ubuntu安装NTP
sudo apt install ntp
sudo systemctl restart ntp
三、组件安装与配置
1. 安装Prometheus Server
下载二进制包
wget https://github.com/prometheus/prometheus/releases/download/v2.39.1/prometheus-2.39.1.linux-amd64.tar.gz
tar xvfz prometheus-*.tar.gz
sudo mv prometheus-2.39.1.linux-amd64 /usr/local/prometheus
创建系统服务
sudo useradd --no-create-home --shell /bin/false prometheus
sudo mkdir /etc/prometheus /var/lib/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus# 创建service文件
sudo cat <<EOF > /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target[Service]
User=prometheus
Group=prometheus
ExecStart=/usr/local/prometheus/prometheus \--config.file=/etc/prometheus/prometheus.yml \--storage.tsdb.path=/var/lib/prometheus \--web.listen-address=0.0.0.0:9090Restart=always[Install]
WantedBy=multi-user.target
EOF# 配置Prometheus
sudo cp /usr/local/prometheus/prometheus.yml /etc/prometheus/
sudo chown -R prometheus:prometheus /etc/prometheus# 启动服务
sudo systemctl daemon-reload
sudo systemctl start prometheus
sudo systemctl enable prometheus
2. 部署Node Exporter(所有节点)
下载安装
wget https://github.com/prometheus/node_exporter/releases/download/v1.4.0/node_exporter-1.4.0.linux-amd64.tar.gz
tar xvfz node_exporter-*.tar.gz
sudo mv node_exporter-1.4.0.linux-amd64/node_exporter /usr/local/bin/
sudo useradd -rs /bin/false node_exporter
创建系统服务
sudo cat <<EOF > /etc/systemd/system/node_exporter.service
[Unit]
Description=Node Exporter
After=network.target[Service]
User=node_exporter
Group=node_exporter
ExecStart=/usr/local/bin/node_exporterRestart=always[Install]
WantedBy=multi-user.target
EOFsudo systemctl daemon-reload
sudo systemctl start node_exporter
sudo systemctl enable node_exporter
3. 配置Prometheus抓取规则
编辑 /etc/prometheus/prometheus.yml
:
scrape_configs:- job_name: 'node'static_configs:- targets: ['node1:9100', 'node2:9100', 'node3:9100']
重启Prometheus生效:
sudo systemctl restart prometheus
四、数据库监控配置(以MySQL为例)
1. 安装mysqld_exporter
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.14.0/mysqld_exporter-0.14.0.linux-amd64.tar.gz
tar xvfz mysqld_exporter-*.tar.gz
sudo mv mysqld_exporter-0.14.0.linux-amd64/mysqld_exporter /usr/local/bin/
sudo useradd -rs /bin/false mysqld_exporter
2. 创建监控用户
CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'SecurePass123!' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';
3. 创建环境变量文件
sudo mkdir /etc/mysqld_exporter
sudo cat <<EOF > /etc/mysqld_exporter/.my.cnf
[client]
user=exporter
password=SecurePass123!
EOF
4. 创建系统服务
sudo cat <<EOF > /etc/systemd/system/mysqld_exporter.service
[Unit]
Description=MySQL Exporter
After=network.target[Service]
User=mysqld_exporter
EnvironmentFile=/etc/mysqld_exporter/.my.cnf
ExecStart=/usr/local/bin/mysqld_exporter \--config.my-cnf="%a" \--web.listen-address=0.0.0.0:9104Restart=always[Install]
WantedBy=multi-user.target
EOFsudo systemctl daemon-reload
sudo systemctl start mysqld_exporter
sudo systemctl enable mysqld_exporter
五、安装与配置Grafana
1. 安装Grafana(CentOS)
sudo tee /etc/yum.repos.d/grafana.repo <<EOF
[grafana]
name=grafana
baseurl=https://packages.grafana.com/oss/rpm
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://packages.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
EOFsudo yum install grafana
sudo systemctl start grafana-server
sudo systemctl enable grafana-server
2. 配置Grafana数据源
- 访问
http://<服务器IP>:3000
,默认账号admin/admin
- 左侧菜单 → Configuration → Data Sources → Add data source
- 选择 Prometheus,填写URL
http://localhost:9090
- 点击 Save & Test
六、导入监控仪表盘
1. 主机监控仪表盘
- Node Exporter Full:ID
1860
- Linux Hosts Metrics:ID
11074
2. MySQL监控仪表盘
- MySQL Overview:ID
7362
- Percona MySQL:ID
11323
操作步骤:
- 左侧菜单 → Create → Import
- 输入仪表盘ID → Load
- 选择Prometheus数据源 → Import
七、安全加固
1. 防火墙配置
# CentOS
sudo firewall-cmd --permanent --add-port=3000/tcp
sudo firewall-cmd --permanent --add-port=9090/tcp
sudo firewall-cmd --reload# Ubuntu
sudo ufw allow 3000/tcp
sudo ufw allow 9090/tcp
sudo ufw reload
2. Grafana反向代理(Nginx示例)
server {listen 80;server_name grafana.yourdomain.com;location / {proxy_pass http://localhost:3000;proxy_set_header Host $host;proxy_set_header X-Real-IP $remote_addr;}
}
八、告警配置示例
1. 创建告警规则文件
sudo cat <<EOF > /etc/prometheus/alerts.yml
groups:
- name: host-alertsrules:- alert: HighMemoryUsageexpr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 85for: 5mlabels:severity: warningannotations:summary: "内存使用率过高 (实例 {{ $labels.instance }})"description: "内存使用率超过85%持续5分钟"
EOF
2. 修改Prometheus配置
# /etc/prometheus/prometheus.yml
rule_files:- alerts.yml
重启服务:
sudo systemctl restart prometheus
九、故障排查指南
1. 服务状态检查
sudo systemctl status prometheus
sudo systemctl status node_exporter
sudo systemctl status mysqld_exporter
2. 日志查看
# Prometheus日志
journalctl -u prometheus -f# Node Exporter日志
journalctl -u node_exporter -f# MySQL Exporter日志
journalctl -u mysqld_exporter -f
十、总结
通过原生安装方式,您已构建完整的监控系统:
- 资源监控:实时掌握CPU、内存、磁盘等指标
- 数据库监控:跟踪查询性能、连接数、复制状态
- 告警通知:配置阈值触发邮件/钉钉通知
- 安全加固:通过防火墙和反向代理保护服务
后续扩展方向:
- 集成Alertmanager实现多通道告警
- 监控Redis、Kafka等中间件
- 部署长期存储(如Thanos)管理历史数据
资源参考:
- Prometheus官方文档
- Grafana仪表盘库