文章目录
- 环境准备
- 部署安装
- keepavlived配置
- 启动测试
- 模拟Nginx宕机
- 重新启动
- 问题分析
环境准备
测试一下keepalived的双主模式,所谓双主模式就是两个keepavlied节点各持有一个/组虚IP,默认情况下,二者互为主备,同时对外提供服务,任何一个节点宕机,虚IP自动漂移到另外一台服务器上,从而实现双活高可用。
这里准备两台CentOS服务器,IP规划及服务器拓扑如下:
部署安装
上传安装包,并在两台服务器上安装 keepalived 和nginx,具体安装过程参见:
nginx安装
keepalived安装配置
drwxrwxr-x. 11 1000 1000 4096 Feb 2 22:29 keepalived-2.2.8
-rw-r--r--. 1 root root 1202602 Nov 29 14:15 keepalived-2.2.8.tar.gz
lrwxrwxrwx. 1 root root 12 Feb 2 22:19 nginx -> nginx-1.24.0
drwxr-xr-x. 9 1001 1001 186 Feb 2 22:20 nginx-1.24.0
-rw-r--r--. 1 root root 1112471 Nov 22 09:28 nginx-1.24.0.tar.gz
完成安装后,可以通过
systemctl start/stop/restart keepalived 来启动停止重启 keepalived
[root@localhost apps]# systemctl status keepalived
● keepalived.service - LVS and VRRP High Availability MonitorLoaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)Active: active (running) since Tue 2024-02-20 23:53:53 +08; 57min agoDocs: man:keepalived(8)man:keepalived.conf(5)man:genhash(1)https://keepalived.orgProcess: 6926 ExecStart=/usr/local/keeplived/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)Main PID: 6927 (keepalived)CGroup: /system.slice/keepalived.service├─6927 /usr/local/keeplived/sbin/keepalived -D└─6928 /usr/local/keeplived/sbin/keepalived -D
keepavlived配置
keepalived 配置文件路径如下
/usr/local/keeplived/etc/keepalived
修改keepalived 配置节点1配置如下:
-rw-r--r--. 1 root root 1039 Feb 20 23:53 keepalived.conf
-rw-r--r--. 1 root root 3550 Feb 2 22:29 keepalived.conf.sample
-rwxr-xr-x. 1 root root 32 Feb 20 22:44 nginx_check.sh
drwxr-xr-x. 2 root root 4096 Feb 2 22:29 samples
global_defs {router_id KEEPALIVED_NODE_11 # 每个keepalived节点要唯一
}
# 这里配置一个nginx 状态检测脚本
vrrp_script chk_nginx {script "/usr/local/keeplived/etc/keepalived/nginx_check.sh"interval 2weight -20fall 2rise 1
}vrrp_instance VI_1{state MASTERinterface ens33virtual_router_id 51priority 100advert_int 1authentication {auth_type PASSauth_pass 1111 # 每个虚IP的pass 最好配置单独的密码,这个是通信权限鉴定的手段}track_script {chk_nginx}unicast_src_ip 192.168.126.11 #配置单播原地址unicast_peer {192.168.126.12}virtual_ipaddress {192.168.126.21/24}
}
vrrp_instance VI_2{state BACKUPinterface ens33 virtual_router_id 52priority 80advert_int 1authentication {auth_type PASSauth_pass 11111}track_script {chk_nginx}unicast_src_ip 192.168.126.11 #配置单播原地址 unicast_peer {192.168.126.12}virtual_ipaddress {192.168.126.22/24}
}
keepalived节点2配置如下
global_defs {router_id KEEPALIVED_NODE_12
}vrrp_script chk_nginx {script "/usr/local/keeplived/etc/keepalived/nginx_check.sh" interval 2 weight -20 #上述脚本返回非0时 自动降低优先级 20fall 2 rise 1
}vrrp_instance VI_1{state BACKUP interface ens33 virtual_router_id 51 # 每个虚IP的id必须唯一 priority 80 advert_int 1 authentication { auth_type PASSauth_pass 1111 # 每个虚IP的pass 最好配置单独的密码,这个是通信权限鉴定的手段}track_script {chk_nginx}unicast_src_ip 192.168.126.12 #配置单播原地址unicast_peer {192.168.126.11}virtual_ipaddress {192.168.126.21/24 }
}
vrrp_instance VI_2{state MASTERinterface ens33virtual_router_id 52 # 每个虚IP的id必须唯一 如这里不能和 VI1中的 virtual_router_id 重复priority 100 advert_int 1 authentication { auth_type PASSauth_pass 11111}track_script {chk_nginx}unicast_src_ip 192.168.126.12 #配置单播原地址unicast_peer {192.168.126.11}virtual_ipaddress {192.168.126.22/24}
}
上述配置中用到了一个nginx_check.sh脚本编写脚本内容如下
[root@localhost keepalived]# vi nginx_check.sh
nginxpid=`ps -C nginx --no-header | wc -l`
#!/bin/bash
DIR=`ps -C nginx --no-header | wc -l`
if [ 0 -eq $DIR ]# 如果nginx进程消失 则返回2
then
exit 2
fi
启动测试
根据上述配置,启动keepalived 和 nginx
可以看到和预期相符 .11服务器持有 192.168.126.21 虚ip .12服务器持有 192.168.126.22 虚ip
修改nginx index.html 文件内容 ,增加本机IP用于区分开两台服务器
分别通过虚IP访问web服务 http://192.168.126.21 和 http://192.168.126.22 分别访问到了 .11 和.12服务器
模拟Nginx宕机
模拟宕机,停止192.168.126.11上的nginx服务
等待一定时间后可以看到keepalived日志中,VI_1 和VI_2 的优先级都降低
之后又等了很长时间 没有再看到其他日志输出。怀疑可能和他的 机制有关,执行一次失败后就不在执行了。执行失败后优先级-20 刚好和 备节点优先级相同 所以,没有进行主备切换。我们修改配置文件,脚本执行返回非0结果后优先级-30,保存并重启服务
启动nginx服务重新测试
[root@localhost keepalived]# systemctl restart keepalived
[root@localhost keepalived]# /usr/local/nginx/sbin/nginx
停止nginx服务,通过keepalived日志看到 优先级降低并切换到了 BACKUP 状态,虚ip漂移到了.12服务器上。
keepalived 日志:/var/log/messages
通过虚IP 192.168.126.21 访问web服务直接访问到了.12 节点
重新启动
重新启动.11 节点上的 nginx,可以看到优先级又回来了,VI_1状态又变回了MASTER
同时.12节点变为了BACKUP状态.
问题分析
在测试过程中出现了两个nginx节点都是主的情况,检查发现是防火墙开启了 导致状态监听报文无法发送和接收。双方接收不到其他节点的状态报文,所以就认为自己是MASTER。