MySQL--MHA高可用及读写分离

一、什么是高可用

1.企业级高可用标准：全年无故障时间

全年无故障时间	全年故障时间	具体时间
99.9%	0.1%	525.6 min	keeplive+双主（切换需要人为干预）
99.99%	0.01%	52.56 min	MHA （半自动化）
99.999%	0.001%	5.256 min	PXC、MGR、MGC （自动化）
99.9999%	0.0001%	0.5256 min	自动化、云化、平台化

二、MHA介绍

1.MHA工作原理

1.监控

通过masterha_master_monitor，每隔ping_interval秒监测一次master心跳。如果监控不到心跳，一共给4次机会

2.选主

主库宕机谁来接管？
1. 所有从节点日志都是一致的，默认会以配置文件的顺序去选择一个新主。
2. 从节点日志不一致，自动选择最接近于主库的从库
3. 如果对于某节点设定了权重（candidate_master=1），权重节点会优先选择。
但是此节点日志量落后主库100M日志的话，也不会被选择。可以配合check_repl_delay=0，关闭日志量的检查，强制选择候选节点。

(1) ping_interval=1
#设置监控主库，发送ping包的时间间隔，尝试三次没有回应的时候自动进行failover
(2) candidate_master=1
#设置为候选master，如果设置该参数以后，发生主从切换以后将会将此从库提升为主库，即使这个主库不是集群中事件最新的slave
(3)check_repl_delay=0
#默认情况下如果一个slave落后master 100M的relay logs的话，MHA将不会选择该slave作为一个新的master，因为对于这个slave的恢复需要花费很长时间，通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时，这个参数对于设置了candidate_master=1的主机非常有用，因为这个候选主在切换的过程中一定是新的master

3.数据日志补偿

1）如果ssh通

各个从节点，通过save_binary_logs立即保存缺失部分的binlog到var/tmp/xxxx

2）如果ssh不通

从节点调用apply_diff_relay_logs，计算两个节点的relay log差异

4.故障转移

解除原油主从关系，构建新的主从关系

5.自动将故障节点，从配置文件剔除

6.MHA自杀

7.应用透明-vip

8.数据补偿补充方案：binlog-server

9.故障提醒：邮件、钉钉

2.MHA架构介绍

1主2从，master：db01   slave：db02   db03 ）：
MHA 高可用方案软件构成
Manager软件：选择一个从节点安装
Node软件：所有节点都要安装

3.MHA 软件构成

Manager工具包主要包括以下几个工具：
masterha_manger             启动MHA 
masterha_check_ssh      检查MHA的SSH配置状况 
masterha_check_repl         检查MySQL复制状况 
masterha_master_monitor     检测master是否宕机 
masterha_check_status       检测当前MHA运行状态 
masterha_master_switch  控制故障转移（自动或者手动）
masterha_conf_host      添加或删除配置的server信息Node工具包主要包括以下几个工具：
这些工具通常由MHA Manager的脚本触发，无需人为操作
save_binary_logs            保存和复制master的二进制日志 
apply_diff_relay_logs       识别差异的中继日志事件并将其差异的事件应用于其他的
purge_relay_logs            清除中继日志（不会阻塞SQL线程）

4.MHA集群

三、MHA环境搭建

1.准备一主两从的环境

2.所有节点创建软连接

[root@localhost bin]# ln -s /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
[root@localhost bin]# ln -s /usr/local/mysql/bin/mysql /usr/bin/mysql
[root@localhost bin]# ls -l /usr/bin/mysql*
lrwxrwxrwx 1 root root 26 5月  30 15:27 /usr/bin/mysql -> /usr/local/mysql/bin/mysql
lrwxrwxrwx 1 root root 32 5月  30 15:27 /usr/bin/mysqlbinlog -> /usr/local/mysql/bin/mysqlbinlog

3.配置各节点互信（各节点之间无密码SSH登录）

db01：
rm -rf /root/.ssh 
ssh-keygen
cd /root/.ssh 
mv id_rsa.pub authorized_keys
scp  -r  /root/.ssh  192.168.20.120:/root 
scp  -r  /root/.ssh  192.168.20.231:/root 
各节点验证
db01:
ssh 192.168.20.132 date
ssh 192.168.20.120 date
ssh 192.168.20.231 date
db02:
ssh 192.168.20.132 date
ssh 192.168.20.120 date
ssh 192.168.20.231 date
db03:
ssh 192.168.20.132 date
ssh 192.168.20.120 date
ssh 192.168.20.231 date

4.下载mha软件

mha官网：https://code.google.com/archive/p/mysql-master-ha/
github下载地址mysql5.7以下可用：https://github.com/yoshinorim/mha4mysql-manager/wiki/Downloads
mysql8.0：https://pan.baidu.com/s/1-yo1KjZZUvbHxrcI9Yl7-Q  提取码fr50

5.所有节点安装Node软件依赖包

[root@localhost opt]# rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm
准备中...                          ################################# [100%]
正在升级/安装...1:mha4mysql-node-0.58-0.el7.centos ################################# [100%]
[root@localhost opt]#

6.创建mha需要的用户——mha和replmha

mysql> create user mha@'%' identified with mysql_native_password by'ok';
Query OK, 0 rows affected (0.00 sec)mysql> GRANT ALL PRIVILEGES ON *.* TO 'mha'@'%' WITH GRANT OPTION;
Query OK, 0 rows affected (0.01 sec)mysql> select user,host,authentication_string,plugin from mysql.user;
+------------------+-----------+------------------------------------------------------------------------+-----------------------+
| user             | host      | authentication_string                                                  | plugin                |
+------------------+-----------+------------------------------------------------------------------------+-----------------------+
| mha              | %         | *31330A9B24799CC9566A39CBD78CEF60E26C906F                              | mysql_native_password |
| replmha          | %         | $A$005$)fcB"IXZmo{qr%)71hR72t5sUOD3H27kNo8uGnWX8/mkwadbMlpTdSyw9B | caching_sha2_password |
| mysql.infoschema | localhost | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED | caching_sha2_password |
| mysql.session    | localhost | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED | caching_sha2_password |
| mysql.sys        | localhost | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED | caching_sha2_password |
| repl             | localhost | *31330A9B24799CC9566A39CBD78CEF60E26C906F                              | mysql_native_password |
| root             | localhost |                                                                        | caching_sha2_password |
| yizuo            | localhost | *31330A9B24799CC9566A39CBD78CEF60E26C906F                              | mysql_native_password |
+------------------+-----------+------------------------------------------------------------------------+-----------------------+
8 rows in set (0.00 sec)

具有复制权限的用户（repl）必须在所有节点上都创建一次，具有管理权限的用户也是一样，在从库做任何操作之前记得set sql_log_bin=0——>关闭binlog日志；

7.Manager软件安装（db03）

# 在管理节点安装依赖软件
yum install -y perl-Config-Tiny perl-Log-Dispatch  perl-Parallel-ForkManager
# 在管理节点安装mha4mysql-manager
rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

8.配置文件准备(db03)

创建配置文件目录mkdir -p /etc/mha
创建日志目录mkdir -p /var/log/mha/app1
编辑mha配置文件
vim /etc/mha/app1.cnf[server default]
manager_log=/var/log/mha/app1/manager        
manager_workdir=/var/log/mha/app1            
master_binlog_dir=/data/3306/binlog       
user=replmha                                   
password=ok                               
ping_interval=2
repl_password=ok
repl_user=mha
ssh_user=root                               
[server1]                                   
hostname=192.168.20.132
port=3306                                  
[server2]            
hostname=192.168.20.120
port=3306
[server3]
hostname=192.168.20.231
port=3306

MHA主要配置文件说明

         manager_workdir=/var/log/masterha/app1.log：设置manager的工作目录

         manager_log=/var/log/masterha/app1/manager.log：设置manager的日志文件，主日志文件，出了问题看这个

         master_binlog_dir=/data/mysql：设置master 保存binlog的位置，以便MHA可以找到master的日志，主库的二进制日志目录

         master_ip_failover_script= /usr/local/bin/master_ip_failover：设置自动failover时候的切换脚本

         master_ip_online_change_script= /usr/local/bin/master_ip_online_change：设置手动切换时候的切换脚本

         user=root：设置监控mysql的用户

         password=dayi123：设置监控mysql的用户，需要授权能够在manager节点远程登录

         ping_interval=1：设置监控主库，发送ping包的时间间隔，默认是3秒，尝试三次没有回应的时候自动进行railover

         remote_workdir=/tmp：设置远端mysql在发生切换时binlog的保存位置

         repl_user=repl ：设置mysql中用于复制的用户密码

         repl_password=replication：设置mysql中用于复制的用户

         report_script=/usr/local/send_report：设置发生切换后发送的报警的脚本

         shutdown_script=""：设置故障发生后关闭故障主机脚本（该脚本的主要作用是关闭主机放在发生脑裂,这里没有使用）

         ssh_user=root //设置ssh的登录用户名

         candidate_master=1：在节点下设置，设置当前节点为候选的master

         slave check_repl_delay=0 :在节点配置下设置，默认情况下如果一个slave落后master 100M的relay logs的话，MHA将不会选择该slave作为一个新的master；这个选项对于对于设置了candidate_master=1的主机非常有用
————————————————
原文链接：https://blog.csdn.net/dayi_123/article/details/83690608

9.互信检查（db03）

[root@localhost opt]# masterha_check_ssh  --conf=/etc/mha/app1.cnf 
Thu May 30 16:25:13 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu May 30 16:25:13 2024 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu May 30 16:25:13 2024 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu May 30 16:25:13 2024 - [info] Starting SSH connection tests..
Thu May 30 16:25:14 2024 - [debug] 
Thu May 30 16:25:13 2024 - [debug]  Connecting via SSH from root@192.168.20.132(192.168.20.132:22) to root@192.168.20.120(192.168.20.120:22)..
Thu May 30 16:25:14 2024 - [debug]   ok.
Thu May 30 16:25:14 2024 - [debug]  Connecting via SSH from root@192.168.20.132(192.168.20.132:22) to root@192.168.20.231(192.168.20.231:22)..
Thu May 30 16:25:14 2024 - [debug]   ok.
Thu May 30 16:25:15 2024 - [debug] 
Thu May 30 16:25:14 2024 - [debug]  Connecting via SSH from root@192.168.20.120(192.168.20.120:22) to root@192.168.20.132(192.168.20.132:22)..
Thu May 30 16:25:14 2024 - [debug]   ok.
Thu May 30 16:25:14 2024 - [debug]  Connecting via SSH from root@192.168.20.120(192.168.20.120:22) to root@192.168.20.231(192.168.20.231:22)..
Thu May 30 16:25:15 2024 - [debug]   ok.
Thu May 30 16:25:16 2024 - [debug] 
Thu May 30 16:25:14 2024 - [debug]  Connecting via SSH from root@192.168.20.231(192.168.20.231:22) to root@192.168.20.132(192.168.20.132:22)..
Thu May 30 16:25:15 2024 - [debug]   ok.
Thu May 30 16:25:15 2024 - [debug]  Connecting via SSH from root@192.168.20.231(192.168.20.231:22) to root@192.168.20.120(192.168.20.120:22)..
Thu May 30 16:25:15 2024 - [debug]   ok.
Thu May 30 16:25:16 2024 - [info] All SSH connection tests passed successfully.

10.主从检查（db03）

masterha_check_repl  --conf=/etc/mha/app1.cnf

[root@localhost app1]# masterha_check_repl  --conf=/etc/mha/app1.cnf
Thu May 30 19:09:30 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu May 30 19:09:30 2024 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu May 30 19:09:30 2024 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu May 30 19:09:30 2024 - [info] MHA::MasterMonitor version 0.58.
Thu May 30 19:09:31 2024 - [info] GTID failover mode = 1
Thu May 30 19:09:31 2024 - [info] Dead Servers:
Thu May 30 19:09:31 2024 - [info] Alive Servers:
Thu May 30 19:09:31 2024 - [info]   192.168.20.132(192.168.20.132:3306)
Thu May 30 19:09:31 2024 - [info]   192.168.20.120(192.168.20.120:3306)
Thu May 30 19:09:31 2024 - [info]   192.168.20.231(192.168.20.231:3306)
Thu May 30 19:09:31 2024 - [info] Alive Slaves:
Thu May 30 19:09:31 2024 - [info]   192.168.20.120(192.168.20.120:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Thu May 30 19:09:31 2024 - [info]     GTID ON
Thu May 30 19:09:31 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Thu May 30 19:09:31 2024 - [info]   192.168.20.231(192.168.20.231:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Thu May 30 19:09:31 2024 - [info]     GTID ON
Thu May 30 19:09:31 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Thu May 30 19:09:31 2024 - [info] Current Alive Master: 192.168.20.132(192.168.20.132:3306)
Thu May 30 19:09:31 2024 - [info] Checking slave configurations..
Thu May 30 19:09:31 2024 - [info]  read_only=1 is not set on slave 192.168.20.120(192.168.20.120:3306).
Thu May 30 19:09:31 2024 - [info]  read_only=1 is not set on slave 192.168.20.231(192.168.20.231:3306).
Thu May 30 19:09:31 2024 - [info] Checking replication filtering settings..
Thu May 30 19:09:31 2024 - [info]  binlog_do_db= , binlog_ignore_db= 
Thu May 30 19:09:31 2024 - [info]  Replication filtering check ok.
Thu May 30 19:09:31 2024 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Thu May 30 19:09:31 2024 - [info] Checking SSH publickey authentication settings on the current master..
Thu May 30 19:09:32 2024 - [info] HealthCheck: SSH to 192.168.20.132 is reachable.
Thu May 30 19:09:32 2024 - [info] 
192.168.20.132(192.168.20.132:3306) (current master)+--192.168.20.120(192.168.20.120:3306)+--192.168.20.231(192.168.20.231:3306)Thu May 30 19:09:32 2024 - [info] Checking replication health on 192.168.20.120..
Thu May 30 19:09:32 2024 - [info]  ok.
Thu May 30 19:09:32 2024 - [info] Checking replication health on 192.168.20.231..
Thu May 30 19:09:32 2024 - [info]  ok.
Thu May 30 19:09:32 2024 - [warning] master_ip_failover_script is not defined.
Thu May 30 19:09:32 2024 - [warning] shutdown_script is not defined.
Thu May 30 19:09:32 2024 - [info] Got exit code 0 (Not master dead).MySQL Replication Health is OK.

1）主从检查踩过的坑——Attempt to reload DBD/mysql.pm aborted

[root@localhost data]# masterha_check_repl  --conf=/etc/mha/app1.cnf
Thu May 30 17:43:25 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu May 30 17:43:25 2024 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu May 30 17:43:25 2024 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu May 30 17:43:25 2024 - [info] MHA::MasterMonitor version 0.58.
Thu May 30 17:43:25 2024 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.at /usr/share/perl5/vendor_perl/MHA/DBHelper.pm line 208.at /usr/share/perl5/vendor_perl/MHA/Server.pm line 166.
Thu May 30 17:43:25 2024 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.at /usr/share/perl5/vendor_perl/MHA/DBHelper.pm line 208.at /usr/share/perl5/vendor_perl/MHA/Server.pm line 166.
Thu May 30 17:43:25 2024 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.at /usr/share/perl5/vendor_perl/MHA/DBHelper.pm line 208.at /usr/share/perl5/vendor_perl/MHA/Server.pm line 166.
Thu May 30 17:43:26 2024 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln309] Got fatal error, stopping operations
Thu May 30 17:43:26 2024 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 329.
Thu May 30 17:43:26 2024 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu May 30 17:43:26 2024 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!

解决办法：
install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted 问题解决方案-CSDN博客

操作后出现新问题：

chmod 755 blib/arch/auto/DBD/mysql/mysql.so
Manifying 2 pod documentsDVEEDEN/DBD-mysql-5.005.tar.gz/usr/bin/make -- OK
Running make test
"/usr/bin/perl" -MExtUtils::Command::MM -e 'cp_nonempty' -- mysql.bs blib/arch/auto/DBD/mysql/mysql.bs 644
PERL_DL_NONLAZY=1 "/usr/bin/perl" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef *Test::Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/00base.t .............................. 1/6 # Driver version is 5.005
t/00base.t .............................. ok   
t/01caching_sha2_prime.t ................ ok   
t/05dbcreate.t .......................... 1/2 # Database 'test' accessible
t/05dbcreate.t .......................... ok   
t/10connect.t ........................... 1/? # mysql_clientinfo is: 8.0.20
# mysql_clientversion is: 80020
# mysql_serverversion is: 80020
# mysql_hostinfo is: Localhost via UNIX socket
# mysql_serverinfo is: 8.0.20
# mysql_stat is: Uptime: 64536  Threads: 4  Questions: 1664  Slow queries: 200  Opens: 667  Flush tables: 4  Open tables: 408  Queries per second avg: 0.025
# mysql_protoinfo is: 10
# SQL_DBMS_VER is 8.0.20
# Default storage engine is: InnoDB
t/10connect.t ........................... ok    
t/15reconnect.t ......................... ok     
t/16dbi-get_info.t ...................... ok   
t/17quote.t ............................. ok     
t/20createdrop.t ........................ ok   
t/25lockunlock.t ........................ ok     
t/29warnings.t .......................... ok     
t/30insertfetch.t ....................... ok    
t/31insertid.t .......................... ok     
t/32insert_error.t ...................... ok   
t/35limit.t ............................. ok       
t/35prepare.t ........................... ok     
t/40bindparam.t ......................... ok     
t/40bindparam2.t ........................ ok     
t/40bit.t ............................... ok     
t/40blobs.t ............................. ok     
t/40catalog.t ........................... ok     
t/40keyinfo.t ........................... ok   
t/40listfields.t ........................ ok     
t/40nulls.t ............................. ok     
t/40nulls_prepare.t ..................... ok    
t/40numrows.t ........................... ok     
t/40server_prepare.t .................... ok     
t/40server_prepare_crash.t .............. ok     
t/40server_prepare_error.t .............. ok   
t/40types.t ............................. ok     
t/41bindparam.t ......................... ok     
t/41blobs_prepare.t ..................... ok     
t/41int_min_max.t ....................... ok       
t/42bindparam.t ......................... ok     
t/43count_params.t ...................... ok     
t/50chopblanks.t ........................ ok     
t/50commit.t ............................ ok     
t/51bind_type_guessing.t ................ 1/98 DBD::mysql::st execute failed: Data truncated for column 'dd' at row 1 at t/51bind_type_guessing.t line 136.
DBD::mysql::st execute failed: Data truncated for column 'nn' at row 1 at t/51bind_type_guessing.t line 114.
DBD::mysql::st execute failed: Data truncated for column 'dd' at row 1 at t/51bind_type_guessing.t line 136.
DBD::mysql::st execute failed: Incorrect integer value: '+' for column 'nn' at row 1 at t/51bind_type_guessing.t line 114.
DBD::mysql::st execute failed: Data truncated for column 'dd' at row 1 at t/51bind_type_guessing.t line 136.
DBD::mysql::st execute failed: Incorrect integer value: '.' for column 'nn' at row 1 at t/51bind_type_guessing.t line 114.
DBD::mysql::st execute failed: Data truncated for column 'dd' at row 1 at t/51bind_type_guessing.t line 136.
DBD::mysql::st execute failed: Incorrect integer value: 'e5' for column 'nn' at row 1 at t/51bind_type_guessing.t line 114.
DBD::mysql::st execute failed: Data truncated for column 'dd' at row 1 at t/51bind_type_guessing.t line 136.
t/51bind_type_guessing.t ................ ok     
t/52comment.t ........................... ok     
t/53comment.t ........................... ok    
t/55utf8.t .............................. ok     
t/55utf8mb4.t ........................... ok   
t/56connattr.t .......................... ok   
t/57trackgtid.t ......................... skipped: GTID tracking not enabled
t/60leaks.t ............................. skipped: Skip $ENV{EXTENDED_TESTING} is not set
t/65segfault.t .......................... ok   
t/65types.t ............................. ok     
t/70takeimp.t ........................... ok     
t/71impdata.t ........................... ok     
t/75supported_sql.t ..................... ok     
t/76multi_statement.t ................... ok     
t/80procs.t ............................. ok     
t/81procs.t ............................. ok     
t/85init_command.t ...................... ok   
t/86_bug_36972.t ........................ ok     
t/87async.t ............................. ok     
t/88async-multi-stmts.t ................. ok   
t/89async-method-check.t ................ ok       
t/91errcheck.t .......................... ok   
t/92ssl_backronym_vulnerability.t ....... skipped: Server supports SSL connections, cannot test false-positive enforcement
t/92ssl_optional.t ...................... skipped: Server supports SSL connections, cannot test fallback to plain text
t/92ssl_riddle_vulnerability.t .......... skipped: Server supports SSL connections, cannot test false-positive enforcement
t/99_bug_server_prepare_blob_null.t ..... ok     
t/99compression.t ....................... ok    
t/gh352.t ............................... 1/2 DBD::mysql::db prepare failed: Statement not active at t/gh352.t line 27.
t/gh352.t ............................... ok   
t/gh360.t ............................... ok   
t/manifest.t ............................ skipped: these tests are for release testing
t/pod.t ................................. 1/3 
#   Failed test 'POD test for blib/lib/DBD/mysql.pm'
#   at /usr/share/perl5/vendor_perl/Test/Pod.pm line 186.
# blib/lib/DBD/mysql.pm (1350): You forgot a '=back' before '=head1'
# Looks like you failed 1 test of 3.
t/pod.t ................................. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/3 subtests 
t/rt110983-valid-mysqlfd.t .............. ok   
t/rt118977-zerofill.t ................... ok   
t/rt25389-bin-case.t .................... ok   
t/rt50304-column_info_parentheses.t ..... ok    
t/rt61849-bind-param-buffer-overflow.t .. ok   
t/rt75353-innodb-lock-timeout.t ......... ok   
t/rt83494-quotes-comments.t ............. ok   
t/rt85919-fetch-lost-connection.t ....... ok   
t/rt86153-reconnect-fail-memory.t ....... skipped: $ENV{EXTENDED_TESTING} is not set
t/rt88006-bit-prepare.t ................. ok    
t/rt91715.t ............................. ok   
t/version.t ............................. 1/? # mysql_get_client_version: 80020
t/version.t ............................. ok   Test Summary Report
-------------------
t/pod.t                               (Wstat: 256 Tests: 3 Failed: 1)Failed test:  1Non-zero exit status: 1
Files=79, Tests=2449, 41 wallclock secs ( 0.36 usr  0.12 sys +  4.32 cusr  0.98 csys =  5.78 CPU)
Result: FAIL
Failed 1/79 test programs. 1/2449 subtests failed.
make: *** [test_dynamic] 错误 255DVEEDEN/DBD-mysql-5.005.tar.gz/usr/bin/make test -- NOT OK
//hint// to see the cpan-testers results for installing this module, try:reports DVEEDEN/DBD-mysql-5.005.tar.gz
Running make installmake test had returned bad status, won't install without force

为了解决该问题：install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted. 异常解决_attempt to reload clone.pm aborted.-CSDN博客

[root@localhost bin]# ldd /usr/lib64/perl5/vendor_perl/auto/DBD/mysql/mysql.solinux-vdso.so.1 =>  (0x00007fffa75b4000)libmysqlclient.so.18 => not foundlibpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1a492b5000)libz.so.1 => /lib64/libz.so.1 (0x00007f1a4909f000)libm.so.6 => /lib64/libm.so.6 (0x00007f1a48d9d000)libssl.so.10 => /lib64/libssl.so.10 (0x00007f1a48b2b000)libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007f1a486ca000)libdl.so.2 => /lib64/libdl.so.2 (0x00007f1a484c6000)libc.so.6 => /lib64/libc.so.6 (0x00007f1a480f9000)/lib64/ld-linux-x86-64.so.2 (0x00007f1a496ed000)libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007f1a47eac000)libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007f1a47bc4000)libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007f1a479c0000)libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007f1a4778d000)libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007f1a4757f000)libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007f1a4737b000)libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f1a47162000)libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f1a46f3b000)libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f1a46cd9000)

又发现libmysqlclient.so.18 => not found，于是继续解决

在其他安装了mysql数据库的服务器上查找下这个文件

[root@localhost opt]# find / -name libmysqlclient.so.18
find: ‘/run/user/1000/gvfs’: 权限不够
/usr/lib64/mysql/libmysqlclient.so.18
[root@localhost opt]# sz /usr/lib64/mysql/libmysqlclient.so.18

将文件传到出问题的服务器上，cp到指定目录下，重新ldd，发现不再not found：

[root@localhost opt]# cp libmysqlclient.so.18 /usr/lib/
[root@localhost opt]# 
[root@localhost opt]# cp libmysqlclient.so.18 /usr/lib64/
[root@localhost opt]# ldd /usr/lib64/perl5/vendor_perl/auto/DBD/mysql/mysql
ldd: /usr/lib64/perl5/vendor_perl/auto/DBD/mysql/mysql: 没有那个文件或目录
[root@localhost opt]# ldd /usr/lib64/perl5/vendor_perl/auto/DBD/mysql/mysql.solinux-vdso.so.1 =>  (0x00007fff47db4000)libmysqlclient.so.18 => /lib64/libmysqlclient.so.18 (0x00007f3697cce000)libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f3697ab2000)libz.so.1 => /lib64/libz.so.1 (0x00007f369789c000)libm.so.6 => /lib64/libm.so.6 (0x00007f369759a000)libssl.so.10 => /lib64/libssl.so.10 (0x00007f3697328000)libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007f3696ec7000)libdl.so.2 => /lib64/libdl.so.2 (0x00007f3696cc3000)libc.so.6 => /lib64/libc.so.6 (0x00007f36968f6000)libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f36965ee000)libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f36963d8000)/lib64/ld-linux-x86-64.so.2 (0x00007f369844e000)libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007f369618b000)libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007f3695ea3000)libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007f3695c9f000)libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007f3695a6c000)libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007f369585e000)libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007f369565a000)libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f3695441000)libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f369521a000)libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f3694fb8000)

再运行一遍主从检查：很好，这个问题过了，出现了下一个问题

[root@localhost opt]# masterha_check_repl  --conf=/etc/mha/app1.cnf
Thu May 30 17:56:26 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu May 30 17:56:26 2024 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu May 30 17:56:26 2024 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu May 30 17:56:26 2024 - [info] MHA::MasterMonitor version 0.58.
Thu May 30 17:56:27 2024 - [info] GTID failover mode = 1
Thu May 30 17:56:27 2024 - [info] Dead Servers:
Thu May 30 17:56:27 2024 - [info]   192.168.20.231(192.168.20.231:3306)
Thu May 30 17:56:27 2024 - [info] Alive Servers:
Thu May 30 17:56:27 2024 - [info]   192.168.20.132(192.168.20.132:3306)
Thu May 30 17:56:27 2024 - [info]   192.168.20.120(192.168.20.120:3306)
Thu May 30 17:56:27 2024 - [info] Alive Slaves:
Thu May 30 17:56:27 2024 - [info]   192.168.20.120(192.168.20.120:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Thu May 30 17:56:27 2024 - [info]     GTID ON
Thu May 30 17:56:27 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Thu May 30 17:56:27 2024 - [info] Current Alive Master: 192.168.20.132(192.168.20.132:3306)
Thu May 30 17:56:27 2024 - [info] Checking slave configurations..
Thu May 30 17:56:27 2024 - [info]  read_only=1 is not set on slave 192.168.20.120(192.168.20.120:3306).
Thu May 30 17:56:27 2024 - [info] Checking replication filtering settings..
Thu May 30 17:56:27 2024 - [info]  binlog_do_db= , binlog_ignore_db= 
Thu May 30 17:56:27 2024 - [info]  Replication filtering check ok.
Thu May 30 17:56:27 2024 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln398] 192.168.20.120(192.168.20.120:3306): User replmha does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.
Thu May 30 17:56:27 2024 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 1403.
Thu May 30 17:56:27 2024 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu May 30 17:56:27 2024 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!

2）主从检查踩过的坑——User replmha does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.

解决方法：具有复制权限的用户（repl）必须在所有节点上都创建一次，具有管理权限的用户也是一样，在从库做任何操作之前记得set sql_log_bin=0——>关闭binlog日志；

3）主从检查踩过的坑—— Server 192.168.20.231(192.168.20.231:3306) is dead, but must be alive! Check server settings.

管理机的my.cnf端口设置错了啊啊。3307改回3306好了

其他可能出现的问题：https://www.cnblogs.com/xuliuzai/p/11980273.html

MYSQL 高可用集群搭建 ---MHA_mha下载-CSDN博客

4）主从检查踩过的坑——Fri May 31 18:25:52 2024 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln180] Got MySQL error when connecting 192.168.20.231(192.168.20.231:3306) :1130:Host '192.168.20.231' is not allowed to connect to this MySQL server, but this is not a MySQL crash. Check MySQL server settings.

这个错误表示客户端主机（IP地址为192.168.20.231）试图连接到MySQL服务器，但是服务器的权限设置不允许该主机进行连接。这通常是因为MySQL的用户权限配置不正确，导致该主机的用户无法从该主机上进行连接。

检查后发现，由于整个数据库文件都是备份过来的，备份过来的用户表中的replmha用户不能真实使用，在从库中需要删除后重新create replmha用户，重新赋予权限后解决了该问题

在从库做任何操作之前记得set sql_log_bin=0——>关闭binlog日志；

mysql> drop user replmha@'%';
Query OK, 0 rows affected (0.00 sec)mysql> select user,host,authentication_string,plugin from mysql.user;
+------------------+-----------+------------------------------------------------------------------------+-----------------------+
| user             | host      | authentication_string                                                  | plugin                |
+------------------+-----------+------------------------------------------------------------------------+-----------------------+
| mha              | %         | *31330A9B24799CC9566A39CBD78CEF60E26C906F                              | mysql_native_password |
| yiyi             | %         | $A$005$^oaN;+gtM.v?}dzN9ur30WU8M8ZKEMmqPx00qANDdp3WuzcAvu4DbDz6 | caching_sha2_password |
| mysql.infoschema | localhost | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED | caching_sha2_password |
| mysql.session    | localhost | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED | caching_sha2_password |
| mysql.sys        | localhost | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED | caching_sha2_password |
| repl             | localhost | *31330A9B24799CC9566A39CBD78CEF60E26C906F                              | mysql_native_password |
| root             | localhost |                                                                        | caching_sha2_password |
| yizuo            | localhost | *31330A9B24799CC9566A39CBD78CEF60E26C906F                              | mysql_native_password |
+------------------+-----------+------------------------------------------------------------------------+-----------------------+
8 rows in set (0.00 sec)mysql> create user replmha@'%' identified with mysql_native_password by'ok';
Query OK, 0 rows affected (0.00 sec)mysql> select user,host,authentication_string,plugin from mysql.user;
+------------------+-----------+------------------------------------------------------------------------+-----------------------+
| user             | host      | authentication_string                                                  | plugin                |
+------------------+-----------+------------------------------------------------------------------------+-----------------------+
| mha              | %         | *31330A9B24799CC9566A39CBD78CEF60E26C906F                              | mysql_native_password |
| replmha          | %         | *31330A9B24799CC9566A39CBD78CEF60E26C906F                              | mysql_native_password |
| yiyi             | %         | $A$005$^oaN;+gtM.v?}dzN9ur30WU8M8ZKEMmqPx00qANDdp3WuzcAvu4DbDz6 | caching_sha2_password |
| mysql.infoschema | localhost | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED | caching_sha2_password |
| mysql.session    | localhost | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED | caching_sha2_password |
| mysql.sys        | localhost | $A$005$THISISACOMBINATIONOFINVALIDSALTANDPASSWORDTHATMUSTNEVERBRBEUSED | caching_sha2_password |
| repl             | localhost | *31330A9B24799CC9566A39CBD78CEF60E26C906F                              | mysql_native_password |
| root             | localhost |                                                                        | caching_sha2_password |
| yizuo            | localhost | *31330A9B24799CC9566A39CBD78CEF60E26C906F                              | mysql_native_password |
+------------------+-----------+------------------------------------------------------------------------+-----------------------+
9 rows in set (0.00 sec)mysql> GRANT ALL PRIVILEGES ON *.* TO 'replmha'@'%' WITH GRANT OPTION;
Query OK, 0 rows affected (0.00 sec)mysql> show master status;
+---------------+----------+--------------+------------------+-------------------------------------------+
| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                         |
+---------------+----------+--------------+------------------+-------------------------------------------+
| binlog.000002 |      156 |              |                  | 93909ace-1b58-11ef-81d8-000c2912a662:1-31 |
+---------------+----------+--------------+------------------+-------------------------------------------+
1 row in set (0.00 sec)mysql> set sql_log_bin=1;
Query OK, 0 rows affected (0.00 sec)mysql> show master status;
+---------------+----------+--------------+------------------+-------------------------------------------+
| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                         |
+---------------+----------+--------------+------------------+-------------------------------------------+
| binlog.000002 |      156 |              |                  | 93909ace-1b58-11ef-81d8-000c2912a662:1-31 |
+---------------+----------+--------------+------------------+-------------------------------------------+
1 row in set (0.00 sec)

11.开启MHA(db03)

nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover  < /dev/null> /var/log/mha/app1/manager.log 2>&1 &

[root@localhost /]# nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover  < /dev/null> /var/log/mha/app1/manager.log 2>&1 &
[1] 88762    //没有报错说明是成功的
[root@localhost /]#

12.查看MHA状态(db03)

[root@localhost /]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:88762) is running(0:PING_OK), master:192.168.20.132

13.服务器关机重启后

1）检查各个机器数据库是否正常启动

2）检查1主2从状态

3）检查mha配置文件

4）互信检查，主从检查

5）启动mha

四、MHA模拟故障并恢复

1.MHA工作状态查看

[root@localhost app1]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:88762) is running(0:PING_OK), master:192.168.20.132

2.主库宕机

mysql> shutdown;
Query OK, 0 rows affected (0.00 sec)mysql> quit;

//db03MHA工作状态查看
[root@localhost app1]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 master is down and failover is running(50:FAILOVER_RUNNING). master:192.168.20.132
Check /var/log/mha/app1/manager for details.

3.查看日志

cat /var/log/mha/app1/manager

----- Failover Report -----app1: MySQL Master failover 192.168.20.132(192.168.20.132:3306) to 192.168.20.120(192.168.20.120:3306)Master 192.168.20.132(192.168.20.132:3306) is down!Check MHA Manager logs at localhost.localdomain:/var/log/mha/app1/manager for details.Started automated(non-interactive) failover.
Selected 192.168.20.120(192.168.20.120:3306) as a new master.
192.168.20.120(192.168.20.120:3306): OK: Applying all logs succeeded.
192.168.20.231(192.168.20.231:3306): ERROR: Starting slave failed.
Master failover to 192.168.20.120(192.168.20.120:3306) done, but recovery on slave partially failed.
Fri May 31 20:04:25 2024 - [info] MHA::MasterMonitor version 0.58.
Fri May 31 20:04:26 2024 - [info] GTID failover mode = 1
Fri May 31 20:04:26 2024 - [info] Dead Servers:
Fri May 31 20:04:26 2024 - [info] Alive Servers:
Fri May 31 20:04:26 2024 - [info]   192.168.20.132(192.168.20.132:3306)
Fri May 31 20:04:26 2024 - [info]   192.168.20.120(192.168.20.120:3306)
Fri May 31 20:04:26 2024 - [info]   192.168.20.231(192.168.20.231:3306)
Fri May 31 20:04:26 2024 - [info] Alive Slaves:
Fri May 31 20:04:26 2024 - [info]   192.168.20.120(192.168.20.120:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Fri May 31 20:04:26 2024 - [info]     GTID ON
Fri May 31 20:04:26 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Fri May 31 20:04:26 2024 - [info]   192.168.20.231(192.168.20.231:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Fri May 31 20:04:26 2024 - [info]     GTID ON
Fri May 31 20:04:26 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Fri May 31 20:04:26 2024 - [info] Current Alive Master: 192.168.20.132(192.168.20.132:3306)
Fri May 31 20:04:26 2024 - [info] Checking slave configurations..
Fri May 31 20:04:26 2024 - [info]  read_only=1 is not set on slave 192.168.20.120(192.168.20.120:3306).
Fri May 31 20:04:26 2024 - [info]  read_only=1 is not set on slave 192.168.20.231(192.168.20.231:3306).
Fri May 31 20:04:26 2024 - [info] Checking replication filtering settings..
Fri May 31 20:04:26 2024 - [info]  binlog_do_db= , binlog_ignore_db= 
Fri May 31 20:04:26 2024 - [info]  Replication filtering check ok.
Fri May 31 20:04:26 2024 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Fri May 31 20:04:26 2024 - [info] Checking SSH publickey authentication settings on the current master..
Fri May 31 20:04:26 2024 - [info] HealthCheck: SSH to 192.168.20.132 is reachable.
Fri May 31 20:04:26 2024 - [info] 
192.168.20.132(192.168.20.132:3306) (current master)+--192.168.20.120(192.168.20.120:3306)+--192.168.20.231(192.168.20.231:3306)Fri May 31 20:04:26 2024 - [warning] master_ip_failover_script is not defined.
Fri May 31 20:04:26 2024 - [warning] shutdown_script is not defined.
Fri May 31 20:04:26 2024 - [info] Set master ping interval 2 seconds.
Fri May 31 20:04:26 2024 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Fri May 31 20:04:26 2024 - [info] Starting ping health check on 192.168.20.132(192.168.20.132:3306)..
Fri May 31 20:04:26 2024 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
Fri May 31 20:05:04 2024 - [warning] Got error on MySQL select ping: 1053 (Server shutdown in progress)
Fri May 31 20:05:04 2024 - [info] Executing SSH check script: exit 0
Fri May 31 20:05:04 2024 - [info] HealthCheck: SSH to 192.168.20.132 is reachable.
Fri May 31 20:05:06 2024 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.20.132' (111))
Fri May 31 20:05:06 2024 - [warning] Connection failed 2 time(s)..
Fri May 31 20:05:08 2024 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.20.132' (111))
Fri May 31 20:05:08 2024 - [warning] Connection failed 3 time(s)..
Fri May 31 20:05:10 2024 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.20.132' (111))
Fri May 31 20:05:10 2024 - [warning] Connection failed 4 time(s)..
Fri May 31 20:05:10 2024 - [warning] Master is not reachable from health checker!
Fri May 31 20:05:10 2024 - [warning] Master 192.168.20.132(192.168.20.132:3306) is not reachable!
Fri May 31 20:05:10 2024 - [warning] SSH is reachable.
Fri May 31 20:05:10 2024 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.cnf again, and trying to connect to all servers to check server status..
Fri May 31 20:05:10 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri May 31 20:05:10 2024 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Fri May 31 20:05:10 2024 - [info] Reading server configuration from /etc/mha/app1.cnf..
Fri May 31 20:05:11 2024 - [info] GTID failover mode = 1
Fri May 31 20:05:11 2024 - [info] Dead Servers:
Fri May 31 20:05:11 2024 - [info]   192.168.20.132(192.168.20.132:3306)
Fri May 31 20:05:11 2024 - [info] Alive Servers:
Fri May 31 20:05:11 2024 - [info]   192.168.20.120(192.168.20.120:3306)
Fri May 31 20:05:11 2024 - [info]   192.168.20.231(192.168.20.231:3306)
Fri May 31 20:05:11 2024 - [info] Alive Slaves:
Fri May 31 20:05:11 2024 - [info]   192.168.20.120(192.168.20.120:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Fri May 31 20:05:11 2024 - [info]     GTID ON
Fri May 31 20:05:11 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Fri May 31 20:05:11 2024 - [info]   192.168.20.231(192.168.20.231:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Fri May 31 20:05:11 2024 - [info]     GTID ON
Fri May 31 20:05:11 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Fri May 31 20:05:11 2024 - [info] Checking slave configurations..
Fri May 31 20:05:11 2024 - [info]  read_only=1 is not set on slave 192.168.20.120(192.168.20.120:3306).
Fri May 31 20:05:11 2024 - [info]  read_only=1 is not set on slave 192.168.20.231(192.168.20.231:3306).
Fri May 31 20:05:11 2024 - [info] Checking replication filtering settings..
Fri May 31 20:05:11 2024 - [info]  Replication filtering check ok.
Fri May 31 20:05:11 2024 - [info] Master is down!
Fri May 31 20:05:11 2024 - [info] Terminating monitoring script.
Fri May 31 20:05:11 2024 - [info] Got exit code 20 (Master dead).
Fri May 31 20:05:11 2024 - [info] MHA::MasterFailover version 0.58.
Fri May 31 20:05:11 2024 - [info] Starting master failover.
Fri May 31 20:05:11 2024 - [info] 
Fri May 31 20:05:11 2024 - [info] * Phase 1: Configuration Check Phase..
Fri May 31 20:05:11 2024 - [info] 
Fri May 31 20:05:12 2024 - [info] GTID failover mode = 1
Fri May 31 20:05:12 2024 - [info] Dead Servers:
Fri May 31 20:05:12 2024 - [info]   192.168.20.132(192.168.20.132:3306)
Fri May 31 20:05:12 2024 - [info] Checking master reachability via MySQL(double check)...
Fri May 31 20:05:12 2024 - [info]  ok.
Fri May 31 20:05:12 2024 - [info] Alive Servers:
Fri May 31 20:05:12 2024 - [info]   192.168.20.120(192.168.20.120:3306)
Fri May 31 20:05:12 2024 - [info]   192.168.20.231(192.168.20.231:3306)
Fri May 31 20:05:12 2024 - [info] Alive Slaves:
Fri May 31 20:05:12 2024 - [info]   192.168.20.120(192.168.20.120:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Fri May 31 20:05:12 2024 - [info]     GTID ON
Fri May 31 20:05:12 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Fri May 31 20:05:12 2024 - [info]   192.168.20.231(192.168.20.231:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Fri May 31 20:05:12 2024 - [info]     GTID ON
Fri May 31 20:05:12 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Fri May 31 20:05:12 2024 - [info] Starting GTID based failover.
Fri May 31 20:05:12 2024 - [info] 
Fri May 31 20:05:12 2024 - [info] ** Phase 1: Configuration Check Phase completed.
Fri May 31 20:05:12 2024 - [info] 
Fri May 31 20:05:12 2024 - [info] * Phase 2: Dead Master Shutdown Phase..
Fri May 31 20:05:12 2024 - [info] 
Fri May 31 20:05:12 2024 - [info] Forcing shutdown so that applications never connect to the current master..
Fri May 31 20:05:12 2024 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.
Fri May 31 20:05:12 2024 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Fri May 31 20:05:13 2024 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Fri May 31 20:05:13 2024 - [info] 
Fri May 31 20:05:13 2024 - [info] * Phase 3: Master Recovery Phase..
Fri May 31 20:05:13 2024 - [info] 
Fri May 31 20:05:13 2024 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Fri May 31 20:05:13 2024 - [info] 
Fri May 31 20:05:13 2024 - [info] The latest binary log file/position on all slaves is binlog.000011:196
Fri May 31 20:05:13 2024 - [info] Latest slaves (Slaves that received relay log files to the latest):
Fri May 31 20:05:13 2024 - [info]   192.168.20.120(192.168.20.120:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Fri May 31 20:05:13 2024 - [info]     GTID ON
Fri May 31 20:05:13 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Fri May 31 20:05:13 2024 - [info]   192.168.20.231(192.168.20.231:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Fri May 31 20:05:13 2024 - [info]     GTID ON
Fri May 31 20:05:13 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Fri May 31 20:05:13 2024 - [info] The oldest binary log file/position on all slaves is binlog.000011:196
Fri May 31 20:05:13 2024 - [info] Oldest slaves:
Fri May 31 20:05:13 2024 - [info]   192.168.20.120(192.168.20.120:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Fri May 31 20:05:13 2024 - [info]     GTID ON
Fri May 31 20:05:13 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Fri May 31 20:05:13 2024 - [info]   192.168.20.231(192.168.20.231:3306)  Version=8.0.20 (oldest major version between slaves) log-bin:enabled
Fri May 31 20:05:13 2024 - [info]     GTID ON
Fri May 31 20:05:13 2024 - [info]     Replicating from 192.168.20.132(192.168.20.132:3306)
Fri May 31 20:05:13 2024 - [info] 
Fri May 31 20:05:13 2024 - [info] * Phase 3.3: Determining New Master Phase..
Fri May 31 20:05:13 2024 - [info] 
Fri May 31 20:05:13 2024 - [info] Searching new master from slaves..
Fri May 31 20:05:13 2024 - [info]  Candidate masters from the configuration file:
Fri May 31 20:05:13 2024 - [info]  Non-candidate masters:
Fri May 31 20:05:13 2024 - [info] New master is 192.168.20.120(192.168.20.120:3306)
Fri May 31 20:05:13 2024 - [info] Starting master failover..
Fri May 31 20:05:13 2024 - [info] 
From:
192.168.20.132(192.168.20.132:3306) (current master)+--192.168.20.120(192.168.20.120:3306)+--192.168.20.231(192.168.20.231:3306)To:
192.168.20.120(192.168.20.120:3306) (new master)+--192.168.20.231(192.168.20.231:3306)
Fri May 31 20:05:13 2024 - [info] 
Fri May 31 20:05:13 2024 - [info] * Phase 3.3: New Master Recovery Phase..
Fri May 31 20:05:13 2024 - [info] 
Fri May 31 20:05:13 2024 - [info]  Waiting all logs to be applied.. 
Fri May 31 20:05:13 2024 - [info]   done.
Fri May 31 20:05:13 2024 - [info] Getting new master's binlog name and position..
Fri May 31 20:05:13 2024 - [info]  binlog.000003:196
Fri May 31 20:05:13 2024 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.20.120', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='replmha', MASTER_PASSWORD='xxx';
Fri May 31 20:05:13 2024 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: binlog.000003, 196, 93909ace-1b58-11ef-81d8-000c2912a662:1-7
Fri May 31 20:05:13 2024 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address.
Fri May 31 20:05:13 2024 - [info] ** Finished master recovery successfully.
Fri May 31 20:05:13 2024 - [info] * Phase 3: Master Recovery Phase completed.
Fri May 31 20:05:13 2024 - [info] 
Fri May 31 20:05:13 2024 - [info] * Phase 4: Slaves Recovery Phase..
Fri May 31 20:05:13 2024 - [info] 
Fri May 31 20:05:13 2024 - [info] 
Fri May 31 20:05:13 2024 - [info] * Phase 4.1: Starting Slaves in parallel..
Fri May 31 20:05:13 2024 - [info] 
Fri May 31 20:05:13 2024 - [info] -- Slave recovery on host 192.168.20.231(192.168.20.231:3306) started, pid: 38527. Check tmp log /var/log/mha/app1/192.168.20.231_3306_20240531200511.log if it takes time..
Fri May 31 20:05:15 2024 - [info] 
Fri May 31 20:05:15 2024 - [info] Log messages from 192.168.20.231 ...
Fri May 31 20:05:15 2024 - [info] 
Fri May 31 20:05:13 2024 - [info]  Resetting slave 192.168.20.231(192.168.20.231:3306) and starting replication from the new master 192.168.20.120(192.168.20.120:3306)..
Fri May 31 20:05:13 2024 - [info]  Executed CHANGE MASTER.
Fri May 31 20:05:14 2024 - [info]  Slave started.
Fri May 31 20:05:14 2024 - [info]  gtid_wait(93909ace-1b58-11ef-81d8-000c2912a662:1-7) completed on 192.168.20.231(192.168.20.231:3306). Executed 0 events.
Fri May 31 20:05:15 2024 - [info] End of log messages from 192.168.20.231.
Fri May 31 20:05:15 2024 - [info] -- Slave on host 192.168.20.231(192.168.20.231:3306) started.
Fri May 31 20:05:15 2024 - [info] All new slave servers recovered successfully.
Fri May 31 20:05:15 2024 - [info] 
Fri May 31 20:05:15 2024 - [info] * Phase 5: New master cleanup phase..
Fri May 31 20:05:15 2024 - [info] 
Fri May 31 20:05:15 2024 - [info] Resetting slave info on the new master..
Fri May 31 20:05:15 2024 - [info]  192.168.20.120: Resetting slave info succeeded.
Fri May 31 20:05:15 2024 - [info] Master failover to 192.168.20.120(192.168.20.120:3306) completed successfully.
Fri May 31 20:05:15 2024 - [info] Deleted server1 entry from /etc/mha/app1.cnf .
Fri May 31 20:05:15 2024 - [info] ----- Failover Report -----app1: MySQL Master failover 192.168.20.132(192.168.20.132:3306) to 192.168.20.120(192.168.20.120:3306) succeededMaster 192.168.20.132(192.168.20.132:3306) is down!Check MHA Manager logs at localhost.localdomain:/var/log/mha/app1/manager for details.Started automated(non-interactive) failover.
Selected 192.168.20.120(192.168.20.120:3306) as a new master.
192.168.20.120(192.168.20.120:3306): OK: Applying all logs succeeded.
192.168.20.231(192.168.20.231:3306): OK: Slave started, replicating from 192.168.20.120(192.168.20.120:3306)
192.168.20.120(192.168.20.120:3306): Resetting slave info succeeded.
Master failover to 192.168.20.120(192.168.20.120:3306) completed successfully.

4.修复主库

重启主库数据库服务即可

如果是实际生产中怎么办？——>判断是否有可恢复性，如果没有就重新初始化，重构

5.恢复主从

此时db02是主库，db03是他的从库也是master库，db01失去了主从环境

需要把db01重新设定为db02的从库

6.修改配置文件

vim /etc/mha/app1.cnf
打开配置文件发现，db01的配置已经被mha自动删掉了，再添加上就可以

7.重新启动MHA

nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &

8.查看MHA状态

[root@localhost /]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:100728) is running(0:PING_OK), master:192.168.20.120