【MySQL数据库】丨高可用之MHA集群部署

一、准备工作

1.1 修改主机名

vim  /etc/hosts# 添加对应主机
192.168.28.128 mha1
192.168.28.131 mha2
192.168.28.132 mha3

1.2 关闭防火墙及修改selinux

# 关闭防火墙
systemctl  stop firewalld
systemctl  disable firewalld   # 关闭自启动# 修改selinux
vim  /etc/sysconfig/selinux
SELINUX=disabled  #  设置为disabled

1.3 部署一套1主2从的MySQL集群

创建主从





VIPIPportrole
192.168.28.199192.168.28.1283306主库
192.168.28.1313306备选主库
192.168.28.1323306从库(MHA管理节点)

注意数据库必须有如下参数

server-id=1                    #  每个节点不能相同
log-bin=/data/mysql3306/logs/mysql-bin  # 不写路径默认在目录下 
relay-log=/data/mysql3306/logs/relay-log  # 不写路径默认在目录下
skip-name-resolve              #  建议加上 非必须项
#read_only = ON                #  从库开启,主库关闭只读
relay_log_purge = 0            #  关闭自动清理中继日志
log_slave_updates = 1          #  从库通过binlog更新的数据写进从库二进制日志中,必加,否则切换后可能丢失数据
创建mha管理账号

创建mha管理账号

特别注意:mha的密码不要出现特殊字符,否则后面无法切换主库,很多人踩坑

create user  mha@'192.168.28.%' identified by 'MHAadmin123';create user  mha@'localhost' identified by 'MHAadmin123';grant all on *.* to   mha@'192.168.28.%';
grant all on *.* to   mha@'localhost';

二、在主库上添加VIP

ip addr add 192.168.28.199/24 dev ens33  # 其中 192.168.24.199为VIP ens33为网卡名

1.4 配置互信

MHA管理节点上执行(但建议每台主机均执行,便于切换管理节点及集群间维护,但注意主机安全),包含本机到本机的互信

sh-keygen
ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.28.128
ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.28.131
ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.28.132

配置完成后记得测试一下是否配置成功(必须测试)

ssh root@192.168.28.128ssh root@192.168.28.131ssh root@192.168.28.132ssh root@mha1ssh root@mha2ssh root@mha3

三、MHA部署

2.1 安装MHA相关依赖包

 yum install perl-DBI perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes perl-Params-Validate perl-DateTime -yyum install perl-ExtUtils-Embed -yyum install cpan -yyum install perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker -y

注意:MySQL数据库安装时不建议用rpm包方式安装,否则此处部分包可能有冲突

2.2 安装MHA 管理及node节点

# 所有节点均需安装rpm -ivh  mha4mysql-node-0.58-0.el7.centos.noarch.rpm#管理节点需安装(其他节点也可以安装)
mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

如果以上安装包未安装全,则会出现类似下面的错误,如出现可以调整yum源或找下载好的同学获取

[root@mha3 local]# rpm -ivh  mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
error: Failed dependencies:perl(Log::Dispatch) is needed by mha4mysql-manager-0.58-0.el7.centos.noarchperl(Log::Dispatch::File) is needed by mha4mysql-manager-0.58-0.el7.centos.noarchperl(Log::Dispatch::Screen) is needed by mha4mysql-manager-0.58-0.el7.centos.noarchperl(Parallel::ForkManager) is needed by mha4mysql-manager-0.58-0.el7.centos.noarch

2.3 配置mha

创建配置文件路径、日志文件路径

mkdir -p /etc/masterha
mkdir -p /var/log/masterha/app1

创建mha配置文件

vim  /etc/masterha/app1.conf[server default]
manager_workdir=/var/log/masterha/app1
manager_log=/var/log/masterha/app1/app1.log
master_ip_failover_script=/usr/bin/master_ip_failover
master_ip_online_change_script=/usr/bin/master_ip_online_change##mysql用户名和密码
user=mha
password=MHAadmin123
ssh_user=root
repl_user=repl
repl_password=repl
ping_interval=3
remote_workdir=/tmp
report_script=/usr/bin/send_report
# secondary_check_script 可以不加# secondary_check_script=/usr/bin/masterha_secondary_check -s mha2 -s mha3 --user=mha --master_host=mha1 --master_ip=192.168.28.128 --master_port=3306 --password=MHAadmin123
shutdown_script=""
report_script=""[server1]
hostname=192.168.28.128
master_binlog_dir=/data/mysql3306/logs
candidate_master=1[server2]
hostname=192.168.28.131
master_binlog_dir=/data/mysql3306/logs
candidate_master=1
check_repl_delay=0[server3]
hostname=192.168.28.132
master_binlog_dir=/data/mysql3306/logs
no_master=1

配置切换脚本

配置两个重要的脚本 master_ip_failover 、 master_ip_online_change,注意修改VIP地址及网卡名

/usr/bin/master_ip_failover

vim /usr/bin/master_ip_failover
#!/usr/bin/env perluse strict;
use warnings FATAL => 'all';use Getopt::Long;my ($command,          $ssh_user,        $orig_master_host, $orig_master_ip,$orig_master_port, $new_master_host, $new_master_ip,    $new_master_port
);my $vip = '192.168.28.199/24';
my $if = 'ens33';
my $ssh_start_vip = "/sbin/ip addr add  $vip dev  $if";
my $ssh_stop_vip = "/sbin/ip addr del $vip dev $if";GetOptions('command=s'          => \$command,'ssh_user=s'         => \$ssh_user,'orig_master_host=s' => \$orig_master_host,'orig_master_ip=s'   => \$orig_master_ip,'orig_master_port=i' => \$orig_master_port,'new_master_host=s'  => \$new_master_host,'new_master_ip=s'    => \$new_master_ip,'new_master_port=i'  => \$new_master_port,
);exit &main();sub main {print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";if ( $command eq "stop" || $command eq "stopssh" ) {my $exit_code = 1;eval {print "Disabling the VIP on old master: $orig_master_host \n";&stop_vip();$exit_code = 0;};if ($@) {warn "Got Error: $@\n";exit $exit_code;}exit $exit_code;}elsif ( $command eq "start" ) {my $exit_code = 10;eval {print "Enabling the VIP - $vip on the new master - $new_master_host \n";&start_vip();$exit_code = 0;};if ($@) {warn $@;exit $exit_code;}exit $exit_code;}elsif ( $command eq "status" ) {print "Checking the Status of the script.. OK \n";exit 0;}else {&usage();exit 1;}
}sub start_vip() {`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {return 0  unless  ($ssh_user);`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}sub usage {print"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

/usr/bin/master_ip_online_change

vim /usr/bin/master_ip_online_change
#!/usr/bin/env perluse strict;
use warnings FATAL => 'all';use Getopt::Long;#my (
#    $command,          $ssh_user,        $orig_master_host, $orig_master_ip,
#    $orig_master_port, $new_master_host, $new_master_ip,    $new_master_port
#);my ($command,              $orig_master_is_new_slave, $orig_master_host,$orig_master_ip,       $orig_master_port,         $orig_master_user,$orig_master_password, $orig_master_ssh_user,     $new_master_host,$new_master_ip,        $new_master_port,          $new_master_user,$new_master_password,  $new_master_ssh_user,
);my $vip = '192.168.28.199/24';
my $if = 'ens33';
my $ssh_start_vip = "/sbin/ip addr add $vip dev $if";
my $ssh_stop_vip = "/sbin/ip addr del $vip dev $if";
my $ssh_user = "root";GetOptions('command=s'          => \$command,#'ssh_user=s'         => \$ssh_user,#'orig_master_host=s' => \$orig_master_host,#'orig_master_ip=s'   => \$orig_master_ip,#'orig_master_port=i' => \$orig_master_port,#'new_master_host=s'  => \$new_master_host,#'new_master_ip=s'    => \$new_master_ip,#'new_master_port=i'  => \$new_master_port,'orig_master_is_new_slave' => \$orig_master_is_new_slave,'orig_master_host=s'       => \$orig_master_host,'orig_master_ip=s'         => \$orig_master_ip,'orig_master_port=i'       => \$orig_master_port,'orig_master_user=s'       => \$orig_master_user,'orig_master_password=s'   => \$orig_master_password,'orig_master_ssh_user=s'   => \$orig_master_ssh_user,'new_master_host=s'        => \$new_master_host,'new_master_ip=s'          => \$new_master_ip,'new_master_port=i'        => \$new_master_port,'new_master_user=s'        => \$new_master_user,'new_master_password=s'    => \$new_master_password,'new_master_ssh_user=s'    => \$new_master_ssh_user,
);exit &main();sub main {print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";if ( $command eq "stop" || $command eq "stopssh" ) {my $exit_code = 1;eval {print "Disabling the VIP on old master: $orig_master_host \n";&stop_vip();$exit_code = 0;};if ($@) {warn "Got Error: $@\n";exit $exit_code;}exit $exit_code;}elsif ( $command eq "start" ) {my $exit_code = 10;eval {print "Enabling the VIP - $vip on the new master - $new_master_host \n";&start_vip();$exit_code = 0;};if ($@) {warn $@;exit $exit_code;}exit $exit_code;}elsif ( $command eq "status" ) {print "Checking the Status of the script.. OK \n";exit 0;}else {&usage();exit 1;}
}sub start_vip() {`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {return 0  unless  ($ssh_user);`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}sub usage {print"Usage: master_ip_failover --command=start|stop|stopssh|status --ssh-user=user --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

添加执行权限

chmod +x /usr/bin/master_ip_failover 
chmod +x /usr/bin/master_ip_online_change

2.4 相关检测

检测互信

检查各节点互信是否正常,类似于之前的检查,此处有脚本实现检查

[root@mha3 app1]# masterha_check_ssh --conf=/etc/masterha/app1.conf
Sun May 24 17:33:08 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun May 24 17:33:08 2020 - [info] Reading application default configuration from /etc/masterha/app1.conf..
Sun May 24 17:33:08 2020 - [info] Reading server configuration from /etc/masterha/app1.conf..
Sun May 24 17:33:08 2020 - [info] Starting SSH connection tests..
Sun May 24 17:33:12 2020 - [debug] 
Sun May 24 17:33:08 2020 - [debug]  Connecting via SSH from root@192.168.28.131(192.168.28.131:22) to root@192.168.28.128(192.168.28.128:22)..
Sun May 24 17:33:10 2020 - [debug]   ok.
Sun May 24 17:33:10 2020 - [debug]  Connecting via SSH from root@192.168.28.131(192.168.28.131:22) to root@192.168.28.132(192.168.28.132:22)..
Sun May 24 17:33:12 2020 - [debug]   ok.
Sun May 24 17:33:12 2020 - [debug] 
Sun May 24 17:33:08 2020 - [debug]  Connecting via SSH from root@192.168.28.128(192.168.28.128:22) to root@192.168.28.131(192.168.28.131:22)..
Sun May 24 17:33:09 2020 - [debug]   ok.
Sun May 24 17:33:09 2020 - [debug]  Connecting via SSH from root@192.168.28.128(192.168.28.128:22) to root@192.168.28.132(192.168.28.132:22)..
Sun May 24 17:33:12 2020 - [debug]   ok.
Sun May 24 17:33:13 2020 - [debug] 
Sun May 24 17:33:09 2020 - [debug]  Connecting via SSH from root@192.168.28.132(192.168.28.132:22) to root@192.168.28.128(192.168.28.128:22)..
Sun May 24 17:33:11 2020 - [debug]   ok.
Sun May 24 17:33:11 2020 - [debug]  Connecting via SSH from root@192.168.28.132(192.168.28.132:22) to root@192.168.28.131(192.168.28.131:22)..
Sun May 24 17:33:13 2020 - [debug]   ok.
Sun May 24 17:33:13 2020 - [info] All SSH connection tests passed successfully

检查复制集群是否正常

masterha_check_repl --conf=/etc/masterha/app1.conf

如按照之前的步骤配置,则此处会有如下异常

Sun May 24 17:34:02 2020 - [info] Connecting to root@192.168.28.131(192.168.28.131:22)..
Can't exec "mysqlbinlog": No such file or directory at /usr/share/perl5/vendor_perl/MHA/BinlogManager.pm line 106.
mysqlbinlog version command failed with rc 1:0, please verify PATH, LD_LIBRARY_PATH, and client options
at /usr/bin/apply_diff_relay_logs line 532.

报错信息很明确,找不到mysqlbinlog命令,处理方式比较简单,做个软连接即可

ln -s /usr/local/mysql5.7/bin/mysql /usr/bin/
ln -s /usr/local/mysql5.7/bin/mysqlbinlog /usr/bin/

再进行检测

[root@mha3 app1]# masterha_check_repl --conf=/etc/masterha/app1.conf
Sun May 24 17:34:41 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun May 24 17:34:41 2020 - [info] Reading application default configuration from /etc/masterha/app1.conf..
Sun May 24 17:34:41 2020 - [info] Reading server configuration from /etc/masterha/app1.conf..
Sun May 24 17:34:41 2020 - [info] MHA::MasterMonitor version 0.58.
Sun May 24 17:34:42 2020 - [info] GTID failover mode = 0
Sun May 24 17:34:42 2020 - [info] Dead Servers:
Sun May 24 17:34:42 2020 - [info] Alive Servers:
Sun May 24 17:34:42 2020 - [info]   192.168.28.128(192.168.28.128:3306)
Sun May 24 17:34:42 2020 - [info]   192.168.28.131(192.168.28.131:3306)
Sun May 24 17:34:42 2020 - [info]   192.168.28.132(192.168.28.132:3306)
Sun May 24 17:34:42 2020 - [info] Alive Slaves:
Sun May 24 17:34:42 2020 - [info]   192.168.28.131(192.168.28.131:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 17:34:42 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 17:34:42 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun May 24 17:34:42 2020 - [info]   192.168.28.132(192.168.28.132:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 17:34:42 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 17:34:42 2020 - [info]     Not candidate for the new Master (no_master is set)
Sun May 24 17:34:42 2020 - [info] Current Alive Master: 192.168.28.128(192.168.28.128:3306)
Sun May 24 17:34:42 2020 - [info] Checking slave configurations..
Sun May 24 17:34:42 2020 - [info] Checking replication filtering settings..
Sun May 24 17:34:42 2020 - [info]  binlog_do_db= , binlog_ignore_db= 
Sun May 24 17:34:42 2020 - [info]  Replication filtering check ok.
Sun May 24 17:34:42 2020 - [info] GTID (with auto-pos) is not supported
Sun May 24 17:34:42 2020 - [info] Starting SSH connection tests..
Sun May 24 17:34:48 2020 - [info] All SSH connection tests passed successfully.
Sun May 24 17:34:48 2020 - [info] Checking MHA Node version..
Sun May 24 17:34:49 2020 - [info]  Version check ok.
Sun May 24 17:34:49 2020 - [info] Checking SSH publickey authentication settings on the current master..
Sun May 24 17:34:50 2020 - [info] HealthCheck: SSH to 192.168.28.128 is reachable.
Sun May 24 17:34:51 2020 - [info] Master MHA Node version is 0.58.
Sun May 24 17:34:51 2020 - [info] Checking recovery script configurations on 192.168.28.128(192.168.28.128:3306)..
Sun May 24 17:34:51 2020 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql3306/data --output_file=/tmp/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000012
Sun May 24 17:34:51 2020 - [info]   Connecting to root@192.168.28.128(192.168.28.128:22)..Creating /tmp if not exists..    ok.Checking output directory is accessible or not..ok.Binlog found at /data/mysql3306/data, up to mysql-bin.000012
Sun May 24 17:34:52 2020 - [info] Binlog setting check done.
Sun May 24 17:34:52 2020 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sun May 24 17:34:52 2020 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.28.131 --slave_ip=192.168.28.131 --slave_port=3306 --workdir=/tmp --target_version=5.7.25-28-log --manager_version=0.58 --relay_log_info=/data/mysql3306/data/relay-log.info  --relay_dir=/data/mysql3306/data/  --slave_pass=xxx
Sun May 24 17:34:52 2020 - [info]   Connecting to root@192.168.28.131(192.168.28.131:22)..Checking slave recovery environment settings..Opening /data/mysql3306/data/relay-log.info ... ok.Relay log found at /data/mysql3306/data, up to relay-log.000003Temporary relay log file is /data/mysql3306/data/relay-log.000003Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.done.Testing mysqlbinlog output.. done.Cleaning up test file(s).. done.
Sun May 24 17:34:53 2020 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.28.132 --slave_ip=192.168.28.132 --slave_port=3306 --workdir=/tmp --target_version=5.7.25-28-log --manager_version=0.58 --relay_log_info=/data/mysql3306/data/relay-log.info  --relay_dir=/data/mysql3306/data/  --slave_pass=xxx
Sun May 24 17:34:53 2020 - [info]   Connecting to root@192.168.28.132(192.168.28.132:22)..Checking slave recovery environment settings..Opening /data/mysql3306/data/relay-log.info ... ok.Relay log found at /data/mysql3306/data, up to relay-log.000003Temporary relay log file is /data/mysql3306/data/relay-log.000003Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.done.Testing mysqlbinlog output.. done.Cleaning up test file(s).. done.
Sun May 24 17:34:54 2020 - [info] Slaves settings check done.
Sun May 24 17:34:54 2020 - [info] 
192.168.28.128(192.168.28.128:3306) (current master)+--192.168.28.131(192.168.28.131:3306)+--192.168.28.132(192.168.28.132:3306)Sun May 24 17:34:54 2020 - [info] Checking replication health on 192.168.28.131..
Sun May 24 17:34:54 2020 - [info]  ok.
Sun May 24 17:34:54 2020 - [info] Checking replication health on 192.168.28.132..
Sun May 24 17:34:54 2020 - [info]  ok.
Sun May 24 17:34:54 2020 - [info] Checking master_ip_failover_script status:
Sun May 24 17:34:54 2020 - [info]   /usr/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.28.128 --orig_master_ip=192.168.28.128 --orig_master_port=3306IN SCRIPT TEST====/sbin/ip addr del 192.168.28.199/24 dev ens33==/sbin/ip addr add  192.168.28.199/24 dev  ens33===Checking the Status of the script.. OK
Sun May 24 17:34:54 2020 - [info]  OK.
Sun May 24 17:34:54 2020 - [warning] shutdown_script is not defined.
Sun May 24 17:34:54 2020 - [info] Got exit code 0 (Not master dead).MySQL Replication Health is OK.

看到 “MySQL Replication Health is OK.” 代表检测通过。

四、MHA测试

3.1 开启MHA服务

开启MHA服务的脚本如下,也可以写成脚本或服务

nohup masterha_manager --conf=/etc/masterha/app1.conf < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

开启服务后,日志如下,与集群检测类似

Sun May 24 18:31:54 2020 - [info] MHA::MasterMonitor version 0.58.
Sun May 24 18:31:55 2020 - [info] GTID failover mode = 0
Sun May 24 18:31:55 2020 - [info] Dead Servers:
Sun May 24 18:31:55 2020 - [info] Alive Servers:
Sun May 24 18:31:55 2020 - [info]   192.168.28.128(192.168.28.128:3306)
Sun May 24 18:31:55 2020 - [info]   192.168.28.131(192.168.28.131:3306)
Sun May 24 18:31:55 2020 - [info]   192.168.28.132(192.168.28.132:3306)
Sun May 24 18:31:55 2020 - [info] Alive Slaves:
Sun May 24 18:31:55 2020 - [info]   192.168.28.131(192.168.28.131:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:31:55 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:31:55 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun May 24 18:31:55 2020 - [info]   192.168.28.132(192.168.28.132:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:31:55 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:31:55 2020 - [info]     Not candidate for the new Master (no_master is set)
Sun May 24 18:31:55 2020 - [info] Current Alive Master: 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:31:55 2020 - [info] Checking slave configurations..
Sun May 24 18:31:55 2020 - [info] Checking replication filtering settings..
Sun May 24 18:31:55 2020 - [info]  binlog_do_db= , binlog_ignore_db=
Sun May 24 18:31:55 2020 - [info]  Replication filtering check ok.
Sun May 24 18:31:55 2020 - [info] GTID (with auto-pos) is not supported
Sun May 24 18:31:55 2020 - [info] Starting SSH connection tests..
Sun May 24 18:32:01 2020 - [info] All SSH connection tests passed successfully.
Sun May 24 18:32:01 2020 - [info] Checking MHA Node version..
Sun May 24 18:32:03 2020 - [info]  Version check ok.
Sun May 24 18:32:03 2020 - [info] Checking SSH publickey authentication settings on the current master..
Sun May 24 18:32:03 2020 - [info] HealthCheck: SSH to 192.168.28.128 is reachable.
Sun May 24 18:32:04 2020 - [info] Master MHA Node version is 0.58.
Sun May 24 18:32:04 2020 - [info] Checking recovery script configurations on 192.168.28.128(192.168.28.128:3306)..
Sun May 24 18:32:04 2020 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql3306/data --output_file=/tmp/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000013
Sun May 24 18:32:04 2020 - [info]   Connecting to root@192.168.28.128(192.168.28.128:22)..Creating /tmp if not exists..    ok.Checking output directory is accessible or not..ok.Binlog found at /data/mysql3306/data, up to mysql-bin.000013
Sun May 24 18:32:05 2020 - [info] Binlog setting check done.
Sun May 24 18:32:05 2020 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sun May 24 18:32:05 2020 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.28.131 --slave_ip=192.168.28.131 --slave_port=3306 --workdir=/tmp --target_version=5.7.25-28-log --manager_version=0.58 --relay_log_info=/data/mysql3306/data/relay-log.info  --relay_dir=/data/mysql3306/data/  --slave_pass=xxx
Sun May 24 18:32:05 2020 - [info]   Connecting to root@192.168.28.131(192.168.28.131:22)..Checking slave recovery environment settings..Opening /data/mysql3306/data/relay-log.info ... ok.Relay log found at /data/mysql3306/data, up to relay-log.000005Temporary relay log file is /data/mysql3306/data/relay-log.000005Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.done.Testing mysqlbinlog output.. done.Cleaning up test file(s).. done.
Sun May 24 18:32:06 2020 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.28.132 --slave_ip=192.168.28.132 --slave_port=3306 --workdir=/tmp --target_version=5.7.25-28-log --manager_version=0.58 --relay_log_info=/data/mysql3306/data/relay-log.info  --relay_dir=/data/mysql3306/data/  --slave_pass=xxx
Sun May 24 18:32:06 2020 - [info]   Connecting to root@192.168.28.132(192.168.28.132:22)..Checking slave recovery environment settings..Opening /data/mysql3306/data/relay-log.info ... ok.Relay log found at /data/mysql3306/data, up to relay-log.000005Temporary relay log file is /data/mysql3306/data/relay-log.000005Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.Testing mysql connection and privileges.
mysql: [Warning] Using a password on the command line interface can be insecure.done.Testing mysqlbinlog output.. done.Cleaning up test file(s).. done.
Sun May 24 18:32:07 2020 - [info] Slaves settings check done.
Sun May 24 18:32:07 2020 - [info]
192.168.28.128(192.168.28.128:3306) (current master)+--192.168.28.131(192.168.28.131:3306)+--192.168.28.132(192.168.28.132:3306)Sun May 24 18:32:07 2020 - [info] Checking master_ip_failover_script status:
Sun May 24 18:32:07 2020 - [info]   /usr/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.28.128 --orig_master_ip=192.168.28.128 --orig_master_port=3306IN SCRIPT TEST====/sbin/ip addr del 192.168.28.199/24 dev ens33==/sbin/ip addr add  192.168.28.199/24 dev  ens33===Checking the Status of the script.. OK
Sun May 24 18:32:08 2020 - [info]  OK.
Sun May 24 18:32:08 2020 - [warning] shutdown_script is not defined.
Sun May 24 18:32:08 2020 - [info] Set master ping interval 3 seconds.
Sun May 24 18:32:08 2020 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Sun May 24 18:32:08 2020 - [info] Starting ping health check on 192.168.28.128(192.168.28.128:3306)..
Sun May 24 18:32:08 2020 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

3.2 测试自动切换

模拟主库数据库down

主库执行shutdown

mysql> shutdown;

观察日志:

日志中大致的流程是检测到主库(192.168.28.128:3306)不可用–>连续试探3次(次数可自定义)–>检测进群中剩余存活的节点–>从备选主节点中选择一个节点为主节点–>漂移VIP至新的主节点(如果原主节点系统正常则将VIP在原主机上删除)–>拷贝原主节点的binlog日志–>新主节点判断是否需要补充日志–>其他节点全部改为从新主节点复制数据(组成新的集群)

Sun May 24 18:35:56 2020 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Sun May 24 18:35:56 2020 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql3306/data --output_file=/tmp/save_binary_logs_test --manager_version=0.58 --binlog_prefix=mysql-bin
Sun May 24 18:35:56 2020 - [info] HealthCheck: SSH to 192.168.28.128 is reachable.
Sun May 24 18:35:59 2020 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.28.128' (111))
Sun May 24 18:35:59 2020 - [warning] Connection failed 2 time(s)..
Sun May 24 18:36:02 2020 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.28.128' (111))
Sun May 24 18:36:02 2020 - [warning] Connection failed 3 time(s)..
Sun May 24 18:36:05 2020 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.28.128' (111))
Sun May 24 18:36:05 2020 - [warning] Connection failed 4 time(s)..
Sun May 24 18:36:05 2020 - [warning] Master is not reachable from health checker!
Sun May 24 18:36:05 2020 - [warning] Master 192.168.28.128(192.168.28.128:3306) is not reachable!
Sun May 24 18:36:05 2020 - [warning] SSH is reachable.
Sun May 24 18:36:05 2020 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.conf again, and trying to connect to all servers to check server status..
Sun May 24 18:36:05 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun May 24 18:36:05 2020 - [info] Reading application default configuration from /etc/masterha/app1.conf..
Sun May 24 18:36:05 2020 - [info] Reading server configuration from /etc/masterha/app1.conf..
Sun May 24 18:36:06 2020 - [info] GTID failover mode = 0
Sun May 24 18:36:06 2020 - [info] Dead Servers:
Sun May 24 18:36:06 2020 - [info]   192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:06 2020 - [info] Alive Servers:
Sun May 24 18:36:06 2020 - [info]   192.168.28.131(192.168.28.131:3306)
Sun May 24 18:36:06 2020 - [info]   192.168.28.132(192.168.28.132:3306)
Sun May 24 18:36:06 2020 - [info] Alive Slaves:
Sun May 24 18:36:06 2020 - [info]   192.168.28.131(192.168.28.131:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:36:06 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:06 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun May 24 18:36:06 2020 - [info]   192.168.28.132(192.168.28.132:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:36:06 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:06 2020 - [info]     Not candidate for the new Master (no_master is set)
Sun May 24 18:36:06 2020 - [info] Checking slave configurations..
Sun May 24 18:36:06 2020 - [info] Checking replication filtering settings..
Sun May 24 18:36:06 2020 - [info]  Replication filtering check ok.
Sun May 24 18:36:06 2020 - [info] Master is down!
Sun May 24 18:36:06 2020 - [info] Terminating monitoring script.
Sun May 24 18:36:06 2020 - [info] Got exit code 20 (Master dead).
Sun May 24 18:36:06 2020 - [info] MHA::MasterFailover version 0.58.
Sun May 24 18:36:06 2020 - [info] Starting master failover.
Sun May 24 18:36:06 2020 - [info]
Sun May 24 18:36:06 2020 - [info] * Phase 1: Configuration Check Phase..
Sun May 24 18:36:06 2020 - [info]
Sun May 24 18:36:07 2020 - [info] GTID failover mode = 0
Sun May 24 18:36:07 2020 - [info] Dead Servers:
Sun May 24 18:36:07 2020 - [info]   192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:07 2020 - [info] Checking master reachability via MySQL(double check)...
Sun May 24 18:36:07 2020 - [info]  ok.
Sun May 24 18:36:07 2020 - [info] Alive Servers:
Sun May 24 18:36:07 2020 - [info]   192.168.28.131(192.168.28.131:3306)
Sun May 24 18:36:07 2020 - [info]   192.168.28.132(192.168.28.132:3306)
Sun May 24 18:36:07 2020 - [info] Alive Slaves:
Sun May 24 18:36:07 2020 - [info]   192.168.28.131(192.168.28.131:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:36:07 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:07 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun May 24 18:36:07 2020 - [info]   192.168.28.132(192.168.28.132:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:36:07 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:07 2020 - [info]     Not candidate for the new Master (no_master is set)
Sun May 24 18:36:07 2020 - [info] Starting Non-GTID based failover.
Sun May 24 18:36:07 2020 - [info]
Sun May 24 18:36:07 2020 - [info] ** Phase 1: Configuration Check Phase completed.
Sun May 24 18:36:07 2020 - [info]
Sun May 24 18:36:07 2020 - [info] * Phase 2: Dead Master Shutdown Phase..
Sun May 24 18:36:07 2020 - [info]
Sun May 24 18:36:07 2020 - [info] * Phase 2: Dead Master Shutdown Phase..
Sun May 24 18:36:07 2020 - [info]
Sun May 24 18:36:07 2020 - [info] Forcing shutdown so that applications never connect to the current master..
Sun May 24 18:36:07 2020 - [info] Executing master IP deactivation script:
Sun May 24 18:36:07 2020 - [info]   /usr/bin/master_ip_failover --orig_master_host=192.168.28.128 --orig_master_ip=192.168.28.128 --orig_master_port=3306 --command=stopssh --ssh_user=rootIN SCRIPT TEST====/sbin/ip addr del 192.168.28.199/24 dev ens33==/sbin/ip addr add  192.168.28.199/24 dev  ens33===Disabling the VIP on old master: 192.168.28.128
Sun May 24 18:36:08 2020 - [info]  done.
Sun May 24 18:36:08 2020 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Sun May 24 18:36:08 2020 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Sun May 24 18:36:08 2020 - [info]
Sun May 24 18:36:08 2020 - [info] * Phase 3: Master Recovery Phase..
Sun May 24 18:36:08 2020 - [info]
Sun May 24 18:36:08 2020 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Sun May 24 18:36:08 2020 - [info]
Sun May 24 18:36:08 2020 - [info] The latest binary log file/position on all slaves is mysql-bin.000013:154
Sun May 24 18:36:08 2020 - [info] Latest slaves (Slaves that received relay log files to the latest):
Sun May 24 18:36:08 2020 - [info]   192.168.28.131(192.168.28.131:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:36:08 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:08 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun May 24 18:36:08 2020 - [info]   192.168.28.132(192.168.28.132:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:36:08 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:08 2020 - [info]     Not candidate for the new Master (no_master is set)
Sun May 24 18:36:08 2020 - [info] The oldest binary log file/position on all slaves is mysql-bin.000013:154
Sun May 24 18:36:08 2020 - [info] Oldest slaves:
Sun May 24 18:36:08 2020 - [info]   192.168.28.131(192.168.28.131:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:36:08 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:08 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun May 24 18:36:08 2020 - [info]   192.168.28.132(192.168.28.132:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:36:08 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:08 2020 - [info]     Not candidate for the new Master (no_master is set)
Sun May 24 18:36:08 2020 - [info]
Sun May 24 18:36:08 2020 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Sun May 24 18:36:08 2020 - [info]
Sun May 24 18:36:09 2020 - [info] Fetching dead master's binary logs..
Sun May 24 18:36:09 2020 - [info] Executing command on the dead master 192.168.28.128(192.168.28.128:3306): save_binary_logs --command=save --start_file=mysql-bin.000013  --start_pos=154 --binlog_dir=/data/mysql3306/data --output_file=/tmp/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58Creating /tmp if not exists..    ok.Concat binary/relay logs from mysql-bin.000013 pos 154 to mysql-bin.000013 EOF into /tmp/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog ..Binlog Checksum enabledDumping binlog format description event, from position 0 to 154.. ok.Dumping effective binlog data from /data/mysql3306/data/mysql-bin.000013 position 154 to tail(177).. ok.Binlog Checksum enabledConcat succeeded.
Sun May 24 18:36:11 2020 - [info] scp from root@192.168.28.128:/tmp/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog to local:/var/log/masterha/app1/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog succeeded.
Sun May 24 18:36:12 2020 - [info] HealthCheck: SSH to 192.168.28.131 is reachable.
Sun May 24 18:36:13 2020 - [info] HealthCheck: SSH to 192.168.28.132 is reachable.
Sun May 24 18:36:14 2020 - [info]
Sun May 24 18:36:14 2020 - [info] * Phase 3.3: Determining New Master Phase..
Sun May 24 18:36:14 2020 - [info]
Sun May 24 18:36:14 2020 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Sun May 24 18:36:14 2020 - [info] All slaves received relay logs to the same position. No need to resync each other.
Sun May 24 18:36:14 2020 - [info] Searching new master from slaves..
Sun May 24 18:36:14 2020 - [info]  Candidate masters from the configuration file:
Sun May 24 18:36:14 2020 - [info]   192.168.28.131(192.168.28.131:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:36:14 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:14 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun May 24 18:36:14 2020 - [info]  Non-candidate masters:
Sun May 24 18:36:14 2020 - [info]   192.168.28.132(192.168.28.132:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 18:36:14 2020 - [info]     Replicating from 192.168.28.128(192.168.28.128:3306)
Sun May 24 18:36:14 2020 - [info]     Not candidate for the new Master (no_master is set)
Sun May 24 18:36:14 2020 - [info] New master is 192.168.28.131(192.168.28.131:3306)
Sun May 24 18:36:14 2020 - [info] Starting master failover..
Sun May 24 18:36:14 2020 - [info]
From:
192.168.28.128(192.168.28.128:3306) (current master)+--192.168.28.131(192.168.28.131:3306)+--192.168.28.132(192.168.28.132:3306)To:
192.168.28.131(192.168.28.131:3306) (new master)+--192.168.28.132(192.168.28.132:3306)
Sun May 24 18:36:14 2020 - [info]
Sun May 24 18:36:14 2020 - [info] * Phase 3.4: New Master Diff Log Generation Phase..
Sun May 24 18:36:14 2020 - [info]
Sun May 24 18:36:14 2020 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Sun May 24 18:36:14 2020 - [info] Sending binlog..
Sun May 24 18:36:15 2020 - [info] scp from local:/var/log/masterha/app1/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog to root@192.168.28.131:/tmp/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog succeeded.
Sun May 24 18:36:15 2020 - [info]
Sun May 24 18:36:15 2020 - [info] * Phase 3.5: Master Log Apply Phase..
Sun May 24 18:36:15 2020 - [info]
Sun May 24 18:36:15 2020 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Sun May 24 18:36:15 2020 - [info] Starting recovery on 192.168.28.131(192.168.28.131:3306)..
Sun May 24 18:36:15 2020 - [info]  Generating diffs succeeded.
Sun May 24 18:36:15 2020 - [info] Waiting until all relay logs are applied.
Sun May 24 18:36:15 2020 - [info]  done.
Sun May 24 18:36:15 2020 - [info] Getting slave status..
Sun May 24 18:36:15 2020 - [info] This slave(192.168.28.131)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000013:154). No need to recover from Exec_Master_Log_Pos.
Sun May 24 18:36:15 2020 - [info] Connecting to the target slave host 192.168.28.131, running recover script..
Sun May 24 18:36:15 2020 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user='mha' --slave_host=192.168.28.131 --slave_ip=192.168.28.131  --slave_port=3306 --apply_files=/tmp/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog --workdir=/tmp --target_version=5.7.25-28-log --timestamp=20200524183606 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 --slave_pass=xxx
Sun May 24 18:36:16 2020 - [info]
MySQL client version is 5.7.25. Using --binary-mode.
Applying differential binary/relay log files /tmp/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog on 192.168.28.131:3306. This may take long time...
Applying log files succeeded.
Sun May 24 18:36:16 2020 - [info]  All relay logs were successfully applied.
Sun May 24 18:36:16 2020 - [info] Getting new master's binlog name and position..
Sun May 24 18:36:16 2020 - [info]  mysql-bin.000008:154
Sun May 24 18:36:16 2020 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.28.131', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000008', MASTER_LOG_POS=154, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Sun May 24 18:36:16 2020 - [info] Executing master IP activate script:
Sun May 24 18:36:16 2020 - [info]   /usr/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.28.128 --orig_master_ip=192.168.28.128 --orig_master_port=3306 --new_master_host=192.168.28.131 --new_master_ip=192.168.28.131 --new_master_port=3306 --new_master_user='mha'   --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_passwordIN SCRIPT TEST====/sbin/ip addr del 192.168.28.199/24 dev ens33==/sbin/ip addr add  192.168.28.199/24 dev  ens33===Enabling the VIP - 192.168.28.199/24 on the new master - 192.168.28.131
Sun May 24 18:36:17 2020 - [info]  OK.
Sun May 24 18:36:17 2020 - [info] Setting read_only=0 on 192.168.28.131(192.168.28.131:3306)..
Sun May 24 18:36:17 2020 - [info]  ok.
Sun May 24 18:36:17 2020 - [info] ** Finished master recovery successfully.
Sun May 24 18:36:17 2020 - [info] * Phase 3: Master Recovery Phase completed.
Sun May 24 18:36:17 2020 - [info]
Sun May 24 18:36:17 2020 - [info] * Phase 4: Slaves Recovery Phase..
Sun May 24 18:36:17 2020 - [info]
Sun May 24 18:36:17 2020 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Sun May 24 18:36:17 2020 - [info]
Sun May 24 18:36:17 2020 - [info] -- Slave diff file generation on host 192.168.28.132(192.168.28.132:3306) started, pid: 48890. Check tmp log /var/log/masterha/app1/192.168.28.132_3306_20200524183606.log if it takes time..
Sun May 24 18:36:18 2020 - [info]
Sun May 24 18:36:18 2020 - [info]
Sun May 24 18:36:18 2020 - [info] Log messages from 192.168.28.132 ...
Sun May 24 18:36:18 2020 - [info]
Sun May 24 18:36:17 2020 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Sun May 24 18:36:18 2020 - [info] End of log messages from 192.168.28.132.
Sun May 24 18:36:18 2020 - [info] -- 192.168.28.132(192.168.28.132:3306) has the latest relay log events.
Sun May 24 18:36:18 2020 - [info] Generating relay diff files from the latest slave succeeded.
Sun May 24 18:36:18 2020 - [info]
Sun May 24 18:36:18 2020 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Sun May 24 18:36:18 2020 - [info]
Sun May 24 18:36:18 2020 - [info] -- Slave recovery on host 192.168.28.132(192.168.28.132:3306) started, pid: 48892. Check tmp log /var/log/masterha/app1/192.168.28.132_3306_20200524183606.log if it takes time..
Sun May 24 18:36:21 2020 - [info]
Sun May 24 18:36:21 2020 - [info] Log messages from 192.168.28.132 ...
Sun May 24 18:36:21 2020 - [info]
Sun May 24 18:36:18 2020 - [info] Sending binlog..
Sun May 24 18:36:19 2020 - [info] scp from local:/var/log/masterha/app1/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog to root@192.168.28.132:/tmp/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog succeeded.
Sun May 24 18:36:19 2020 - [info] Starting recovery on 192.168.28.132(192.168.28.132:3306)..
Sun May 24 18:36:19 2020 - [info]  Generating diffs succeeded.
Sun May 24 18:36:19 2020 - [info] Waiting until all relay logs are applied.
Sun May 24 18:36:19 2020 - [info]  done.
Sun May 24 18:36:19 2020 - [info] Getting slave status..
Sun May 24 18:36:19 2020 - [info] This slave(192.168.28.132)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000013:154). No need to recover from Exec_Master_Log_Pos.
Sun May 24 18:36:19 2020 - [info] Connecting to the target slave host 192.168.28.132, running recover script..
Sun May 24 18:36:19 2020 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user='mha' --slave_host=192.168.28.132 --slave_ip=192.168.28.132  --slave_port=3306 --apply_files=/tmp/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog --workdir=/tmp --target_version=5.7.25-28-log --timestamp=20200524183606 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 --slave_pass=xxx
Sun May 24 18:36:20 2020 - [info]
MySQL client version is 5.7.25. Using --binary-mode.
Applying differential binary/relay log files /tmp/saved_master_binlog_from_192.168.28.128_3306_20200524183606.binlog on 192.168.28.132:3306. This may take long time...
Applying log files succeeded.
Sun May 24 18:36:20 2020 - [info]  All relay logs were successfully applied.
Sun May 24 18:36:20 2020 - [info]  Resetting slave 192.168.28.132(192.168.28.132:3306) and starting replication from the new master 192.168.28.131(192.168.28.131:3306)..
Sun May 24 18:36:20 2020 - [info]  Executed CHANGE MASTER.
Sun May 24 18:36:20 2020 - [info]  Slave started.
Sun May 24 18:36:21 2020 - [info] End of log messages from 192.168.28.132.
Sun May 24 18:36:21 2020 - [info] -- Slave recovery on host 192.168.28.132(192.168.28.132:3306) succeeded.
Sun May 24 18:36:21 2020 - [info] All new slave servers recovered successfully.
Sun May 24 18:36:21 2020 - [info]
Sun May 24 18:36:21 2020 - [info] * Phase 5: New master cleanup phase..
Sun May 24 18:36:21 2020 - [info]
Sun May 24 18:36:21 2020 - [info] Resetting slave info on the new master..
Sun May 24 18:36:21 2020 - [info]  192.168.28.131: Resetting slave info succeeded.
Sun May 24 18:36:21 2020 - [info] Master failover to 192.168.28.131(192.168.28.131:3306) completed successfully.
Sun May 24 18:36:21 2020 - [info]----- Failover Report -----app1: MySQL Master failover 192.168.28.128(192.168.28.128:3306) to 192.168.28.131(192.168.28.131:3306) succeededMaster 192.168.28.128(192.168.28.128:3306) is down!Check MHA Manager logs at mha3:/var/log/masterha/app1/app1.log for details.Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.28.128(192.168.28.128:3306)
The latest slave 192.168.28.131(192.168.28.131:3306) has all relay logs for recovery.
Selected 192.168.28.131(192.168.28.131:3306) as a new master.
192.168.28.131(192.168.28.131:3306): OK: Applying all logs succeeded.
192.168.28.131(192.168.28.131:3306): OK: Activated master IP address.
192.168.28.132(192.168.28.132:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.28.132(192.168.28.132:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.28.131(192.168.28.131:3306)
192.168.28.131(192.168.28.131:3306): Resetting slave info succeeded.
Master failover to 192.168.28.131(192.168.28.131:3306) completed successfully.

此时的VIP在 192.168.28.131机器上了

在这里插入图片描述

原主节点已删掉该VIP

在这里插入图片描述

3.3 手动切换测试

将原主节点恢复并加入集群,保证集群3个节点在线(手动切换时MHA管理进程需关闭)

[root@mha1 masterha]# /usr/local/mysql5.7/bin/mysqld_safe  --defaults-file=/data/mysql3306/etc/my.cnf  &[root@mha1 masterha]# mysql -uroot -p'123456' --socket=/data/mysql3306/tmp/mysql.sockSQL>  change master to master_host='192.168.28.131',master_user='repl', master_password='repl',master_log_file='mysql-bin.000008',master_log_pos=154;  /*生产环境的恢复建议备份主库再配置同步*/

此时再检测集群状态

[root@mha3 app1]# masterha_check_repl --conf=/etc/masterha/app1.conf

在这里插入图片描述

手动切换主库

很多时候需要主动进行主从切换,此时就可以用MHA的手动切换脚本来进行,例如将主库再切回192.168.28.128:3306上(此时MHA如果是启动状态则必须关闭)

masterha_master_switch  --conf=/etc/masterha/app1.conf  --master_state=alive  --orig_master_is_new_slave  --new_master_host=192.168.28.128 --new_master_port=3306

切换过程如下:


[root@mha3 app1]# masterha_master_switch  --conf=/etc/masterha/app1.conf --master_state=alive  --orig_master_is_new_slave --new_master_host=192.168.28.128 --new_master_port=3306
Sun May 24 19:10:29 2020 - [info] MHA::MasterRotate version 0.58.
Sun May 24 19:10:29 2020 - [info] Starting online master switch..
Sun May 24 19:10:29 2020 - [info] 
Sun May 24 19:10:29 2020 - [info] * Phase 1: Configuration Check Phase..
Sun May 24 19:10:29 2020 - [info] 
Sun May 24 19:10:29 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun May 24 19:10:29 2020 - [info] Reading application default configuration from /etc/masterha/app1.conf..
Sun May 24 19:10:29 2020 - [info] Reading server configuration from /etc/masterha/app1.conf..
Sun May 24 19:10:30 2020 - [info] GTID failover mode = 0
Sun May 24 19:10:30 2020 - [info] Current Alive Master: 192.168.28.131(192.168.28.131:3306)
Sun May 24 19:10:30 2020 - [info] Alive Slaves:
Sun May 24 19:10:30 2020 - [info]   192.168.28.128(192.168.28.128:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 19:10:30 2020 - [info]     Replicating from 192.168.28.131(192.168.28.131:3306)
Sun May 24 19:10:30 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun May 24 19:10:30 2020 - [info]   192.168.28.132(192.168.28.132:3306)  Version=5.7.25-28-log (oldest major version between slaves) log-bin:enabled
Sun May 24 19:10:30 2020 - [info]     Replicating from 192.168.28.131(192.168.28.131:3306)
Sun May 24 19:10:30 2020 - [info]     Not candidate for the new Master (no_master is set)It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.28.131(192.168.28.131:3306)? (YES/no): yes
Sun May 24 19:10:32 2020 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Sun May 24 19:10:32 2020 - [info]  ok.
Sun May 24 19:10:32 2020 - [info] Checking MHA is not monitoring or doing failover..
Sun May 24 19:10:32 2020 - [info] Checking replication health on 192.168.28.128..
Sun May 24 19:10:32 2020 - [info]  ok.
Sun May 24 19:10:32 2020 - [info] Checking replication health on 192.168.28.132..
Sun May 24 19:10:32 2020 - [info]  ok.
Sun May 24 19:10:32 2020 - [info] 192.168.28.128 can be new master.
Sun May 24 19:10:32 2020 - [info] 
From:
192.168.28.131(192.168.28.131:3306) (current master)+--192.168.28.128(192.168.28.128:3306)+--192.168.28.132(192.168.28.132:3306)To:
192.168.28.128(192.168.28.128:3306) (new master)+--192.168.28.132(192.168.28.132:3306)+--192.168.28.131(192.168.28.131:3306)Starting master switch from 192.168.28.131(192.168.28.131:3306) to 192.168.28.128(192.168.28.128:3306)? (yes/NO): yes
Sun May 24 19:10:33 2020 - [info] Checking whether 192.168.28.128(192.168.28.128:3306) is ok for the new master..
Sun May 24 19:10:33 2020 - [info]  ok.
Sun May 24 19:10:33 2020 - [info] 192.168.28.131(192.168.28.131:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.
Sun May 24 19:10:33 2020 - [info] 192.168.28.131(192.168.28.131:3306): Resetting slave pointing to the dummy host.
Sun May 24 19:10:33 2020 - [info] ** Phase 1: Configuration Check Phase completed.
Sun May 24 19:10:33 2020 - [info] 
Sun May 24 19:10:33 2020 - [info] * Phase 2: Rejecting updates Phase..
Sun May 24 19:10:33 2020 - [info] 
Sun May 24 19:10:33 2020 - [info] Executing master ip online change script to disable write on the current master:
Sun May 24 19:10:33 2020 - [info]   /usr/bin/master_ip_online_change --command=stop --orig_master_host=192.168.28.131 --orig_master_ip=192.168.28.131 --orig_master_port=3306 --orig_master_user='mha' --new_master_host=192.168.28.128 --new_master_ip=192.168.28.128 --new_master_port=3306 --new_master_user='mha' --orig_master_ssh_user=root --new_master_ssh_user=root   --orig_master_is_new_slave --orig_master_password=xxx --new_master_password=xxxIN SCRIPT TEST====/sbin/ip addr del 192.168.28.199/24 dev ens33==/sbin/ip addr add 192.168.28.199/24 dev ens33===Disabling the VIP on old master: 192.168.28.131 
Sun May 24 19:10:33 2020 - [info]  ok.
Sun May 24 19:10:33 2020 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Sun May 24 19:10:33 2020 - [info] Executing FLUSH TABLES WITH READ LOCK..
Sun May 24 19:10:33 2020 - [info]  ok.
Sun May 24 19:10:33 2020 - [info] Orig master binlog:pos is mysql-bin.000008:154.
Sun May 24 19:10:33 2020 - [info]  Waiting to execute all relay logs on 192.168.28.128(192.168.28.128:3306)..
Sun May 24 19:10:33 2020 - [info]  master_pos_wait(mysql-bin.000008:154) completed on 192.168.28.128(192.168.28.128:3306). Executed 0 events.
Sun May 24 19:10:33 2020 - [info]   done.
Sun May 24 19:10:33 2020 - [info] Getting new master's binlog name and position..
Sun May 24 19:10:33 2020 - [info]  mysql-bin.000014:154
Sun May 24 19:10:33 2020 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.28.128', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000014', MASTER_LOG_POS=154, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Sun May 24 19:10:33 2020 - [info] Executing master ip online change script to allow write on the new master:
Sun May 24 19:10:33 2020 - [info]   /usr/bin/master_ip_online_change --command=start --orig_master_host=192.168.28.131 --orig_master_ip=192.168.28.131 --orig_master_port=3306 --orig_master_user='mha' --new_master_host=192.168.28.128 --new_master_ip=192.168.28.128 --new_master_port=3306 --new_master_user='mha' --orig_master_ssh_user=root --new_master_ssh_user=root   --orig_master_is_new_slave --orig_master_password=xxx --new_master_password=xxxIN SCRIPT TEST====/sbin/ip addr del 192.168.28.199/24 dev ens33==/sbin/ip addr add 192.168.28.199/24 dev ens33===Enabling the VIP - 192.168.28.199/24 on the new master - 192.168.28.128
Sun May 24 19:10:34 2020 - [info]  ok.
Sun May 24 19:10:34 2020 - [info] Setting read_only=0 on 192.168.28.128(192.168.28.128:3306)..
Sun May 24 19:10:34 2020 - [info]  ok.
Sun May 24 19:10:34 2020 - [info]
Sun May 24 19:10:34 2020 - [info] * Switching slaves in parallel..
Sun May 24 19:10:34 2020 - [info]
Sun May 24 19:10:34 2020 - [info] -- Slave switch on host 192.168.28.132(192.168.28.132:3306) started, pid: 49178
Sun May 24 19:10:34 2020 - [info]
Sun May 24 19:10:35 2020 - [info] Log messages from 192.168.28.132 ...
Sun May 24 19:10:35 2020 - [info]
Sun May 24 19:10:34 2020 - [info]  Waiting to execute all relay logs on 192.168.28.132(192.168.28.132:3306)..
Sun May 24 19:10:34 2020 - [info]  master_pos_wait(mysql-bin.000008:154) completed on 192.168.28.132(192.168.28.132:3306). Executed 0 events.
Sun May 24 19:10:34 2020 - [info]   done.
Sun May 24 19:10:34 2020 - [info]  Resetting slave 192.168.28.132(192.168.28.132:3306) and starting replication from the new master 192.168.28.128(192.168.28.128:3306)..
Sun May 24 19:10:34 2020 - [info]  Executed CHANGE MASTER.
Sun May 24 19:10:34 2020 - [info]  Slave started.
Sun May 24 19:10:35 2020 - [info] End of log messages from 192.168.28.132 ...
Sun May 24 19:10:35 2020 - [info]
Sun May 24 19:10:35 2020 - [info] -- Slave switch on host 192.168.28.132(192.168.28.132:3306) succeeded.
Sun May 24 19:10:35 2020 - [info] Unlocking all tables on the orig master:
Sun May 24 19:10:35 2020 - [info] Executing UNLOCK TABLES..
Sun May 24 19:10:35 2020 - [info]  ok.
Sun May 24 19:10:35 2020 - [info] Starting orig master as a new slave..
Sun May 24 19:10:35 2020 - [info]  Resetting slave 192.168.28.131(192.168.28.131:3306) and starting replication from the new master 192.168.28.128(192.168.28.128:3306)..
Sun May 24 19:10:35 2020 - [info]  Executed CHANGE MASTER.
Sun May 24 19:10:35 2020 - [info]  Slave started.
Sun May 24 19:10:35 2020 - [info] All new slave servers switched successfully.
Sun May 24 19:10:35 2020 - [info]
Sun May 24 19:10:35 2020 - [info] * Phase 5: New master cleanup phase..
Sun May 24 19:10:35 2020 - [info]
Sun May 24 19:10:35 2020 - [info]  192.168.28.128: Resetting slave info succeeded.
Sun May 24 19:10:35 2020 - [info] Switching master to 192.168.28.128(192.168.28.128:3306) completed successfully.

此时查看,主库已切回192.168.28.128:3306节点上了。

五、补充内容

配置2个定时任务,分别用于清理relay-log及服务器时钟同步,每台机器上均配置

清理relay-log

因MHA集群建议关闭relay-log 所以relay-log需要手动清理,因此可以配置一个定时任务进行清理

00 01 * * 0 /usr/bin/purge_relay_logs --user=mha --password='MHAadmin123' --host=192.168.28.131' --disable_relay_log_purge >> /var/log/masterha/app1/purge_relay_logs.log 2>&1

配置时钟同步

可以配置公网的时钟服务器,也可以自己搭建(生产环境需有自建的时钟服务器)

*/15  *  * * *   /usr/sbin/ntpdate  ntp1.aliyun.com; /sbin/hwclock -w

六、结语

MHA的搭建过程中最大的困难点在于经常依赖包安装不全以及相关脚本与版本不对应导致一直无法部署,还有一个问题是集群复制检查、手动切换主库均正常,但是主库异常宕机时无法切换(切换脚本问题)。小伙伴们在实践过程中遇到问题可以多多与我沟通,相互学习提高排坑技能。


好书推荐

在这里插入图片描述

《SQL职场必备 》

MySQL作为一款开源的关系型数据库管理系统,有着强大的功能和广泛的应用领域,对促进信息化建设、推动数字经济发展起着重要的作用。《SQL职场必备》为读者详述了处理各类SQL数据所需的基本技能。通过“边做边学”这种简明直观的教学方式,让读者轻松掌握SQL的基础知识,并能在实际工作环境和场景中快速而高效地将其应用。书中的每一课都详细阐述了关键概念,并配备了与SQL任务紧密相关的实践练习,以帮助读者巩固所学的内容。

购书链接:点此进入

Kimberly A. Weiss 是WileyEdge课程运营的高级经理。她与多所大学以及企业培训机构合作,针对软件开发课程为学员设计成功的交互式教学案例。

Haythem Balti博士 是Wiley Edge的副院长。他创建了许多门课程,供数千个软件协会和WileyEdge(前身为mthree)校友使用,以学习SQL、Go、Java、Python和其他编程语言及数据科学技能。

在这里插入图片描述


在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/bicheng/12124.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

nginx配置域名与IP访问服务冲突问题

在最近的一次开发中遇到一个问题&#xff0c;我在云服务器上部署了两个服务&#xff0c;A服务和B服务&#xff0c; A服务在服务器中用的端口是80端口&#xff0c;所以我在浏览器访问的地址就是 B服务在服务器中用的是9818端口&#xff0c;所以我在浏览器访问的是 现在我给B服务…

Apache访问控制与虚拟主机

目录 一. Web服务简介 以下是一些 Web 服务的基本概念和特征 以下是一些主流的 Web 服务器 WEB 服务协议 二. Apache 服务的搭建与配置 2.1 Apache 介绍 2.2 Apache安装 2.3 Apache目录介绍 三. 访问控制 四. 修改默认网站发布目录 五. 虚拟主机 5.1 基于域名的虚拟…

产品经理也要学个PMP证书?

随着互联网行业竞争的加剧&#xff0c;越来越多的互联网公司将产品经理视为重点培养对象。为了提升自身能力&#xff0c;许多产品经理选择考取项目管理专业认证PMP&#xff08;Project Management Professional&#xff09;。那么&#xff0c;PMP对产品经理来说是否真的有帮助呢…

发布订阅模式

一、常见的发布订阅模式 1、Dom的事件 Event addEventListener dispatchEvent //订阅中心 const event new Event(zyk); //订阅 document.addEventListener(zyk, (value)>{console.log(我收到了&#xff1a;, value) }); //发布 document.dispatchEvent(e, 1); docume…

生活服务商家拥抱数字化,鸿运果系统加速“服务生意数字化”进程

在数字化转型的大潮中&#xff0c;生活服务商家正积极拥抱变革&#xff0c;以适应新的市场环境和消费者需求。鸿运果系统作为专业的“服务生意”数字化解决方案提供商&#xff0c;正助力商家加速数字化转型&#xff0c;推动行业向智能化、个性化服务转型。 数字化转型的背景 …

部分树上问题及图的联通性(图论学习总结部分内容)

文章目录 前言三、部分树上问题及图的联通性最小生成树知识点例题 e g 1 : eg1: eg1: 走廊泼水节&#xff08;克鲁斯卡尔思想的灵活运用&#xff09; e g 2 &#xff1a; eg2&#xff1a; eg2&#xff1a; B-Picnic Planning e g 3 eg3 eg3&#xff1a;L - Classic Problem&…

会议发布会展览展会,不用活动如何制定媒体邀约方案?

传媒如春雨&#xff0c;润物细无声&#xff0c;大家好&#xff0c;我是51媒体网胡老师。 在会议、发布会、展览展会等不同活动的情况下&#xff0c;制定媒体邀约方案是非常必要的&#xff0c;因为它可以帮助你有效地传播信息&#xff0c;扩大影响力。以下是一个关于如何制定媒…

深入学习Linux内核之v4l2驱动框架(一)

一&#xff0c;概述 V4L2&#xff08;Video for Linux 2&#xff09;是Linux操作系统中用于支持摄像头和视频设备的框架。它提供了一组API和驱动程序接口&#xff0c;用于在Linux系统中进行视频采集、视频流处理和视频播放等操作。 V4L2的设计目标是支持多种设备&#xff0c;…

游戏中的设计模式一

游戏开发是一个快速迭代的过程&#xff0c;代码复杂度也很高&#xff0c;借助于设计模式&#xff0c;可以帮助我们降低复杂度&#xff0c;降低系统间的耦合&#xff0c;从而高效高质的做出交付。 最近读了这本书&#xff1a;《游戏编程模式》[1]&#xff0c;很受启发&#xff…

win10系统解除微软账户和本地账户绑定

折腾了好久&#xff0c;终于找到一种方法可以退出微软账号了&#xff0c;不过这种方法我测试是成功的&#xff0c;有人留言自己不成功&#xff0c;具体解决方法只能看这些留言了 win10当中没有注销按钮&#xff0c;win x 弹出的菜单里面有关闭或注销&#xff0c;可以选择注销…

最短木板长度 - 贪心思维

系列文章目录 文章目录 系列文章目录前言一、题目描述二、输入描述三、输出描述四、java代码五、测试用例 前言 本人最近再练习算法&#xff0c;所以会发布自己的解题思路&#xff0c;希望大家多指教 一、题目描述 小明有 n 块木板&#xff0c;第 i ( 1 ≤ i ≤ n ) 块木板长…

NASA数据即——Aqua AIRS 第 3 级光谱出射长波辐射 (OLR) 月报 (AIRSIL3MSOLR)

Aqua AIRS Level 3 Spectral Outgoing Longwave Radiation (OLR) Monthly (AIRSIL3MSOLR) Aqua AIRS 第 3 级光谱出射长波辐射 (OLR) 月报 (AIRSIL3MSOLR) 简介 这个 L3 光谱出射长波辐射&#xff08;OLR&#xff09;是根据密歇根大学黄向磊开发的算法&#xff0c;利用 AIRS…

前端XHR请求数据

axios封装了XHR(XMLHttpRequest) 效果 项目结构 Jakarta EE9&#xff0c;Web项目。 无额外的maven依赖 1、Web页面 index.html <!DOCTYPE html> <html lang"en"> <head><meta charset"UTF-8"><title>Title</title&…

【JS红宝书学习笔记】第1、2章 初识JS

第1章 什么是JavaScript JavaScript 是一门用来与网页交互的脚本语言&#xff0c;包含以下三个组成部分。 ECMAScript&#xff1a;由 ECMA-262 定义并提供核心功能。文档对象模型&#xff08;DOM&#xff09;&#xff1a;提供与网页内容交互的方法和接口。浏览器对象模型&…

鸿蒙内核源码分析 (内存池管理) | 如何高效切割合并内存块

动态分配 系列篇将动态分配分成上下两篇&#xff0c;本篇为下篇&#xff0c;阅读之前建议翻看上篇。 鸿蒙内核源码分析(TLFS算法) 结合图表从理论视角说清楚 TLFS 算法鸿蒙内核源码分析(内存池管理) 结合源码说清楚鸿蒙内核动态内存池实现过程&#xff0c;个人认为这部分代码…

羊大师分析,羊奶助力共筑健康中国新生活

羊大师分析&#xff0c;羊奶助力共筑健康中国新生活 在健康中国行动的大背景下&#xff0c;我们越来越注重生活方式的健康与营养。羊大师发现&#xff0c;羊奶作为一种营养丰富、易于吸收的天然食品&#xff0c;正逐渐成为我们追求健康生活的得力助手。 羊奶富含优质蛋白质、矿…

vue3自定义指令​(通过指令钩子获得dom和钩子参数)

实现文本框自动获得焦点 Index.vue: <script setup> import { ref, onMounted } from vue import ./index.cssconst vFocus {mounted: (el, binding) > {el.focus()console.log(binding)} }onMounted(() > {}) </script><template><div class&qu…

设计说明-行为型-状态模式-State

状态接口 public interface State {//状态接口void insertQuarter();//投币void ejectQuarter();//退币void turnCrank();//按下“出纸巾”按钮void dispense();//出纸巾 } 有纸巾类 public class HasQuarterState implements State {private TissueMachine tissueMachine;O…

Python中tkinter编程入门4

在Python中tkinter编程入门3-CSDN博客中创建了Button控件&#xff0c;点击该控件就会产生一个点击事件&#xff0c;在创建Button控件时指定该点击事件的处理程序后&#xff0c;按键控件就会对用户的点击事件产生响应。 1 定义事件处理器 定义事件处理器就是一个自定义的函数。…

前端连续发送同一个请求时,终止上一次请求

场景&#xff1a;几个tab页之间快速的切换&#xff08;tab页只是参数不同&#xff0c;下边的数据渲染给同一个data&#xff09;就会导致如果我在1,2,3&#xff0c;tab页按照顺序快速点击&#xff0c;发送三个请求&#xff0c;我想要展示的是3但是如果1或者2请求响应的时间比3长…