IT博客汇 | MHA，增加管理群集（MySQL Replication[主从]）+ 新的增强脚本

MHA，增加管理群集（MySQL Replication[主从]）+ 新的增强脚本

Adamhuan发表于 2017-02-13 06:34:50

在前面的文章里面，我在自己的环境中，搭建了一个MHA的架构：

URL：http://d-prototype.com/archives/6859

当时，这个MHA中只包含一个MySQL的主从架构。
而MHA的架构本身不止于只管理一个MySQL主从架构，所以，本文将演示，如何让MHA接管新的MySQL主从架构。

一、MHA Node：新的MySQL主从架构
总体的配置这里不重复说明，详情可以参阅下面的文章：

URL：http://d-prototype.com/archives/5353

需要注意的是，新的MySQL主从架构中，最初的主库的server_id需要为：10

如下：
主库：

mysql> show variables like 'server_id';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id     | 10    |
+---------------+-------+
1 row in set (0.00 sec)

mysql>

从库：

[root@mha4 ~]# mysql -u root -p'Abcd1@34';
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.7.15-9-log Percona Server (GPL), Release 9, Revision 9f0fd0a

Copyright (c) 2009-2016 Percona LLC and/or its affiliates
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.158.1.149
                  Master_User: replme
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000006
          Read_Master_Log_Pos: 154
               Relay_Log_File: adamhuan_relay-bin.000005
                Relay_Log_Pos: 367
        Relay_Master_Log_File: mysql-bin.000006
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 154
              Relay_Log_Space: 1506
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 10
                  Master_UUID: c8db6e7a-b147-11e6-8700-005056847ecb
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

mysql>

否则可能会报错：

[root@mha1 ~]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf 
Mon Feb 13 10:29:32 2017 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Feb 13 10:29:32 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 10:29:32 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 10:29:32 2017 - [info] MHA::MasterMonitor version 0.56.
Mon Feb 13 10:29:41 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln188] There is no alive server. We can't do failover
Mon Feb 13 10:29:41 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 326
Mon Feb 13 10:29:41 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Mon Feb 13 10:29:41 2017 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!
[root@mha1 ~]#

MySQL的用户权限的配置：
用户：root（所有节点）。
grant all privileges on *.* to root@‘10.158.1.94’ identified by ‘********’; #10.158.1.94，是我的MHA Manager

# MHA Node
grant all privileges on *.* to root@‘10.158.1.149’ identified by ‘********’;
grant all privileges on *.* to root@‘10.158.1.174’ identified by ‘********’;

用户：replme（MHA Node节点）
grant replication slave on *.* to ‘replme’@’10.158.1.173’ identified by ‘************’;
grant replication slave on *.* to ‘replme’@’10.158.1.174’ identified by ‘************’;

如果这里的权限没有设置好，那么会出现这样的问题：
1. repleme，设置不正确会导致MySQL Replication（主从）架构的同步不正常
2. root，配置不正确，会导致：

[root@mha1 ~]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf 
Mon Feb 13 10:42:16 2017 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Feb 13 10:42:16 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 10:42:16 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 10:42:16 2017 - [info] MHA::MasterMonitor version 0.56.
Mon Feb 13 10:42:16 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] Got MySQL error when connecting 10.158.1.149(10.158.1.149:3306) :1045:Access denied for user 'root'@'10.158.1.94' (using password: YES), but this is not a MySQL crash. Check MySQL server settings.
 at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 297
Mon Feb 13 10:42:16 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln301] Got MySQL error when connecting 10.158.1.174(10.158.1.174:3306) :1045:Access denied for user 'root'@'10.158.1.94' (using password: YES), but this is not a MySQL crash. Check MySQL server settings.
 at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 297
Mon Feb 13 10:42:16 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln309] Got fatal error, stopping operations
Mon Feb 13 10:42:16 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 326
Mon Feb 13 10:42:16 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Mon Feb 13 10:42:16 2017 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!
[root@mha1 ~]#

SSH等价关系：
MHA Manager 与 MHA Node 互信。
ssh-copy-id -i .ssh/id_rsa.pub 10.158.1.94
这一部分，在MHA的搭建文档中已经有提到过，不重述。

软件包安装（MHA Node节点）：
YUM安装：mha4mysql-node的软件包。

[root@mha2 ~]# ls -ltr /etc/yum.repos.d/
total 40
-rw-r--r--  1 root root 1056 Nov  5  2012 epel-testing.repo
-rw-r--r--  1 root root  957 Nov  5  2012 epel.repo
-rw-r--r--. 1 root root 6259 May 19  2016 CentOS-Vault.repo
-rw-r--r--. 1 root root  630 May 19  2016 CentOS-Media.repo
-rw-r--r--. 1 root root  289 May 19  2016 CentOS-fasttrack.repo
-rw-r--r--. 1 root root  647 May 19  2016 CentOS-Debuginfo.repo
-rw-r--r--. 1 root root 1991 May 19  2016 CentOS-Base.repo
-rw-r--r--  1 root root   70 Nov 23 14:34 percona.repo
-rw-r--r--  1 root root   82 Dec 17 01:34 local.repo
[root@mha2 ~]# 
[root@mha2 ~]# yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
Loaded plugins: fastestmirror, refresh-packagekit, security
Setting up Install Process
Loading mirror speeds from cached hostfile
epel/metalink                                                                                                       | 6.2 kB     00:00     
 * base: mirrors.aliyun.com
 * epel: mirrors.tuna.tsinghua.edu.cn
 * extras: mirrors.aliyun.com
 * updates: mirrors.aliyun.com
LocalPercona                                                                                                        | 2.9 kB     00:00 ... 
base                                                                                                                | 3.7 kB     00:00     
epel                                                                                                                | 4.3 kB     00:00     
epel/primary_db                                                                                                     | 5.9 MB     00:15     
extras                                                                                                              | 3.4 kB     00:00     
local                                                                                                               | 2.9 kB     00:00 ... 
updates                                                                                                             | 3.4 kB     00:00     
Package perl-DBD-MySQL-4.013-3.el6.x86_64 already installed and latest version
Package perl-Config-Tiny-2.12-7.1.el6.noarch already installed and latest version
Package perl-Log-Dispatch-2.27-1.el6.noarch already installed and latest version
Package perl-Parallel-ForkManager-0.7.9-1.el6.noarch already installed and latest version
Package 4:perl-Time-HiRes-1.9721-141.el6_7.1.x86_64 already installed and latest version
Nothing to do
[root@mha2 ~]# 

[root@mha2 ~]# ls -ltr /software
total 489720
-rw-rw-r-- 1 root root   1281236 Oct 18 16:16 Percona-Server-devel-57-5.7.15-9.1.el6.x86_64.rpm
-rw-rw-r-- 1 root root   8404220 Oct 18 16:16 Percona-Server-client-57-5.7.15-9.1.el6.x86_64.rpm
-rw-rw-r-- 1 root root 145811460 Oct 18 16:16 Percona-Server-57-debuginfo-5.7.15-9.1.el6.x86_64.rpm
-rw-rw-r-- 1 root root  42409500 Oct 18 16:16 Percona-Server-server-57-5.7.15-9.1.el6.x86_64.rpm
-rw-rw-r-- 1 root root  30371004 Oct 18 16:16 Percona-Server-test-57-5.7.15-9.1.el6.x86_64.rpm
-rw-rw-r-- 1 root root    925228 Oct 18 16:16 Percona-Server-shared-57-5.7.15-9.1.el6.x86_64.rpm
-rw-rw-r-- 1 root root   1573044 Oct 18 16:16 Percona-Server-tokudb-57-5.7.15-9.1.el6.x86_64.rpm
-rw-r--r-- 1 root root 230789120 Nov  7 12:58 Percona-Server-5.7.15-9-r9f0fd0a-el6-x86_64-bundle.tar
-rw-r--r-- 1 root root     87119 Nov 21 19:44 mha4mysql-manager-0.56-0.el6.noarch.rpm
-rw-r--r-- 1 root root     36326 Nov 21 19:45 mha4mysql-node-0.56-0.el6.noarch.rpm
-rw-r--r-- 1 root root  39758750 Nov 23 13:50 nomachine_5.1.62_1_x86_64.rpm
drwxr-xr-x 2 root root      4096 Nov 23 14:27 repodata
drwxr-xr-x 2 root root      4096 Nov 25 09:28 aria2
drwxr-xr-x 3 root root      4096 Dec 16 15:29 mysql-cluster
[root@mha2 ~]# 

[root@mha2 ~]# rpm -qa | grep --color mha
[root@mha2 ~]# 
[root@mha2 ~]# rpm -ivh /software/mha4mysql-node-0.56-0.el6.noarch.rpm 
Preparing...                ########################################### [100%]
   1:mha4mysql-node         ########################################### [100%]
[root@mha2 ~]# 
[root@mha2 ~]# rpm -qa | grep --color mha
mha4mysql-node-0.56-0.el6.noarch
[root@mha2 ~]#

如果你没有在MHA Node安装上面的软件包，那么你会遇到下面的错误：

[root@mha1 ~]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf 
Mon Feb 13 10:49:50 2017 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Feb 13 10:49:50 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 10:49:50 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 10:49:50 2017 - [info] MHA::MasterMonitor version 0.56.
Mon Feb 13 10:49:56 2017 - [info] GTID failover mode = 0
Mon Feb 13 10:49:56 2017 - [info] Dead Servers:
Mon Feb 13 10:49:56 2017 - [info] Alive Servers:
Mon Feb 13 10:49:56 2017 - [info]   10.158.1.149(10.158.1.149:3306)
Mon Feb 13 10:49:56 2017 - [info]   10.158.1.174(10.158.1.174:3306)
Mon Feb 13 10:49:56 2017 - [info] Alive Slaves:
Mon Feb 13 10:49:56 2017 - [info]   10.158.1.174(10.158.1.174:3306)  Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled
Mon Feb 13 10:49:56 2017 - [info]     Replicating from 10.158.1.149(10.158.1.149:3306)
Mon Feb 13 10:49:56 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Feb 13 10:49:56 2017 - [info] Current Alive Master: 10.158.1.149(10.158.1.149:3306)
Mon Feb 13 10:49:56 2017 - [info] Checking slave configurations..
Mon Feb 13 10:49:56 2017 - [info] Checking replication filtering settings..
Mon Feb 13 10:49:56 2017 - [info]  binlog_do_db= , binlog_ignore_db= 
Mon Feb 13 10:49:56 2017 - [info]  Replication filtering check ok.
Mon Feb 13 10:49:56 2017 - [info] GTID (with auto-pos) is not supported
Mon Feb 13 10:49:56 2017 - [info] Starting SSH connection tests..
Mon Feb 13 10:49:57 2017 - [info] All SSH connection tests passed successfully.
Mon Feb 13 10:49:57 2017 - [info] Checking MHA Node version..
Mon Feb 13 10:49:57 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln122] Got error when getting node version. Error:
Mon Feb 13 10:49:57 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln123] 
bash: apply_diff_relay_logs: command not found
Mon Feb 13 10:49:57 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm, ln150] node version on 10.158.1.174 not found! Is MHA Node package installed ?
 at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 374
Mon Feb 13 10:49:57 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. Died at /usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm line 151.
Mon Feb 13 10:49:57 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Mon Feb 13 10:49:57 2017 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!
[root@mha1 ~]#

这一部分，YUM的具体配置，MHA的搭建文档中也有过提及，不重述。

————————————————
如果，顺利完成了上面的步骤，那么MHA的MySQL这一部分的配置就完成了。

二、MHA Manager：配置

默认配置文件：

[root@mha1 ~]# cat /etc/masterha_default.cnf 
[server default]
user=root
password=Abcd1@34

ssh_user=root

master_binlog_dir=/var/lib/mysql
remote_workdir=/work_dir/mha_node

repl_user=replme
repl_password=Or@cle123

master_ip_failover_script=/script/mha/master_ip_failover
master_ip_online_change_script=/script/mha/master_ip_online_change

ping_interval=1
[root@mha1 ~]#

对于上面的这个问题件，要注意：
对于同一个MHA，MySQL的口令最好一样。
这样，默认的配置文件跟应用配置文件就可以很好的分开。

如果没有办法做到这一点，那么，默认配置与应用配置就要写成一个文件。

在我的环境中，可以看到我是分开管理的，因此我所有的被MHA管理的MySQL的口令都是一样的。

应用配置文件：
masterha_application_1.cnf，这是之前的搭建文档配置的应用配置文件：

[root@mha1 ~]# cat /etc/masterha_application_1.cnf
[server default]
manager_workdir=/work_dir/mha_manager
manager_log=/work_dir/mha_manager/mha_manager.log

[server1]
hostname=mha2
candidate_master=1

[server2]
hostname=mha3
candidate_master=1

[server3]
hostname=mha4
#candidate_master=1
no_master=1
[root@mha1 ~]#

/etc/masterha_application_2_149_174.cnf，是本次（本文）新增的MySQL主从架构：

[root@mha1 ~]# cat /etc/masterha_application_2_149_174.cnf 
[server default]
manager_workdir=/work_dir/mha_manager
manager_log=/work_dir/mha_manager/mha_manager_2.log

[server1]
hostname=10.158.1.149
candidate_master=1

[server2]
hostname=10.158.1.174
candidate_master=1

[root@mha1 ~]#

这样，MHA的配置的新增操作就完成了。

三、MHA测试：
测试SSH等价关系：

[root@mha1 ~]# masterha_check_ssh --conf=/etc/masterha_application_2_149_174.cnf
Mon Feb 13 13:46:44 2017 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Feb 13 13:46:44 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 13:46:44 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 13:46:44 2017 - [info] Starting SSH connection tests..
Mon Feb 13 13:46:48 2017 - [debug] 
Mon Feb 13 13:46:45 2017 - [debug]  Connecting via SSH from root@10.158.1.174(10.158.1.174:22) to root@10.158.1.149(10.158.1.149:22)..
Mon Feb 13 13:46:48 2017 - [debug]   ok.
Mon Feb 13 13:46:53 2017 - [debug] 
Mon Feb 13 13:46:44 2017 - [debug]  Connecting via SSH from root@10.158.1.149(10.158.1.149:22) to root@10.158.1.174(10.158.1.174:22)..
Mon Feb 13 13:46:53 2017 - [debug]   ok.
Mon Feb 13 13:46:53 2017 - [info] All SSH connection tests passed successfully.
[root@mha1 ~]#

测试MySQL Replication（主从）：

[root@mha1 ~]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf
Mon Feb 13 13:47:18 2017 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Feb 13 13:47:18 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 13:47:18 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 13:47:18 2017 - [info] MHA::MasterMonitor version 0.56.
Mon Feb 13 13:47:21 2017 - [info] GTID failover mode = 0
Mon Feb 13 13:47:21 2017 - [info] Dead Servers:
Mon Feb 13 13:47:21 2017 - [info] Alive Servers:
Mon Feb 13 13:47:21 2017 - [info]   10.158.1.149(10.158.1.149:3306)
Mon Feb 13 13:47:21 2017 - [info]   10.158.1.174(10.158.1.174:3306)
Mon Feb 13 13:47:21 2017 - [info] Alive Slaves:
Mon Feb 13 13:47:21 2017 - [info]   10.158.1.174(10.158.1.174:3306)  Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled
Mon Feb 13 13:47:21 2017 - [info]     Replicating from 10.158.1.149(10.158.1.149:3306)
Mon Feb 13 13:47:21 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Feb 13 13:47:21 2017 - [info] Current Alive Master: 10.158.1.149(10.158.1.149:3306)
Mon Feb 13 13:47:21 2017 - [info] Checking slave configurations..
Mon Feb 13 13:47:21 2017 - [info] Checking replication filtering settings..
Mon Feb 13 13:47:21 2017 - [info]  binlog_do_db= , binlog_ignore_db= 
Mon Feb 13 13:47:21 2017 - [info]  Replication filtering check ok.
Mon Feb 13 13:47:21 2017 - [info] GTID (with auto-pos) is not supported
Mon Feb 13 13:47:21 2017 - [info] Starting SSH connection tests..
Mon Feb 13 13:47:22 2017 - [info] All SSH connection tests passed successfully.
Mon Feb 13 13:47:22 2017 - [info] Checking MHA Node version..
Mon Feb 13 13:47:22 2017 - [info]  Version check ok.
Mon Feb 13 13:47:22 2017 - [info] Checking SSH publickey authentication settings on the current master..
Mon Feb 13 13:47:23 2017 - [info] HealthCheck: SSH to 10.158.1.149 is reachable.
Mon Feb 13 13:47:23 2017 - [info] Master MHA Node version is 0.56.
Mon Feb 13 13:47:23 2017 - [info] Checking recovery script configurations on 10.158.1.149(10.158.1.149:3306)..
Mon Feb 13 13:47:23 2017 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/work_dir/mha_node/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000006 
Mon Feb 13 13:47:23 2017 - [info]   Connecting to root@10.158.1.149(10.158.1.149:22).. 
  Creating /work_dir/mha_node if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /var/lib/mysql, up to mysql-bin.000006
Mon Feb 13 13:47:23 2017 - [info] Binlog setting check done.
Mon Feb 13 13:47:23 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Mon Feb 13 13:47:23 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.158.1.174 --slave_ip=10.158.1.174 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Mon Feb 13 13:47:23 2017 - [info]   Connecting to root@10.158.1.174(10.158.1.174:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to adamhuan_relay-bin.000005
    Temporary relay log file is /var/lib/mysql/adamhuan_relay-bin.000005
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Mon Feb 13 13:47:24 2017 - [info] Slaves settings check done.
Mon Feb 13 13:47:24 2017 - [info] 
10.158.1.149(10.158.1.149:3306) (current master)
 +--10.158.1.174(10.158.1.174:3306)

Mon Feb 13 13:47:24 2017 - [info] Checking replication health on 10.158.1.174..
Mon Feb 13 13:47:24 2017 - [info]  ok.
Mon Feb 13 13:47:24 2017 - [info] Checking master_ip_failover_script status:
Mon Feb 13 13:47:24 2017 - [info]   /script/mha/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.158.1.149 --orig_master_ip=10.158.1.149 --orig_master_port=3306 
Mon Feb 13 13:47:24 2017 - [info]  OK.
Mon Feb 13 13:47:24 2017 - [warning] shutdown_script is not defined.
Mon Feb 13 13:47:24 2017 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
[root@mha1 ~]#

四、部署增强脚本
这一部分会跟之前的搭建文档不一样。
在搭建文档中，只涉及到一个MySQL主从，所以增强脚本只需要过滤一个manager即可。
而在现在的环境中，我们有两个MySQL主从，如果再用以前的方法过滤MHA Manager上的manager进程，将会列出所有的manager进程，影响所有的MHA接管的MySQL Replication。

先停掉当前的增强脚本：
杀掉【sh call_do_mha.sh】的进程，即可。

[root@mha1 ~]# ps -ef | grep --color do_mha
root      6626 24113  0 13:51 ?        00:00:00 sh /script/shell/do_mha.sh
root      6633 14322  0 13:51 pts/2    00:00:00 grep --color do_mha
root     24113     1  0 Feb07 ?        01:13:06 sh call_do_mha.sh
[root@mha1 ~]# 
[root@mha1 ~]# ps -ef | grep --color nohup
root     23093 14322  0 13:52 pts/2    00:00:00 grep --color nohup
[root@mha1 ~]# 
[root@mha1 ~]# ps -ef | grep --color manager
root      1476     1  0  2016 ?        00:00:18 /usr/sbin/modem-manager
root      3376     1  0 Feb12 ?        00:03:49 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover
root     12318 12181  0  2016 ?        00:02:19 gnome-power-manager
root     32527 14322  0 13:52 pts/2    00:00:00 grep --color manager
[root@mha1 ~]# 
[root@mha1 ~]# kill -9 24113
[root@mha1 ~]# 
[root@mha1 ~]# ps -ef | grep --color do_mha
root     17354 14322  0 13:52 pts/2    00:00:00 grep --color do_mha
[root@mha1 ~]# 
[root@mha1 ~]# ps -ef | grep --color nohup
root     17376 14322  0 13:53 pts/2    00:00:00 grep --color nohup
[root@mha1 ~]# 
[root@mha1 ~]# ps -ef | grep --color manager
root      1476     1  0  2016 ?        00:00:18 /usr/sbin/modem-manager
root      3376     1  0 Feb12 ?        00:03:49 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover
root     12318 12181  0  2016 ?        00:02:19 gnome-power-manager
root     17383 14322  0 13:53 pts/2    00:00:00 grep --color manager
[root@mha1 ~]#

过滤方法：

[root@mha1 shell]# ps -ef | grep masterha_manager
root      3376     1  0 Feb12 ?        00:03:50 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover
root     17635 14322  0 13:56 pts/2    00:00:00 grep masterha_manager
[root@mha1 shell]# 
[root@mha1 shell]# ps -ef | grep masterha_manager | grep application_1
root      3376     1  0 Feb12 ?        00:03:50 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover
[root@mha1 shell]# 
[root@mha1 shell]#

新的脚本：
文件：do_mha.sh
使用方法：do_mha.sh

[root@mha1 shell]# pwd
/script/shell
[root@mha1 shell]# 
[root@mha1 shell]# ls -ltr
total 3099176
-rw-r--r-- 1 root root       7068 Dec 19 23:46 do_mha.sh_orig
-rw-r--r-- 1 root root         76 Dec 19 23:56 call_do_mha.sh
-rw------- 1 root root 3173528130 Feb 13 13:52 nohup.out
-rw-r--r-- 1 root root       7233 Feb 13 13:56 do_mha.sh
[root@mha1 shell]# 
[root@mha1 shell]# cat do_mha.sh
#!/bin/bash

# Script Name: MySQL_MHA_Enhance_shell.sh
# About: auto register in MySQL Replication
# Script Type: Bash Shell
# OS: RHEL6 / CentOS6
# Architecture: MySQL MasterHA

# Auther: adamhuan
# Blog: d-prototype.com

# --------------------------
# variable and file path

# Part:: this script
str_mha_application=$1

# Part:: Linux
# Account info
str_linux_username="root"

# Part:: MySQL
# Account info
str_mysql_username="root"
str_mysql_password=Abcd1@34

str_repl_username="replme"
str_repl_password=Or@cle123

# Part:: MHA
# pid
#str_pid_masterha_manager=`ps -ef | grep masterha_manager | grep perl | awk '{print $2}'`

str_pid_masterha_manager=`ps -ef | grep masterha_manager | grep "$str_mha_application" | grep perl | awk '{print $2}'`

# file
file_conf_mha_global="/etc/masterha_default.cnf"

#file_conf_mha_application="/etc/masterha_application_1.cnf"
file_conf_mha_application="/etc/$str_mha_application.cnf"

# file: relation / by computed
file_log_mha_manager=`cat $file_conf_mha_application | grep --color manager_log | cut -d'=' -f2`

# variable: ip info
# 如果MHA中MySQL主库的候选服务器数量超过了两台，也许下面这个list参数，就会排上用场
list_ip_candicate=`cat $file_conf_mha_application | grep -B 2 "^candidate" | grep "hostname" | cut -d'=' -f2`

str_ip_orig_master=`cat $file_log_mha_manager | grep --color "MySQL Master failover" | cut -d'(' -f2 | cut -d':' -f1 | tail -n 1`
str_ip_new_master=`cat $file_log_mha_manager | grep --color "MySQL Master failover" | cut -d'(' -f3 | cut -d':' -f1 | tail -n 1`

str_ip_mha_manager="10.158.1.94"

# 为[change master]准备的参数
str_log_file_new_master=""
str_log_pos_new_master=""

# Part:: String SQL
str_sql_mysql_change_master=""

# --------------------------
# function

function do_sql() {
  # variable
  func_str_ip="$1"
  func_str_sql="$2"

  # action
  # 本场景中不涉及到对MySQL某个库的操作，所以没有选择[db]
  # mysql -u $user -p"$password" $db -N -e "$f_sql_str"
  mysql -u $str_mysql_username -h $func_str_ip -p"$str_mysql_password" -N -e "$func_str_sql"
}

# 获取主库状态信息
#function get_info_mysql_master_new_master() {
  # version ONE
  #str_log_file_new_master=`do_sql "$str_ip_new_master" "show master status" | awk '{print $1}'`
  #str_log_pos_new_master=`do_sql "$str_ip_new_master" "show master status" | awk '{print $2}'`

#}

# 生成orig_master作为slave加入new_master的[change master]SQL命令
function gen_sql_mysql_change_master() {
  #if [[ "$str_log_file_new_master" == "" || $str_log_pos_new_master == "" ]]
  #then
  #  get_info_mysql_master_new_master
  #fi
  #str_sql_mysql_change_master="CHANGE MASTER TO MASTER_HOST='$str_ip_new_master',MASTER_USER='$str_repl_username',MASTER_PASSWORD='$str_repl_password',MASTER_LOG_FILE='$str_log_file_new_master',MASTER_LOG_POS=$str_log_pos_new_master;"

  # version TWO
  func_temp_master_host_sed=`cat $file_log_mha_manager | grep --color "All other slaves should start" | tail -n 1 | cut -d',' -f1 | cut -d'=' -f2 | cut -d\' -f2`
  func_temp_repl_password_sed=`cat $file_log_mha_manager | grep --color "All other slaves should start" | tail -n 1 | rev | cut -d\' -f2`

  echo "======================"
  echo "@@ func variable: func_temp_repl_password_sed = $func_temp_repl_password_sed"
  echo "======================"
  str_sql_mysql_change_master=`cat $file_log_mha_manager | grep --color "All other slaves should start" | tail -n 1 | sed "s/'$func_temp_repl_password_sed'/'$str_repl_password'/g" | cut -d':' -f4`
  str_sql_mysql_change_master=`echo $str_sql_mysql_change_master | sed "s/'$func_temp_master_host_sed'/'$str_ip_new_master'/g"`
}

# 对指定主机执行Linux命令
# 前提：
# 1. IP可达
# 2. SSH等价关系
function do_linux_by_ssh() {
  # variable
  func_str_ip="$1"
  func_str_user="$2"
  func_str_command="$3"

  # action
  ssh -t $func_str_user@$func_str_ip "$func_str_command"
}

# 处理VIP的事宜
function do_part_vip() {
  do_linux_by_ssh "$str_ip_new_master" "root" "service keepalived start"
  do_linux_by_ssh "$str_ip_orig_master" "root" "service keepalived stop"
}

function do_part_orig_master_is_new_slave() {
  do_linux_by_ssh "$str_ip_orig_master" "root" "service mysql start"
  do_sql "$str_ip_orig_master" "set global read_only=1;"
  do_sql "$str_ip_orig_master" "$str_sql_mysql_change_master"
  do_sql "$str_ip_orig_master" "start slave;"
}

function do_part_mha_master_manager_start() {
  do_linux_by_ssh "$str_ip_mha_manager" "root" "nohup masterha_manager --conf=$file_conf_mha_application --ignore_last_failover &"
}

# 如果PID不存在，则执行该脚本，否则，退出
function runable_by_mha_manager_pid() {
  echo "-----------------"
  echo "Script for MySQL Master HA"
  echo "-----------------"
  echo "Begin:: "`date "+|%Y-%m-%d|%H:%M:%S|"`

  if [[ "$str_pid_masterha_manager" == "" ]]
  then
    echo "## masterha_manager is [NOT ALIVED]."
  else
    echo "## masterha_manager is [ALIVED]."
    echo "[masterha_manager] PID is:: $str_pid_masterha_manager"

    # do something.
    echo "## Exit Script"
    exit 0
  fi
}

# --------------------------
# action

# 如果PID不存在，则执行该脚本，否则，退出
echo "------------------"
echo "app: runable_by_mha_manager_pid"
runable_by_mha_manager_pid
echo ""

echo "------------------"
echo "app: gen_sql_mysql_change_master"
gen_sql_mysql_change_master
echo ""
#echo "------------------"
#echo "app: do_part_vip"
#do_part_vip
#echo ""

echo "------------------"
echo "app: do_part_orig_master_is_new_slave"
do_part_orig_master_is_new_slave
echo ""

echo "------------------"
echo "app: do_part_mha_master_manager_start"
#do_part_mha_master_manager_start
nohup masterha_manager --conf=$file_conf_mha_application --ignore_last_failover &
echo ""

# --------------------------
# Show time

# ---------
# version one
# ---------
#echo "new master is:: $str_ip_new_master"
#echo "Master log file is:: $str_log_file_new_master"
#echo "Master log POS is:: $str_log_pos_new_master"
#echo "orig master --> new master ## SQL: CHANGE MASTER ## is:: $str_sql_mysql_change_master"

# ---------
# version two
# ---------
echo "================="
echo "MySQL info:"

echo "## Account and Password"
echo "username @ $str_mysql_username"
echo "password @ $str_mysql_password"
echo "--- for REPLICATION ---"
echo "repl @ username ## $str_repl_username"
echo "repl @ password ## $str_repl_password"
echo "## Master Server info"
echo "log file @ Master ## $str_log_file_new_master"
echo "log pos  @ Master ## $str_log_pos_new_master"
echo ""

echo "================="
echo "SQL statement:"
echo "[CHANGE MASTER] -->"
echo "$str_sql_mysql_change_master"
echo ""

echo "================="
echo "MasterHA info:"

echo "## File and Path"
echo "MHA Global config file @ $file_conf_mha_global"
echo "MHA Application config file @ $file_conf_mha_application"
echo "MHA Log file:: masterha_manager @ $file_log_mha_manager"

echo "## Architecture"
echo "Candicate Server list::"
echo "$list_ip_candicate"

echo "## IP"
echo "MHA Manager Server:: $str_ip_mha_manager"
echo "Last:: new master:: $str_ip_new_master"
echo "Last:: orig master:: $str_ip_orig_master"
echo ""

# --------------------------
echo "-----------------"
echo "Finished:: "`date "+|%Y-%m-%d|%H:%M:%S|"`
# Done

[root@mha1 shell]#

文件：call_do_mha.sh

[root@mha1 shell]# cat call_do_mha.sh 

# name: call_do_mha.sh

#variables

str_mha_app=$1

while [ true ]
do
 sh /script/shell/do_mha.sh $str_mha_app
done
[root@mha1 shell]#

启动新的MHA增强脚本：

[root@mha1 ~]# ps -ef | grep masterha_manager
root      2481  2459  0 14:18 pts/0    00:00:00 grep masterha_manager
[root@mha1 ~]# 
[root@mha1 ~]# ls -ltr /etc/ | grep masterha
-rw-r--r--   1 root root    250 Dec 13 13:50 masterha_application_1.cnf
-rw-r--r--   1 root root    312 Dec 13 20:31 masterha_default.cnf
-rw-r--r--   1 root root    212 Feb 13 10:25 masterha_application_2_149_174.cnf
[root@mha1 ~]# 
[root@mha1 ~]# cd /script/shell/
[root@mha1 shell]# ls -ltr
total 3099280
-rw-r--r-- 1 root root       7068 Dec 19 23:46 do_mha.sh_orig
-rw-r--r-- 1 root root        117 Feb 13 13:59 call_do_mha.sh
-rw-r--r-- 1 root root       7293 Feb 13 14:10 do_mha.sh
-rw------- 1 root root 3173637364 Feb 13 14:12 nohup.out
[root@mha1 shell]# 

应用1：
[root@mha1 shell]# nohup sh call_do_mha.sh masterha_application_1 &
[1] 2493
nohup: ignoring input and appending output to `nohup.out'
[root@mha1 shell]# 
[root@mha1 shell]# 
[root@mha1 shell]# ps -ef | grep --color masterha_manager
root      2549     1  1 14:18 pts/0    00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover
root     28289  2459  0 14:19 pts/0    00:00:00 grep --color masterha_manager
[root@mha1 shell]# 

应用2：
[root@mha1 shell]# nohup sh call_do_mha.sh masterha_application_2_149_174 &
[3] 16662
nohup: ignoring input and appending output to `nohup.out'
[root@mha1 shell]# 
[root@mha1 shell]# 
[root@mha1 shell]# ps -ef | grep --color masterha_manager
root      2549     1  0 14:18 pts/0    00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover
root      3513     1  1 14:19 pts/0    00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_2_149_174.cnf --ignore_last_failover
root     30124  2459  0 14:19 pts/0    00:00:00 grep --color masterha_manager
[root@mha1 shell]#

五，测试：新的增强脚本：

首先看看当前的两个MySQL主从的状态：
应用1：

[root@mha1 shell]# masterha_check_repl --conf=/etc/masterha_application_1.cnf
Mon Feb 13 14:24:32 2017 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Feb 13 14:24:32 2017 - [info] Reading application default configuration from /etc/masterha_application_1.cnf..
Mon Feb 13 14:24:32 2017 - [info] Reading server configuration from /etc/masterha_application_1.cnf..
Mon Feb 13 14:24:32 2017 - [info] MHA::MasterMonitor version 0.56.
Mon Feb 13 14:24:32 2017 - [info] GTID failover mode = 0
Mon Feb 13 14:24:32 2017 - [info] Dead Servers:
Mon Feb 13 14:24:32 2017 - [info] Alive Servers:
Mon Feb 13 14:24:32 2017 - [info]   mha2(10.158.1.95:3306)
Mon Feb 13 14:24:32 2017 - [info]   mha3(10.158.1.96:3306)
Mon Feb 13 14:24:32 2017 - [info]   mha4(10.158.1.97:3306)
Mon Feb 13 14:24:32 2017 - [info] Alive Slaves:
Mon Feb 13 14:24:32 2017 - [info]   mha2(10.158.1.95:3306)  Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled
Mon Feb 13 14:24:32 2017 - [info]     Replicating from 10.158.1.96(10.158.1.96:3306)
Mon Feb 13 14:24:32 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Feb 13 14:24:32 2017 - [info]   mha4(10.158.1.97:3306)  Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled
Mon Feb 13 14:24:32 2017 - [info]     Replicating from 10.158.1.96(10.158.1.96:3306)
Mon Feb 13 14:24:32 2017 - [info]     Not candidate for the new Master (no_master is set)
Mon Feb 13 14:24:32 2017 - [info] Current Alive Master: mha3(10.158.1.96:3306)
Mon Feb 13 14:24:32 2017 - [info] Checking slave configurations..
Mon Feb 13 14:24:32 2017 - [info] Checking replication filtering settings..
Mon Feb 13 14:24:32 2017 - [info]  binlog_do_db= , binlog_ignore_db= 
Mon Feb 13 14:24:32 2017 - [info]  Replication filtering check ok.
Mon Feb 13 14:24:32 2017 - [info] GTID (with auto-pos) is not supported
Mon Feb 13 14:24:32 2017 - [info] Starting SSH connection tests..
Mon Feb 13 14:24:34 2017 - [info] All SSH connection tests passed successfully.
Mon Feb 13 14:24:34 2017 - [info] Checking MHA Node version..
Mon Feb 13 14:24:34 2017 - [info]  Version check ok.
Mon Feb 13 14:24:34 2017 - [info] Checking SSH publickey authentication settings on the current master..
Mon Feb 13 14:24:34 2017 - [info] HealthCheck: SSH to mha3 is reachable.
Mon Feb 13 14:24:35 2017 - [info] Master MHA Node version is 0.56.
Mon Feb 13 14:24:35 2017 - [info] Checking recovery script configurations on mha3(10.158.1.96:3306)..
Mon Feb 13 14:24:35 2017 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/work_dir/mha_node/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000028 
Mon Feb 13 14:24:35 2017 - [info]   Connecting to root@10.158.1.96(mha3:22).. 
  Creating /work_dir/mha_node if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /var/lib/mysql, up to mysql-bin.000028
Mon Feb 13 14:24:35 2017 - [info] Binlog setting check done.
Mon Feb 13 14:24:35 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Mon Feb 13 14:24:35 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=mha2 --slave_ip=10.158.1.95 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Mon Feb 13 14:24:35 2017 - [info]   Connecting to root@10.158.1.95(mha2:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to relay-bin.000002
    Temporary relay log file is /var/lib/mysql/relay-bin.000002
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Mon Feb 13 14:24:35 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=mha4 --slave_ip=10.158.1.97 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Mon Feb 13 14:24:35 2017 - [info]   Connecting to root@10.158.1.97(mha4:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to relay-bin.000002
    Temporary relay log file is /var/lib/mysql/relay-bin.000002
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Mon Feb 13 14:24:36 2017 - [info] Slaves settings check done.
Mon Feb 13 14:24:36 2017 - [info] 
mha3(10.158.1.96:3306) (current master)
 +--mha2(10.158.1.95:3306)
 +--mha4(10.158.1.97:3306)

Mon Feb 13 14:24:36 2017 - [info] Checking replication health on mha2..
Mon Feb 13 14:24:36 2017 - [info]  ok.
Mon Feb 13 14:24:36 2017 - [info] Checking replication health on mha4..
Mon Feb 13 14:24:36 2017 - [info]  ok.
Mon Feb 13 14:24:36 2017 - [info] Checking master_ip_failover_script status:
Mon Feb 13 14:24:36 2017 - [info]   /script/mha/master_ip_failover --command=status --ssh_user=root --orig_master_host=mha3 --orig_master_ip=10.158.1.96 --orig_master_port=3306 
Mon Feb 13 14:24:36 2017 - [info]  OK.
Mon Feb 13 14:24:36 2017 - [warning] shutdown_script is not defined.
Mon Feb 13 14:24:36 2017 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
[root@mha1 shell]#

可以看到：

Mon Feb 13 14:24:36 2017 - [info] 
mha3(10.158.1.96:3306) (current master)
 +--mha2(10.158.1.95:3306)
 +--mha4(10.158.1.97:3306)

应用2：

[root@mha1 shell]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf 
Mon Feb 13 14:25:18 2017 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Feb 13 14:25:18 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 14:25:18 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 14:25:18 2017 - [info] MHA::MasterMonitor version 0.56.
Mon Feb 13 14:25:18 2017 - [info] GTID failover mode = 0
Mon Feb 13 14:25:18 2017 - [info] Dead Servers:
Mon Feb 13 14:25:18 2017 - [info] Alive Servers:
Mon Feb 13 14:25:18 2017 - [info]   10.158.1.149(10.158.1.149:3306)
Mon Feb 13 14:25:18 2017 - [info]   10.158.1.174(10.158.1.174:3306)
Mon Feb 13 14:25:18 2017 - [info] Alive Slaves:
Mon Feb 13 14:25:18 2017 - [info]   10.158.1.174(10.158.1.174:3306)  Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled
Mon Feb 13 14:25:18 2017 - [info]     Replicating from 10.158.1.149(10.158.1.149:3306)
Mon Feb 13 14:25:18 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Feb 13 14:25:18 2017 - [info] Current Alive Master: 10.158.1.149(10.158.1.149:3306)
Mon Feb 13 14:25:18 2017 - [info] Checking slave configurations..
Mon Feb 13 14:25:18 2017 - [info] Checking replication filtering settings..
Mon Feb 13 14:25:18 2017 - [info]  binlog_do_db= , binlog_ignore_db= 
Mon Feb 13 14:25:18 2017 - [info]  Replication filtering check ok.
Mon Feb 13 14:25:18 2017 - [info] GTID (with auto-pos) is not supported
Mon Feb 13 14:25:18 2017 - [info] Starting SSH connection tests..
Mon Feb 13 14:25:24 2017 - [info] All SSH connection tests passed successfully.
Mon Feb 13 14:25:24 2017 - [info] Checking MHA Node version..
Mon Feb 13 14:25:24 2017 - [info]  Version check ok.
Mon Feb 13 14:25:24 2017 - [info] Checking SSH publickey authentication settings on the current master..
Mon Feb 13 14:25:24 2017 - [info] HealthCheck: SSH to 10.158.1.149 is reachable.
Mon Feb 13 14:25:25 2017 - [info] Master MHA Node version is 0.56.
Mon Feb 13 14:25:25 2017 - [info] Checking recovery script configurations on 10.158.1.149(10.158.1.149:3306)..
Mon Feb 13 14:25:25 2017 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/work_dir/mha_node/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000006 
Mon Feb 13 14:25:25 2017 - [info]   Connecting to root@10.158.1.149(10.158.1.149:22).. 
  Creating /work_dir/mha_node if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /var/lib/mysql, up to mysql-bin.000006
Mon Feb 13 14:25:25 2017 - [info] Binlog setting check done.
Mon Feb 13 14:25:25 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Mon Feb 13 14:25:25 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.158.1.174 --slave_ip=10.158.1.174 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Mon Feb 13 14:25:25 2017 - [info]   Connecting to root@10.158.1.174(10.158.1.174:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to adamhuan_relay-bin.000005
    Temporary relay log file is /var/lib/mysql/adamhuan_relay-bin.000005
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Mon Feb 13 14:25:26 2017 - [info] Slaves settings check done.
Mon Feb 13 14:25:26 2017 - [info] 
10.158.1.149(10.158.1.149:3306) (current master)
 +--10.158.1.174(10.158.1.174:3306)

Mon Feb 13 14:25:26 2017 - [info] Checking replication health on 10.158.1.174..
Mon Feb 13 14:25:26 2017 - [info]  ok.
Mon Feb 13 14:25:26 2017 - [info] Checking master_ip_failover_script status:
Mon Feb 13 14:25:26 2017 - [info]   /script/mha/master_ip_failover --command=status --ssh_user=root --orig_master_host=10.158.1.149 --orig_master_ip=10.158.1.149 --orig_master_port=3306 
Mon Feb 13 14:25:26 2017 - [info]  OK.
Mon Feb 13 14:25:26 2017 - [warning] shutdown_script is not defined.
Mon Feb 13 14:25:26 2017 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
[root@mha1 shell]#

可以看到：

Mon Feb 13 14:25:26 2017 - [info] 
10.158.1.149(10.158.1.149:3306) (current master)
 +--10.158.1.174(10.158.1.174:3306)

测试：停掉当前的主库的服务
应用1：

[root@mha3 ~]# mysql -u root -p'***************'
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2165
Server version: 5.7.15-9-log Percona Server (GPL), Release 9, Revision 9f0fd0a

Copyright (c) 2009-2016 Percona LLC and/or its affiliates
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000028 |      154 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

mysql> show slave status\G
Empty set (0.00 sec)

mysql> 
mysql> exit
Bye
[root@mha3 ~]# 
[root@mha3 ~]# service mysql stop
Stopping mysqld:  [  OK  ]
[root@mha3 ~]# 
[root@mha3 ~]# service mysql status
mysqld (pid  2680) is running...
[root@mha3 ~]#

应用2：

[root@mha2 ~]# mysql -u root -p'***************';
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 24
Server version: 5.7.15-9-log Percona Server (GPL), Release 9, Revision 9f0fd0a

Copyright (c) 2009-2016 Percona LLC and/or its affiliates
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000006 |     1916 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

mysql> show slave status\G
Empty set (0.00 sec)

mysql> 
mysql> exit
Bye
[root@mha2 ~]# 
[root@mha2 ~]# service mysql stop
Stopping mysqld:  [  OK  ]
[root@mha2 ~]# 
[root@mha2 ~]# service mysql status
mysqld (pid  1038) is running...
[root@mha2 ~]#

可以看到，上面的服务被停掉之后，就自己起来了。（这是如预期的）

再次查看MySQL主从的状态：

应用1：

[root@mha1 shell]# masterha_check_repl --conf=/etc/masterha_application_1.cnf
Mon Feb 13 14:29:29 2017 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Feb 13 14:29:29 2017 - [info] Reading application default configuration from /etc/masterha_application_1.cnf..
Mon Feb 13 14:29:29 2017 - [info] Reading server configuration from /etc/masterha_application_1.cnf..
Mon Feb 13 14:29:29 2017 - [info] MHA::MasterMonitor version 0.56.
Mon Feb 13 14:29:36 2017 - [info] GTID failover mode = 0
Mon Feb 13 14:29:36 2017 - [info] Dead Servers:
Mon Feb 13 14:29:36 2017 - [info] Alive Servers:
Mon Feb 13 14:29:36 2017 - [info]   mha2(10.158.1.95:3306)
Mon Feb 13 14:29:36 2017 - [info]   mha3(10.158.1.96:3306)
Mon Feb 13 14:29:36 2017 - [info]   mha4(10.158.1.97:3306)
Mon Feb 13 14:29:36 2017 - [info] Alive Slaves:
Mon Feb 13 14:29:36 2017 - [info]   mha3(10.158.1.96:3306)  Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled
Mon Feb 13 14:29:36 2017 - [info]     Replicating from 10.158.1.95(10.158.1.95:3306)
Mon Feb 13 14:29:36 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Feb 13 14:29:36 2017 - [info]   mha4(10.158.1.97:3306)  Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled
Mon Feb 13 14:29:36 2017 - [info]     Replicating from 10.158.1.95(10.158.1.95:3306)
Mon Feb 13 14:29:36 2017 - [info]     Not candidate for the new Master (no_master is set)
Mon Feb 13 14:29:36 2017 - [info] Current Alive Master: mha2(10.158.1.95:3306)
Mon Feb 13 14:29:36 2017 - [info] Checking slave configurations..
Mon Feb 13 14:29:36 2017 - [info] Checking replication filtering settings..
Mon Feb 13 14:29:36 2017 - [info]  binlog_do_db= , binlog_ignore_db= 
Mon Feb 13 14:29:36 2017 - [info]  Replication filtering check ok.
Mon Feb 13 14:29:36 2017 - [info] GTID (with auto-pos) is not supported
Mon Feb 13 14:29:36 2017 - [info] Starting SSH connection tests..
Mon Feb 13 14:29:38 2017 - [info] All SSH connection tests passed successfully.
Mon Feb 13 14:29:38 2017 - [info] Checking MHA Node version..
Mon Feb 13 14:29:38 2017 - [info]  Version check ok.
Mon Feb 13 14:29:38 2017 - [info] Checking SSH publickey authentication settings on the current master..
Mon Feb 13 14:29:38 2017 - [info] HealthCheck: SSH to mha2 is reachable.
Mon Feb 13 14:29:38 2017 - [info] Master MHA Node version is 0.56.
Mon Feb 13 14:29:38 2017 - [info] Checking recovery script configurations on mha2(10.158.1.95:3306)..
Mon Feb 13 14:29:38 2017 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/work_dir/mha_node/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000054 
Mon Feb 13 14:29:38 2017 - [info]   Connecting to root@10.158.1.95(mha2:22).. 
  Creating /work_dir/mha_node if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /var/lib/mysql, up to mysql-bin.000054
Mon Feb 13 14:29:39 2017 - [info] Binlog setting check done.
Mon Feb 13 14:29:39 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Mon Feb 13 14:29:39 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=mha3 --slave_ip=10.158.1.96 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Mon Feb 13 14:29:39 2017 - [info]   Connecting to root@10.158.1.96(mha3:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to relay-bin.000002
    Temporary relay log file is /var/lib/mysql/relay-bin.000002
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Mon Feb 13 14:29:40 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=mha4 --slave_ip=10.158.1.97 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Mon Feb 13 14:29:40 2017 - [info]   Connecting to root@10.158.1.97(mha4:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to relay-bin.000002
    Temporary relay log file is /var/lib/mysql/relay-bin.000002
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Mon Feb 13 14:29:41 2017 - [info] Slaves settings check done.
Mon Feb 13 14:29:41 2017 - [info] 
mha2(10.158.1.95:3306) (current master)
 +--mha3(10.158.1.96:3306)
 +--mha4(10.158.1.97:3306)

Mon Feb 13 14:29:41 2017 - [info] Checking replication health on mha3..
Mon Feb 13 14:29:41 2017 - [info]  ok.
Mon Feb 13 14:29:41 2017 - [info] Checking replication health on mha4..
Mon Feb 13 14:29:41 2017 - [info]  ok.
Mon Feb 13 14:29:41 2017 - [info] Checking master_ip_failover_script status:
Mon Feb 13 14:29:41 2017 - [info]   /script/mha/master_ip_failover --command=status --ssh_user=root --orig_master_host=mha2 --orig_master_ip=10.158.1.95 --orig_master_port=3306 
Mon Feb 13 14:29:41 2017 - [info]  OK.
Mon Feb 13 14:29:41 2017 - [warning] shutdown_script is not defined.
Mon Feb 13 14:29:41 2017 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
[root@mha1 shell]#

可以看到：

Mon Feb 13 14:29:41 2017 - [info] 
mha2(10.158.1.95:3306) (current master)
 +--mha3(10.158.1.96:3306)
 +--mha4(10.158.1.97:3306)

应用2：

[root@mha1 shell]# masterha_check_repl --conf=/etc/masterha_application_2_149_174.cnf 
Mon Feb 13 14:32:00 2017 - [info] Reading default configuration from /etc/masterha_default.cnf..
Mon Feb 13 14:32:00 2017 - [info] Reading application default configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 14:32:00 2017 - [info] Reading server configuration from /etc/masterha_application_2_149_174.cnf..
Mon Feb 13 14:32:00 2017 - [info] MHA::MasterMonitor version 0.56.
Mon Feb 13 14:32:00 2017 - [info] GTID failover mode = 0
Mon Feb 13 14:32:00 2017 - [info] Dead Servers:
Mon Feb 13 14:32:00 2017 - [info] Alive Servers:
Mon Feb 13 14:32:00 2017 - [info]   10.158.1.149(10.158.1.149:3306)
Mon Feb 13 14:32:00 2017 - [info]   10.158.1.174(10.158.1.174:3306)
Mon Feb 13 14:32:00 2017 - [info] Alive Slaves:
Mon Feb 13 14:32:00 2017 - [info]   10.158.1.149(10.158.1.149:3306)  Version=5.7.15-9-log (oldest major version between slaves) log-bin:enabled
Mon Feb 13 14:32:00 2017 - [info]     Replicating from 10.158.1.174(10.158.1.174:3306)
Mon Feb 13 14:32:00 2017 - [info]     Primary candidate for the new Master (candidate_master is set)
Mon Feb 13 14:32:00 2017 - [info] Current Alive Master: 10.158.1.174(10.158.1.174:3306)
Mon Feb 13 14:32:00 2017 - [info] Checking slave configurations..
Mon Feb 13 14:32:00 2017 - [info]  read_only=1 is not set on slave 10.158.1.149(10.158.1.149:3306).
Mon Feb 13 14:32:00 2017 - [info] Checking replication filtering settings..
Mon Feb 13 14:32:00 2017 - [info]  binlog_do_db= , binlog_ignore_db= 
Mon Feb 13 14:32:00 2017 - [info]  Replication filtering check ok.
Mon Feb 13 14:32:00 2017 - [info] GTID (with auto-pos) is not supported
Mon Feb 13 14:32:00 2017 - [info] Starting SSH connection tests..
Mon Feb 13 14:32:01 2017 - [info] All SSH connection tests passed successfully.
Mon Feb 13 14:32:01 2017 - [info] Checking MHA Node version..
Mon Feb 13 14:32:01 2017 - [info]  Version check ok.
Mon Feb 13 14:32:01 2017 - [info] Checking SSH publickey authentication settings on the current master..
Mon Feb 13 14:32:01 2017 - [info] HealthCheck: SSH to 10.158.1.174 is reachable.
Mon Feb 13 14:32:01 2017 - [info] Master MHA Node version is 0.56.
Mon Feb 13 14:32:01 2017 - [info] Checking recovery script configurations on 10.158.1.174(10.158.1.174:3306)..
Mon Feb 13 14:32:01 2017 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/work_dir/mha_node/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000007 
Mon Feb 13 14:32:01 2017 - [info]   Connecting to root@10.158.1.174(10.158.1.174:22).. 
  Creating /work_dir/mha_node if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /var/lib/mysql, up to mysql-bin.000007
Mon Feb 13 14:32:02 2017 - [info] Binlog setting check done.
Mon Feb 13 14:32:02 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Mon Feb 13 14:32:02 2017 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.158.1.149 --slave_ip=10.158.1.149 --slave_port=3306 --workdir=/work_dir/mha_node --target_version=5.7.15-9-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Mon Feb 13 14:32:02 2017 - [info]   Connecting to root@10.158.1.149(10.158.1.149:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to mysql-relay.000003
    Temporary relay log file is /var/lib/mysql/mysql-relay.000003
    Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Mon Feb 13 14:32:03 2017 - [info] Slaves settings check done.
Mon Feb 13 14:32:03 2017 - [info] 
10.158.1.174(10.158.1.174:3306) (current master)
 +--10.158.1.149(10.158.1.149:3306)

Mon Feb 13 14:32:03 2017 - [info] Checking replication health on 10.158.1.149..
Mon Feb 13 14:32:03 2017 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln485] Slave IO thread is not running on 10.158.1.149(10.158.1.149:3306)
Mon Feb 13 14:32:03 2017 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln1526]  failed!
Mon Feb 13 14:32:03 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 417
Mon Feb 13 14:32:03 2017 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Mon Feb 13 14:32:03 2017 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!
[root@mha1 shell]# 
[root@mha1 shell]#

可以看到：

Mon Feb 13 14:32:03 2017 - [info] 
10.158.1.174(10.158.1.174:3306) (current master)
 +--10.158.1.149(10.158.1.149:3306)

都做了切换。

当前，MHA Manager的manager服务状态：

[root@mha1 shell]# ps -ef | grep masterha_manager
root      7851     1  0 14:27 pts/0    00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_1.cnf --ignore_last_failover
root     21892     1  0 14:28 pts/0    00:00:00 perl /usr/bin/masterha_manager --conf=/etc/masterha_application_2_149_174.cnf --ignore_last_failover
root     31057  2459  0 14:34 pts/0    00:00:00 grep masterha_manager
[root@mha1 shell]# 
[root@mha1 shell]# ps -ef | grep --color do_mha
root      2493  2459  0 14:18 pts/0    00:00:06 sh call_do_mha.sh masterha_application_1
root      3576  2493  0 14:34 pts/0    00:00:00 sh /script/shell/do_mha.sh masterha_application_1
root      3579  3576  0 14:34 pts/0    00:00:00 sh /script/shell/do_mha.sh masterha_application_1
root      3595  2459  0 14:34 pts/0    00:00:00 grep --color do_mha
root     31368  2459  0 14:19 pts/0    00:00:07 sh call_do_mha.sh masterha_application_2_149_174
[root@mha1 shell]#

至此，为MHA新增MySQL Replication（主从），完成。

——————————————
Done。