天天看点

Mysql的高可用MHA实现

 Mysql的高可用MHA实现

环境:rhel6.5

Server1:172.25.66.1 master

Server2:172.25.66.2 Candicate slave

Server3:172.25.66.3 slave

Server4: 172.25.66.4 monitor

Server1是master,Server2和server3是server1的slave,其中master对外提供写服务,备选master(实际的slave,主机名server2)提供读服务,另一个slave也提供相关的读服务,一旦master宕机,将会把备选master提升为新的master,slave指向新的master。

一.基础知识

1.MHA构架组成:MHA Manager(管理节点)和MHA Node(数据节点)

Mysql的高可用MHA实现

mha支持多套主从切换,只要编写多个配置文件即可,例:app1.conf,app2.conf,...,appn.conf

工作机制

MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。MHA Node运行在每台MySQL服务器上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。

在MHA自动故障切换过程中,MHA试图从宕机的主服务器上保存二进制日志,最大程度的保证数据的不丢失,但这并不总是可行的。例如,如果主服务器硬件故障或无法通过ssh访问,MHA没法保存二进制日志,只进行故障转移而丢失了最新的数据。使用MySQL 5.5的半同步复制,可以大大降低数据丢失的风险。MHA可以与半同步复制结合起来。如果只有一个slave已经收到了最新的二进制日志,MHA可以将最新的二进制日志应用于其他所有的slave服务器上,因此可以保证所有节点的数据一致性。

安装软件

1.在所有节点都要安装MHA node所需的perl模块(DBD:mysql)

[[email protected] ~]# yum install perl-DBD-MySQL -y

[[email protected] ~]# yum install perl-DBD-MySQL -y

[[email protected] ~]# yum install perl-DBD-MySQL -y

[[email protected] ~]# yum install perl-DBD-MySQL -y

[[email protected] ~]# yum install -y perl-devel perl-CPAN

[[email protected] mha]# ls

##在安装perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes需要解决依赖性

perl-Config-Tiny-2.12-7.1.el6.noarch.rpm

perl-Email-Date-Format-1.002-5.el6.noarch.rpm

perl-Log-Dispatch-2.27-1.el6.noarch.rpm

perl-Mail-Sender-0.8.16-3.el6.noarch.rpm

perl-Mail-Sendmail-0.79-12.el6.noarch.rpm

perl-MailTools-2.04-4.el6.noarch.rpm

perl-MIME-Lite-3.027-2.el6.noarch.rpm

perl-MIME-Lite-HTML-1.23-2.el6.noarch.rpm

perl-MIME-Types-1.28-2.el6.noarch.rpm

perl-Parallel-ForkManager-0.7.9-1.el6.noarch.rpm

perl-Params-Validate-0.92-3.el6.x86_64.rpm

perl-TimeDate-1.16-13.el6.noarch.rpm

[[email protected] mha]# yum install -y perl-*

2.

[[email protected] ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm

[[email protected] ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm

[[email protected] ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm

3.manager上要安装MHA node和MHA Manager

[[email protected] ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm

[[email protected] ~]# yum install -y mha4mysql-manager-0.56-0.el6.noarch.rpm

总之

监控机上:

mha4mysql-node-0.56-0.el6.noarch.rpm

mha4mysql-node-0.56-0.el6.noarch.rpm

依赖包:

perl-Config-Tiny.noarch 0:2.12-7.1.el6

perl-Email-Date-Format.noarch 0:1.002-5.el6

perl-Log-Dispatch.noarch 0:2.27-1.el6

perl-MIME-Lite.noarch 0:3.027-2.el6

perl-MIME-Lite-HTML.noarch 0:1.23-2.el6

perl-MIME-Types.noarch 0:1.28-2.el6

perl-Mail-Sender.noarch 0:0.8.16-3.el6

perl-Mail-Sendmail.noarch 0:0.79-12.el6

perl-Parallel-ForkManager.noarch 0:0.7.9-1.el6

其他节点上:

mha4mysql-node-0.56-0.el6.noarch.rpm

三.配置ssh无密码登陆

把各个节点的authorized_keys的内容互相拷贝加入到对方的此文件中,然后就可以免密码彼此ssh连入。

Server1:

[[email protected] ~]# ssh-keygen -t rsa ##一路回车

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

51:7f:d4:78:9a:86:86:de:99:42:29:81:18:6b:4f:d0 [email protected]

The key's randomart image is:

+--[ RSA 2048]----+

| o+ . . .o |

| .oE .. . .. o|

| o . .. o...+ |

| . o ..+ o.+ |

| . S+ o + |

| o + |

| . |

| |

| |

+-----------------+

[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]

[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]

Server2:

[[email protected] ~]# ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

a0:3a:e2:74:e2:5d:d0:c2:11:f0:2d:7a:13:61:e5:e8 [email protected]

The key's randomart image is:

+--[ RSA 2048]----+

| ..+.. |

| o * |

| * o. |

| + =. . |

| . E.. S |

| ..+ |

|.oo. . |

|+.+.. |

| o . |

+-----------------+

[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]

[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]

Server3:

[[email protected] ~]# ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

1e:6e:21:26:f6:14:25:e2:d9:bb:56:33:7b:c1:3b:42 [email protected]

The key's randomart image is:

+--[ RSA 2048]----+

| . . . |

| . + o |

| o o |

| o . |

| o = E o |

| . = B * o |

| + * + |

| . . o . |

| |

+-----------------+

[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]

[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]

Server4:

[[email protected] ~]# ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Created directory '/root/.ssh'.

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

8b:47:01:ea:92:3c:92:e2:19:9a:1b:f9:e9:32:17:c3 [email protected]

The key's randomart image is:

+--[ RSA 2048]----+

| . |

| . . |

| . . |

| o o . |

|+o= . S |

|++Eo o . |

|=o o . o |

|o+.. . |

|.=+ |

+-----------------+

[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]

[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]

[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]

配置完后测试: server1,2,3可以互相无密码登陆,server4可无密码登陆其他第三个主机

方法一:

Server1:

[[email protected] ~]# ssh [email protected]

Last login: Sat Aug 5 07:54:27 2017 from 172.25.66.250

[[email protected] ~]# logout

Connection to 172.25.66.3 closed.

[[email protected] ~]# ssh [email protected]

Last login: Sat Aug 5 07:54:17 2017 from 172.25.66.250

[[email protected] ~]# logout

Connection to 172.25.66.2 closed.

server2,server3也做同样的测试,此处省略部分步骤。

[[email protected] ~]# ssh [email protected]

Last login: Sat Aug 5 11:36:33 2017 from 172.25.66.1

[[email protected] ~]# logout

Connection to 172.25.66.2 closed.

[[email protected] ~]# ssh [email protected]

Last login: Sat Aug 5 11:36:26 2017 from 172.25.66.1

[[email protected] ~]# ssh [email protected]

Last login: Sat Aug 5 09:39:05 2017 from 172.25.66.250

方法二:

Server4:

ssh-keygen

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

dd:ab:40:3e:84:b2:41:79:97:29:df:65:62:1f:6c:eb [email protected]

The key's randomart image is:

+--[ RSA 2048]----+

| |

| . o . |

| o o + o * |

| . . = + B o |

| o . S o + |

| + + . . |

| . + E |

| o . |

| . |

+-----------------+

2 cd /root/.ssh/

3 ssh-copy-id server4

4 rsync -p * server2:~/.ssh/ ##yum install -y rsync

5 rsync -p * server3:~/.ssh/

6 rsync -p * server1:~/.ssh/

四.搭建主从复制环境

Server2的数据已经和server1(master)数据同步

下面配置首先使server3与server1(master)数据相同,然后进行同步;

注意:在同步数据之前务必保证数据的一致,binlog-do-db 和 replicate-ignore-db 设置必须相同,MHA 在启动时候会检测过滤规则,如果过滤规则不同,MHA 不启动监控和故障转移。

1.在Master上备份一份完整的数据

[[email protected] ~]# mysqldump -uroot [email protected] --master-data=2 --single-transaction -R --triggers -A > all.sql

##将server1上数据库的所有数据备份在文件all.sql中,其中--master-data=2代表备份时刻记录master的Binlog位置和Position,--single-transaction意思是获取一致性快照,-R意思是备份存储过程和函数,--triggres的意思是备份触发器,-A代表备份所有的库。

配置server3为从机,把数据备份复制到server3上

首先配置server3上的半同步复制

[[email protected] ~]# yum install -y mysql-community-client-5.7.17-1.el6.x86_64.rpm mysql-community-common-5.7.17-1.el6.x86_64.rpm mysql-community-libs-5.7.17-1.el6.x86_64.rpm mysql-community-libs-compat-5.7.17-1.el6.x86_64.rpm mysql-community-server-5.7.17-1.el6.x86_64.rpm

[[email protected] ~]# vim /etc/my.cnf

server-id=3 ##server-id每个节点均不相同,范围是1到2^32-1

log-bin=mysql-bin

binlog-do-db=test

binlog-ignore-db=mysql

gtid_mode=ON

enforce-gtid-consistency=true

slave-parallel-type=LOGICAL_CLOCK

slave-parallel-workers=16

master_info_repository=TABLE

relay_log_info_repository=TABLE

relay_log_recovery=ON

##所有节点上的配置文件除了server-id之外必须相同,因为当master节点down掉了,需要slave接管,因此配置要相同。

[[email protected] ~]# /etc/init.d/mysqld start

Initializing MySQL database: [ OK ]

Installing validate password plugin: [ OK ]

Starting mysqld: [ OK ]

[[email protected] ~]# mysql -p

mysql> show databases;

ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.

mysql> ALTER USER [email protected] identified by '[email protected]';

Query OK, 0 rows affected (0.00 sec)

mysql> ^DBye

[[email protected] ~]# mysql -uroot [email protected] < ./all.sql

[[email protected] ~]# scp all.sql [email protected]:/root/

all.sql 100% 759KB 758.7KB/s 00:00

[[email protected] ~]# mysql [email protected]

mysql> show databases;

+--------------------+

| Database |

+--------------------+

| information_schema |

| mysql |

| performance_schema |

| sys |

| test |

+--------------------+

5 rows in set (0.00 sec)

mysql> stop slave;

Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> change master to master_host='172.25.66.1', master_user='westos',master_password='[email protected]',MASTER_AUTO_POSITION = 1;

Query OK, 0 rows affected, 2 warnings (0.57 sec)

mysql> start slave;

Query OK, 0 rows affected (1.13 sec)

mysql> show slave status\G;

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 172.25.66.1

Master_User: westos

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000007

Read_Master_Log_Pos: 493

Relay_Log_File: server3-relay-bin.000002

Relay_Log_Pos: 414

Relay_Master_Log_File: mysql-bin.000007

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

做到这里我们的server3已经和master(server1)数据同步

在检查一下server2的数据同步:

Mysql的高可用MHA实现
Mysql的高可用MHA实现

说明server2的数据与master数据相同

3.两台slave服务器设置read_only(从库对外提供读服务,之所以没有写进配置文件,是因为随时slave会提升为master)

Mysql的高可用MHA实现
Mysql的高可用MHA实现

4.创建监控用户,给监控权限(在master上执行,也就是server1)

Mysql的高可用MHA实现

配置MHA

1.创建MHA的工作目录,并且创建相关配置文件

[[email protected] ~]# mkdir -p /etc/masterha

[[email protected] ~]# vim /etc/masterha/app.cnf

[server default]

manager_workdir=/etc/masterha ##设置manager的工作目录

manager_log=/etc/masterha/mha.log ##设置manager的日志存放位置

master_binlog_dir=/var/lib/mysql

##设置master 保存binlog的位置,以便MHA可以找到master的日志

master_ip_failover_script=/etc/masterha/master_ip_failover

##设置自动failover时候的切换脚本

master_ip_online_change_script=/etc/masterha/master_ip_online_change

##设置手动切换时候的切换脚本

[email protected]

##设置mysql中root用户的密码,这个密码是前文中创建监控用户的那个密码

user=root

ping_interval=1

##设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应的时候自动进行railover

remote_workdir=/tmp

[email protected]

repl_user=westos

#report_script=/usr/local/send_report

secondary_check_script=/usr/bin/masterha_secondary_check -s 172.25.66.2 -s 172.25.66.3

##一旦MHA到server1的监控之间出现问题,MHA Manager将会尝试从server2登录到server3

#shutdown_script=""

ssh_user=root

[server1]

hostname=172.25.66.1

port=3306

[server2]

hostname=172.25.66.2

port=3306

candidate_master=1

##设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库 的数据不是最新的

check_repl_delay=0

##默认情况下,当主从差异大于100M时,mha就不会选择该主机为master,但是如果添加了该命令,就会忽略复制延时,使得该主机一定是master

[server3]

hostname=172.25.66.3

port=3306

#no_master=1

设置relay log的清除方式(在每个slave节点上)

Mysql的高可用MHA实现
Mysql的高可用MHA实现

设置定期清理relay脚本(两台slave服务器):

Server2:

[[email protected] ~]# vim purge_relay_log.sh

Mysql的高可用MHA实现

[[email protected] ~]# crontab -e

Mysql的高可用MHA实现

Server3:

Mysql的高可用MHA实现

purge_relay_logs脚本删除中继日志不会阻塞SQL线程,手动执行测试:

Mysql的高可用MHA实现

六.排错

1.检查SSH配置(server01 192.168.2.131 Monitor 监控节点上操作)

[[email protected] ~]# masterha_check_ssh --conf=/etc/masterha/app.cnf

Mysql的高可用MHA实现

可以看见各个节点ssh验证都是ok的。

检查整个复制环境状况(server4 Monitor 监控节点上操作),如下:

[[email protected] ~]# masterha_check_repl --conf=/etc/masterha/app.cnf

1)

Mysql的高可用MHA实现

这里的报错信息原因是Failover两种方式:一种是虚拟IP地址,一种是全局配置文件。MHA并没有限定使用哪一种方式,而是让用户自己选择,虚拟IP地址的方式会牵扯到其它的软件,比如keepalive软件,而且还要修改脚本master_ip_failover。所以先暂时注释master_ip_failover_script= /usr/local/bin/master_ip_failover这个选项。

[[email protected] ~]# vim /etc/masterha/app.cnf

Mysql的高可用MHA实现

2)

Mysql的高可用MHA实现

这个报错是因为master主机和备用主机的/etc/my.cnf文件中的内容指定不同,修改如下:

[[email protected] ~]# vim /etc/my.cnf

Mysql的高可用MHA实现

3)

Mysql的高可用MHA实现

这个报错是没有在备用主机里对复制用户授权

Mysql的高可用MHA实现

修改后检测成功:(已经没有报错)

Mysql的高可用MHA实现

这里有一个warning处理以下,报错提示server3上的log-bin没有指定,修改如下

[[email protected] ~]# vim /etc/my.cnf

Mysql的高可用MHA实现

[[email protected] ~]# /etc/init.d/mysqld restart

Stopping mysqld: [ OK ]

Starting mysqld: [ OK ]

3.检查MHA Manager的状态

通过master_check_status脚本查看Manager的状态:

Mysql的高可用MHA实现

注意:如果正常,会显示"PING_OK",否则会显示"NOT_RUNNING",这代表MHA监控没有开启。

开启MHA Manager监控(server4 操作)如下:

[[email protected] ~]# mkdir -p /var/log/masterha/app/

[[email protected]~]# nohup masterha_manager --conf=/etc/masterha/app.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app/manager.log 2>&1 &

[1] 1344

[[email protected] ~]# masterha_check_status --conf=/etc/masterha/app.cnf

app (pid:1344) is running(0:PING_OK), master:172.25.66.1

可以看见已经在监控了,而且master的主机为172.25.66.1

Mysql的高可用MHA实现

测试

当master down掉

在server2上授权复制:

mysql> GRANT REPLICATION SLAVE ON *.* TO 'westos'@'172.25.66.%' IDENTIFIED BY '[email protected]';

Query OK, 0 rows affected, 1 warning (0.12 sec)

将master即server1服务停掉,这时备用的master即server2会切换为真正的master,在slave可以查看此时的master:

[[email protected] ~]# /etc/init.d/mysqld stop

Stopping mysqld: [ OK ]

在server3上:

mysql> change master to master_host='172.25.66.2', master_user='westos',master_password='[email protected]',MASTER_AUTO_POSITION = 1;

Query OK, 0 rows affected, 2 warnings (0.55 sec)

2.在线切换master到不同主机

在很多情况下,有必要将master转移到其他主机上(如替换raid控制器,提升master机器硬件等等)。这并不是master崩溃,但是计划维护必须去做。

在server4上

[[email protected] ~]# masterha_master_switch --conf=/etc/masterha/app.cnf --master_state=alive --new_master_host=172.25.66.3 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000

##手动切换master

Mysql的高可用MHA实现
Mysql的高可用MHA实现

3.故障转移

[[email protected] masterha]# nohup masterha_manager --conf=/etc/masterha/app.cnf

[[email protected] ~]# kill -9 2502 2169

Mysql的高可用MHA实现

配置文件指定server1为备用master,当master挂掉时,server1会变成master

Mysql的高可用MHA实现