Mysql的高可用MHA實作
環境:rhel6.5
Server1:172.25.66.1 master
Server2:172.25.66.2 Candicate slave
Server3:172.25.66.3 slave
Server4: 172.25.66.4 monitor
Server1是master,Server2和server3是server1的slave,其中master對外提供寫服務,備選master(實際的slave,主機名server2)提供讀服務,另一個slave也提供相關的讀服務,一旦master當機,将會把備選master提升為新的master,slave指向新的master。
一.基礎知識
1.MHA構架組成:MHA Manager(管理節點)和MHA Node(資料節點)
mha支援多套主從切換,隻要編寫多個配置檔案即可,例:app1.conf,app2.conf,...,appn.conf
工作機制
MHA Manager可以單獨部署在一台獨立的機器上管理多個master-slave叢集,也可以部署在一台slave節點上。MHA Node運作在每台MySQL伺服器上,MHA Manager會定時探測叢集中的master節點,當master出現故障時,它可以自動将最新資料的slave提升為新的master,然後将所有其他的slave重新指向新的master。整個故障轉移過程對應用程式完全透明。
在MHA自動故障切換過程中,MHA試圖從當機的主伺服器上儲存二進制日志,最大程度的保證資料的不丢失,但這并不總是可行的。例如,如果主伺服器硬體故障或無法通過ssh通路,MHA沒法儲存二進制日志,隻進行故障轉移而丢失了最新的資料。使用MySQL 5.5的半同步複制,可以大大降低資料丢失的風險。MHA可以與半同步複制結合起來。如果隻有一個slave已經收到了最新的二進制日志,MHA可以将最新的二進制日志應用于其他所有的slave伺服器上,是以可以保證所有節點的資料一緻性。
安裝軟體
1.在所有節點都要安裝MHA node所需的perl子產品(DBD:mysql)
[[email protected] ~]# yum install perl-DBD-MySQL -y
[[email protected] ~]# yum install perl-DBD-MySQL -y
[[email protected] ~]# yum install perl-DBD-MySQL -y
[[email protected] ~]# yum install perl-DBD-MySQL -y
[[email protected] ~]# yum install -y perl-devel perl-CPAN
[[email protected] mha]# ls
##在安裝perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes需要解決依賴性
perl-Config-Tiny-2.12-7.1.el6.noarch.rpm
perl-Email-Date-Format-1.002-5.el6.noarch.rpm
perl-Log-Dispatch-2.27-1.el6.noarch.rpm
perl-Mail-Sender-0.8.16-3.el6.noarch.rpm
perl-Mail-Sendmail-0.79-12.el6.noarch.rpm
perl-MailTools-2.04-4.el6.noarch.rpm
perl-MIME-Lite-3.027-2.el6.noarch.rpm
perl-MIME-Lite-HTML-1.23-2.el6.noarch.rpm
perl-MIME-Types-1.28-2.el6.noarch.rpm
perl-Parallel-ForkManager-0.7.9-1.el6.noarch.rpm
perl-Params-Validate-0.92-3.el6.x86_64.rpm
perl-TimeDate-1.16-13.el6.noarch.rpm
[[email protected] mha]# yum install -y perl-*
2.
[[email protected] ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm
[[email protected] ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm
[[email protected] ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm
3.manager上要安裝MHA node和MHA Manager
[[email protected] ~]# yum install -y mha4mysql-node-0.56-0.el6.noarch.rpm
[[email protected] ~]# yum install -y mha4mysql-manager-0.56-0.el6.noarch.rpm
總之
監控機上:
mha4mysql-node-0.56-0.el6.noarch.rpm
mha4mysql-node-0.56-0.el6.noarch.rpm
依賴包:
perl-Config-Tiny.noarch 0:2.12-7.1.el6
perl-Email-Date-Format.noarch 0:1.002-5.el6
perl-Log-Dispatch.noarch 0:2.27-1.el6
perl-MIME-Lite.noarch 0:3.027-2.el6
perl-MIME-Lite-HTML.noarch 0:1.23-2.el6
perl-MIME-Types.noarch 0:1.28-2.el6
perl-Mail-Sender.noarch 0:0.8.16-3.el6
perl-Mail-Sendmail.noarch 0:0.79-12.el6
perl-Parallel-ForkManager.noarch 0:0.7.9-1.el6
其他節點上:
mha4mysql-node-0.56-0.el6.noarch.rpm
三.配置ssh無密碼登陸
把各個節點的authorized_keys的内容互相拷貝加入到對方的此檔案中,然後就可以免密碼彼此ssh連入。
Server1:
[[email protected] ~]# ssh-keygen -t rsa ##一路回車
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
51:7f:d4:78:9a:86:86:de:99:42:29:81:18:6b:4f:d0 [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| o+ . . .o |
| .oE .. . .. o|
| o . .. o...+ |
| . o ..+ o.+ |
| . S+ o + |
| o + |
| . |
| |
| |
+-----------------+
[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
Server2:
[[email protected] ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
a0:3a:e2:74:e2:5d:d0:c2:11:f0:2d:7a:13:61:e5:e8 [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| ..+.. |
| o * |
| * o. |
| + =. . |
| . E.. S |
| ..+ |
|.oo. . |
|+.+.. |
| o . |
+-----------------+
[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
Server3:
[[email protected] ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
1e:6e:21:26:f6:14:25:e2:d9:bb:56:33:7b:c1:3b:42 [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| . . . |
| . + o |
| o o |
| o . |
| o = E o |
| . = B * o |
| + * + |
| . . o . |
| |
+-----------------+
[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
Server4:
[[email protected] ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
8b:47:01:ea:92:3c:92:e2:19:9a:1b:f9:e9:32:17:c3 [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| . |
| . . |
| . . |
| o o . |
|+o= . S |
|++Eo o . |
|=o o . o |
|o+.. . |
|.=+ |
+-----------------+
[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
[[email protected] ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
配置完後測試: server1,2,3可以互相無密碼登陸,server4可無密碼登陸其他第三個主機
方法一:
Server1:
[[email protected] ~]# ssh [email protected]
Last login: Sat Aug 5 07:54:27 2017 from 172.25.66.250
[[email protected] ~]# logout
Connection to 172.25.66.3 closed.
[[email protected] ~]# ssh [email protected]
Last login: Sat Aug 5 07:54:17 2017 from 172.25.66.250
[[email protected] ~]# logout
Connection to 172.25.66.2 closed.
server2,server3也做同樣的測試,此處省略部分步驟。
[[email protected] ~]# ssh [email protected]
Last login: Sat Aug 5 11:36:33 2017 from 172.25.66.1
[[email protected] ~]# logout
Connection to 172.25.66.2 closed.
[[email protected] ~]# ssh [email protected]
Last login: Sat Aug 5 11:36:26 2017 from 172.25.66.1
[[email protected] ~]# ssh [email protected]
Last login: Sat Aug 5 09:39:05 2017 from 172.25.66.250
方法二:
Server4:
ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
dd:ab:40:3e:84:b2:41:79:97:29:df:65:62:1f:6c:eb [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| |
| . o . |
| o o + o * |
| . . = + B o |
| o . S o + |
| + + . . |
| . + E |
| o . |
| . |
+-----------------+
2 cd /root/.ssh/
3 ssh-copy-id server4
4 rsync -p * server2:~/.ssh/ ##yum install -y rsync
5 rsync -p * server3:~/.ssh/
6 rsync -p * server1:~/.ssh/
四.搭建主從複制環境
Server2的資料已經和server1(master)資料同步
下面配置首先使server3與server1(master)資料相同,然後進行同步;
注意:在同步資料之前務必保證資料的一緻,binlog-do-db 和 replicate-ignore-db 設定必須相同,MHA 在啟動時候會檢測過濾規則,如果過濾規則不同,MHA 不啟動監控和故障轉移。
1.在Master上備份一份完整的資料
[[email protected] ~]# mysqldump -uroot [email protected] --master-data=2 --single-transaction -R --triggers -A > all.sql
##将server1上資料庫的所有資料備份在檔案all.sql中,其中--master-data=2代表備份時刻記錄master的Binlog位置和Position,--single-transaction意思是擷取一緻性快照,-R意思是備份存儲過程和函數,--triggres的意思是備份觸發器,-A代表備份所有的庫。
配置server3為從機,把資料備份複制到server3上
首先配置server3上的半同步複制
[[email protected] ~]# yum install -y mysql-community-client-5.7.17-1.el6.x86_64.rpm mysql-community-common-5.7.17-1.el6.x86_64.rpm mysql-community-libs-5.7.17-1.el6.x86_64.rpm mysql-community-libs-compat-5.7.17-1.el6.x86_64.rpm mysql-community-server-5.7.17-1.el6.x86_64.rpm
[[email protected] ~]# vim /etc/my.cnf
server-id=3 ##server-id每個節點均不相同,範圍是1到2^32-1
log-bin=mysql-bin
binlog-do-db=test
binlog-ignore-db=mysql
gtid_mode=ON
enforce-gtid-consistency=true
slave-parallel-type=LOGICAL_CLOCK
slave-parallel-workers=16
master_info_repository=TABLE
relay_log_info_repository=TABLE
relay_log_recovery=ON
##所有節點上的配置檔案除了server-id之外必須相同,因為當master節點down掉了,需要slave接管,是以配置要相同。
[[email protected] ~]# /etc/init.d/mysqld start
Initializing MySQL database: [ OK ]
Installing validate password plugin: [ OK ]
Starting mysqld: [ OK ]
[[email protected] ~]# mysql -p
mysql> show databases;
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.
mysql> ALTER USER [email protected] identified by '[email protected]';
Query OK, 0 rows affected (0.00 sec)
mysql> ^DBye
[[email protected] ~]# mysql -uroot [email protected] < ./all.sql
[[email protected] ~]# scp all.sql [email protected]:/root/
all.sql 100% 759KB 758.7KB/s 00:00
[[email protected] ~]# mysql [email protected]
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
| test |
+--------------------+
5 rows in set (0.00 sec)
mysql> stop slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> change master to master_host='172.25.66.1', master_user='westos',master_password='[email protected]',MASTER_AUTO_POSITION = 1;
Query OK, 0 rows affected, 2 warnings (0.57 sec)
mysql> start slave;
Query OK, 0 rows affected (1.13 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 172.25.66.1
Master_User: westos
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000007
Read_Master_Log_Pos: 493
Relay_Log_File: server3-relay-bin.000002
Relay_Log_Pos: 414
Relay_Master_Log_File: mysql-bin.000007
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
做到這裡我們的server3已經和master(server1)資料同步
在檢查一下server2的資料同步:
說明server2的資料與master資料相同
3.兩台slave伺服器設定read_only(從庫對外提供讀服務,之是以沒有寫進配置檔案,是因為随時slave會提升為master)
4.建立監控使用者,給監控權限(在master上執行,也就是server1)
配置MHA
1.建立MHA的工作目錄,并且建立相關配置檔案
[[email protected] ~]# mkdir -p /etc/masterha
[[email protected] ~]# vim /etc/masterha/app.cnf
[server default]
manager_workdir=/etc/masterha ##設定manager的工作目錄
manager_log=/etc/masterha/mha.log ##設定manager的日志存放位置
master_binlog_dir=/var/lib/mysql
##設定master 儲存binlog的位置,以便MHA可以找到master的日志
master_ip_failover_script=/etc/masterha/master_ip_failover
##設定自動failover時候的切換腳本
master_ip_online_change_script=/etc/masterha/master_ip_online_change
##設定手動切換時候的切換腳本
[email protected]
##設定mysql中root使用者的密碼,這個密碼是前文中建立監控使用者的那個密碼
user=root
ping_interval=1
##設定監控主庫,發送ping包的時間間隔,預設是3秒,嘗試三次沒有回應的時候自動進行railover
remote_workdir=/tmp
[email protected]
repl_user=westos
#report_script=/usr/local/send_report
secondary_check_script=/usr/bin/masterha_secondary_check -s 172.25.66.2 -s 172.25.66.3
##一旦MHA到server1的監控之間出現問題,MHA Manager将會嘗試從server2登入到server3
#shutdown_script=""
ssh_user=root
[server1]
hostname=172.25.66.1
port=3306
[server2]
hostname=172.25.66.2
port=3306
candidate_master=1
##設定為候選master,如果設定該參數以後,發生主從切換以後将會将此從庫提升為主庫,即使這個主庫 的資料不是最新的
check_repl_delay=0
##預設情況下,當主從差異大于100M時,mha就不會選擇該主機為master,但是如果添加了該指令,就會忽略複制延時,使得該主機一定是master
[server3]
hostname=172.25.66.3
port=3306
#no_master=1
設定relay log的清除方式(在每個slave節點上)
設定定期清理relay腳本(兩台slave伺服器):
Server2:
[[email protected] ~]# vim purge_relay_log.sh
[[email protected] ~]# crontab -e
Server3:
purge_relay_logs腳本删除中繼日志不會阻塞SQL線程,手動執行測試:
六.排錯
1.檢查SSH配置(server01 192.168.2.131 Monitor 監控節點上操作)
[[email protected] ~]# masterha_check_ssh --conf=/etc/masterha/app.cnf
可以看見各個節點ssh驗證都是ok的。
檢查整個複制環境狀況(server4 Monitor 監控節點上操作),如下:
[[email protected] ~]# masterha_check_repl --conf=/etc/masterha/app.cnf
1)
這裡的報錯資訊原因是Failover兩種方式:一種是虛拟IP位址,一種是全局配置檔案。MHA并沒有限定使用哪一種方式,而是讓使用者自己選擇,虛拟IP位址的方式會牽扯到其它的軟體,比如keepalive軟體,而且還要修改腳本master_ip_failover。是以先暫時注釋master_ip_failover_script= /usr/local/bin/master_ip_failover這個選項。
[[email protected] ~]# vim /etc/masterha/app.cnf
2)
這個報錯是因為master主機和備用主機的/etc/my.cnf檔案中的内容指定不同,修改如下:
[[email protected] ~]# vim /etc/my.cnf
3)
這個報錯是沒有在備用主機裡對複制使用者授權
修改後檢測成功:(已經沒有報錯)
這裡有一個warning處理以下,報錯提示server3上的log-bin沒有指定,修改如下
[[email protected] ~]# vim /etc/my.cnf
[[email protected] ~]# /etc/init.d/mysqld restart
Stopping mysqld: [ OK ]
Starting mysqld: [ OK ]
3.檢查MHA Manager的狀态
通過master_check_status腳本檢視Manager的狀态:
注意:如果正常,會顯示"PING_OK",否則會顯示"NOT_RUNNING",這代表MHA監控沒有開啟。
開啟MHA Manager監控(server4 操作)如下:
[[email protected] ~]# mkdir -p /var/log/masterha/app/
[[email protected]~]# nohup masterha_manager --conf=/etc/masterha/app.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app/manager.log 2>&1 &
[1] 1344
[[email protected] ~]# masterha_check_status --conf=/etc/masterha/app.cnf
app (pid:1344) is running(0:PING_OK), master:172.25.66.1
可以看見已經在監控了,而且master的主機為172.25.66.1
測試
當master down掉
在server2上授權複制:
mysql> GRANT REPLICATION SLAVE ON *.* TO 'westos'@'172.25.66.%' IDENTIFIED BY '[email protected]';
Query OK, 0 rows affected, 1 warning (0.12 sec)
将master即server1服務停掉,這時備用的master即server2會切換為真正的master,在slave可以檢視此時的master:
[[email protected] ~]# /etc/init.d/mysqld stop
Stopping mysqld: [ OK ]
在server3上:
mysql> change master to master_host='172.25.66.2', master_user='westos',master_password='[email protected]',MASTER_AUTO_POSITION = 1;
Query OK, 0 rows affected, 2 warnings (0.55 sec)
2.線上切換master到不同主機
在很多情況下,有必要将master轉移到其他主機上(如替換raid控制器,提升master機器硬體等等)。這并不是master崩潰,但是計劃維護必須去做。
在server4上
[[email protected] ~]# masterha_master_switch --conf=/etc/masterha/app.cnf --master_state=alive --new_master_host=172.25.66.3 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000
##手動切換master
3.故障轉移
[[email protected] masterha]# nohup masterha_manager --conf=/etc/masterha/app.cnf
[[email protected] ~]# kill -9 2502 2169
配置檔案指定server1為備用master,當master挂掉時,server1會變成master