mha::masterfailover::main()->do_master_failover
init_config(): 初始化配置
mha::servermanager::init_binlog_server: 初始化binlog server
check_settings()
is_gtid_auto_pos_enabled(): 判斷是否是gtid模式
force_shutdown($dead_master):
phase 3.1: getting latest slaves phase..
phase 3.2: saving dead master's binlog phase..
phase 3.3: determining new master phase..
phase 3.3(3.4): new master diff log generation phase..
phase 3.4: master log apply phase..
recover_slaves_internal
phase 4.1: starting parallel slave diff log generation phase..
phase 4.2: starting parallel slave log apply phase..
reset_slave_on_new_master
phase 3.2: saving dead master's binlog phase.. (gtid 模式下沒有這一步)
phase 3.3: new master recovery phase..
phase 4.1: starting slaves in parallel..
do_master_online_switch
identify_orig_master
identify_new_master
reject_update
read_slave_status
switch_master
switch_slaves
release_failover_advisory_lock
gtid模式的online switch 和 non-gtid 流程一樣,除了在change_master_and_start_slave 不一樣之外
今天測試了一把gtid的線上切換,遇到的問題非常詭異
問題:新搭建了一組group,做mha線上切換,結果卻導緻環境混亂。
mha with not binlog server
常用配置
常用指令
m,(s1,s2)結構,如果s1是latest slave,如果master挂了,剛好s1的purge-relay-log的job也跑了,這樣還能愉快的failover嗎?