mha::masterfailover::main()->do_master_failover
init_config(): 初始化配置
mha::servermanager::init_binlog_server: 初始化binlog server
check_settings()
is_gtid_auto_pos_enabled(): 判断是否是gtid模式
force_shutdown($dead_master):
phase 3.1: getting latest slaves phase..
phase 3.2: saving dead master's binlog phase..
phase 3.3: determining new master phase..
phase 3.3(3.4): new master diff log generation phase..
phase 3.4: master log apply phase..
recover_slaves_internal
phase 4.1: starting parallel slave diff log generation phase..
phase 4.2: starting parallel slave log apply phase..
reset_slave_on_new_master
phase 3.2: saving dead master's binlog phase.. (gtid 模式下没有这一步)
phase 3.3: new master recovery phase..
phase 4.1: starting slaves in parallel..
do_master_online_switch
identify_orig_master
identify_new_master
reject_update
read_slave_status
switch_master
switch_slaves
release_failover_advisory_lock
gtid模式的online switch 和 non-gtid 流程一样,除了在change_master_and_start_slave 不一样之外
今天测试了一把gtid的在线切换,遇到的问题非常诡异
问题:新搭建了一组group,做mha在线切换,结果却导致环境混乱。
mha with not binlog server
常用配置
常用命令
m,(s1,s2)结构,如果s1是latest slave,如果master挂了,刚好s1的purge-relay-log的job也跑了,这样还能愉快的failover吗?