一主两从的环境，如果主库挂了，如何选举一个从库作为主库？

1969-12-31 23:50:00

如图：

如果M挂了，怎么从S1和S2中选举一个从库作为主库？

传统复制的解决方法

（1）查看从库状态：

S1：show slave status；

S2：show slave status；

root@localhost [(none)]>show slave status\G

*************************** 1. row ***************************

Slave_IO_State: Reconnecting after a failed master event read

Master_Host: 192.168.91.22

Master_User: repl

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000006

Read_Master_Log_Pos: 6227

Relay_Log_File: relay-bin.000004

Relay_Log_Pos: 414

Relay_Master_Log_File: mysql-bin.000006

Slave_IO_Running: Connecting

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 6227

Relay_Log_Space: 875

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: NULL --主库服务停止后由0变成null，所以这个值不能作为判断从库是否同步完成的标准

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 2003

Last_IO_Error: error reconnecting to master '[email protected]:3306' - retry-time: 60 retries: 12

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 330622

Master_UUID: 83373570-fe03-11e6-bb0a-000c29c1b8a9

Master_Info_File: mysql.slave_master_info

SQL_Delay: 0

SQL_Remaining_Delay: NULL

Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

Master_Retry_Count: 86400

Master_Bind:

Last_IO_Error_Timestamp: 170415 23:08:25

Last_SQL_Error_Timestamp:

Master_SSL_Crl:

Master_SSL_Crlpath:

Retrieved_Gtid_Set:

Executed_Gtid_Set: 83373570-fe03-11e6-bb0a-000c29c1b8a9:1-33,

b30cdc47-216a-11e7-95a8-000c29565380:1-3

Auto_Position: 1

Replicate_Rewrite_DB:

Channel_Name:

Master_TLS_Version

（2）判断每个slave是不是同步完成：

io_thread读到主库的binlog日志和位置：

Master_Log_File: mysql-bin.000006

Read_Master_Log_Pos: 6227

sql_thread执行到哪个relay-log和位置：

Relay_Master_Log_File: mysql-bin.000006

Exec_Master_Log_Pos: 6227

当Master_Log_File = Relay_Master_Log_File && Read_Master_Log_Pos = Exec_Master_Log_Pos 表示从库与主库同步完成。

如果Master_Log_File = Relay_Master_Log_File，但是Read_Master_Log_Pos > Exec_Master_Log_Pos，并且sql_thread的状态是 Connecting，表示relay-log还没有重放完成，大概等待2-5s也就会同步完成。

（3）比较两个从库的同步情况：

当S1和S2分别同步完成，谁靠前，谁当主。多数情况下S1和S2是一样的.

当S1.Relay_Master_Log_File=S2.Relay_Master_Log_File 但 S1.Exec_Master_Log_Pos > S2.Exec_Master_Log_Pos，则表示S1同步靠前，选择S1作为新主。

或者比较：

当S1.Master_Log_File = S2.Master_Log_File 但 S1.Read_Master_Log_Pos > S2.Read_Master_Log_Pos，则表示S1同步靠前，选择S1作为新主。

（4）S1和S2数据不一致怎么办？

如果万一出现S1靠前，S2数据比S1数据少，那么把S1作为新的主之后，业务读写都先放在S1上，然后通过pt-table-checksum和pt-table-sync工具修复S2的数据，再用S2分担业务。

GTID复制的解决方法

（1）判断每个slave是不是同步完成：

Retrieved_Gtid_Set: 83373570-fe03-11e6-bb0a-000c29c1b8a9:22-28

Executed_Gtid_Set: 83373570-fe03-11e6-bb0a-000c29c1b8a9:1-28,

当Retrieved_Gtid_Set = Executed_Gtid_Set （即28=28）表示从库已经和主库完成同步。

（2）选举一个从库作为主库：

如果S1. Executed_Gtid_Set = S2. Executed_Gtid_Set，随机选择一个作为主；

如果S1. Executed_Gtid_Set> S2. Executed_Gtid_Set，则选举S1作为主，S2可以直接change master to到S1,作为S1的从库

损坏的主库怎么办？

（1）把以前的主库重新change master to 新主，然后主从一致性校验，数据修复。

（2）如果是原来的主库数据损坏，需要重新作为从库加到新主上面

如何暂时停止主库写操作？

（1）改密码，不能影响已有的连接，记得要把已有的连接都kill掉。

（2）flush table with read lock

（3）开启参数super_read_only=on

（4）通过防火墙把3306端口封住

总结：

一主两从的环境，如果主库挂了，如何选举一个从库作为主库的切换过程（整个过程快的话大概1-5秒）：

（1）修改主库密码，断开所有连接

（2）判断S1和S2同步情况

（3）选举新库

（4）写流量放在新主上

一主两从的环境，如果主库挂了，如何选举一个从库作为主库？

继续阅读

来自贝佐斯的决策指南(亚马逊)，如何在一个非初创公司里保持快速的行动力？

用Mindjet MindManager 15 打开文件后停止响应的解决方法

关于Python3中venv虚拟环境

一主两从的环境，如果主库挂了，如何选举一个从库作为主库？

虚拟机与主机如何实现与局域网互联

如何在ES5与ES6环境下处理函数默认参数

[MyBatis日记]（3）映射器配置文件与映射器接口

Spring Boot 2.1.0 已发布，7 个重大更新你需要了解

MyBatis编写映射文件实现增删改操作附说明及代码

MyBatis 映射器自动映射然后创建数据库编写映射文件传递多个参数注解传递参数

Qt编写自定义控件30-颜色多态按钮

如何监测多云环境

Mybatis(三) 映射文件详解

Mybatis（二）映射文件一、文件的约束二、主配置引入三、元素节点四、常见属性五、常见属性分析六、resultMap标签详解七、#{}和${}

pip 安装sshtunnel时提示“Failed building wheel for bcrypt“

Windows如何使用venv虚拟环境并指派python