天天看点

MySQL8.0 Xtrabackup备份问题

一、背景

MySQL 8.0已经成为最受瞩目的版本,增加了许多新特性,如Clone Plugin、Hash Join等等。由于新的MySQL重做日志和数据字典格式,所以推出新的XtraBackup 8.0版本,本文基于MySQL产品在自动搭建备库时遇到问题做出解析。

1.1 测试环境

  • MySQL 8.0.18
  • xtrabackup 8.0.8
  • binlog_format=raw,gtid_mode=on,log-bin=1
  • sysbench 1.0.17

1.2 问题来源

某天,QFusion测试人员遇到了MySQL 8.0在备库搭建时报错,需要DBA同学跟进问题。首先通过show slave status查看报错信息,发现从库的sql thread出现异常。

root@localhost : (none) 08:56:40> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: test-58-aa-f8aa500-headless
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000085
          Read_Master_Log_Pos: 1751589
               Relay_Log_File: mysql-relay-bin.000002
                Relay_Log_Pos: 416
        Relay_Master_Log_File: mysql-bin.000050
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1062
                   Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758' at master log , end_log_pos 16052967. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 15520378
              Relay_Log_Space: 19285319437
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1062
               Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758' at master log , end_log_pos 16052967. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 41796
                  Master_UUID: 4a8f2de4-1fe1-11ea-a7d3-42549a90e353
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 191216 18:56:49
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758-48217
            Executed_Gtid_Set: 4a8f2de4-1fe1-11ea-a7d3-42549a90e353:1-25757
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
       Master_public_key_path:
        Get_master_public_key: 0
            Network_Namespace:
1 row in set (0.00 sec)           

通过报错信息中的提示,发现是出现了主键冲突。

root@localhost : (none) 08:57:05> select * from performance_schema.replication_applier_status_by_worker limit 1\G
*************************** 1. row ***************************
                                           CHANNEL_NAME:
                                              WORKER_ID: 1
                                              THREAD_ID: NULL
                                          SERVICE_STATE: OFF
                                      LAST_ERROR_NUMBER: 1062
                                     LAST_ERROR_MESSAGE: Worker 1 failed executing transaction '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758' at master log , end_log_pos 16052967; Could not execute Write_rows event on table test.sbtest24; Duplicate entry '1529529' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 16052967
                                   LAST_ERROR_TIMESTAMP: 2019-12-16 18:56:49.326384
                               LAST_APPLIED_TRANSACTION:
     LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
    LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
         LAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMP: 0000-00-00 00:00:00.000000
           LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP: 0000-00-00 00:00:00.000000
                                   APPLYING_TRANSACTION: 4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758
         APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2019-12-16 17:31:13.653629
        APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2019-12-16 17:31:13.653629
             APPLYING_TRANSACTION_START_APPLY_TIMESTAMP: 2019-12-16 18:56:49.325745
                 LAST_APPLIED_TRANSACTION_RETRIES_COUNT: 0
   LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_NUMBER: 0
  LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_MESSAGE:
LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
                     APPLYING_TRANSACTION_RETRIES_COUNT: 0
       APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_NUMBER: 0
      APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_MESSAGE:
    APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
1 row in set (0.00 sec)           

查看当前slave最新的gtid信息

root@localhost : (none) 12:05:14> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000008
         Position: 155
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: 4a8f2de4-1fe1-11ea-a7d3-42549a90e353:1-25757
1 row in set (0.00 sec)           

查看xtrabackup备份时的binlog的pos和gtid信息

[root@test-58-aa-f8aa501-0 mydata]# cat xtrabackup_info
...
binlog_pos = filename 'mysql-bin.000050', position '33110447', GTID of the last change '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:1-25757'
...           

此时对比发现,slave的gtid信息和备份完成时的gtid是一致的,那为什么还是出现了主键冲突的情况呢?

此时我们根据pos位点解析master的binlog信息

[root@test-58-aa-f8aa500-0 archive]# mysqlbinlog -vv mysql-bin.000050 --start-position=33110447 | head -20
mysqlbinlog: [Warning] unknown variable 'loose_default-character-set=utf8mb4'.
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 195
#191216 17:31:12 server id 41796  end_log_pos 124 CRC32 0xc9ad2c11     Start: binlog v 4, server v 8.0.18 created 191216 17:31:12
BINLOG '
4E73XQ9EowAAeAAAAHwAAAAAAAQAOC4wLjE4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAEwANAAgAAAAABAAEAAAAYAAEGggAAAAICAgCAAAACgoKKioAEjQA
CgERLK3J
'/*!*/;
# at 33110447
#191216 17:31:13 server id 41796  end_log_pos 33110527 CRC32 0x27442dd4     GTID    last_committed=15    sequence_number=34    rbr_only=yes    original_committed_timestamp=1576488674113763    immediate_commit_timestamp=1576488674113763    transaction_length=1034680
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
# original_commit_timestamp=1576488674113763 (2019-12-16 17:31:14.113763 CST)
# immediate_commit_timestamp=1576488674113763 (2019-12-16 17:31:14.113763 CST)
/*!80001 SET @@session.original_commit_timestamp=1576488674113763*//*!*/;
/*!80014 SET @@session.original_server_version=80018*//*!*/;
/*!80014 SET @@session.immediate_server_version=80018*//*!*/;
SET @@SESSION.GTID_NEXT= '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25776'/*!*/;
# at 33110527           

根据binlog的pos位点,发现下一个事务GTID为4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25776;而备份结束的GTID为4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25757。

### 主库查看4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758的执行语句
[root@test-58-aa-f8aa500-0 archive]# mysqlbinlog -vv mysql-bin.000050 | grep -B 50 4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25759
...
### INSERT INTO `test`.`sbtest24`
### SET
###   @1=1532200 /* INT meta=0 nullable=0 is_null=0 */
###   @2=2512693 /* INT meta=0 nullable=0 is_null=0 */
###   @3='89203990588-69103074818-52114821100-93439894362-23847143576-07484173808-05428369926-02975913993-41574550041-26938280355' /* STRING(360) meta=61032 nullable=0 is_null=0 */
###   @4='45342050513-61911498937-97819152099-08587625932-60582159210' /* STRING(180) meta=65204 nullable=0 is_null=0 */
### INSERT INTO `test`.`sbtest24`
### SET
###   @1=1532201 /* INT meta=0 nullable=0 is_null=0 */
###   @2=2516379 /* INT meta=0 nullable=0 is_null=0 */
###   @3='16016112465-66688149957-19003543589-12752783330-10140834595-17946818527-54058037200-83940103252-07788820899-59081899782' /* STRING(360) meta=61032 nullable=0 is_null=0 */
###   @4='02966014962-81012143426-29059376957-22789496327-25833494842' /* STRING(180) meta=65204 nullable=0 is_null=0 */
### INSERT INTO `test`.`sbtest24`
### SET
###   @1=1532202 /* INT meta=0 nullable=0 is_null=0 */
###   @2=2707896 /* INT meta=0 nullable=0 is_null=0 */
###   @3='16925828637-58090054889-95780407302-10293804842-12632614860-03811213734-46471823942-17333962963-31790381481-07019946586' /* STRING(360) meta=61032 nullable=0 is_null=0 */
###   @4='77178758790-70128854370-00264407205-64007208401-22279618099' /* STRING(180) meta=65204 nullable=0 is_null=0 */
# at 16555027
#191216 17:31:11 server id 41796  end_log_pos 16555058 CRC32 0x6f8135d4         Xid = 36324
COMMIT/*!*/;
# at 16555058
#191216 17:31:11 server id 41796  end_log_pos 16555138 CRC32 0x7fcb97ac         GTID    last_committed=0        sequence_number=17      rbr_only=yes    original_committed_timestamp=1576488673657448   immediate_commit_timestamp=1576488673657448     transaction_length=1034680
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
# original_commit_timestamp=1576488673657448 (2019-12-16 17:31:13.657448 CST)
# immediate_commit_timestamp=1576488673657448 (2019-12-16 17:31:13.657448 CST)
/*!80001 SET @@session.original_commit_timestamp=1576488673657448*//*!*/;
/*!80014 SET @@session.original_server_version=80018*//*!*/;
/*!80014 SET @@session.immediate_server_version=80018*//*!*/;
SET @@SESSION.GTID_NEXT= '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25759'/*!*/;

### 从库查看事务4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758的数据是否存在
root@localhost : (none) 09:23:04> select * from test.sbtest24 order by id desc limit 10;
+---------+---------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| id      | k       | c                                                                                                                       | pad                                                         |
+---------+---------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| 1532202 | 2707896 | 16925828637-58090054889-95780407302-10293804842-12632614860-03811213734-46471823942-17333962963-31790381481-07019946586 | 77178758790-70128854370-00264407205-64007208401-22279618099 |
| 1532201 | 2516379 | 16016112465-66688149957-19003543589-12752783330-10140834595-17946818527-54058037200-83940103252-07788820899-59081899782 | 02966014962-81012143426-29059376957-22789496327-25833494842 |
| 1532200 | 2512693 | 89203990588-69103074818-52114821100-93439894362-23847143576-07484173808-05428369926-02975913993-41574550041-26938280355 | 45342050513-61911498937-97819152099-08587625932-60582159210 |
| 1532199 | 2502576 | 09802368960-91917733453-96239669598-54904648916-77296852719-98016730243-90741158106-18793912721-49864482275-01438682979 | 97383280864-09719424624-52657831677-16997497487-42888128592 |
| 1532198 | 2492769 | 50767886676-93929990978-69338677932-87282029008-76889813973-90185628217-96733941319-68335970853-94711333694-25795217601 | 19336837263-95804875039-06595487043-70523328561-09413324879 |
| 1532197 | 1955396 | 00397406795-74747838176-71394165168-41273836833-69479591787-42154528121-51771653744-67255388362-49886388719-58697178839 | 02673409369-19296574832-28003465943-79301421425-80480578998 |
| 1532196 | 2493231 | 86498404871-42108054116-40071340969-11350042446-06300939909-05201025379-06411857335-03954766664-39537185462-91490507327 | 84818317688-20764928816-31090759998-91731449425-90118225230 |
| 1532195 | 2499593 | 77660688834-45252238334-24137839414-65367466774-19337774130-53076983367-01390519320-71314042514-11226819773-51096244448 | 21997720631-75948312175-91821894792-80556960963-06584097138 |
| 1532194 | 2515138 | 66646860538-72521674543-39777861094-01574256160-18755135370-00487912454-64289722053-52102961561-41410923742-00667455284 | 84070066757-97543385536-80074118200-18008298945-61299807560 |
| 1532193 | 2517655 | 50560105798-09418525134-35753621841-55990140866-27080595470-36899245149-55668180470-33559752487-83646593848-17843254144 | 79647081405-62007589765-85313418965-19437322653-89655822041 |
+---------+---------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
10 rows in set (0.00 sec)           

此时我们根据pos位点解析master的binlog信息确认下一个事务已经执行。

二、问题排查

2.1 前期准备

MySQL 8.0的备份取消了FLUSH TABLES WITH READ LOCK的全局加锁限制,具体备份流程后续会写文章对比;使用performance_schema.log_status表来获取binlog的pos和GTID信息。

首先我们介绍下8.0新增的performance_schema.log_status表的具体内容。

root@localhost : (none):47: > select * from performance_schema.log_status\G
*************************** 1. row ***************************
    SERVER_UUID: 9eed5dd0-213c-11ea-a06f-fa9c144d7000
          LOCAL: {"gtid_executed": "d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2344524", "binary_log_file": "mysql-bin.000036", "binary_log_position": 8237993}
    REPLICATION: {"channels": [{"channel_name": "", "relay_log_file": "mysql-relay-bin.000002", "relay_log_position": 130357303}]}
STORAGE_ENGINES: {"InnoDB": {"LSN": 13536450538, "LSN_checkpoint": 13522598241}}
1 row in set (0.00 sec)           
  • SERVER_UUID:此服务器实例的服务器UUID。这是只读系统变量server_uuid的唯一值。
  • LOCAL:来自主服务器的日志位置状态信息,以单个JSON对象的形式提供,并具有以下信息:
    • binary_log_file:当前二进制日志文件的名称。
    • binary_log_position:访问log_status表时的当前二进制日志位置。
    • gtid_executed:在访问log_status表时,全局变量gtid_execute的当前值。此信息与binary_log_file和binary_log_position键一致。
  • REPLICATION:一个通道的JSON数组,每个通道具有以下信息:
    • channel_name:复制通道的名称。
    • relay_log_file:复制通道的当前中继日志文件的名称。
    • relay_log_pos:访问log_status表时的当前中继日志位置。
  • STORAGE_ENGINES:来自各个存储引擎的相关信息,以JSON对象的形式提供,每个适用的存储引擎都有一个密钥。

2.2 验证show master status的输出信息

evernotecid://232882CB-0E86-4D89-B07A-EA1AE2F94E20/appyinxiangcom/15365034/ENResource/p17613

MySQL8.0 Xtrabackup备份问题

这里可以看到,在并发场景下,show master status输出的GTID和binlog的pos位点不是对应的。查看代码后发现:binlog的file和pos位点的更新在group commit的flush阶段后,即可对其他会话可见。而GTID是在group commit的commit阶段后可对其他会话可见,所以也就造成了show master status输出的binlog信息和GTID信息非原子性。

三、复现问题

3.1 开始备份

[root@qfusion1 ~]# xtrabackup --defaults-file=/etc/mysql/my.cnf --user=root --host=127.0.0.1 --password=xxx --port=3306 --extra-lsndir=/opt/full_lsn --stream=xbstream --backup --target-dir=./ | sshpass -p'letsg0' ssh root@qfusion2 'cat -> /data/backup/full_`date +%F_%H-%M-%S`.tar'
...
191217 19:02:45 [01]        ...done
191217 19:02:45 Finished backing up non-InnoDB tables and files
191217 19:02:45 Executing FLUSH NO_WRITE_TO_BINLOG BINARY LOGS
191217 19:02:46 Selecting LSN and binary log position from p_s.log_status
191217 19:02:46 [00] Streaming /var/lib/mysql/archive/mysql-bin.000035 to <STDOUT> up to position 290601
191217 19:02:46 [00]        ...done
191217 19:02:46 [00] Streaming <STDOUT>
191217 19:02:46 [00]        ...done
191217 19:02:46 [00] Streaming <STDOUT>
191217 19:02:46 [00]        ...done
191217 19:02:46 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
xtrabackup: The latest check point (for incremental): '13470381978'
xtrabackup: Stopping log copying thread at LSN 13522426897.
191217 19:02:46 >> log scanned up to (13522598105)

191217 19:02:48 All tables unlocked
191217 19:02:48 [00] Streaming ib_buffer_pool to <STDOUT>
191217 19:02:48 [00]        ...done
191217 19:02:48 Backup created in directory '/root/'
MySQL binlog position: filename 'mysql-bin.000035', position '290601', GTID of the last change 'd3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340959'
191217 19:02:48 [00] Streaming <STDOUT>
191217 19:02:48 [00]        ...done
191217 19:02:48 [00] Streaming <STDOUT>
191217 19:02:48 [00]        ...done
xtrabackup: Transaction log of lsn (12972859512) to (13526909324) was copied.
191217 19:02:50 completed OK!           

3.2 查看xtrabackup的信息

[root@qfusion2 full]# ls
backup-my.cnf   ibdata1  mysql-bin.000035  mysql.ibd           sbtest  test      undo_002                xtrabackup_checkpoints  xtrabackup_logfile
ib_buffer_pool  mysql    mysql-bin.index   performance_schema  sys     undo_001  xtrabackup_binlog_info  xtrabackup_info         xtrabackup_tablespaces

[root@qfusion2 full]# cat xtrabackup_binlog_info
mysql-bin.000035    290601    d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340959           

3.3 查看对应的binlog信息

[root@qfusion2 full]# mysqlbinlog -vv mysql-bin.000035 | grep d3eff916-0e03-11ea-a98d-fa84dc05a100 | tail -10
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340970'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340971'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340972'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340973'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340974'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340975'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340976'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340977'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340978'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340979'/*!*/;

[root@qfusion2 full]# mysqlbinlog -vv mysql-bin.000035 --start-position=290601
mysqlbinlog: [Warning] unknown variable 'loose_default-character-set=utf8'.
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 195
#191217 19:02:45 server id 2433306  end_log_pos 124 CRC32 0x6d85a9e8     Start: binlog v 4, server v 8.0.18 created 191217 19:02:45
# Warning: this binlog is either in use or was not closed properly.
BINLOG '
1bX4XQ8aISUAeAAAAHwAAAABAAQAOC4wLjE4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAEwANAAgAAAAABAAEAAAAYAAEGggAAAAICAgCAAAACgoKKioAEjQA
CgHoqYVt
'/*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;           

3.4 开始恢复

[root@qfusion2 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --prepare --use-memory=5G --target-dir=/data/backup/full/

[root@qfusion2 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --move-back --target-dir=/data/backup/full/

[root@qfusion2 full]# chown -R mysql:mysql /var/lib/mysql/

[root@qfusion2 full]# mysqld_safe --defaults-file=/etc/mysql/my.cnf &

[root@qfusion2 full]# mysql -uroot -pxxx

root@localhost : (none):19: > show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000036
         Position: 195
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340979
1 row in set (0.00 sec)

[root@qfusion2 full]# cat xtrabackup_binlog_info
mysql-bin.000035    290601    d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340959           

备份恢复完成后,我们可以看到,show master status输出的GTID信息与xtrabackup_binlog_info文件的GTID是不一致的,对比产品中遇到主键冲突的问题中的show master status的GTID和与xtrabackup_binlog_info文件的GTID是一致的;可以确认的事情是恢复之后是有过对数据库的操作了。此时应该找下原型预研文档和产品开发人员确认下是否在MySQL恢复之后有其他额外的操作,得到反馈是执行了如下的语句操作:

reset master;
...
set global gtid_purged='xxx'           

这些步骤是MySQL 5.7下的搭建备库必要的操作(下面内容有验证),因为在5.7备份是没有binlog的备份,且GTID的持久化是在mysql.gtid_executed表,但是该表不是实时更新的,在主库的更新策略如下:

  • 主库开启binlog时,mysql.gtid_executed表会记录GTID信息,且在binlog rotation或者server重启时进行更新。
  • 主库未开启binlog时,mysql.gtid_executed表不会记录任何信息。

那么MySQL 8.0下的mysql.gtid_executed表的更新策略是什么样的呢?此时我们可以查询官方文档寻找答案:

简单翻译下:如果启用了二进制日志(log_bin为ON),则从MySQL 8.0.17起对InnoDB存储引擎,mysqld以禁用binlog记录或slave更新日志记录的相同方式更新mysql.gtid_executed表,并存储GTID用于事务提交时的每个事务。但是,在MySQL 8.0.17之前的版本以及其他存储引擎中,仅在切换binlog或重启mysqld时,mysqld将所有写入上一个binlog的事务的GTID写入mysql.gtid_executed表。

我们也可以测试验证下:

root@localhost : test:12: > select * from mysql.gtid_executed;show master status;
+--------------------------------------+----------------+--------------+
| source_uuid                          | interval_start | interval_end |
+--------------------------------------+----------------+--------------+
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |              1 |      2764676 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764677 |      2764783 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764784 |      2764893 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764894 |      2765008 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765009 |      2765074 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765075 |      2765143 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765144 |      2765185 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765186 |      2765283 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765284 |      2765384 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765385 |      2765485 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765486 |      2765571 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765572 |      2765652 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765653 |      2765746 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765747 |      2765865 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765866 |      2765948 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765949 |      2766028 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766029 |      2766052 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766053 |      2766053 |
+--------------------------------------+----------------+--------------+
18 rows in set (0.00 sec)

+------------------+-----------+--------------+------------------+------------------------------------------------+
| File             | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                              |
+------------------+-----------+--------------+------------------+------------------------------------------------+
| mysql-bin.000036 | 450959566 |              |                  | d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2766053 |
+------------------+-----------+--------------+------------------+------------------------------------------------+
1 row in set (0.01 sec)

root@localhost : test:12: > insert into test values(1,1,'1');
Query OK, 1 row affected (0.00 sec)

root@localhost : test:12: > insert into test values(2,2,'2');
Query OK, 1 row affected (0.01 sec)

root@localhost : test:12: > select * from mysql.gtid_executed;show master status;
+--------------------------------------+----------------+--------------+
| source_uuid                          | interval_start | interval_end |
+--------------------------------------+----------------+--------------+
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |              1 |      2764676 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764677 |      2764783 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764784 |      2764893 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764894 |      2765008 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765009 |      2765074 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765075 |      2765143 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765144 |      2765185 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765186 |      2765283 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765284 |      2765384 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765385 |      2765485 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765486 |      2765571 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765572 |      2765652 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765653 |      2765746 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765747 |      2765865 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765866 |      2765948 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765949 |      2766028 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766029 |      2766052 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766053 |      2766053 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766054 |      2766054 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766055 |      2766055 |
+--------------------------------------+----------------+--------------+
20 rows in set (0.01 sec)

+------------------+-----------+--------------+------------------+------------------------------------------------+
| File             | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                              |
+------------------+-----------+--------------+------------------+------------------------------------------------+
| mysql-bin.000036 | 450960258 |              |                  | d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2766055 |
+------------------+-----------+--------------+------------------+------------------------------------------------+
1 row in set (0.00 sec)           

3.5 手动删除备份的binlog文件

[root@qfusion2 full]# ls
backup-my.cnf   mysql-bin.000035    sbtest    undo_002                xtrabackup_logfile
ib_buffer_pool  mysql-bin.index     sys       xtrabackup_binlog_info  xtrabackup_tablespaces
ibdata1         mysql.ibd           test      xtrabackup_checkpoints
mysql           performance_schema  undo_001  xtrabackup_info

[root@qfusion2 full]# mv mysql-bin.* ../

[root@qfusion2 full]# ls
backup-my.cnf   mysql.ibd           test                    xtrabackup_checkpoints
ib_buffer_pool  performance_schema  undo_001                xtrabackup_info
ibdata1         sbtest              undo_002                xtrabackup_logfile
mysql           sys                 xtrabackup_binlog_info  xtrabackup_tablespaces

[root@qfusion2 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --prepare --use-memory=5G --target-dir=/data/backup/full/

[root@qfusion2 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --move-back --target-dir=/data/backup/full/

[root@qfusion2 full]# chown -R mysql:mysql /var/lib/mysql/

[root@qfusion2 full]# mysqld_safe --defaults-file=/etc/mysql/my.cnf &

[root@qfusion2 full]# mysql -uroot -pxxx

root@localhost : (none):46: > show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000003
         Position: 155
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340959
1 row in set (0.00 sec)

[root@qfusion2 full]# cat xtrabackup_binlog_info
mysql-bin.000035    290601    d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340959           

在手动移除备份生成的相关binlog文件后,发现事务是回滚到xtrabackup_binlog_info记录的GTID的位置。与MySQL 5.7下的备份恢复的情况一致;具体可以查看下面的验证。

四、MySQL 5.7下的物理备份

4.1 测试环境

  • MySQL 5.7.28
  • xtrabackup 2.4.17

4.2 开始备份

[root@qfusion3 archive]# xtrabackup --defaults-file=/etc/mysql/my.cnf --user=root --host=127.0.0.1 --password=abc123 --port=3306 --extra-lsndir=/opt/full_lsn --stream=xbstream --backup --target-dir=./ | sshpass -p'letsg0' ssh root@qfusion4 'cat -> /data/backup/full_`date +%F_%H-%M-%S`.tar'           

4.3 开始恢复

[root@qfusion4 full]# cd /data/backup

[root@qfusion4 full]# xbstream -x < full_2019-12-18_14-06-10.tar -C full

[root@qfusion4 full]# cd full

[root@qfusion4 full]# ls
backup-my.cnf   ibdata1  performance_schema  sys      undo002  undo004  undo006  undo008  undo010  undo012  undo014  undo016                 xtrabackup_checkpoints  xtrabackup_logfile
ib_buffer_pool  mysql    sbtest              undo001  undo003  undo005  undo007  undo009  undo011  undo013  undo015  xtrabackup_binlog_info  xtrabackup_info

[root@qfusion4 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --prepare --use-memory=5G --target-dir=/data/backup/full/

[root@qfusion4 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --move-back --target-dir=/data/backup/full/

[root@qfusion4 full]# chown -R mysql:mysql /var/lib/mysql/

[root@qfusion4 full]# mysqld_safe --defaults-file=/etc/mysql/my.cnf &

[root@qfusion4 full]# mysql -uroot -pxxx

root@localhost : (none):13: > select * from mysql.gtid_executed;
+--------------------------------------+----------------+--------------+
| source_uuid                          | interval_start | interval_end |
+--------------------------------------+----------------+--------------+
| 189fb8ac-1fe8-11ea-9b92-faf16b07cf00 |              1 |          535 |
+--------------------------------------+----------------+--------------+
1 row in set (0.01 sec)

root@localhost : (none):13: > show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 154
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: 189fb8ac-1fe8-11ea-9b92-faf16b07cf00:1-535
1 row in set (0.00 sec)

[root@qfusion4 full]# cat xtrabackup_binlog_info
mysql-bin.000003    175325364    189fb8ac-1fe8-11ea-9b92-faf16b07cf00:1-736           

可以看到MySQL 5.7版本的备份恢复后的GTID是依赖于mysql.gtid_executed表,但是该表不是实时更新,此时的GTID其实是滞后的,所以也需要额外执行set global gtid_purged=‘xxx’。

五、总结

  • MySQL 8.0下的物理备份生成的GTID值是不可靠,因为备份了最后一个binlog文件,恢复时会recovery到最新的事务。
  • MySQL 8.0下的物理备份会执行FLUSH NO_WRITE_TO_BINLOG BINARY LOGS操作,在有大事务对数据进行修改时,一不小心可能就会出现数据库hang死,所以建议使用小事务。