事发背景
测试环境更换数据盘,直接采取在线将数据目录暴力拷贝到新盘,然后将原服务关闭,启用新盘。
服务是可以正常启动的,但是没多会开发就反应服务down了,错误日志输出
2017-05-17 15:06:28 0x7ffdadff7700 InnoDB: Assertion failure in thread 140727522653952 in file trx0purge.cc line 167
InnoDB: Failing assertion: purge_sys->iter.trx_no <= purge_sys->rseg->last_trx_no
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
07:06:28 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
key_buffer_size=8388608
read_buffer_size=2097152
max_used_connections=0
max_threads=1000
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4117496 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7ffda8000900
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7ffdadff6d98 thread_stack 0x40000
/usr/local/mysql/bin/mysqld(my_print_stacktrace+0x35)[0xf32bd5]
/usr/local/mysql/bin/mysqld(handle_fatal_signal+0x4a4)[0x77b844]
/lib64/libpthread.so.0(+0xf7e0)[0x7ffe223e17e0]
/lib64/libc.so.6(gsignal+0x35)[0x7ffe2108c5e5]
/lib64/libc.so.6(abort+0x175)[0x7ffe2108ddc5]
/usr/local/mysql/bin/mysqld[0x10c5da5]
/usr/local/mysql/bin/mysqld(_ZN20TrxUndoRsegsIterator8set_nextEv+0x322)[0x1093822]
/usr/local/mysql/bin/mysqld[0x1095ea3]
/usr/local/mysql/bin/mysqld(_Z9trx_purgemmb+0x6ad)[0x10974bd]
/usr/local/mysql/bin/mysqld[0x1076178]
/usr/local/mysql/bin/mysqld(srv_purge_coordinator_thread+0x86f)[0x107914f]
/lib64/libpthread.so.0(+0x7aa1)[0x7ffe223d9aa1]
/lib64/libc.so.6(clone+0x6d)[0x7ffe21142aad]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 0
Status: NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
查看了提示的链接,可能的影响因素包括、
1、数据文件或索引文件有坏页
2、mysqld服务在运行期间,有其它项目操作数据文件和索引文件
这条和先前的操作有关,在mysqld服务运行期间我直接将数据文件拷贝到了另一盘,这样做很可能导致副本的数据和索引是不一致的
按照官网给出的建议对日志中的stack进行了解析
vim /data1/sql/mysqld.sym
0xf32bd5
0x77b844
0x7f751eed67e0
0x7f751db815e5
0x7f751db82dc5
0x10c5da5
0x1093822
0x1095ea3
0x10974bd
0x1076178
0x107914f
0x7f751eeceaa1
0x7f751dc37aad
nm -D -n /usr/local/mysql/bin/mysqld > /data1/sql/mysqld.sym
resolve_stack_dump -s /data1/sql/mysqld.sym -n /data1/sql/mysqld.stack
0xf32bd5 my_print_stacktrace + 53
0x77b844 handle_fatal_signal + 1188
0x7f751eed67e0 _end + 486423968
0x7f751db815e5 _end + 466152357
0x7f751db82dc5 _end + 466158469
0x10c5da5 _Z23ut_dbg_assertion_failedPKcS0_m + 170
0x1093822 _ZN20TrxUndoRsegsIterator8set_nextEv + 802
0x1095ea3 _Z14trx_purge_stopv + 5363
0x10974bd _Z9trx_purgemmb + 1709
0x1076178 _Z26srv_get_active_thread_typev + 1048
0x107914f srv_purge_coordinator_thread + 2159
0x7f751eeceaa1 _end + 486391905
0x7f751dc37aad _end + 466899053
找不出导致crash的根本原因,暂时的解决方案是通过设置innodb_force_recovery来启动
相关内容参考InnoDB Recovery
1、设置innodb_force_recovery 先后设置1和2都无效,最后设置为3后启动成功
2、启动后dump所有的数据库,恢复到新建实例中