MySQL 5.6.26　Release Note解讀

最近上遊釋出了mysql 5.6.26版本，從release note來看，mysql5.6版本已經相當成熟，fix的bug數越來越少了。本文主要分析releae note上fix的相關bug，去除performance scheama、mac及windows平台、企業版、package相關内容。

問題描述：

在類unix平台上，當innodb_flush_method設定為o_direct時，函數os_file_create_simple_no_error_handling_func沒有使用o_direct方式打開資料檔案。例如在函數fil_node_open_file中，可能先以函數os_file_create_simple_no_error_handling_func打開檔案，确定檔案的大小，然後關閉檔案；再以os_file_create打開資料檔案，前者使用buffered io，後者使用direct io。這種混合使用可能引發性能問題。

根據man手冊建議：

(bug #21113036, bug #76627)

解決：

在函數os_file_create_simple_no_error_handling_func 中禁止os cache（函數os_file_set_nocache）

更新檔：

<a href="https://github.com/mysql/mysql-server/commit/b4daac21f52ced96c11632b83445111c0acede56">https://github.com/mysql/mysql-server/commit/b4daac21f52ced96c11632b83445111c0acede56</a>

在将一個髒頁從非壓縮page拷貝到壓縮頁後，在寫page到檔案時(buf_flush_write_block_low)，在設定壓縮頁的修改lsn之前先調用了函數page_zip_verify_checksum，由于此時壓縮頁上的lsn為0，而計算出來的checksum也可能為0，此時page_zip_verify_checksum認為要嘗試寫入一個空page，傳回false，導緻斷言失敗(bug #21086723)

先設定lsn，再調用page_zip_verify_checksum

<a href="https://github.com/mysql/mysql-server/commit/5b6041b2c7cbee8a1d917631d3a051122b8c4f8d">https://github.com/mysql/mysql-server/commit/5b6041b2c7cbee8a1d917631d3a051122b8c4f8d</a>

當以如下序列執行時，執行個體會crash

create database <code>b</code>;

use b;

create table <code>#mysql50#q.q</code> select 1;

drop database <code>b</code>;

在建立表時，發現非法的表名，表名被reset成一個空字元串，傳遞到引擎層就是”dbname/”，而引擎層的資料詞典定義中，是通過“dbname/tablename”這樣的字元串來定位的，這就違反了資料詞典的約定。随後如果執行drop database, 會去周遊以db名作為字首的資料詞典項，觸發crash。ps：即使重新開機執行個體，drop database，也無法執行清理操作，使用者線程會不停的在drop db的邏輯裡loop(bug #19929435)

在引擎層拒絕建立空的表名

<a href="https://github.com/mysql/mysql-server/commit/8fd710e06024a890e08e35009da541194ca0e5a4">https://github.com/mysql/mysql-server/commit/8fd710e06024a890e08e35009da541194ca0e5a4</a>

在函數innobase_get_foreign_key_info中，需要根據子表中存儲的父表表名去打開父表，但子表上是根據系統字元集system_charset_info存儲的，而innodb是使用my_charset_filename存儲表名和庫名，是以如果包含父表包含特殊字元，就會造成無法打開父表，導緻報錯。(bug #21094069)

将系統字元集的表名和庫名轉換成my_charset_filename格式（tablename_to_filename）

<a href="https://github.com/mysql/mysql-server/commit/1fae0d42c352908fed03e29db2b391a0d2969269">https://github.com/mysql/mysql-server/commit/1fae0d42c352908fed03e29db2b391a0d2969269</a>

當一個io背景線程為了做ibuf merge，需要讀入對應資料檔案的bitmap page時(check 函數buf_page_io_complete –> ibuf_merge_or_delete_for_page),讀取方式為同步讀, space->n_pending_ops遞增

另外一個使用者線程準備删除對應的tablespace，是以将space->stop_new_ops設定為true，并等待直到space->n_pending_ops為0（fil_check_pending_operations）

背景線程嘗試讀入ibuf bitmap page，但由于在fil_io函數中，如果發現space->stop_new_ops被設定，所有的讀操作都被拒絕，直接傳回db_tablespace_deleted錯誤，但在函數ibuf_merge_or_delete_for_page中總是認為ibuf bitmap page被成功讀入記憶體，後面直接引用這個page（實際上是空指針），可能會導緻執行個體crash

在進行fil_io時，如果表空間正在被删除(space->stop_new_ops被設定為true），不允許異步讀操作，但允許寫操作和同步讀操作。

<a href="https://github.com/mysql/mysql-server/commit/3ba4563a757e07c3052c780b63e2626c78ca5c47">https://github.com/mysql/mysql-server/commit/3ba4563a757e07c3052c780b63e2626c78ca5c47</a>

當表上的索引存在字首索引時(prefix index),對表進行export，再import tablespace可能會失敗，并報schema mismatch錯誤，錯誤碼為er_table_schema_mismatch。test case見bug#76877

(bug #20977779, bug #76877)

原因是cfg檔案和表的索引定義相比對時邏輯錯誤，例如如下表：

create table t1 (c1 varchar(128), primary key (c1(16))) engine=innodb;

在索引對象中定義了4個列：(c1, prefix_len=16), (db_trx_id), (db_roll_ptr),(c1, prefix_len=0)。

cfg和表索引對象相比較時，其實兩者是一樣的，但cfg在取列時，如果存在相同列名的，總是取第一個，如上例，在比較第四個列的schema是否一緻時，取的實際上是第一個，進而産生報錯。

參考函數：row_import::match_index_columns ((bug #20977779, bug #76877))

一個列一個列的依次校驗。

<a href="https://github.com/mysql/mysql-server/commit/db23392bac27ad3e84319229ee3db9921b734abd">https://github.com/mysql/mysql-server/commit/db23392bac27ad3e84319229ee3db9921b734abd</a>

考慮如下場景：

purge線程讀取一個undo ，解析出對應的記錄 (row_purge —> row_purge_parse_undo_rec)

先purge 二級索引(row_purge_remove_sec_if_poss)，再purge聚集索引(row_purge_remove_clust_if_poss)

當purge二級索引頁時，需要檢查二級索引記錄是否可以被實體purge掉(row_purge_remove_sec_if_poss_leaf)，參考函數：row_purge_poss_sec

row_purge_reposition_pcur定位到聚集索引上，node->found_clust設定為true，定位到clust index上的cursor存儲在node->pour上。

然後再檢查二級索引記錄是否被标記删除了，( row_purge_remove_sec_if_poss_leaf —> red_get_deleted_flag)，如果沒有被标記删除，則報warning。

但是步驟c中，即時二級索引沒有被标記删除，在函數row_purge_poss_sec也傳回了true，這是因為重新定位cursor的邏輯錯誤。

函數<code>row_purge_reposition_pcur</code>:

考慮如下序列：

purge index1時，根據node->ref找到對應的聚集索引記錄，node->found_clust設定為true，目前cursor存到node->pour中;

其他使用者線程操作了聚集索引頁，導緻在purge index3時，restore position可能失敗，是以傳回false。

随後purge index3，發現node->found_clust為true,依舊用上次restore的position來作restore，依然失敗；在函數row_purge_reposition_pcur傳回false就認為對應的聚集索引不存在，然後就去嘗試删除二級索引記錄；但注意這次想purge的二級索引記錄可能是一個新鮮插入的新記錄，并沒有被delete mark，我們實際上需要根據node->ref重新定位。

在函數row_purge_reposition_pcur中，若是restore cursor失敗，需要重置node->found_clust為false (bug #19138298, bug #70214, bug #21126772, bug #21065746)

<a href="https://github.com/mysql/mysql-server/commit/982a157c71667040838def7a00d951ffc55eccbc">https://github.com/mysql/mysql-server/commit/982a157c71667040838def7a00d951ffc55eccbc</a>

<a href="https://github.com/mysql/mysql-server/commit/4b8304a9a41c8382d18e084608c33e5c27bec311">https://github.com/mysql/mysql-server/commit/4b8304a9a41c8382d18e084608c33e5c27bec311</a>

<a href="https://github.com/mysql/mysql-server/commit/e59914034ab695035c3fe48f046a96bb98d53044">https://github.com/mysql/mysql-server/commit/e59914034ab695035c3fe48f046a96bb98d53044</a>

<a href="https://github.com/mysql/mysql-server/commit/92b4683d59c066f099be1d283c7d61b00caeedb2">https://github.com/mysql/mysql-server/commit/92b4683d59c066f099be1d283c7d61b00caeedb2</a>

嘗試為表上rebuild 全文索引，但表上已經有損壞的索引時，會觸發assert。(bug #20637494)

抛出錯誤，提示使用者先删掉損壞的索引。傳回錯誤碼為er_innodb_index_corrupt

<a href="https://github.com/mysql/mysql-server/commit/4395ad1755c3ed86c4210f76001a76eb0a69b553">https://github.com/mysql/mysql-server/commit/4395ad1755c3ed86c4210f76001a76eb0a69b553</a>

<a href="https://github.com/mysql/mysql-server/commit/3bdb4573e9b25357eea2421647263216c36367cb">https://github.com/mysql/mysql-server/commit/3bdb4573e9b25357eea2421647263216c36367cb</a>

建構full-text的表上存在隐藏的fts_doc_id和唯一索引fts_doc_id_index（fts_doc_id），當删除全文索引時，對應的隐藏列并沒有删除，但在目前的邏輯中，如果存在fts_doc_id，則不允許online ddl.(bug #20590013, bug #76012)

當表上隻有fts_doc_id_index和fts_doc_id 但沒有定義全文索引時，允許online ddl。這些隐藏列直到全表rebuild時才被删除。

<a href="https://github.com/mysql/mysql-server/commit/5610e5354a8be6609b2fc2a37902961be26af3cf">https://github.com/mysql/mysql-server/commit/5610e5354a8be6609b2fc2a37902961be26af3cf</a>

ib_cursor_moveto 函數沒有判斷建構的tuple的列個數是否小于索引列個數，而是直接用索引列的個數來做周遊，可能導緻段錯誤(bug #21121197, bug #77083)

加上對應的判斷

<a href="https://github.com/mysql/mysql-server/commit/d511b503353c1588e90907f59b947e31796c1fc1">https://github.com/mysql/mysql-server/commit/d511b503353c1588e90907f59b947e31796c1fc1</a>

問題描述:

ib_table_truncate函數中，當truncate失敗時，沒有正确的釋放事務對象，可能導緻shutdown hang住

解決:

總是釋放事務對象

更新檔:

ib_open_table_by_id函數中，已經加了dict_sys->mutex鎖，但該函數中調用dict_table_open_on_id傳遞的第二個參數為false，認為沒有持有mutex，屬于基本的邏輯錯誤。(bug #21121084, bug #77100)

調整傳參

<a href="https://github.com/mysql/mysql-server/commit/a2353c5d7ff6430e853de435d007ac64d91fd17d">https://github.com/mysql/mysql-server/commit/a2353c5d7ff6430e853de435d007ac64d91fd17d</a>

上面幾個bug看起來都是非常“低級”的代碼缺陷，這也側面證明了innodb api接口在推出後社群用的人實在太少了，這三個bug都是facebook的工程師提出的，很好奇他們會利用innodb api做些什麼

innodb memcached plugin在處理unsigned not null類型時沒有正确處理，導緻傳回的資料錯誤。

對于unsigned類型，對應的ib_col_unsigned = 2

對于not null類型，對應的ib_col_not_null = 1

但是代碼裡很多地方都使用類似m_col->attr == ib_col_unsigned，導緻大量的邏輯錯誤。(bug #20535517, bug #75864)

修改成m_col->attr & ib_col_unsigned

<a href="https://github.com/mysql/mysql-server/commit/6ff8d5d2940b9c9079e07641b2beb12e8dd84b38">https://github.com/mysql/mysql-server/commit/6ff8d5d2940b9c9079e07641b2beb12e8dd84b38</a>

當使用多線程複制時，執行stop slave需要等待所有的worker線程完成其各自的工作隊列中的事務。如果pending的事務很多，可能要等待很長時間才能完成stop slave，另外在stop slave的過程中，是無法show slave status的，一種比較常見的場景就是大量的監控程式sql堵塞堆積(bug #75525, bug #20369401)

解決方案是先找到任意worker線程中最新的commit的事務，确定一個上限位點，所有的worker線程執行到這個位置停止，剩下的事務暫時不執行。具體的：

執行stop slave， coordinator線程首先将所有worker線程的狀态設定成stop （slave_stop_workers(rli, &mts_inited)），并更新rli->max_updated_index為最新的已經執行（或正在執行）的事務的group index(set_max_updated_index_on_stop)。

所有worker的工作隊列中索引序号小于等于 rli->max_updated_index的事務都需要被執行完，否則worker狀态設定為stop_accepted，表示已經完成了max_updated_index之前的事務，可以退出。(set_max_updated_index_on_stop)

coordinator線程等待所有worker線程退出，并做一次checkpoint(slave_stop_workers –> mts_checkpoint_routine)

但是上述方案并不能解決正在執行的大事務過慢的問題。

<a href="https://github.com/mysql/mysql-server/commit/37f2e969bd36a7455e81ea2350685707bc859866">https://github.com/mysql/mysql-server/commit/37f2e969bd36a7455e81ea2350685707bc859866</a>

mysql使用innodb + binlog做xa的方式來進行crash recovery，但在之前的版本中如果寫binlog到磁盤發生了錯誤，group commit的邏輯并沒有感覺到這個錯誤，而是繼續在引擎層送出事務，備庫沒有接收到對應的binlog，導緻主備資料不一緻。 (bug #76795, bug #20938915)

從mysql 5.6.22版本開始，引入了一個新參數binlog_error_action(5.6.20及21版本叫做binlogging_impossible_mode)，若設定為abort_server，則在發生binlog寫入錯誤時直接讓執行個體退出，避免引發更大的錯誤；若設定為ignore_error，則忽略本次寫入失敗，同時禁止binlog記錄，需要重新開機才能讓binlog再次開啟。

為了主備資料的強一緻性，通常應該将binlog_error_action設定為abort_server，這樣在打開檔案、rotate新檔案、從io cache寫binlog到檔案出現磁盤錯誤時，都會退出執行個體。

<a href="https://github.com/mysql/mysql-server/commit/3b6b4bf8c5d1bfada58678acebafdf6f813c2dfe">https://github.com/mysql/mysql-server/commit/3b6b4bf8c5d1bfada58678acebafdf6f813c2dfe</a>

relay_log_recovery參數打開時，備庫在重新開機時就可以根據sql線程執行到的位置重新拉binlog，這可以有效處理備庫發生機器當機導緻relay log檔案損壞的情況，無需人工去change master，在之前版本中，如果使用了多線程複制，是無法開啟該特性的，在啟動執行個體時會報如下錯誤：

relay-log-recovery cannot be executed when the slave was stopped with an error or killed in mts mode

實際上，如果開啟了gtid，就無需關心各個worker線程之間的gap，通過備庫的gtid集合充拉relay log即可。(bug #73397, bug #19316063)

在重新開機recovery時檢查是否開啟了gtid

<a href="https://github.com/mysql/mysql-server/commit/fce558959bd0e5af1ae6aac3d8573db00c271dfd">https://github.com/mysql/mysql-server/commit/fce558959bd0e5af1ae6aac3d8573db00c271dfd</a>

當兩台備庫錯誤的配置了相同的server_uuid，并指向同一個主庫時，備庫的io線程會被頻繁的斷開并嘗試重連。而在備庫來看，并沒有足夠的資訊提示産生重連的原因。

這種場景下，主庫會生産一個錯誤資訊傳遞到備庫，當備庫接受到這樣的錯誤資訊時不再嘗試重連。(bug #72581, bug #18731252)

<a href="https://github.com/mysql/mysql-server/commit/751a3da76dfd66b92395f90f11fce6bd890c9db5">https://github.com/mysql/mysql-server/commit/751a3da76dfd66b92395f90f11fce6bd890c9db5</a>

alter table rebuild partition 處理邏輯錯誤導緻的crash，

bug被隐藏，無test case。對應release note:

partitioning: in certain cases, alter table … rebuild partition was not handled correctly when executed on a locked table. (bug #75677, bug #20437706)

先擱置，後面在看。

<a href="https://github.com/mysql/mysql-server/commit/6b0e6683416dc6f8274a460bd2512e7b037ec75f">https://github.com/mysql/mysql-server/commit/6b0e6683416dc6f8274a460bd2512e7b037ec75f</a>

對優化器子產品了解不深，先mark，後面有時間再細看。

while calculating the cost for doing semjoin_dupsweedout strategy inner_fnout is calculated wrongly when max_outer_fanout becomes 0. this causes mysql server to exit later (bug #21184091)

calculate the inner_fanout only when max_outer_fanout is > 0. else there is no need to recalculate inner_fanout w.r.t max_outer_fanout.

<a href="https://github.com/mysql/mysql-server/commit/bfba2338902a81927d116c30eaa1245eaea025c8">https://github.com/mysql/mysql-server/commit/bfba2338902a81927d116c30eaa1245eaea025c8</a>

group by or order by on a char(0) not null column could lead to a server exit. (bug #19660891)

assertion `param.sort_length != 0′ failed in sql/filesort.cc:361

mark，後面分析

<a href="https://github.com/mysql/mysql-server/commit/60c6920509516a1e05b855799479a59c27803191">https://github.com/mysql/mysql-server/commit/60c6920509516a1e05b855799479a59c27803191</a>

<a href="https://github.com/mysql/mysql-server/commit/b62c5daa646434290c9b2d1c9b162487cb8edf04">https://github.com/mysql/mysql-server/commit/b62c5daa646434290c9b2d1c9b162487cb8edf04</a>

when choosing join order, the optimizer could incorrectly calculate the cost of a table scan and choose a table scan over a more efficient eq_ref join. (bug #71584, bug #18194196)

while choosing the join order, the cost for doing table_scan is wrongly calculated. as a result table_scan is preferred over eq_ref, thereby choosing a bad plan.

<a href="https://github.com/mysql/mysql-server/commit/7a36c155ea3f484799c213a5be5a3deb464251dc">https://github.com/mysql/mysql-server/commit/7a36c155ea3f484799c213a5be5a3deb464251dc</a>

mysql string庫下的字元串處理問題，在cs_values函數中，對字元串長度的處理存在缺陷，可能導緻記憶體損壞。 (bug #20359808)

調整長度判斷

<a href="https://github.com/mysql/mysql-server/commit/1cdd3b832ae32d3c236869954f0c7a8a851ed94a">https://github.com/mysql/mysql-server/commit/1cdd3b832ae32d3c236869954f0c7a8a851ed94a</a>

當會話斷開或者執行類似change user時，session status會merge到全局status中（add_to_status(&global_status_var, &status_var)），但沒有立刻對thd的status_var做reset，這時候另外一個session去查詢global status時，會重複把這些session的status值加到全局。

在thd::change_user、thd::release_resources函數中累加到全局status後，重置session的status。

<a href="https://github.com/mysql/mysql-server/commit/c8243dd36047debb76134344d761e48f0cedf78e">https://github.com/mysql/mysql-server/commit/c8243dd36047debb76134344d761e48f0cedf78e</a>

MySQL 5.6.26　Release Note解讀

繼續閱讀

SQL優化SQL語句優化的目的

資料遷移方法資料遷移原則資料遷移之雙寫方案資料遷移之級聯同步方案

redis叢集資料一緻性_RedisRaft為Redis叢集帶來強大的資料一緻性

JAVA高效程式設計指南

寶塔面闆mysql恢複2018.1.8更新

Centos7 MySQL 5.7 安裝MySQL 5.7 安裝

查找入職員工時間排名倒數第三的員工所有資訊

Hibernate使用Hibernate的“3個準備，7個步驟”Hibernate API簡介操作實體對象對象識别

雲計算面試題——mysql/存儲引擎/備份

關于SQL語言

SQL語言基礎：常用的資料查詢語句

Ubuntu16.04安裝Apache+MySQL+PHP1. 安裝Apache2. 安裝MySQL3. 安裝PHP4. 安裝phpMyAdmin

MySQL的4種隔離級别？出現問題

neo4j之cypher使用文檔

mysql使用source指令導入.sql檔案

sqlServer根據經緯查距離

MySQL 5.6.26 Release Note解讀

繼續閱讀

MySQL 5.6.26　Release Note解讀