天天看點

記一次mysql用戶端執行sql後hung住

近期遇到一個新的case,執行sql後無傳回,同時抓包沒有抓到對應的sql發起通路(實際複現的時候抓到了封包)
,以及mysql用戶端加不加-A 速度不一(不加 -A 要在本地建立一個庫表緩存,加了就不建立這個),
實際分析下來并沒有很難,本文主要為了分享一下該類問題分析的小技巧
           

1,需要循環執行sql,寫個循環配置免密登陸,,如下所示的配置後,就可以直接 mysql -A 登陸成功了

[root@Ad****s-143 ~]# cat .my.cnf
[client]
host=rm-t********e3eo.mysql.si****re.rds.aliyuncs.com
user='p***b'
password='vk7m*****x%ta'
[root@Ad****s-143 ~]# mysql -A
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 12571135
Server version: 5.6.16-log Source distribution
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>           

2, 配置循環sql,并抓包(構造一個特殊的sql,并追蹤mysql指令行的執行過程)

抓包用這個sql就足夠了
# for i in {1..100};do echo $i;mysql -A -e "select guid, name, 0 from pa.industry limit $i ;";sleep 1s;done
循環一百次           輸出本倫次序号   mysql指向sql,同時limit指定為前面的i變量,便于提取sql           間隔1秒
strace跟蹤的話用這個
# for i in {1..100};do echo $i;strace -F -ff -t -tt -s 4096 -o m.out mysql -A -e "select guid, name, 0 from pa.industry limit $i ;";sleep 1s;done           

3, 複現問題,并分析

輸出效果

[root@Ad*****s-143 ~]# for i in {1..100};do echo $i;mysql -A -e "select guid, name, 0 from pa.industry limit $i ;";sleep 1s;done
1
+--------------------------------------+------+---+
| guid                                 | name | 0 |
+--------------------------------------+------+---+
| 965EADB8-C88E-83B2-325C-0DD04D5612DA | ???? | 0 |
+--------------------------------------+------+---+
2
+--------------------------------------+------+---+
| guid                                 | name | 0 |
+--------------------------------------+------+---+
| 965EADB8-C88E-83B2-325C-0DD04D5612DA | ???? | 0 |
| 0DFECEC8-9348-75F1-5B7C-2EE050FB0186 | ???? | 0 |
+--------------------------------------+------+---+
......中間省略一萬字
29
+--------------------------------------+---------+---+
| guid                                 | name    | 0 |
+--------------------------------------+---------+---+
| 965EADB8-C88E-83B2-325C-0DD04D5612DA | ????    | 0 |
| 0DFECEC8-9348-75F1-5B7C-2EE050FB0186 | ????    | 0 |
| 4184E44C-E829-E3F3-5D75-1B488B3953A6 | ??      | 0 |
| 7FB961D6-EE07-2E67-447B-E1DDB2C2A2E0 | ??      | 0 |
| F7D1588D-9961-E01E-4C3E-228752504C0C | ????    | 0 |
| 07AD88F0-4546-10B1-040F-89C1773E2C52 | ??      | 0 |
| 451DB2F6-170B-7BEC-5DF9-2C3643204CA8 | ??      | 0 |
| FC67E100-F77B-165A-3202-23DF76BB1120 | ??      | 0 |
| 4C5AEA5F-AD45-E04B-59A8-22AC3CC6BDF9 | ????    | 0 |
| BB3EF523-762E-D38A-7ACC-CBAEA027A2E2 | ????    | 0 |
| 0C331B61-A650-4178-9153-2FAD8402492B | ????    | 0 |
| A85AD107-8A6D-22DD-4854-86DB0AD5A0E1 | ????    | 0 |
| 758D3D24-8725-8FD2-21DC-E58CE7F790B0 | ????    | 0 |
| 1D0DE78E-DE69-8DEA-187F-80762F918CAF | ??????  | 0 |
| 86820362-D8FB-BF3E-D1AE-BEA9F22DF131 | ????    | 0 |
| 4561EF83-C1F9-3EC3-EB5B-B829D8E1B652 | ????    | 0 |
| F552AF90-792C-39CC-E201-CD97C9681A38 | ????    | 0 |
| A8B0CEDA-5B2B-A231-4414-5EA41E37B680 | ????    | 0 |
| AA5E6908-9DF6-17CD-EEB8-4EB877A65F80 | ????    | 0 |
| 220B5BDD-019B-B13D-4518-259A9BF33A84 | ????    | 0 |
| 00F55DEA-15D1-BFC0-EF96-8FA57464A036 | ?????   | 0 |
| 9877703F-B2C1-E5BE-2A85-CF76CE944FC8 | ????    | 0 |
| BCB3CB52-37FB-2F99-C330-FBDC4C7E5949 | ????    | 0 |
| 5CF8E33B-0936-C2C4-0763-3880943D1461 | ????    | 0 |
| AF5941E7-56D5-4D56-B24D-8DA026A76B49 | ????    | 0 |
| E204689F-A318-E53E-FDF2-7FB942CA4D80 | ????    | 0 |
| BAB81C78-BC1C-33F5-81A5-CE9811F4F4E6 | ??????? | 0 |
| 6BD47A46-0CDE-5B95-1701-560BF2A8BBB7 | ????    | 0 |
| C098145A-F21B-033B-52D6-35A4BD7C83A4 | ????    | 0 |
+--------------------------------------+---------+---+
30 
^CCtrl-C -- sending "KILL QUERY 12569784" to server ...
Ctrl-C -- query aborted.
^CCtrl-C -- sending "KILL 12569784" to server ...
Ctrl-C -- query aborted.
^CCtrl-C -- exit!           

4,strace 看到的結果,可以看出來發出的sql是 limit 33(多次複現保留的現場不一,不用糾結序号對不對得上的問題)

記一次mysql用戶端執行sql後hung住

5,檢視wireshark結果,用戶端發出query的sql 已經被server端确認了,但是沒有給傳回response的結果

不正常傳回的截圖

記一次mysql用戶端執行sql後hung住

正常傳回的截圖

記一次mysql用戶端執行sql後hung住

6,登陸mysql檢視processlist發現新問題,server把query給ack(确認)後,用戶端沒有收到response,但是server記錄的是sleep,說明server傳回了response并進入sleep狀态,server端認為用戶端沒有結束連接配接,說明response丢在了中間鍊路上

mysql的會話
mysql> select * from INFORMATION_SCHEMA.PROCESSLIST where HOST like '101.*.*.143%';
+----------+--------+-----------------------+------+---------+------+-----------+---------------------------------------------------------------------------------+
| ID       | USER   | HOST                  | DB   | COMMAND | TIME | STATE     | INFO                                                                            |
+----------+--------+-----------------------+------+---------+------+-----------+---------------------------------------------------------------------------------+
| 12569201 | pa_web | 101.*.*.143:41580 | NULL | Sleep   |  734 |           | NULL                                                                            |
這一條這一條
| 12569784 | pa_web | 101.*.*.143:53000 | NULL | Sleep   |  161 |           | NULL  
           

ecs的tcp連接配接

記一次mysql用戶端執行sql後hung住

7,用戶端在國内,rds在新加坡,懷疑是跨境走的國際鍊路中間某一跳路由有問題導緻封包被丢棄,建議跨境鍊路可以考慮使用高速通道打通内網調用