为DISTINCT、GROUP BY和ORDER BY效率问题提提速

废话不多说，直击问题处理过程...

问题描述

原SQL，多表关联，order表4w，order_trace表24w数据，按照正常逻辑，就算关联7表，查询其他表得字段很多，查询分页得到数据速度平均在1.300s，肯定存在问题。

SELECT DISTINCT
    (a.order_no),
    a.`name` order_name,
    a.eid,
    a.uid,
    a.device_type_id,
    a.num,
    a.`desc`,
    a.attach,
    a.total_amount amount,
    a.ctime,
    a.publish_time,
    a.finish_time,
    a.`status`,
    a.ent_name,
    a.ent_user_name,
    a.ent_user_phone,
    a.cust_id,
    a.cust_name,
    a.start_work_time,
    a.expire_time,

IF (
    (
        NOW() > a.expire_time
        AND a.`status` = 4
    )
    OR (
        a.cancel_time > a.expire_time
        AND a.`status` = 5
    )
    OR b.`type` = 2,
    '1',
    '2'
) 'is_expire',
 d.parent_id,
 e.`name` parent_name,
 d.`name` device_type_name,
 b.eid AS `user.eid`,
 b.uid AS `user.uid`,
 b.ctime AS `user.ctime`,
 c.`name` AS `orderAddress.name`,
 c.phone AS `orderAddress.phone`,
 c.province_id AS `orderAddress.province_id`,
 c.province_name AS `orderAddress.province_name`,
 c.city_id AS `orderAddress.city_id`,
 c.city_name AS `orderAddress.city_name`,
 c.county_id AS `orderAddress.county_id`,
 c.county_name AS `orderAddress.county_name`,
 c.detail_address AS `orderAddress.detail_address`,
 c.`desc` AS `orderAddress.desc`,
 c.lng AS `orderAddress.lng`,
 c.lat AS `orderAddress.lat`,
 b.`type` AS `userOrder.type`,
 b.`cntr_id` AS `userOrder.cntr_id`,
 b.`name` AS `userOrder.name`,
 b.`phone` AS `userOrder.phone`,
 b.dispatch_uid AS `userOrder.dispatch_uid`,
 b.dispatch_time AS `userOrder.dispatch_time`,
 b.type AS `user_order_type`,
 g.attn_name AS `order_attr.attn_name`,
 g.attn_phone AS `order_attr.attn_phone`,
 a.settle_type
FROM
    `order` a
LEFT JOIN user_order b ON a.order_no = b.order_no
AND b.`status` = 1
LEFT JOIN order_address c ON a.order_no = c.order_no
LEFT JOIN category d ON a.device_type_id = d.id
LEFT JOIN category e ON d.parent_id = e.id
LEFT JOIN order_item f ON f.order_no = a.order_no
LEFT JOIN order_attr g ON g.order_no = a.order_no
WHERE
    1 = 1
AND (
    a.`status` IN (4, 5, 6, 7, 12)
    AND a.payway IS NOT NULL
    OR a.`status` = 14
    OR a.`status` = 10
)
AND (
    a.`status` != 4
    OR a.audit_time IS NOT NULL
)
ORDER BY
    a.order_no DESC
LIMIT 12 OFFSET 0;

问题排查

EXPLAIN一下

+----+-------------+-------+--------+-------------------------------------------------------------+--------------+---------+------------------------+-------+----------------------------------------------+
| id | select_type | table | type   | possible_keys                                               | key          | key_len | ref                    | rows  | Extra                                        |
+----+-------------+-------+--------+-------------------------------------------------------------+--------------+---------+------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | a     | ALL    | NULL                                                        | NULL         | NULL    | NULL                   | 41708 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | b     | ref    | idx_order_no,idx_order_no_status,idx_order_no_complete_time | idx_order_no | 9       | order.a.order_no       |     1 | Using where                                  |
|  1 | SIMPLE      | c     | eq_ref | PRIMARY,idx_order_no_city_id,idx_order_no_county_id         | PRIMARY      | 8       | order.a.order_no       |     1 | NULL                                         |
|  1 | SIMPLE      | d     | eq_ref | PRIMARY                                                     | PRIMARY      | 4       | order.a.device_type_id |     1 | NULL                                         |
|  1 | SIMPLE      | e     | eq_ref | PRIMARY                                                     | PRIMARY      | 4       | order.d.parent_id      |     1 | NULL                                         |
|  1 | SIMPLE      | f     | ref    | idx_order_no                                                | idx_order_no | 9       | order.a.order_no       |     1 | Using index                                  |
|  1 | SIMPLE      | g     | eq_ref | PRIMARY                                                     | PRIMARY      | 8       | order.a.order_no       |     1 | Using where                                  |
+----+-------------+-------+--------+-------------------------------------------------------------+--------------+---------+------------------------+-------+----------------------------------------------+
7 rows in set

发现问题一：

　　type=ALL表示索引没用到，rows=41708行，证明使用了全表扫描，Extra包含了Using filesort表示排序的字段上出了问题，因此索引了ORDER BY。

　　解决方式：

　　order_no生成顺序是根据时间生成的，因此，ORDER BY order_no也是可行的，就没必要给ctime创建索引。

　　结果：

　　处理完成之后发现，执行的查询结果时间几乎看不出来差距，此时得知问题并不完全出现在ORDER BY上，继续排查。。。

EXPLAIN一下

+----+-------------+-------+--------+-------------------------------------------------------------+--------------+---------+------------------------+------+------------------------------+
| id | select_type | table | type   | possible_keys                                               | key          | key_len | ref                    | rows | Extra                        |
+----+-------------+-------+--------+-------------------------------------------------------------+--------------+---------+------------------------+------+------------------------------+
|  1 | SIMPLE      | a     | index  | NULL                                                        | PRIMARY      | 8       | NULL                   |   12 | Using where; Using temporary |
|  1 | SIMPLE      | b     | ref    | idx_order_no,idx_order_no_status,idx_order_no_complete_time | idx_order_no | 9       | order.a.order_no       |    1 | Using where                  |
|  1 | SIMPLE      | c     | eq_ref | PRIMARY,idx_order_no_city_id,idx_order_no_county_id         | PRIMARY      | 8       | order.a.order_no       |    1 | NULL                         |
|  1 | SIMPLE      | d     | eq_ref | PRIMARY                                                     | PRIMARY      | 4       | order.a.device_type_id |    1 | NULL                         |
|  1 | SIMPLE      | e     | eq_ref | PRIMARY                                                     | PRIMARY      | 4       | order.d.parent_id      |    1 | NULL                         |
|  1 | SIMPLE      | f     | ref    | idx_order_no                                                | idx_order_no | 9       | order.a.order_no       |    1 | Using index                  |
|  1 | SIMPLE      | g     | eq_ref | PRIMARY                                                     | PRIMARY      | 8       | order.a.order_no       |    1 | Using where                  |
+----+-------------+-------+--------+-------------------------------------------------------------+--------------+---------+------------------------+------+------------------------------+

发现问题二：

Extra包含了Using temporary表示用上了临时表，这是效率慢的一大因素。而DISTINCT是一个SQL优化能不用就不用的关键字，但是因为业务和数据结构的问题的问题，user_order和order_item表是一对多关系，因此DISTINCT是一个需要处理的点。DISTINCT和GROUP BY的原理基本一致，但有一些地方还不一致，所以仍然执行分组操作，意外收获，不使用了临时表，并且效率提升了13倍，查询时间控制在0.100s左右，完全满足查询效率问题

　　解决方式：

SELECT
    a.order_no,
    a.`name` order_name,
    a.eid,
    a.uid,
    a.device_type_id,
    a.num,
    a.`desc`,
    a.attach,
    a.total_amount amount,
    a.ctime,
    a.publish_time,
    a.finish_time,
    a.`status`,
    a.ent_name,
    a.ent_user_name,
    a.ent_user_phone,
    a.cust_id,
    a.cust_name,
    a.start_work_time,
    a.expire_time,

IF (
    (
        NOW() > a.expire_time
        AND a.`status` = 4
    )
    OR (
        a.cancel_time > a.expire_time
        AND a.`status` = 5
    )
    OR b.`type` = 2,
    '1',
    '2'
) 'is_expire',
 d.parent_id,
 e.`name` parent_name,
 d.`name` device_type_name,
 b.eid AS `user.eid`,
 b.uid AS `user.uid`,
 b.ctime AS `user.ctime`,
 c.`name` AS `orderAddress.name`,
 c.phone AS `orderAddress.phone`,
 c.province_id AS `orderAddress.province_id`,
 c.province_name AS `orderAddress.province_name`,
 c.city_id AS `orderAddress.city_id`,
 c.city_name AS `orderAddress.city_name`,
 c.county_id AS `orderAddress.county_id`,
 c.county_name AS `orderAddress.county_name`,
 c.detail_address AS `orderAddress.detail_address`,
 c.`desc` AS `orderAddress.desc`,
 c.lng AS `orderAddress.lng`,
 c.lat AS `orderAddress.lat`,
 b.`type` AS `userOrder.type`,
 b.`cntr_id` AS `userOrder.cntr_id`,
 b.`name` AS `userOrder.name`,
 b.`phone` AS `userOrder.phone`,
 b.dispatch_uid AS `userOrder.dispatch_uid`,
 b.dispatch_time AS `userOrder.dispatch_time`,
 b.type AS `user_order_type`,
 g.attn_name AS `order_attr.attn_name`,
 g.attn_phone AS `order_attr.attn_phone`,
 a.settle_type
FROM
    `order` a
LEFT JOIN user_order b ON a.order_no = b.order_no
AND b.`status` = 1
LEFT JOIN order_address c ON a.order_no = c.order_no
LEFT JOIN category d ON a.device_type_id = d.id
LEFT JOIN category e ON d.parent_id = e.id
LEFT JOIN order_item f ON f.order_no = a.order_no
LEFT JOIN order_attr g ON g.order_no = a.order_no
WHERE
    1 = 1
AND (
    a.`status` IN (4, 5, 6, 7, 12)
    AND a.payway IS NOT NULL
    OR a.`status` = 14
    OR a.`status` = 10
)
AND (
    a.`status` != 4
    OR a.audit_time IS NOT NULL
)
GROUP BY
    a.order_no
ORDER BY
    a.order_no DESC
LIMIT 12 OFFSET 0;

　　原理：

　　如果GROUP BY 的列没有索引，产生临时表。

　　MYSQL5.6中解释：The most general way to satisfy a GROUP BY clause is to scan the whole table and create a new temporary table where all rows from each group are consecutive, and then use this temporary table to discover groups and apply aggregate functions (if any). In some cases, MySQL is able to do much better than that and avoid creation of temporary tables by using index access.通过使用索引访问避免创建临时表。

　　而DISTINCT与ORDER BY 使用同一字段并且索引，仍生产临时表，所以不建议使用DISTINCT

总结

执行效率的排查，EXPLAIN是非常重要的，首先排查是否使用索引，其次排查是否使用了临时表，如果效率还不行，就是得深入研究，或者考虑NOSQL进行缓存了

为DISTINCT、GROUP BY和ORDER BY效率问题提提速

问题描述

问题排查

发现问题一：

发现问题二：

总结

继续阅读

2022秋招面试总结（cpp+java+测开）百度测开一面字节后端一面虾皮后端一面虾皮后端二面

数据库之DDL操作数据库DDL操作数据库DDL操作数据表

数据库之DQL操作数据库

mysql优化（sql优化）

数据迁移方法数据迁移原则数据迁移之双写方案数据迁移之级联同步方案

redis集群数据一致性_RedisRaft为Redis集群带来强大的数据一致性

宝塔面板mysql恢复2018.1.8更新

Centos7 MySQL 5.7 安装MySQL 5.7 安装

查找入职员工时间排名倒数第三的员工所有信息

Hibernate使用Hibernate的“3个准备，7个步骤”Hibernate API简介操作实体对象对象识别

云计算面试题——mysql/存储引擎/备份

SQL语言基础：常用的数据查询语句

Ubuntu16.04安装Apache+MySQL+PHP1. 安装Apache2. 安装MySQL3. 安装PHP4. 安装phpMyAdmin

MySQL的4种隔离级别？出现问题

neo4j之cypher使用文档

mysql使用source命令导入.sql文件