Explain 详解
前言
一条查询语句在经过MySQL
查询优化器的各种基于成本和规则的优化会后生成一个所谓的执行计划
,这个执行计划展示了接下来具体执行查询的方式,比如多表连接的顺序是什么,对于每个表采用什么访问方法来具体执行查询等等。设计MySQL
的大叔贴心的为我们提供了EXPLAIN
语句来帮助我们查看某个查询语句的具体执行计划,本章的内容就是为了帮助大家看懂EXPLAIN
语句的各个输出项都是干嘛使的,从而可以有针对性的提升我们查询语句的性能。
如果我们想看看某个查询的执行计划的话,可以在具体的查询语句前边加一个EXPLAIN
,就像这样:
mysql> EXPLAIN SELECT 1;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.01 sec)
列名 | 描述 |
---|---|
id | 在一个大的查询语句中每个SELECT 关键字都对应一个唯一的id ,相同SELECT的id相同 |
select_type | SELECT 关键字对应的那个查询的类型 |
table | 表名 |
partitions | 匹配的分区信息(忽略) |
type | 针对单表的访问方法(const、ref、range…) |
possible_keys | 可能用到的索引 |
key | 实际上使用的索引 |
key_len | 实际使用到的索引长度 |
ref | 当使用索引列等值查询时,与索引列进行等值匹配的对象信息 |
rows | 预估的需要读取的记录条数 |
filtered | 某个表经过搜索条件过滤后剩余记录条数的百分比 |
Extra | 一些额外的信息 |
创建一个表
CREATE TABLE single_table (id INT NOT NULL AUTO_INCREMENT,key1 VARCHAR(100),key2 INT,key3 VARCHAR(100),key_part1 VARCHAR(100),key_part2 VARCHAR(100),key_part3 VARCHAR(100),common_field VARCHAR(100),PRIMARY KEY (id),KEY idx_key1 (key1),UNIQUE KEY idx_key2 (key2),KEY idx_key3 (key3),KEY idx_key_part(key_part1, key_part2, key_part3)
) Engine=InnoDB CHARSET=utf8;
物化表的提出
对于不相关的IN
子查询,比如这样:
SELECT * FROM s1
WHERE key1 IN (SELECT common_field FROM s2 WHERE key3 = 'a');
我们最开始的感觉就是这种不相关的IN
子查询和不相关的标量子查询或者行子查询是一样一样的,都是把外层查询和子查询当作两个独立的单表查询来对待,可是很遗憾的是设计MySQL
的大叔为了优化IN
子查询倾注了太多心血(毕竟IN
子查询是我们日常生活中最常用的子查询类型),所以整个执行过程并不像我们想象的那么简单(>_<)。
其实说句老实话,对于不相关的IN
子查询来说,如果子查询的结果集中的记录条数很少,那么把子查询和外层查询分别看成两个单独的单表查询效率还是蛮高的,但是如果单独执行子查询后的结果集太多的话,就会导致这些问题:
-
结果集太多,可能内存中都放不下~
-
对于外层查询来说,如果子查询的结果集太多,那就意味着
IN
子句中的参数特别多,这就导致:-
无法有效的使用索引,只能对外层查询进行全表扫描。
-
在对外层查询执行全表扫描时,由于
IN
子句中的参数太多,这会导致检测一条记录是否符合和IN
子句中的参数匹配花费的时间太长。比如说
IN
子句中的参数只有两个:SELECT * FROM tbl_name WHERE column IN (a, b);
这样相当于需要对
tbl_name
表中的每条记录判断一下它的column
列是否符合column = a OR column = b
。在IN
子句中的参数比较少时这并不是什么问题,如果IN
子句中的参数比较多时,比如这样:SELECT * FROM tbl_name WHERE column IN (a, b, c ..., ...);
那么这样每条记录需要判断一下它的
column
列是否符合column = a OR column = b OR column = c OR ...
,这样性能耗费可就多了。
-
于是乎设计MySQL
的大叔想了一个招:不直接将不相关子查询的结果集当作外层查询的参数,而是将该结果集写入一个临时表里。写入临时表的过程是这样的:
- 该临时表的列就是子查询结果集中的列。
- 写入临时表的记录会被去重。
id
-- 一个查询可以对于多个表,但是SELECTid一样
mysql> EXPLAIN SELECT * FROM s1 WHERE key1 = 'a';
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+-------+
| 1 | SIMPLE | s1 | NULL | ref | idx_key1 | idx_key1 | 303 | const | 8 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.03 sec)mysql> EXPLAIN SELECT * FROM s1 INNER JOIN s2;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------------+
| 1 | SIMPLE | s1 | NULL | ALL | NULL | NULL | NULL | NULL | 9688 | 100.00 | NULL |
| 1 | SIMPLE | s2 | NULL | ALL | NULL | NULL | NULL | NULL | 9954 | 100.00 | Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------------+
2 rows in set, 1 warning (0.01 sec)-- 可以通过SELECTid判断是否进行连接
mysql> EXPLAIN SELECT * FROM s1 WHERE key1 IN (SELECT key1 FROM s2) OR key3 = 'a';
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| 1 | PRIMARY | s1 | NULL | ALL | idx_key3 | NULL | NULL | NULL | 9688 | 100.00 | Using where |
| 2 | SUBQUERY | s2 | NULL | index | idx_key1 | idx_key1 | 303 | NULL | 9954 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.02 sec)
-- 通过相同id中不同表的出现顺序可以判断驱动表和被驱动表,如下s2为驱动表
mysql> EXPLAIN SELECT * FROM s1 WHERE key1 IN (SELECT key3 FROM s2 WHERE common_field = 'a');
+----+-------------+-------+------------+------+---------------+----------+---------+-------------------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+----------+---------+-------------------+------+----------+------------------------------+
| 1 | SIMPLE | s2 | NULL | ALL | idx_key3 | NULL | NULL | NULL | 9954 | 10.00 | Using where; Start temporary |
| 1 | SIMPLE | s1 | NULL | ref | idx_key1 | idx_key1 | 303 | xiaohaizi.s2.key3 | 1 | 100.00 | End temporary |
+----+-------------+-------+------------+------+---------------+----------+---------+-------------------+------+----------+------------------------------+
2 rows in set, 1 warning (0.00 sec)
select_type
通过上边的内容我们知道,一条大的查询语句里边可以包含若干个SELECT
关键字,每个SELECT
关键字代表着一个小的查询语句,而每个SELECT
关键字的FROM
子句中都可以包含若干张表(这些表用来做连接查询),每一张表都对应着执行计划输出中的一条记录,对于在同一个SELECT
关键字中的表来说,它们的id
值是相同的。
设计MySQL
的大叔为每一个SELECT
关键字代表的小查询都定义了一个称之为select_type
的属性,意思是我们只要知道了某个小查询的select_type
属性,就知道了这个小查询在整个大查询中扮演了一个什么角色,口说无凭,我们还是先来见识见识这个select_type
都能取哪些值(为了精确起见,我们直接使用文档中的英文做简要描述,随后会进行详细解释的)
名称 | 描述 |
---|---|
SIMPLE | Simple SELECT (not using UNION or subqueries) |
PRIMARY | Outermost SELECT |
UNION | Second or later SELECT statement in a UNION |
UNION RESULT | Result of a UNION |
SUBQUERY | First SELECT in subquery |
DEPENDENT SUBQUERY | First SELECT in subquery, dependent on outer query |
DEPENDENT UNION | Second or later SELECT statement in a UNION, dependent on outer query |
DERIVED | Derived table |
MATERIALIZED | Materialized subquery |
UNCACHEABLE SUBQUERY | A subquery for which the result cannot be cached and must be re-evaluated for each row of the outer query |
UNCACHEABLE UNION | The second or later select in a UNION that belongs to an uncacheable subquery (see UNCACHEABLE SUBQUERY) |
SIMPLE
查询语句中不包含UNION或者子查询的查询都算作是SIMPLE类型
mysql> EXPLAIN SELECT * FROM s1;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | SIMPLE | s1 | NULL | ALL | NULL | NULL | NULL | NULL | 9688 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)mysql> EXPLAIN SELECT * FROM s1 INNER JOIN s2;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------------+
| 1 | SIMPLE | s1 | NULL | ALL | NULL | NULL | NULL | NULL | 9688 | 100.00 | NULL |
| 1 | SIMPLE | s2 | NULL | ALL | NULL | NULL | NULL | NULL | 9954 | 100.00 | Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+---------------------------------------+
2 rows in set, 1 warning (0.01 sec)
PRIMARY
对于包含UNION
、UNION ALL
或者子查询
的大查询来说,它是由几个小查询组成的,其中最左边的那个查询的select_type
值就是PRIMARY
mysql> EXPLAIN SELECT * FROM s1 UNION SELECT * FROM s2;
+----+--------------+------------+------------+------+---------------+------+---------+------+------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+------------+------------+------+---------------+------+---------+------+------+----------+-----------------+
| 1 | PRIMARY | s1 | NULL | ALL | NULL | NULL | NULL | NULL | 9688 | 100.00 | NULL |
| 2 | UNION | s2 | NULL | ALL | NULL | NULL | NULL | NULL | 9954 | 100.00 | NULL |
| NULL | UNION RESULT | <union1,2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+------------+------------+------+---------------+------+---------+------+------+----------+-----------------+
3 rows in set, 1 warning (0.00 sec)
UNION
对于包含UNION
或者UNION ALL
的大查询来说,它是由几个小查询组成的,其中除了最左边的那个小查询以外,其余的小查询的select_type
值就是UNION
,可以对比上一个例子的效果。
UNION RESULT
MySQL
选择使用临时表来完成UNION
查询的去重工作,针对该临时表的查询的select_type
就是UNION RESULT
,例子上边有,就不赘述了。
SUBQUERY
如果包含子查询的查询语句不能够转为对应的semi-join
的形式,并且该子查询是不相关子查询,并且查询优化器决定采用将该子查询物化的方案来执行该子查询时,该子查询的第一个SELECT
关键字代表的那个查询的select_type
就是SUBQUERY
,比如下边这个查询:
mysql> EXPLAIN SELECT * FROM s1 WHERE key1 IN (SELECT key1 FROM s2) OR key3 = 'a';
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| 1 | PRIMARY | s1 | NULL | ALL | idx_key3 | NULL | NULL | NULL | 9688 | 100.00 | Using where |
| 2 | SUBQUERY | s2 | NULL | index | idx_key1 | idx_key1 | 303 | NULL | 9954 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
可以看到,外层查询的select_type
就是PRIMARY
,子查询的select_type
就是SUBQUERY
。需要大家注意的是,由于select_type为SUBQUERY的子查询会被物化,所以只需要执行一遍。
DEPENDENT SUBQUERY
如果包含子查询的查询语句不能够转为对应的semi-join
的形式,并且该子查询是相关子查询,则该子查询的第一个SELECT
关键字代表的那个查询的select_type
就是DEPENDENT SUBQUERY
,比如下边这个查询:
mysql> EXPLAIN SELECT * FROM s1 WHERE key1 IN (SELECT key1 FROM s2 WHERE s1.key2 = s2.key2) OR key3 = 'a';
+----+--------------------+-------+------------+------+-------------------+----------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+------------+------+-------------------+----------+---------+-------------------+------+----------+-------------+
| 1 | PRIMARY | s1 | NULL | ALL | idx_key3 | NULL | NULL | NULL | 9688 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | s2 | NULL | ref | idx_key2,idx_key1 | idx_key2 | 5 | xiaohaizi.s1.key2 | 1 | 10.00 | Using where |
+----+--------------------+-------+------------+------+-------------------+----------+---------+-------------------+------+----------+-------------+
2 rows in set, 2 warnings (0.00 sec)
需要大家注意的是,select_type为DEPENDENT SUBQUERY的查询可能会被执行多次。
DEPENDENT UNION
在包含UNION
或者UNION ALL
的大查询中,如果各个小查询都依赖于外层查询的话,那除了最左边的那个小查询之外,其余的小查询的select_type
的值就是DEPENDENT UNION
DERIVED
对于采用物化的方式执行的包含派生表的查询,该派生表对应的子查询的select_type
就是DERIVED
,比方说下边这个查询:
-- derived_s1为派生表
mysql> EXPLAIN SELECT * FROM (SELECT key1, count(*) as c FROM s1 GROUP BY key1) AS derived_s1 where c > 1;
+----+-------------+------------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 9688 | 33.33 | Using where |
| 2 | DERIVED | s1 | NULL | index | idx_key1 | idx_key1 | 303 | NULL | 9688 | 100.00 | Using index |
+----+-------------+------------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
MATERIALIZED
当查询优化器在执行包含子查询的语句时,选择将子查询物化之后与外层查询进行连接查询时,该子查询对应的select_type
属性就是MATERIALIZED
,比如下边这个查询:
-- 表s2物化成<subquery2>与s1进行连接
mysql> EXPLAIN SELECT * FROM s1 WHERE key1 IN (SELECT key1 FROM s2);
+----+--------------+-------------+------------+--------+---------------+------------+---------+-------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+------------+--------+---------------+------------+---------+-------------------+------+----------+-------------+
| 1 | SIMPLE | s1 | NULL | ALL | idx_key1 | NULL | NULL | NULL | 9688 | 100.00 | Using where |
| 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_key> | <auto_key> | 303 | xiaohaizi.s1.key1 | 1 | 100.00 | NULL |
| 2 | MATERIALIZED | s2 | NULL | index | idx_key1 | idx_key1 | 303 | NULL | 9954 | 100.00 | Using index |
+----+--------------+-------------+------------+--------+---------------+------------+---------+-------------------+------+----------+-------------+
3 rows in set, 1 warning (0.01 sec)
type
可以看到type
列的值表示进行单表访问的访问方法,完整的访问方法如下:system
,const
,eq_ref
,ref
,fulltext
,ref_or_null
,index_merge
,unique_subquery
,index_subquery
,range
,index
,ALL
-- system 当表中只有一条记录并且该表使用的存储引擎的统计数据是精确的,比如MyISAM、Memory-- const 这个我们前边唠叨过,就是当我们根据主键或者唯一二级索引列与常数进行等值匹配时,对单表的访问方法就是const
SELECT * FROM s1 WHERE id = 5;-- eq_ref
-- 在连接查询时,如果被驱动表是通过主键或者唯一二级索引列等值匹配的方式进行访问的(如果该主键或者唯一二级索引是联合索引的话,所有的索引列都必须进行等值比较),则对该被驱动表的访问方法就是eq_ref(我感觉是应该叫eq_const的)
SELECT * FROM s1 INNER JOIN s2 ON s1.id = s2.id;-- ref_or_null
-- 当对普通二级索引进行等值匹配查询,该索引列的值也可以是NULL值时,那么对该表的访问方法就可能是ref_or_null
SELECT * FROM s1 WHERE key1 = 'a' OR key1 IS NULL;-- index_merge
-- 在某些场景下可以使用Intersection、Union、Sort-Union这三种索引合并的方式来执行查询
SELECT * FROM s1 WHERE key1 = 'a' OR key3 = 'a';-- unique_subquery
-- 类似于两表连接中被驱动表的eq_ref访问方法,unique_subquery是针对在一些包含IN子查询的查询语句中,如果查询优化器决定将IN子查询转换为EXISTS子查询,而且子查询可以使用到主键进行等值匹配的话,那么该子查询执行计划的type列的值就是unique_subquery
SELECT * FROM s1 WHERE key2 IN (SELECT id FROM s2 where s1.key1 = s2.key1) OR key3 = 'a';
-- 转化为
SELECT * FROM s1 WHERE EXISTS (SELECT id FROM s2 where s1.key1 = s2.key1 AND s1.key2 = s2.id) OR key3 = 'a';-- index_subquery
-- index_subquery与unique_subquery类似,只不过访问子查询中的表时使用的是普通的索引
SELECT * FROM s1 WHERE common_field IN (SELECT key3 FROM s2 where s1.key1 = s2.key1) OR key3 = 'a';-- range
-- 如果使用索引获取某些范围区间的记录,那么就可能使用到range访问方法,比如下边的
SELECT * FROM s1 WHERE key1 IN ('a', 'b', 'c');
ref
ref
列展示的就是与索引列作等值匹配的东东是个啥,比如只是一个常数或者是某个列。大家看下边这个查询
mysql> EXPLAIN SELECT * FROM s1 WHERE key1 = 'a';
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+-------+
| 1 | SIMPLE | s1 | NULL | ref | idx_key1 | idx_key1 | 303 | const | 8 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+----------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.01 sec)mysql> EXPLAIN SELECT * FROM s1 INNER JOIN s2 ON s1.id = s2.id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+-------+
| 1 | SIMPLE | s1 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 9688 | 100.00 | NULL |
| 1 | SIMPLE | s2 | NULL | eq_ref | PRIMARY | PRIMARY | 4 | s1.id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+---------------+---------+---------+-----------------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)mysql> EXPLAIN SELECT * FROM s1 INNER JOIN s2 ON s2.key1 = UPPER(s1.key1);
+----+-------------+-------+------------+------+---------------+----------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+----------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | s1 | NULL | ALL | NULL | NULL | NULL | NULL | 9688 | 100.00 | NULL |
| 1 | SIMPLE | s2 | NULL | ref | idx_key1 | idx_key1 | 303 | func | 1 | 100.00 | Using index condition |
+----+-------------+-------+------------+------+---------------+----------+---------+------+------+----------+-----------------------+
2 rows in set, 1 warning (0.00 sec)
rows
筛选后预计扫描的索引记录行数
-- 满足key1 > 'z'这个条件的记录只有266条
mysql> EXPLAIN SELECT * FROM s1 WHERE key1 > 'z';
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | s1 | NULL | range | idx_key1 | idx_key1 | 303 | NULL | 266 | 100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)
filtered
最终符合条件数/筛选后预计扫描的索引记录行数
-- common_field = 'a' 占 key1 > 'z' 中的10%
mysql> EXPLAIN SELECT * FROM s1 WHERE key1 > 'z' AND common_field = 'a';
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+------------------------------------+
| 1 | SIMPLE | s1 | NULL | range | idx_key1 | idx_key1 | 303 | NULL | 266 | 10.00 | Using index condition; Using where |
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+------------------------------------+
1 row in set, 1 warning (0.00 sec)