MySQL 深度分页

文章目录

一、深度分页
二、测试数据
三、分页策略
- 3.1 默认分页
- 3.2 索引分页
- 3.3 子查询
- 3.4 延迟关联

一、深度分页

MySQL 深度分页是指在分页查询数据量比较大的表时，需要访问表中的某一段数据，而这段数据的位置非常靠后，需要通过较大的 offset 来获取目标数据。

二、测试数据

-- 测试表
drop table if exists wide_table;
create table wide_table
(id          bigint auto_increment primary key,a           varchar(255),b           varchar(255),c           varchar(255),d           varchar(255),e           varchar(255),f           varchar(255),g           varchar(255),h           varchar(255),i           varchar(255),create_time datetime default current_timestamp
);

-- 插入十万条记录
delimiter //drop procedure if exists insert_data;
create procedure insert_data()
begindeclare i int default 1;while i <= 100000doinsert into wide_table (a, b, c, d, e, f, g, h, i)values (i, i, i, i, i, i, i, i, i);set i = i + 1;if i % 10000 = 0 thenselect i;end if;end while;
end //delimiter ;call insert_data();

三、分页策略

3.1 默认分页

默认分页即通过 limit #{offset}, #{pageSize} 或 limit #{pageSize} offset #{offset} 来进行分页。二者本质上都是全表扫描，MySQL 会依次取出 pageSize 条记录，然后判断其是否在 offset 后，如果不在则舍弃，继续过滤，所以效率低下。

test> select *from wide_tablelimit 80000, 100
[2024-01-18 14:55:42] 在 335 ms (execution: 298 ms, fetching: 37 ms) 内检索到从 1 开始的 100 行
test> select *from wide_tablelimit 100 offset 80000
[2024-01-18 14:55:43] 在 282 ms (execution: 233 ms, fetching: 49 ms) 内检索到从 1 开始的 100 行

explain
select *
from wide_table
limit 80000, 100;
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----+
# |id|select_type|table     |partitions|type|possible_keys|key |key_len|ref |rows |filtered|Extra|
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----+
# |1 |SIMPLE     |wide_table|null      |ALL |null         |null|null   |null|99551|100     |null |
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----+explain
select *
from wide_table
limit 100 offset 80000;
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----+
# |id|select_type|table     |partitions|type|possible_keys|key |key_len|ref |rows |filtered|Extra|
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----+
# |1 |SIMPLE     |wide_table|null      |ALL |null         |null|null   |null|99551|100     |null |
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----+

3.2 索引分页

索引分页即通过索引字段（一般选取主键）的范围查询以及比较操作进行分页，通过应用索引能够大幅缩短查询时间。

test> select *from wide_tablewhere id between 80000 and 80100
[2024-01-18 15:02:27] 在 224 ms (execution: 184 ms, fetching: 40 ms) 内检索到从 1 开始的 101 行
test> select *from wide_tablewhere id > 80000limit 100
[2024-01-18 14:58:34] 在 218 ms (execution: 185 ms, fetching: 33 ms) 内检索到从 1 开始的 100 行

explain
select *
from wide_table
where id between 80000 and 80100;
# +--+-----------+----------+----------+-----+-------------+-------+-------+----+----+--------+-----------+
# |id|select_type|table     |partitions|type |possible_keys|key    |key_len|ref |rows|filtered|Extra      |
# +--+-----------+----------+----------+-----+-------------+-------+-------+----+----+--------+-----------+
# |1 |SIMPLE     |wide_table|null      |range|PRIMARY      |PRIMARY|8      |null|101 |100     |Using where|
# +--+-----------+----------+----------+-----+-------------+-------+-------+----+----+--------+-----------+explain
select *
from wide_table
where id > 80000
limit 100;
# +--+-----------+----------+----------+-----+-------------+-------+-------+----+-----+--------+-----------+
# |id|select_type|table     |partitions|type |possible_keys|key    |key_len|ref |rows |filtered|Extra      |
# +--+-----------+----------+----------+-----+-------------+-------+-------+----+-----+--------+-----------+
# |1 |SIMPLE     |wide_table|null      |range|PRIMARY      |PRIMARY|8      |null|39420|100     |Using where|
# +--+-----------+----------+----------+-----+-------------+-------+-------+----+-----+--------+-----------+

3.3 子查询

索引分页的局限性在于待查询的表必须有一个自增长的主键，如果主键是 UUID 就无法进行索引分页了，此外在编码时还需要考虑索引断裂导致的重复查询。

我们也可以通过子查询来优化分页，即先通过默认分页操作找到限定的主键，这个操作能够利用主键索引而且只考虑主键这一个字段，因此效率比较高。然后在根据这个主键进行条件查询，从而得到分页数据。

test> select *from wide_tablewhere id >= (select id from wide_table limit 80000, 1)limit 100
[2024-01-18 15:03:34] 在 236 ms (execution: 202 ms, fetching: 34 ms) 内检索到从 1 开始的 100 行

explain
select *
from wide_table
where id >= (select id from wide_table limit 80000, 1)
limit 100;
# +--+-----------+----------+----------+-----+-------------+-------+-------+----+-----+--------+-----------+
# |id|select_type|table     |partitions|type |possible_keys|key    |key_len|ref |rows |filtered|Extra      |
# +--+-----------+----------+----------+-----+-------------+-------+-------+----+-----+--------+-----------+
# |1 |PRIMARY    |wide_table|null      |range|PRIMARY      |PRIMARY|8      |null|39420|100     |Using where|
# |2 |SUBQUERY   |wide_table|null      |index|null         |PRIMARY|8      |null|99551|100     |Using index|
# +--+-----------+----------+----------+-----+-------------+-------+-------+----+-----+--------+-----------+

3.4 延迟关联

延迟关联主要通过减少回表次数来提高查询效率。

分析下面的执行计划：

第三行执行计划 id 最大最先执行，这是一个对 wide_table 表的索引扫描，只遍历索引而无需回表（可以理解为索引覆盖）。最终查询到了从第 80001 条记录开始的 100 条记录的主键，并将这些主键保存在临时表 wt 中。
第一行执行计划对 wt 进行了全表扫描，获取了主键的查询结果集。
第二行执行计划在 wt 中这 100 条主键记录的基础上，通过等值连接（延迟关联）回表获取到了最终的分页结果。

整个查询过程中只有最终的 100 条记录发生了回表，其余记录都只查询了 id 并被过滤掉了，提高了查询的效率。

test> select *from wide_tableinner join (select id from wide_table limit 80000, 100) as wton wide_table.id = wt.id
[2024-01-18 15:03:36] 在 241 ms (execution: 207 ms, fetching: 34 ms) 内检索到从 1 开始的 100 行

explain
select *
from wide_table
inner join (select id from wide_table limit 80000, 100) as wt
on wide_table.id = wt.id;
# +--+-----------+----------+----------+------+-------------+-------+-------+-----+-----+--------+-----------+
# |id|select_type|table     |partitions|type  |possible_keys|key    |key_len|ref  |rows |filtered|Extra      |
# +--+-----------+----------+----------+------+-------------+-------+-------+-----+-----+--------+-----------+
# |1 |PRIMARY    |<derived2>|null      |ALL   |null         |null   |null   |null |80100|100     |null       |
# |1 |PRIMARY    |wide_table|null      |eq_ref|PRIMARY      |PRIMARY|8      |wt.id|1    |100     |null       |
# |2 |DERIVED    |wide_table|null      |index |null         |PRIMARY|8      |null |99551|100     |Using index|
# +--+-----------+----------+----------+------+-------------+-------+-------+-----+-----+--------+-----------+