PostgreSQL VACUUM 之深入浅出 (二)

Python微信订餐小程序课程视频

https://edu.csdn.net/course/detail/36074

Python实战量化交易理财系统

https://edu.csdn.net/course/detail/35475

AUTOVACUUM

AUTOVACUUM 简介

PostgreSQL 提供了 AUTOVACUUM 的机制。

autovacuum 不仅会自动进行 VACUUM,也会自动进行 ANALYZE,以分析统计信息用于执行计划。

在 postgresql.conf 中,autovacuum 参数已默认打开。

|  | autovacuum = on |

autovacuum 打开后,会有一个 autovacuum launcher 进程

|  | $ ps -ef|grep postgres|grep autovacuum|grep -v grep |
|  | postgres 28398 28392 0 Nov13 ? 00:00:19 postgres: autovacuum launcher  |

pg_stat_activity 也可以看到 backend_type 为 autovacuum launcher 的连接:

|  | psql -d alvindb -U postgres |
|  | alvindb=# \x |
|  | Expanded display is on. |
|  | alvindb=# SELECT * FROM pg\_stat\_activity WHERE backend\_type = 'autovacuum launcher'; |
|  | -[ RECORD 1 ]----+------------------------------ |
|  | datid |  |
|  | datname |  |
|  | pid | 28398 |
|  | usesysid |  |
|  | usename |  |
|  | application\_name |  |
|  | client\_addr |  |
|  | client\_hostname |  |
|  | client\_port |  |
|  | backend\_start | 2021-11-13 23:18:00.406618+08 |
|  | xact\_start |  |
|  | query\_start |  |
|  | state\_change |  |
|  | wait\_event\_type | Activity |
|  | wait\_event | AutoVacuumMain |
|  | state |  |
|  | backend\_xid |  |
|  | backend\_xmin |  |
|  | query |  |
|  | backend\_type | autovacuum launcher |

那么 AUTOVACUUM 多久运行一次?

autovacuum launcher 会每隔 autovacuum_naptime ,创建 autovacuum worker,检查是否需要做 autovacuum。

|  | psql -d alvindb -U postgres |
|  | alvindb=# SELECT * FROM pg\_stat\_activity WHERE backend\_type = 'autovacuum worker'; |
|  | -[ RECORD 1 ]----+------------------------------ |
|  | datid | 13220 |
|  | datname | postgres |
|  | pid | 32457 |
|  | usesysid | |
|  | usename | |
|  | application\_name | |
|  | client\_addr | |
|  | client\_hostname | |
|  | client\_port | |
|  | backend\_start | 2021-11-06 23:32:53.880281+08 |
|  | xact\_start | |
|  | query\_start | |
|  | state\_change | |
|  | wait\_event\_type | |
|  | wait\_event | |
|  | state | |
|  | backend\_xid | |
|  | backend\_xmin | |
|  | query | |
|  | backend\_type | autovacuum worker |

autovacuum_naptime 默认为 1min:

|  | #autovacuum\_naptime = 1min # time between autovacuum runs |

autovacuum 又是根据什么标准决定是否进行 VACUUM 和 ANALYZE 呢?

当 autovacuum worker 检查到,

dead tuples 大于 vacuum threshold 时,会自动进行 VACUUM。

vacuum threshold 公式如下:

|  | vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuples |

增删改的行数据大于 analyze threshold 时,会自动进行 ANALYZE。

analyze threshold 公式如下:

|  | analyze threshold = analyze base threshold + analyze scale factor * number of tuples |

对应 postgresql.conf 中相关参数如下:

|  | #autovacuum\_vacuum\_threshold = 50 # min number of row updates before vacuum |
|  | #autovacuum\_analyze\_threshold = 50 # min number of row updates before analyze |
|  | #autovacuum\_vacuum\_scale\_factor = 0.2 # fraction of table size before vacuum |
|  | #autovacuum\_analyze\_scale\_factor = 0.1 # fraction of table size before analyze |

dead tuples 为 pg_stat_user_tables.n_dead_tup(Estimated number of dead rows)

|  | alvindb=> SELECT * FROM pg\_stat\_user\_tables WHERE schemaname = 'alvin' AND relname = 'tb\_test\_vacuum'; |
|  | -[ RECORD 1 ]-------+--------------- |
|  | relid | 37409 |
|  | schemaname | alvin |
|  | relname | tb\_test\_vacuum |
|  | seq\_scan | 2 |
|  | seq\_tup\_read | 0 |
|  | idx\_scan | 0 |
|  | idx\_tup\_fetch | 0 |
|  | n\_tup\_ins | 0 |
|  | n\_tup\_upd | 0 |
|  | n\_tup\_del | 0 |
|  | n\_tup\_hot\_upd | 0 |
|  | n\_live\_tup | 0 |
|  | n\_dead\_tup | 0 |
|  | n\_mod\_since\_analyze | 0 |
|  | last\_vacuum | |
|  | last\_autovacuum | |
|  | last\_analyze | |
|  | last\_autoanalyze | |
|  | vacuum\_count | 0 |
|  | autovacuum\_count | 0 |
|  | analyze\_count | 0 |
|  | autoanalyze\_count | 0 |

那么 number of tuples 是哪个列的值?是 pg_stat_user_tables.n_live_tup(Estimate number of live rows)?还是实际的 count 值?

其实是 pg_class.reltuples (Estimate number of live rows in the table used by the planner)。

|  | alvindb=> SELECT u.schemaname,u.relname,c.reltuples,u.n\_live\_tup,u.n\_mod\_since\_analyze,u.n\_dead\_tup,u.last\_autoanalyze,u.last\_autovacuum |
|  | FROM |
|  |  pg\_stat\_user\_tables u, pg\_class c, pg\_namespace n |
|  | WHERE n.oid = c.relnamespace |
|  | AND c.relname = u.relname |
|  | AND n.nspname = u.schemaname |
|  | AND u.schemaname = 'alvin' |
|  | AND u.relname = 'tb\_test\_vacuum' |
|  | -[ RECORD 1 ]-------+--------------- |
|  | schemaname | alvin |
|  | relname | tb\_test\_vacuum |
|  | reltuples | 0 |
|  | n\_live\_tup | 0 |
|  | n\_mod\_since\_analyze | 0 |
|  | n\_dead\_tup | 0 |
|  | last\_autoanalyze | |
|  | last\_autovacuum | |

所以 AUTO VACUUM 具体公式如下:

|  | pg\_stat\_user\_tables.n\_dead\_tup > autovacuum\_vacuum\_threshold + autovacuum\_vacuum\_scale\_factor * pg\_class.reltuples |

同理,AUTO ANALYZE 具体公式如下:

|  | pg\_stat\_user\_tables.n\_mod\_since\_analyze > autovacuum\_analyze\_threshold + autovacuum\_analyze\_scale\_factor * pg\_class.reltuples |

精准触发 AUTOVACUUM

下面实测一下 autovacuum。为了测试方便,autovacuum_naptime 临时修改为 5s,这样触发了临界条件,只需要等 5s 就能看到效果,而不是等 1min。

修改参数如下:

|  | autovacuum\_naptime = 5s |
|  | autovacuum\_vacuum\_threshold = 100 # min number of row updates before vacuum |
|  | autovacuum\_analyze\_threshold = 100 # min number of row updates before analyze |
|  | autovacuum\_vacuum\_scale\_factor = 0.2 # fraction of table size before vacuum |
|  | autovacuum\_analyze\_scale\_factor = 0.1 # fraction of table size before analyze |

接下来通过一步一步测试,精准触发 autovacuum。

为了方便测试,通过如下 AUTOVACUUM 计算 SQL 计算需要删除或修改的数据行数。

|  | alvindb=> WITH v AS ( |
|  | SELECT * FROM |
|  |  (SELECT setting AS autovacuum\_vacuum\_scale\_factor FROM pg\_settings WHERE name = 'autovacuum\_vacuum\_scale\_factor') vsf, |
|  |  (SELECT setting AS autovacuum\_vacuum\_threshold FROM pg\_settings WHERE name = 'autovacuum\_vacuum\_threshold') vth, |
|  |  (SELECT setting AS autovacuum\_analyze\_scale\_factor FROM pg\_settings WHERE name = 'autovacuum\_analyze\_scale\_factor') asf, |
|  |  (SELECT setting AS autovacuum\_analyze\_threshold FROM pg\_settings WHERE name = 'autovacuum\_analyze\_threshold') ath |
|  | ), |
|  | t AS ( |
|  | SELECT |
|  |  c.reltuples,u.* |
|  | FROM |
|  |  pg\_stat\_user\_tables u, pg\_class c, pg\_namespace n |
|  | WHERE n.oid = c.relnamespace |
|  | AND c.relname = u.relname |
|  | AND n.nspname = u.schemaname |
|  | AND u.schemaname = 'alvin' |
|  | AND u.relname = 'tb\_test\_vacuum' |
|  | ) |
|  | SELECT |
|  |  schemaname, |
|  |  relname, |
|  |  autovacuum\_vacuum\_scale\_factor, |
|  |  autovacuum\_vacuum\_threshold, |
|  |  autovacuum\_analyze\_scale\_factor, |
|  |  autovacuum\_analyze\_threshold, |
|  |  n\_live\_tup, |
|  |  reltuples, |
|  |  autovacuum\_analyze\_trigger, |
|  |  n\_mod\_since\_analyze, |
|  |  autovacuum\_analyze\_trigger - n\_mod\_since\_analyze AS rows\_to\_mod\_before\_auto\_analyze, |
|  |  last\_autoanalyze, |
|  |  autovacuum\_vacuum\_trigger, |
|  |  n\_dead\_tup, |
|  |  autovacuum\_vacuum\_trigger - n\_dead\_tup AS rows\_to\_delete\_before\_auto\_vacuum, |
|  |  last\_autovacuum |
|  | FROM ( |
|  | SELECT |
|  |  schemaname, |
|  |  relname, |
|  |  autovacuum\_vacuum\_scale\_factor, |
|  |  autovacuum\_vacuum\_threshold, |
|  |  autovacuum\_analyze\_scale\_factor, |
|  |  autovacuum\_analyze\_threshold, |
|  | floor(autovacuum\_analyze\_scale\_factor::numeric * reltuples) + 1 + autovacuum\_analyze\_threshold::int AS autovacuum\_analyze\_trigger, |
|  | floor(autovacuum\_vacuum\_scale\_factor::numeric * reltuples) + 1 + autovacuum\_vacuum\_threshold::int AS autovacuum\_vacuum\_trigger, |
|  |  reltuples, |
|  |  n\_live\_tup, |
|  |  n\_dead\_tup, |
|  |  n\_mod\_since\_analyze, |
|  |  last\_autoanalyze, |
|  |  last\_autovacuum |
|  | FROM |
|  |  v, |
|  |  t) a; |
|  | -[ RECORD 1 ]---------------------+--------------- |
|  | schemaname | alvin |
|  | relname | tb\_test\_vacuum |
|  | autovacuum\_vacuum\_scale\_factor | 0.2 |
|  | autovacuum\_vacuum\_threshold | 100 |
|  | autovacuum\_analyze\_scale\_factor | 0.1 |
|  | autovacuum\_analyze\_threshold | 100 |
|  | n\_live\_tup | 0 |
|  | reltuples | 0 |
|  | autovacuum\_analyze\_trigger | 101 |
|  | n\_mod\_since\_analyze | 0 |
|  | rows\_to\_mod\_before\_auto\_analyze | 101 |
|  | last\_autoanalyze | |
|  | autovacuum\_vacuum\_trigger | 101 |
|  | n\_dead\_tup | 0 |
|  | rows\_to\_delete\_before\_auto\_vacuum | 101 |
|  | last\_autovacuum | |

根据计算公式,

|  | pg\_stat\_user\_tables.n\_mod\_since\_analyze > 100 + 0.1 * 0 |

即当修改的行数大于 100,即为 101 时,将触发 AUTO ANALYZE。

先插入 100 行数据,

|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 20:45:57.669183+08 |
|  | (1 row) |
|  | alvindb=> INSERT INTO tb\_test\_vacuum(test\_num) SELECT gid FROM generate\_series(1,100,1) gid; |
|  | INSERT 0 100 |

此时,通过如下计算可以看到,再更新 1 行,将触发 AUTO ANALYZE。

|  | schemaname | alvin |
|  | relname | tb\_test\_vacuum |
|  | autovacuum\_vacuum\_scale\_factor | 0.2 |
|  | autovacuum\_vacuum\_threshold | 100 |
|  | autovacuum\_analyze\_scale\_factor | 0.1 |
|  | autovacuum\_analyze\_threshold | 100 |
|  | n\_live\_tup | 100 |
|  | reltuples | 0 |
|  | autovacuum\_analyze\_trigger | 101 |
|  | n\_mod\_since\_analyze | 100 |
|  | rows\_to\_mod\_before\_auto\_analyze | 1 |
|  | last\_autoanalyze |  |
|  | autovacuum\_vacuum\_trigger | 101 |
|  | n\_dead\_tup | 0 |
|  | rows\_to\_delete\_before\_auto\_vacuum | 101 |
|  | last\_autovacuum |  |

此时,统计信息为空:

|  | alvindb=> SELECT * FROM pg\_stats WHERE schemaname = 'alvin' AND tablename = 'tb\_test\_vacuum'; |
|  | (0 rows) |

现在插入最后一条数据,

|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 20:46:31.034422+08 |
|  | (1 row) |
|  | alvindb=> INSERT INTO tb\_test\_vacuum(test\_num) SELECT gid FROM generate\_series(101,101,1) gid; |
|  | INSERT 0 1 |

执行 AUTOVACUUM 计算 SQL, 可以看到,已触发 AUTO ANALYZE:

|  | schemaname | alvin |
|  | relname | tb\_test\_vacuum |
|  | autovacuum\_vacuum\_scale\_factor | 0.2 |
|  | autovacuum\_vacuum\_threshold | 100 |
|  | autovacuum\_analyze\_scale\_factor | 0.1 |
|  | autovacuum\_analyze\_threshold | 100 |
|  | n\_live\_tup | 101 |
|  | reltuples | 101 |
|  | autovacuum\_analyze\_trigger | 111 |
|  | n\_mod\_since\_analyze | 0 |
|  | rows\_to\_mod\_before\_auto\_analyze | 111 |
|  | last\_autoanalyze | 2021-11-06 20:46:39.88796+08 |
|  | autovacuum\_vacuum\_trigger | 121 |
|  | n\_dead\_tup | 0 |
|  | rows\_to\_delete\_before\_auto\_vacuum | 121 |
|  | last\_autovacuum | |

可以看到表 tb_test_vacuum 统计信息已更新:

|  | alvindb=> SELECT * FROM pg\_stats WHERE schemaname = 'alvin' AND tablename = 'tb\_test\_vacuum'; |

查看 PostgreSQL 日志,可以看到

|  | [ 2021-11-06 20:46:39.887 CST 6816 6186792f.1aa0 1 3/173948 13179359]LOG: automatic analyze of table "alvindb.alvin.tb\_test\_vacuum" system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s |

PostgreSQL 日志中是否记录 AUTOVACUUM 由参数 log_autovacuum_min_duration 控制,默认关闭。

|  | #log\_autovacuum\_min\_duration = -1 # -1 disables, 0 logs all actions and |
|  | # their durations, > 0 logs only |
|  | # actions running at least this number |
|  | # of milliseconds. |

可将该参数改为 0,即记录所有的 AUTOVACUUM 操作。

|  | log\_autovacuum\_min\_duration = 0 |

AUTOVACUUM 计算 SQL 的执行结果得知,再修改 111 行将触发 AUTO ANALYZE。

|  | rows\_to\_mod\_before\_auto\_analyze | 111 |
|  | rows\_to\_delete\_before\_auto\_vacuum | 121 |

先修改 110 行,并 sleep 6s。

|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------ |
|  | 2021-11-06 20:47:30.75553+08 |
|  | (1 row) |
|  | alvindb=> INSERT INTO tb\_test\_vacuum(test\_num) SELECT gid FROM generate\_series(102,111,1) gid; |
|  | INSERT 0 10 |
|  | alvindb=> UPDATE tb\_test\_vacuum SET test\_num = test\_num WHERE test\_num <= 100; |
|  | UPDATE 100 |
|  | alvindb=> SELECT pg\_sleep(6); |
|  |  pg\_sleep  |
|  | ---------- |
|  |  |
|  | (1 row) |
|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 20:47:43.465651+08 |
|  | (1 row) |

AUTOVACUUM 计算 SQL 的执行结果得知,修改后 110 行并 sleep 6s (前面已将 autovacuum_naptime 设置成了 5s)后,AUTO ANALYZE 并未触发。

|  | schemaname | alvin |
|  | relname | tb\_test\_vacuum |
|  | autovacuum\_vacuum\_scale\_factor | 0.2 |
|  | autovacuum\_vacuum\_threshold | 100 |
|  | autovacuum\_analyze\_scale\_factor | 0.1 |
|  | autovacuum\_analyze\_threshold | 100 |
|  | n\_live\_tup | 111 |
|  | reltuples | 101 |
|  | autovacuum\_analyze\_trigger | 111 |
|  | n\_mod\_since\_analyze | 110 |
|  | rows\_to\_mod\_before\_auto\_analyze | 1 |
|  | last\_autoanalyze | 2021-11-06 20:46:39.88796+08 |
|  | autovacuum\_vacuum\_trigger | 121 |
|  | n\_dead\_tup | 100 |
|  | rows\_to\_delete\_before\_auto\_vacuum | 21 |
|  | last\_autovacuum | |

再修改 1 行预计将触发 AUTO ANALYZE。此时删除一行:

|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 20:47:55.746411+08 |
|  | (1 row) |
|  | alvindb=> DELETE FROM tb\_test\_vacuum WHERE test\_id = 111; |
|  | DELETE 1 |
|  | alvindb=> SELECT pg\_sleep(6); |
|  |  pg\_sleep  |
|  | ---------- |
|  |  |
|  | (1 row) |
|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 20:48:01.796389+08 |
|  | (1 row) |

AUTOVACUUM 计算 SQL 的查询结果中的 last_autoanalyze 得知,已精准触发 AUTO ANALYZE。

并且从 rows_to_delete_before_auto_vacuum 得知,预计删除 22 行后,将触发 AUTO VACUUM。

|  | schemaname | alvin |
|  | relname | tb\_test\_vacuum |
|  | autovacuum\_vacuum\_scale\_factor | 0.2 |
|  | autovacuum\_vacuum\_threshold | 100 |
|  | autovacuum\_analyze\_scale\_factor | 0.1 |
|  | autovacuum\_analyze\_threshold | 100 |
|  | n\_live\_tup | 110 |
|  | reltuples | 110 |
|  | autovacuum\_analyze\_trigger | 112 |
|  | n\_mod\_since\_analyze | 0 |
|  | rows\_to\_mod\_before\_auto\_analyze | 112 |
|  | last\_autoanalyze | 2021-11-06 20:48:04.928899+08 |
|  | autovacuum\_vacuum\_trigger | 123 |
|  | n\_dead\_tup | 101 |
|  | rows\_to\_delete\_before\_auto\_vacuum | 22 |
|  | last\_autovacuum | |

先删除 (UPDATE = DELETE + INSERT) 21 行:

|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 20:48:32.313706+08 |
|  | (1 row) |
|  |  |
|  | alvindb=> UPDATE tb\_test\_vacuum SET test\_num = test\_num WHERE test\_num <= 21; |
|  | UPDATE 21 |
|  | alvindb=> SELECT pg\_sleep(6); |
|  |  pg\_sleep  |
|  | ---------- |
|  |  |
|  | (1 row) |
|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 20:48:38.454997+08 |
|  | (1 row) |

AUTOVACUUM 计算 SQL 的查询结果中的 last_autovacuum 得知,还未触发 AUTO VACUUM。

并且从 rows_to_delete_before_auto_vacuum 得知,预计删除 1 行后,将触发 AUTO VACUUM。

|  | schemaname | alvin |
|  | relname | tb\_test\_vacuum |
|  | autovacuum\_vacuum\_scale\_factor | 0.2 |
|  | autovacuum\_vacuum\_threshold | 100 |
|  | autovacuum\_analyze\_scale\_factor | 0.1 |
|  | autovacuum\_analyze\_threshold | 100 |
|  | n\_live\_tup | 110 |
|  | reltuples | 110 |
|  | autovacuum\_analyze\_trigger | 112 |
|  | n\_mod\_since\_analyze | 21 |
|  | rows\_to\_mod\_before\_auto\_analyze | 91 |
|  | last\_autoanalyze | 2021-11-06 20:48:04.928899+08 |
|  | autovacuum\_vacuum\_trigger | 123 |
|  | n\_dead\_tup | 122 |
|  | rows\_to\_delete\_before\_auto\_vacuum | 1 |
|  | last\_autovacuum | |

此时删除一行

|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 20:48:39.174009+08 |
|  | (1 row) |
|  |  |
|  | alvindb=> DELETE FROM tb\_test\_vacuum WHERE test\_id = 110; |
|  | DELETE 1 |
|  | alvindb=> SELECT pg\_sleep(6); |
|  |  pg\_sleep  |
|  | ---------- |
|  |  |
|  | (1 row) |
|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 20:48:45.213537+08 |
|  | (1 row) |

AUTOVACUUM 计算 SQL 的查询结果中的 last_autovacuum 得知,已精准触发 AUTO VACUUM!

|  | schemaname | alvin |
|  | relname | tb\_test\_vacuum |
|  | autovacuum\_vacuum\_scale\_factor | 0.2 |
|  | autovacuum\_vacuum\_threshold | 100 |
|  | autovacuum\_analyze\_scale\_factor | 0.1 |
|  | autovacuum\_analyze\_threshold | 100 |
|  | n\_live\_tup | 109 |
|  | reltuples | 109 |
|  | autovacuum\_analyze\_trigger | 111 |
|  | n\_mod\_since\_analyze | 22 |
|  | rows\_to\_mod\_before\_auto\_analyze | 89 |
|  | last\_autoanalyze | 2021-11-06 20:48:04.928899+08 |
|  | autovacuum\_vacuum\_trigger | 122 |
|  | n\_dead\_tup | 0 |
|  | rows\_to\_delete\_before\_auto\_vacuum | 122 |
|  | last\_autovacuum | 2021-11-06 20:48:49.914345+08 |

查看 PostgreSQL 日志,可以看到

|  | [ 2021-11-06 20:48:49.914 CST 7207 618679b1.1c27 1 3/174162 0]LOG: automatic vacuum of table "alvindb.alvin.tb\_test\_vacuum": index scans: 1 |
|  | pages: 0 removed, 1 remain, 0 skipped due to pins, 0 skipped frozen |
|  | tuples: 123 removed, 109 remain, 0 are dead but not yet removable, oldest xmin: 13179371 |
|  | buffer usage: 59 hits, 4 misses, 4 dirtied |
|  | avg read rate: 121.832 MB/s, avg write rate: 121.832 MB/s |
|  | system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s |
|  | buffer usage: 59 hits, 4 misses, 4 dirtied |
|  | avg read rate: 121.832 MB/s, avg write rate: 121.832 MB/s |
|  | system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s |

那么问题来了,autovacuum_vacuum_scale_factor 为 0.2 对于所有的表都合适吗?1 亿数据量的表有 2000 万 dead tuples 以上才会触发 AUTO VACUUM,这意味着表越大越不容易触发 AUTO VACUUM。怎么可以解决这个问题呢?

精准触发表级 AUTOVACUUM

可以根据需要,在表上设置合理的 autovacuum_vacuum_scale_factor。对于大表,可以设置小点的 autovacuum_vacuum_scale_factor,如 0.1。

下面带你一步一步设置并精确触发表级的 AUTO ANALYZE 和 AUTO VACUUM。

这次将采用大一点的数据量进行测试。考虑到手动创建表,插入数据等比较麻烦,接下来测试利用 PostgreSQL 自带的工具 pgbench。

使用 pgbench 创建 10 万行数据的测试表:

|  | $ pgbench -i alvindb |
|  | dropping old tables... |
|  | creating tables... |
|  | generating data... |
|  | 100000 of 100000 tuples (100%) done (elapsed 0.38 s, remaining 0.00 s) |
|  | vacuuming... |
|  | creating primary keys... |
|  | done. |

修改表级参数:

|  | alvindb=> ALTER TABLE pgbench\_accounts SET (autovacuum\_vacuum\_scale\_factor = 0.1, autovacuum\_vacuum\_threshold = 2000); |
|  | ALTER TABLE |
|  | alvindb=> ALTER TABLE pgbench\_accounts SET (autovacuum\_analyze\_scale\_factor = 0.05, autovacuum\_analyze\_threshold = 2000); |
|  | ALTER TABLE |

按照之前 AUTOVACUUM 计算 SQL ,可知要修改 11001 行才会触发 AUTO ANALYZE, 要有约 21001 个 dead tuples 才会触发 AUTO VACUUM。

|  | schemaname | public |
|  | relname | pgbench\_accounts |
|  | autovacuum\_vacuum\_scale\_factor | 0.2 |
|  | autovacuum\_vacuum\_threshold | 1000 |
|  | autovacuum\_analyze\_scale\_factor | 0.1 |
|  | autovacuum\_analyze\_threshold | 1000 |
|  | n\_live\_tup | 100000 |
|  | reltuples | 100000 |
|  | autovacuum\_analyze\_trigger | 11001 |
|  | n\_mod\_since\_analyze | 0 |
|  | rows\_to\_mod\_before\_auto\_analyze | 11001 |
|  | last\_autoanalyze | |
|  | autovacuum\_vacuum\_trigger | 21001 |
|  | n\_dead\_tup | 0 |
|  | rows\_to\_delete\_before\_auto\_vacuum | 21001 |
|  | last\_autovacuum | |
|  |  |

现在设置了表级的参数以后,从如下 表级 AUTOVACUUM 计算 SQL ,可知修改 7001 行就可以触发 AUTO ANALYZE, 有约 12001 个 dead tuples 就可以触发 AUTO VACUUM。更重要的是,表级的 AUTOVACUUM 参数不会对其他表产生影响,只对已设置的表有效,也可以对不同大小的表设置不同的参数,还可以随时调整!

表级 AUTOVACUUM 计算 SQL

|  | alvindb=> WITH v AS ( |
|  | SELECT (SELECT split\_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum\_vacuum\_scale\_factor=' ) as autovacuum\_vacuum\_ |
|  | scale\_factor, |
|  |  (SELECT split\_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum\_vacuum\_threshold=' ) as autovacuum\_vacuum\_thresh |
|  | old, |
|  |  (SELECT split\_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum\_analyze\_scale\_factor=' ) as autovacuum\_analyze\_s |
|  | cale\_factor, |
|  |  (SELECT split\_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum\_analyze\_threshold=' ) as autovacuum\_analyze\_thre |
|  | shold |
|  | FROM pg\_class c |
|  | LEFT JOIN pg\_namespace n ON n.oid = c.relnamespace |
|  | WHERE n.nspname IN ('public') |
|  | AND c.relname = 'pgbench\_accounts' |
|  | ), |
|  | t AS ( |
|  | SELECT |
|  |  c.reltuples,u.* |
|  | FROM |
|  |  pg\_stat\_user\_tables u, pg\_class c, pg\_namespace n |
|  | WHERE n.oid = c.relnamespace |
|  | AND c.relname = u.relname |
|  | AND n.nspname = u.schemaname |
|  | AND u.schemaname = 'public' |
|  | AND u.relname = 'pgbench\_accounts' |
|  | ) |
|  | SELECT |
|  |  schemaname, |
|  |  relname, |
|  |  autovacuum\_vacuum\_scale\_factor, |
|  |  autovacuum\_vacuum\_threshold, |
|  |  autovacuum\_analyze\_scale\_factor, |
|  |  autovacuum\_analyze\_threshold, |
|  |  n\_live\_tup, |
|  |  reltuples, |
|  |  autovacuum\_analyze\_trigger, |
|  |  n\_mod\_since\_analyze, |
|  |  autovacuum\_analyze\_trigger - n\_mod\_since\_analyze AS rows\_to\_mod\_before\_analyze, |
|  |  last\_autoanalyze, |
|  |  autovacuum\_vacuum\_trigger, |
|  |  n\_dead\_tup, |
|  |  autovacuum\_vacuum\_trigger - n\_dead\_tup AS rows\_to\_delete\_before\_vacuum, |
|  |  last\_autovacuum |
|  | FROM ( |
|  | SELECT |
|  |  schemaname, |
|  |  relname, |
|  |  autovacuum\_vacuum\_scale\_factor, |
|  |  autovacuum\_vacuum\_threshold, |
|  |  autovacuum\_analyze\_scale\_factor, |
|  |  autovacuum\_analyze\_threshold, |
|  | floor(autovacuum\_analyze\_scale\_factor::numeric * reltuples) + 1 + autovacuum\_analyze\_threshold::int AS autovacuum\_analyze\_trigger, |
|  | floor(autovacuum\_vacuum\_scale\_factor::numeric * reltuples) + 1 + autovacuum\_vacuum\_threshold::int AS autovacuum\_vacuum\_trigger, |
|  |  reltuples, |
|  |  n\_live\_tup, |
|  |  n\_dead\_tup, |
|  |  n\_mod\_since\_analyze, |
|  |  last\_autoanalyze, |
|  |  last\_autovacuum |
|  | FROM |
|  |  v, |
|  |  t) a; |
|  | -[ RECORD 1 ]-------------------+----------------- |
|  | schemaname | public |
|  | relname | pgbench\_accounts |
|  | autovacuum\_vacuum\_scale\_factor | 0.1 |
|  | autovacuum\_vacuum\_threshold | 2000 |
|  | autovacuum\_analyze\_scale\_factor | 0.05 |
|  | autovacuum\_analyze\_threshold | 2000 |
|  | n\_live\_tup | 100000 |
|  | reltuples | 100000 |
|  | autovacuum\_analyze\_trigger | 7001 |
|  | n\_mod\_since\_analyze | 0 |
|  | rows\_to\_mod\_before\_analyze | 7001 |
|  | last\_autoanalyze | |
|  | autovacuum\_vacuum\_trigger | 12001 |
|  | n\_dead\_tup | 0 |
|  | rows\_to\_delete\_before\_vacuum | 12001 |
|  | last\_autovacuum | |

现在已预测到要修改的行数,接下来一步一步来触发一下表级的 AUTO ANALYZE 和 AUTO VACUUM。

先删除 7000 行数据:

|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 23:33:03.252622+08 |
|  | (1 row) |
|  | alvindb=> DELETE FROM pgbench\_accounts WHERE aid<=7000; |
|  | DELETE 7000 |
|  | alvindb=> SELECT pg\_sleep(6); |
|  |  pg\_sleep  |
|  | ---------- |
|  |  |
|  | (1 row) |
|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 23:33:09.363536+08 |
|  | (1 row) |

根据表级 AUTOVACUUM 计算 SQL 执行结果的 rows_to_mod_before_analyze 得知,再修改 1 行将触发 AUTO ANALYZE:

|  | schemaname | public |
|  | relname | pgbench\_accounts |
|  | autovacuum\_vacuum\_scale\_factor | 0.1 |
|  | autovacuum\_vacuum\_threshold | 2000 |
|  | autovacuum\_analyze\_scale\_factor | 0.05 |
|  | autovacuum\_analyze\_threshold | 2000 |
|  | n\_live\_tup | 93000 |
|  | reltuples | 100000 |
|  | autovacuum\_analyze\_trigger | 7001 |
|  | n\_mod\_since\_analyze | 7000 |
|  | rows\_to\_mod\_before\_analyze | 1 |
|  | last\_autoanalyze | |
|  | autovacuum\_vacuum\_trigger | 12001 |
|  | n\_dead\_tup | 7000 |
|  | rows\_to\_delete\_before\_vacuum | 5001 |
|  | last\_autovacuum | |

再修改 1 行:

|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 23:33:30.649717+08 |
|  | (1 row) |
|  | alvindb=> UPDATE pgbench\_accounts SET bid = bid WHERE aid=7001; |
|  | UPDATE 1 |
|  | alvindb=> SELECT pg\_sleep(6); |
|  |  pg\_sleep  |
|  | ---------- |
|  |  |
|  | (1 row) |
|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 23:33:36.705928+08 |
|  | (1 row) |

根据表级 AUTOVACUUM 计算 SQL 执行结果的 last_autoanalyze 得知,已精准触发 AUTO ANALYZE!

|  | schemaname | public |
|  | relname | pgbench\_accounts |
|  | autovacuum\_vacuum\_scale\_factor | 0.1 |
|  | autovacuum\_vacuum\_threshold | 2000 |
|  | autovacuum\_analyze\_scale\_factor | 0.05 |
|  | autovacuum\_analyze\_threshold | 2000 |
|  | n\_live\_tup | 93000 |
|  | reltuples | 93000 |
|  | autovacuum\_analyze\_trigger | 6651 |
|  | n\_mod\_since\_analyze | 0 |
|  | rows\_to\_mod\_before\_analyze | 6651 |
|  | last\_autoanalyze | 2021-11-06 23:33:40.87317+08 |
|  | autovacuum\_vacuum\_trigger | 11301 |
|  | n\_dead\_tup | 7001 |
|  | rows\_to\_delete\_before\_vacuum | 4300 |
|  | last\_autovacuum | |

从 PostgreSQL 日志中也可以看到 AUTO ANALYZE 被触发了:

|  | [ 2021-11-06 23:33:40.873 CST 32646 6186a054.7f86 1 6/1393 13179750]LOG: automatic analyze of table "alvindb.public.pgbench\_accounts" syst |
|  | em usage: CPU: user: 0.04 s, system: 0.03 s, elapsed: 0.11 s |

并且,根据 rows_to_delete_before_vacuum 得知,再删除 4300 行就可以触发 AUTO VACUUM。

接下来先删除 4299 行,以测试临界值:

|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 23:33:43.867176+08 |
|  | (1 row) |
|  | alvindb=> UPDATE pgbench\_accounts SET bid = bid WHERE aid>=95702; |
|  | UPDATE 4299 |
|  | alvindb=> SELECT pg\_sleep(6); |
|  |  pg\_sleep  |
|  | ---------- |
|  |  |
|  | (1 row) |
|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 23:33:50.016447+08 |
|  | (1 row) |

autovacuum_naptime 为 5s,此时并未触发 AUTO VACUUM。

|  | schemaname | public |
|  | relname | pgbench\_accounts |
|  | autovacuum\_vacuum\_scale\_factor | 0.1 |
|  | autovacuum\_vacuum\_threshold | 2000 |
|  | autovacuum\_analyze\_scale\_factor | 0.05 |
|  | autovacuum\_analyze\_threshold | 2000 |
|  | n\_live\_tup | 93000 |
|  | reltuples | 93000 |
|  | autovacuum\_analyze\_trigger | 6651 |
|  | n\_mod\_since\_analyze | 4299 |
|  | rows\_to\_mod\_before\_analyze | 2352 |
|  | last\_autoanalyze | 2021-11-06 23:33:40.87317+08 |
|  | autovacuum\_vacuum\_trigger | 11301 |
|  | n\_dead\_tup | 11300 |
|  | rows\_to\_delete\_before\_vacuum | 1 |
|  | last\_autovacuum | |

再删除 (UPDATE = DELETE + INSERT) 1 行 :

|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 23:33:53.326483+08 |
|  | (1 row) |
|  | alvindb=> UPDATE pgbench\_accounts SET bid = bid WHERE aid=7002; |
|  | UPDATE 1 |
|  | alvindb=> SELECT pg\_sleep(6); |
|  |  pg\_sleep  |
|  | ---------- |
|  |  |
|  | (1 row) |
|  | alvindb=> SELECT clock\_timestamp(); |
|  |  clock\_timestamp  |
|  | ------------------------------- |
|  | 2021-11-06 23:33:59.439375+08 |
|  | (1 row) |

从如下结果中的 last_autovacuum 得知,此时已精确触发 AUTO VACUUM!

|  | schemaname | public |
|  | relname | pgbench\_accounts |
|  | autovacuum\_vacuum\_scale\_factor | 0.1 |
|  | autovacuum\_vacuum\_threshold | 2000 |
|  | autovacuum\_analyze\_scale\_factor | 0.05 |
|  | autovacuum\_analyze\_threshold | 2000 |
|  | n\_live\_tup | 93000 |
|  | reltuples | 93000 |
|  | autovacuum\_analyze\_trigger | 6651 |
|  | n\_mod\_since\_analyze | 4300 |
|  | rows\_to\_mod\_before\_analyze | 2351 |
|  | last\_autoanalyze | 2021-11-06 23:33:40.87317+08 |
|  | autovacuum\_vacuum\_trigger | 11301 |
|  | n\_dead\_tup | 0 |
|  | rows\_to\_delete\_before\_vacuum | 11301 |
|  | last\_autovacuum | 2021-11-06 23:34:00.956936+08 |

从 PostgreSQL 日志中也可以看到 AUTO VACUUM 被触发了:

|  | [ 2021-11-06 23:34:00.956 CST 32710 6186a068.7fc6 1 6/1455 0]LOG: automatic vacuum of table "alvindb.public.pgbench\_accounts": index scans |
|  | : 1 |
|  | pages: 0 removed, 421 remain, 0 skipped due to pins, 0 skipped frozen |
|  | tuples: 2 removed, 93000 remain, 0 are dead but not yet removable, oldest xmin: 13179755 |
|  | buffer usage: 967 hits, 60 misses, 7 dirtied |
|  | avg read rate: 10.067 MB/s, avg write rate: 1.174 MB/s |
|  | system usage: CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.18 s |

公众号

关注 DBA Daily 公众号,第一时间收到文章的更新。
通过一线 DBA 的日常工作,学习实用数据库技术干货!

公众号优质文章推荐

PostgreSQL VACUUM 之深入浅出

华山论剑之 PostgreSQL sequence

[PG Upgrade Series] Extract Epoch Trap

[PG Upgrade Series] Toast Dump Error

GitLab supports only PostgreSQL now

MySQL or PostgreSQL?

PostgreSQL hstore Insight

ReIndex 失败原因调查

PG 数据导入 Hive 乱码问题调查

PostGIS 扩展创建失败原因调查

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/401298.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

基本概念-数据类型

参考&#xff1a;http://edu.51cto.com/roadmap/view/id-59.html5.数据类型5.1 数据类型可以使变量知道如何分配内存空间。例如&#xff0c;char类型占用1个字符&#xff0c;int通常占用4个字节5.2 C 语言常用的数据类型有 int sort 浮点型 double float字符串 char指针&#x…

android webview控件的缩放问题 隐藏缩放控件

利用java的反射机制 public void setZoomControlGone(View view) { Class classType; Field field; try { classType WebView.class; field classType.getDeclaredField("mZoomButtonsController"); field.setAccessible(true); ZoomButtonsController mZoomButton…

分布式概念与协议

Python微信订餐小程序课程视频 https://edu.csdn.net/course/detail/36074 Python实战量化交易理财系统 https://edu.csdn.net/course/detail/35475 分布式协议 分布式理论概念 1. 分布式数据一致性 分布式数据一致性&#xff0c;指的是数据在多个副本中存储时&#xff…

C语言中的转义字符

在字符集中&#xff0c;有一类字符具有这样的特性&#xff1a;当从键盘上输入这个字符时&#xff0c;显示器上就可以显示这个字符&#xff0c;即输入什么就显示什么。这类字符称为可显示字符&#xff0c;如a、b、c、$、和空格符等都是可显示字符。 另一类字符却没有这种特性。它…

最常被程序员们谎称读过的计算机书籍

英文原文&#xff1a;Books Programmers Claim to Have Read 马克吐温曾经说过&#xff0c;所谓经典小说&#xff0c;就是指很多人希望读过&#xff0c;但很少人真正花时间去读的小说。这种说法同样适用于“经典”的计算机书籍。 在 Stack Overflow (以及其它很多软件论坛)上&…

java Web监听器导图详解

监听器是JAVA Web开发中很重要的内容&#xff0c;其中涉及到的知识&#xff0c;可以参考下面导图&#xff1a; Web监听器 1 什么是web监听器&#xff1f; web监听器是一种Servlet中的特殊的类&#xff0c;它们能帮助开发者监听web中的特定事件&#xff0c;比如ServletContext,H…

Linux C/C++ UDP Socket 网络通信

Python微信订餐小程序课程视频 https://edu.csdn.net/course/detail/36074 Python实战量化交易理财系统 https://edu.csdn.net/course/detail/35475 昨晚 Vv 让我给她讲讲网络编程&#xff0c;于是我就傻乎乎的带她入了门… 以下内容为讲课时制作的笔记&#xff5e; 1. sock…

PCB布线规则

PCB布线有单面布线、双面布线及多层布线。布线的方式也有两种&#xff1a;自动布线及交互式布线&#xff0c;在自动布线之前&#xff0c;可以用交互式预先对要求比较严格的线进行布线&#xff0c;输入端与输出端的边线应避免相邻平行&#xff0c;以免产生反射干扰。必要时应加地…

strtok和strtok_r

strtok和strtok_r原型&#xff1a;char *strtok(char *s, char *delim); 功能&#xff1a;分解字符串为一组字符串。s为要分解的字符串&#xff0c;delim为分隔符字符串。 说明&#xff1a;首次调用时&#xff0c;s指向要分解的字符串&#xff0c;之后再次调用要把s设成NULL。 …

ConTeXt 标题前后的空白

由于标题字一般都挺大&#xff0c;所以默认时标题之间的空白比较大&#xff0c;尤其是当多个标题在一起的时&#xff0c;空白就显得格外地大&#xff01; 要去除空白可以这样做&#xff1a;\setuphead[chapter][before\nowhitespace,after\nowhitespace] 当然&#xff0c;我们也…

PHP 实现简单的 倒计时 时分秒

// 以 YII框架为例&#xff1a; C 层代码public function actionIndex(){//php的时间是以秒算。js的时间以毫秒算date_default_timezone_set("Asia/Hong_Kong");//地区//配置每天的活动时间段$starttimestr "18:53:00";//转换为时间戳$starttimestr …

芯片封装名称说明

1、BGA(ball grid array)   球形触点陈列&#xff0c;表面贴装型封装之一。在印刷基板的背面按陈列方式制作出球形凸点用以代替引脚&#xff0c;在印刷基板的正面装配LSI 芯片&#xff0c;然后用模压树脂或灌封方法进行密封。也称为凸点陈列载体(PAC)。引脚可超过200&#xf…

Django ORM

Python微信订餐小程序课程视频 https://edu.csdn.net/course/detail/36074 Python实战量化交易理财系统 https://edu.csdn.net/course/detail/35475 目录* Django ORM ORM实操之数据库迁移 ORM实操之字段的修改 ORM实操之数据的增删改查 数据库同步 ORM创建表关系 Dja…

html5 的百度地图连接

在一些网站上&#xff0c;我们经常会看到一些地址会有一个图标的形式展现&#xff0c;当你点击的时候就会加载一个你点击区域的地图出来&#xff0c;很神奇的一个功能&#xff0c;在之前是没有这样功能的&#xff0c;都是直接写上地址&#xff0c;你要去的话自己找去吧&#xf…

分享25个高质量的移动设备wordpress主题(Mobile theme)

日期&#xff1a;2012-9-10 来源&#xff1a;GBin1.com wordpress毋庸置疑是占有量最大的博客管理系统。提供强大的功能和使用的主题及其自定义模块。随着移动互联网的发展&#xff0c;更多的人开始使用移动设备访问互联网&#xff0c;为了更好的迎合用户的需要&#xff0c;我…

.NET NPOI导出Excel详解

http://www.cnblogs.com/yinrq/p/5590970.html .NET NPOI导出Excel详解 NPOI&#xff0c;顾名思义&#xff0c;就是POI的.NET版本。那POI又是什么呢&#xff1f;POI是一套用Java写成的库&#xff0c;能够帮助开发者在没有安装微软Office的情况下读写Office的文件。 支持的文件格…

c++隐式类型转换存在的陷阱

Python微信订餐小程序课程视频 https://edu.csdn.net/course/detail/36074 Python实战量化交易理财系统 https://edu.csdn.net/course/detail/35475 目录* 目标代码 构造函数定义的隐式类型转换分析a1分析a2分析a3 1|0目标代码 旨在弄懂下面的代码&#xff0c;明确变量a1…

PCB布局布线技巧

1、[问]高频信号布线时要注意哪些问题&#xff1f; [答] 1.信号线的阻抗匹配&#xff1b; 2.与其他信号线的空间隔离&#xff1b; 3.对于数字高频信号&#xff0c;差分线效果会更好&#xff1b; 2、[问]在布板时&#xff0c;如果线密&#xff0c;过孔就可能要多&#xff0c;当然…

Android中将一个图片切割成多个图片[转]

有种场景&#xff0c;我们想将一个图片切割成多个图片。比如我们在开发一个拼图的游戏&#xff0c;就首先要对图片进行切割。 以下是封装好的两个类&#xff0c;可以实现图片的切割。仅供参考和学习。 一个是ImagePiece类&#xff0c;此类保存了一个Bitmap对象和一个标识图片的…

并行开发 —— 第六篇 异步编程模型

在.net里面异步编程模型由来已久&#xff0c;相信大家也知道Begin/End异步模式和事件异步模式&#xff0c;在task出现以后&#xff0c;这些东西都可以被task包装 起来&#xff0c;可能有人会问&#xff0c;这样做有什么好处&#xff0c;下面一一道来。 一&#xff1a; Begin/En…