一次报警了解：direct path read、enq: KO

一次报警了解：direct path read、enq: KO - fast object checkpoint

背景

今天突然接到订单超时报警，数据库的状态确实惊出一身冷汗，查看系统日志正常，数据库日志正常，load 1-3之间，Session 连接200左右，未发现有负载。于是生成一个ASH报告，感觉比平时要慢很多，不过好在报告生成完成。

生产ASH报告时alert日志输出

Tue Jan 14 11:49:26 2025
Active Session History (ASH) performed an emergency flush. This may mean that ASH is undersized. If emergency flushes are a recurring issue, you may consider increasing ASH size by setting the value of _ASH_SIZE to a sufficiently large value. Currently, ASH size is 50331648 bytes. Both ASH size and the total number of emergency flushes since instance startup can be monitored by running the following query:select total_size,awr_flush_emergency_count from v$ash_info;

ASH报告分析

奇怪的等待事件：direct path read、reliable message、enq: KO - fast object checkpoint

“direct path read”与“enq: KO - fast object checkpoint”之间联系

当进行TABLE FULL SCAN (全表扫描) 或并行查询整个segment时, 11g（由于11g的 Searial Table Scan 新特性）下，因_adaptive_direct_read 默认为true, 达到阀值_small_table_threshold大小的表会被认为是大表，读取数据时不会读入SGA , 而是直接路径读取(direct path read)，而direct path read 是需要将数据从磁盘读取到各session 的PGA中，因为不是读入SGA, 所以读取这些表之前需要在所有数据库节点(如果是RAC)触发object level checkpoint，将这些表被修改过的dirty buffer写入磁盘，以保证读取数据的一致性。

涉及隐含参数

SQL> col KSPPINM for a50 
SQL> col KSPPSTVL for a100
SQL> select ksppinm, ksppstvl from x$ksppi pi, x$ksppcv cv   where cv.indx = pi.indx and pi.ksppinm like '_adaptive_direct_read';
KSPPINM                                            KSPPSTVL
-------------------------------------------------- -----------------------------------------------
_adaptive_direct_read                              TRUESQL> select ksppinm, ksppstvl from x$ksppi pi, x$ksppcv cv   where cv.indx = pi.indx and pi.ksppinm like '_small_table_threshold';KSPPINM                                            KSPPSTVL
-------------------------------------------------- -----------------------------------------------
_small_table_threshold                             74400  --> 单位为blocks

Reliable Message

MOS 上对于reliable message的解释如下:<Document 1951729.1.pdf>
When a process sends a message using the ‘KSR’ intra-instance broadcast service,the message publisher waits on this wait-event until all subscribers have consumed the ‘reliable message’ just sent. The publisher waits on this wait-event for up to one second and then re-tests if all subscribers have consumed the message, or until posted. If the message is not fully consumed the wait recurs,repeating until either the message is consumed or until the waiter is interrupted.
译：
当进程使用“KSR”实例内广播服务发送消息时，消息发布者会等待此等待事件，直到所有订阅者都使用了刚刚发送的“可靠消息”。发布者等待此等待事件长达一秒钟，然后重新测试是否所有订阅者都已使用该消息，或者直到发布为止。如果消息未被完全消耗，等待会再次出现，重复直到消息被消耗或服务员被打断。

此等待事件是一个非空闲等待事件，用于跟踪Oracle数据库中的许多不同类型的通道通信，当这个等待出现时，一定意味着某些进程的通道通讯受到阻塞。当然影响主要是进程级别的,不是全局影响。
是针对各种channel的，不同的channel 处理方式不一。gv$channel_waits 视图里查询问题最严重的 channel，可以看到“kxfp control signal channel ”遥遥领先其他CHANNEL。

SELECT CHANNEL, SUM(wait_count) sum_wait_countFROM GV$CHANNEL_WAITSGROUP BY CHANNELORDER BY SUM(wait_count) DESC;

引发事件全表扫描的SQL：

查看执行计划

SQL> select * from  table(dbms_xplan.display_awr('f9h1zk5z96gqv')) ; PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------
SQL_ID f9h1zk5z96gqv
--------------------
...隐藏SQL
Plan hash value: 312768246
---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                 | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                      |       |       |   337K(100)|          |       |       |
|   1 |  NESTED LOOPS OUTER          |                      |     1 |   135 |   337K  (1)| 01:07:26 |       |       |
|   2 |   PARTITION RANGE ALL        |                      |     1 |    83 |   337K  (1)| 01:07:26 |     1 |1048575|
|   3 |    PARTITION HASH ALL        |                      |     1 |    83 |   337K  (1)| 01:07:26 |     1 |    32 |
|   4 |     TABLE ACCESS FULL        | ORDER_*******        |     1 |    83 |   337K  (1)| 01:07:26 |     1 |1048575|
|   5 |   TABLE ACCESS BY INDEX ROWID| ************_INFO    |     1 |    52 |     1   (0)| 00:00:01 |       |       |
|   6 |    INDEX UNIQUE SCAN         | PK_*********         |     1 |       |     0   (0)|          |       |       |
--------------------------------------------------------------------------------------------------
21 rows selected.

表大小 BLOCKS：1264937 > 74400，全表扫描肯定是：“direct path read”

SQL> select table_name,blocks from user_tables  where table_name ='ORDER_*******';
TABLE_NAME                         BLOCKS
------------------------------ ----------
ORDER_*******                     1264937

总结

经与开发人员确认，是因为有客户批量处理订单导致性能SQL：“f9h1zk5z96gqv”的全表扫描20G的并发量过大，从而导致等待事件直接路径读取(direct path read)，而direct path read 是需要将数据从磁盘读取到各session 的PGA中，因为不是读入SGA, 所以读取这些表之前需要在所有数据库节点(如果是RAC)触发object level checkpoint，将这些表被修改过的dirty buffer写入磁盘，以保证读取数据的一致性。
等待checkpoint完成。相当于 ckpt 或 dbwr 阻塞了读(读入到SGA)。这时，enq: KO - fast object checkpoint 等待事件发生。如果发生这种等待事件，一个简单的查询可能也会变得非常慢。