在之前的文章《latch free:cache buffer handles造成的SQL性能问题》中我介绍了cache buffer handle latch的一些知识,在这里我们复习一下: "当会话需要pin住buffer header时它首先要获去buffer handle,得到buffer handle的过程中首先要抢占cache buffer handles栓,为了避免对于cache buffer handles栓的过度争用,每个会话被允许cache一小撮buffer handles,也叫保留集(reserved set)。该保留集的上限由隐式参数_db_handles_cached(默认为5)所控制,在此基础上会话在执行不是十分复杂的SQL时不必反复申请栓。 同时存在一些十分复杂的语句,需要同时pin住大量的缓存,此时若存在空闲的handle,会话被允许索要手头5个cached buffer handles以外的handle。也为了限制单个会话的资源占用量,Oracle通过内部算法总是尽量让每个进程能够pin住”合理份额“的buffer,由此单个会话同一时间所能持有的总handle数等于db_block_buffers/processes,隐式参数_cursor_db_buffers_pinned指出了该值。另cache buffer handles并没有子栓,仅有父栓本身,所以如果系统中有大量复杂SQL,会话所能cache的buffer handles远不足以满足需求时,该栓就成为整个系统性能的瓶颈,甚至可能出现潜在的死锁/饿死问题。" 应网友在T.ASKMACLEAN.COM上的提问,我们介绍更多cache Buffer handle相关的知识: cache buffer handle 结构:
------------------------------
| Buffer state object |
------------------------------
| Place to hang the buffer |
------------------------------
| Consistent Get? |
------------------------------
| Proc Owning SO |
------------------------------
| Flags(RIR) |
------------------------------
来看一个 cache buffer handle
SO: 70000046fdfe530, type: 24, owner: 70000041b018630, flag: INIT/-/-/0x00 (buffer) (CR) PR: 70000048e92d148 FLG: 0x500000 lock rls: 0, class bit: 0 kcbbfbp: [BH: 7000001c7f069b0, LINK: 70000046fdfe570] where: kdswh02: kdsgrp, why: 0 BH (7000001c7f069b0) file#: 12 rdba: 0x03061612 (12/398866) class: 1 ba: 7000001c70ee000 set: 75 blksize: 8192 bsi: 0 set-flg: 0 pwbcnt: 0 dbwrid: 2 obj: 66209 objn: 48710 tsn: 6 afn: 12 hash: [700000485f12138,700000485f12138] lru: [70000025af67790,700000132f69ee0] lru-flags: hot_buffer ckptq: [NULL] fileq: [NULL] objq: [700000114f5dd10,70000028bf5d620] use: [70000046fdfe570,70000046fdfe570] wait: [NULL] st: SCURRENT md: SHR tch: 0 flags: affinity_lock LRBA: [0x0.0.0] HSCN: [0xffff.ffffffff] HSUB: [65535] where: kdswh02: kdsgrp, why: 0 # Example: # (buffer) (CR) PR: 37290 FLG: 0 # kcbbfbp : [BH: befd8, LINK: 7836c] (WAITING) Buffer handle (X$KCBBF) kernel cache, buffer buffer_handles Query x$kcbbf - lists all the buffer handles 相关参数
_db_handles System-wide simultaneous buffer operations ,no of buffer handles _db_handles_cached Buffer handles cached each process , no of processes default 5 _cursor_db_buffers_pinned additional number of buffers a cursor can pin at once _session_kept_cursor_pins Number of cursors pins to keep in a session When a buffer is pinned it is attached to buffer state object. 接下来 我们实际体验一下 cache buffer handles latch 和 buffer pin的影响:
SESSION A :SQL> select * from v$version;BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE 10.2.0.5.0 Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - ProductionSQL> create table test_cbc_handle(t1 int);Table created.SQL> insert into test_cbc_handle values(1);1 row created.SQL> commit;Commit complete.SQL> select rowid from test_cbc_handle; ROWID
------------------
AAANO6AABAAAQZSAAASQL> select * from test_cbc_handle where rowid='AAANO6AABAAAQZSAAA';T1
----------1 SQL> select addr,name from v$latch_parent where name='cache buffer handles';ADDR NAME
---------------- --------------------------------------------------
00000000600140A8 cache buffer handlesSQL> select to_number('00000000600140A8','xxxxxxxxxxxxxxxxxxxx') from dual;TO_NUMBER('00000000600140A8','XXXXXXXXXXXXXXXXXXXX')
----------------------------------------------------1610694824注意cache buffer handles只有一个parent latch 而没有 child latch我们让SESSION A hold 住唯一的一个cache buffer handles parent latch
这会用到 oradebug call kslgetl 命令, kslgetl是oracle内部get latch的函数SQL> oradebug setmypid;
Statement processed.SQL> oradebug call kslgetl 1610694824 1;
Function returned 1另外开一个SESSION B 来观察:SQL> select * from v$latchholder;PID SID LADDR NAME GETS
---------- ---------- ---------------- ---------------------------------------------------------------- ----------15 141 00000000600140A8 cache buffer handles 119cache buffer handles latch 确实被session A hold住了,这将阻塞其他进程acquire cache buffer handle latchSQL> select * from test_cbc_handle where rowid='AAANO6AABAAAQZSAAA';T1
----------1但此时其他Server Process还是可以正常 read buffer, 这是因为隐藏参数"_db_handles_cached",
默认process会cache 5个 cache buffer handle 修改"_db_handles_cached"=0,让process不再拥有5个缓存的cache buffer handle ,
这将导致 process 若需要pin buffer,则必须hold cache buffer handle latch以便分配一个cache buffer handleSQL> alter system set "_db_handles_cached"=0 scope=spfile;System altered.并重启实例 shutdown immediate;
startup;session A:SQL> oradebug setmypid;
Statement processed.SQL> oradebug call kslgetl 1610694824 1;
Function returned 1session B:select * from test_cbc_handle where rowid='AAANO6AABAAAQZSAAA';session B hang!!WHY?SQL> oradebug setmypid;
Statement processed.SQL> oradebug dump systemstate 266;
Statement processed.SO: 0x11b30b7b0, type: 2, owner: (nil), flag: INIT/-/-/0x00(process) Oracle pid=22, calls cur/top: (nil)/0x11b453c38, flag: (0) -int error: 0, call error: 0, sess error: 0, txn error 0(post info) last post received: 0 0 0last post received-location: No postlast process to post me: nonelast post sent: 0 0 0last post sent-location: No postlast process posted by me: none(latch info) wait_event=0 bits=8holding (efd=4) 600140a8 cache buffer handles level=3SO: 0x11b305810, type: 2, owner: (nil), flag: INIT/-/-/0x00(process) Oracle pid=10, calls cur/top: 0x11b455ac0/0x11b450a58, flag: (0) -int error: 0, call error: 0, sess error: 0, txn error 0(post info) last post received: 0 0 0last post received-location: No postlast process to post me: nonelast post sent: 0 0 0last post sent-location: No postlast process posted by me: none(latch info) wait_event=0 bits=2Location from where call was made: kcbzgs:waiting for 600140a8 cache buffer handles level=3FBD93353:000019F0 10 162 10005 1 KSL WAIT BEG [latch: cache buffer handles] 1610694824/0x600140a8 125/0x7d 0/0x0
FF936584:00002761 10 144 10005 1 KSL WAIT BEG [latch: cache buffer handles] 1610694824/0x600140a8 125/0x7d 0/0x0PID=22 holding 持有cache buffer handles latchPID=10 等待 cache buffer handles latch, 这是因为"_db_handles_cached"=0 导致 process自身没有缓存cache buffer handles这个systemstate中找不到 kcbbfbp cache buffer handle对象,
因为 "_db_handles_cached"=0 且 cache buffer handles latch被hold 住了我们释放cache buffer handles latch , 来观察 buffer被pin住而不是释放时的情境session A exit session B:SQL> select * from v$latchholder;no rows selectedSQL> insert into test_cbc_handle values(2);1 row created.SQL> commit;Commit complete.SQL>
SQL> select t1,rowid from test_cbc_handle;T1 ROWID
---------- ------------------1 AAANPAAABAAAQZSAAA2 AAANPAAABAAAQZSAABSQL> select spid,pid from v$process where addr = ( select paddr from v$session where sid=(select distinct sid from v$mystat));SPID PID
------------ ----------
19251 10用 GDB 对 SPID=19215 做debug , 使用 kcbrls 函数作为breakpoint 这会让 进程无法release buffer[oracle@vrh8 ~]$ gdb $ORACLE_HOME/bin/oracle 19251
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-37.el5)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /s01/oracle/product/10.2.0.5/db_1/bin/oracle...(no debugging symbols found)...done.
Attaching to program: /s01/oracle/product/10.2.0.5/db_1/bin/oracle, process 19251
Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libskgxp10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libskgxp10.so
Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libhasgen10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libhasgen10.so
Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libskgxn2.so...(no debugging symbols found)...done.
Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libskgxn2.so
Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libocr10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libocr10.so
Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libocrb10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libocrb10.so
Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libocrutl10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libocrutl10.so
Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libjox10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libjox10.so
Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libclsra10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libclsra10.so
Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libdbcfg10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libdbcfg10.so
Reading symbols from /s01/oracle/product/10.2.0.5/db_1/lib/libnnz10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/oracle/product/10.2.0.5/db_1/lib/libnnz10.so
Reading symbols from /usr/lib64/libaio.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libaio.so.1
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_files.so.2
0x00000035c000d940 in __read_nocancel () from /lib64/libpthread.so.0(gdb) break kcbrlsBreakpoint 1 at 0x10e5d24session B:select * from test_cbc_handle where rowid='AAANPAAABAAAQZSAAA'; select hang !!GDB(gdb) c
Continuing.Breakpoint 1, 0x00000000010e5d24 in kcbrls ()
(gdb) bt
#0 0x00000000010e5d24 in kcbrls ()
#1 0x0000000002e87d25 in qertbFetchByUserRowID ()
#2 0x00000000030c62b8 in opifch2 ()
#3 0x00000000032327f0 in kpoal8 ()
#4 0x00000000013b7c10 in opiodr ()
#5 0x0000000003c3c9da in ttcpip ()
#6 0x00000000013b3144 in opitsk ()
#7 0x00000000013b60ec in opiino ()
#8 0x00000000013b7c10 in opiodr ()
#9 0x00000000013a92f8 in opidrv ()
#10 0x0000000001fa3936 in sou2o ()
#11 0x000000000072d40b in opimai_real ()
#12 0x000000000072d35c in main ()SQL> oradebug setmypid;
Statement processed.SQL> oradebug dump systemstate 266;
Statement processed.此时可以看到 kcbbfbp buffer cache handle 将 SO state object 和 BH BUFFER HEADER link在一起----------------------------------------SO: 0x11b452348, type: 3, owner: 0x11b305810, flag: INIT/-/-/0x00(call) sess: cur 11b41bd18, rec 0, usr 11b41bd18; depth: 0----------------------------------------SO: 0x1182dc750, type: 24, owner: 0x11b452348, flag: INIT/-/-/0x00(buffer) (CR) PR: 0x11b305810 FLG: 0x108000class bit: (nil)kcbbfbp: [BH: 0xf2fc69f8, LINK: 0x1182dc790]where: kdswh05: kdsgrp, why: 0BH (0xf2fc69f8) file#: 1 rdba: 0x00410652 (1/67154) class: 1 ba: 0xf297c000set: 3 blksize: 8192 bsi: 0 set-flg: 2 pwbcnt: 272dbwrid: 0 obj: 54208 objn: 54202 tsn: 0 afn: 1hash: [f2fc47f8,1181f3038] lru: [f2fc6b88,f2fc6968]obj-flags: object_ckpt_listckptq: [1182ecf38,1182ecf38] fileq: [1182ecf58,1182ecf58] objq: [108712a28,108712a28]use: [1182dc790,1182dc790] wait: [NULL]st: XCURRENT md: SHR tch: 12flags: buffer_dirty gotten_in_current_mode block_written_onceredo_since_readLRBA: [0xc7.73b.0] HSCN: [0x0.1cbe52] HSUB: [1]Using State Objects----------------------------------------SO: 0x1182dc750, type: 24, owner: 0x11b452348, flag: INIT/-/-/0x00(buffer) (CR) PR: 0x11b305810 FLG: 0x108000class bit: (nil)kcbbfbp: [BH: 0xf2fc69f8, LINK: 0x1182dc790]where: kdswh05: kdsgrp, why: 0buffer tsn: 0 rdba: 0x00410652 (1/67154)scn: 0x0000.001cbe52 seq: 0x01 flg: 0x02 tail: 0xbe520601frmt: 0x02 chkval: 0x0000 type: 0x06=trans datatab 0, row 0, @0x1f9a
tl: 6 fb: --H-FL-- lb: 0x0 cc: 1
col 0: [ 2] c1 02
tab 0, row 1, @0x1f94
tl: 6 fb: --H-FL-- lb: 0x2 cc: 1
col 0: [ 2] c1 15
end_of_block_dump (buffer) (CR) PR: 0x11b305810 FLG: 0x108000st: XCURRENT md: SHR tch: 12
该 buffer header的status= XCURRENT mode=KCBMSHARE KCBMSHR current share也可以通过 x$kcbbf 找到相关的 cache buffer handleSQL> select distinct KCBBPBH from x$kcbbf ;KCBBPBH
----------------
00
00000000F2FC69F8 ==>0xf2fc69f8SQL> select * from x$kcbbf where kcbbpbh='00000000F2FC69F8';ADDR INDX INST_ID KCBBFSO_TYP KCBBFSO_FLG KCBBFSO_OWN
---------------- ---------- ---------- ----------- ----------- ----------------KCBBFFLG KCBBFCR KCBBFCM KCBBFMBR KCBBPBH
---------- ---------- ---------- ---------------- ----------------
KCBBPBF X0KCBBPBH X0KCBBPBF X1KCBBPBH
---------------- ---------------- ---------------- ----------------
X1KCBBPBF KCBBFBH KCBBFWHR KCBBFWHY
---------------- ---------------- ---------- ----------
00000001182DC750 748 1 24 1 000000011B4523481081344 1 0 00 00000000F2FC69F8
00000001182DC750 00 00000001182DC750 00
00000001182DC7F8 00 583 0SQL> desc x$kcbbf;Name Null? Type----------------------------------------- -------- ----------------------------ADDR RAW(8)INDX NUMBERINST_ID NUMBERKCBBFSO_TYP NUMBERKCBBFSO_FLG NUMBERKCBBFSO_OWN RAW(8)KCBBFFLG NUMBERKCBBFCR NUMBERKCBBFCM NUMBERKCBBFMBR RAW(8)KCBBPBH RAW(8)KCBBPBF RAW(8)X0KCBBPBH RAW(8)X0KCBBPBF RAW(8)X1KCBBPBH RAW(8)X1KCBBPBF RAW(8)KCBBFBH RAW(8)KCBBFWHR NUMBERKCBBFWHY NUMBERgdb 退出 让process能够顺利完成kcbrls release buffer后 相关的cache buffer handle被清理SQL> select distinct KCBBPBH from x$kcbbf ;KCBBPBH
----------------
00