整个表147M,执行时一个CPU耗尽, buffer gets 超过1个G, 启用并行也没有用
今天开发的同事问有个表上的数据为什么删不掉?我看了一下,也就不到100000条数据,表上有外键,等了5分钟hang在那里,时间原因,我对表上的外键禁用后,瞬间删除。
现在来还原这个问题。
sys@ANBOB>select count(*) from bjhr.doctor_exam_member;
COUNT(*)
——————–
92102
sys@ANBOB>delete bjhr.doctor_exam_member;
–hang
–等待10分钟都未执行完,检查表的外键信息
bjhr@ANBOB>SELECT /*+RULE*/D.CONSTRAINT_NAME pk_name,-- d.table_name,D.TABLE_NAME || '.' || D.COLUMN_NAME pk_column,A.CONSTRAINT_TYPE,B.CONSTRAINT_NAME fk_name,B.TABLE_NAME || '.' || B.COLUMN_NAME fk_columnFROM user_constraints aJOIN user_cons_columns bON a.constraint_name = b.constraint_name AND a.owner = b.ownerJOIN user_constraints cON A.R_CONSTRAINT_NAME = C.CONSTRAINT_NAME AND A.R_OWNER = c.ownerJOIN user_cons_columns dON c.constraint_name = d.constraint_name AND c.owner = d.ownerWHERE D.table_name = 'DOCTOR_EXAM_MEMBER' bjhr@ANBOB>/PK_NAME PK_COLUMN C FK_NAME FK_COLUMN -------------------- ---------------------------------------- - ------------------------------ ------------------------------------------------------- PK_DOCTOR_EXAM_MEMBE DOCTOR_EXAM_MEMBER.DOCTOR_EXAM_MEMBER_ID R FK_RESULT_N_REFERENCE_DOCTOR RESULT_NOTIFICATION_RECORD.DOCTOR_EXAM_MEMBER_ID PK_DOCTOR_EXAM_MEMBE DOCTOR_EXAM_MEMBER.DOCTOR_EXAM_MEMBER_ID R RESULT_RE_DOCTOR_MEMBER DOCTOR_EXAM_RESULT.DOCTOR_EXAM_MEMBER_ID
–有外键,之前已对子表进行过删除,否则会报错ORA-02266
delete RESULT_NOTIFICATION_RECORD;
delete DOCTOR_EXAM_RESULT;
commit;
–下面开始分析,创建新的session
sys@ANBOB>select xidsqn,xidusn,object_id,session_id,locked_mode from v$locked_object;XIDSQN XIDUSN OBJECT_ID SESSION_ID LOCKED_MODE -------------------- -------------------- -------------------- -------------------- --------------------2102 203 1639631 2290 32102 203 1639572 2290 3sys@ANBOB>select object_name,object_type from dba_objects where object_id in(1639631,1639572);OBJECT_NAME OBJECT_TYPE ------------------------------ ------------------- DOCTOR_EXAM_MEMBER TABLE DOCTOR_EXAM_RESULT TABLEsys@ANBOB>select event,p1,p2,p1text,p2text,seconds_in_wait,state from v$session_wait where sid=2290; EVENT P1 P2 P1TEXT P2TEXT SECONDS_IN_WAIT STATE ------------------------------ ----------- ----- -------------------- -------------------- -------------------- ------------------- latch: shared pool 1611704464 307 address number 213 WAITED SHORT TIME--trace hanganalyze and systemstatealter session set events 'immediate trace name systemstate level 266'; alter session set events 'immediate trace name hanganalyze level 3';--hanganalyze trace ===============================================================================Chains most likely to have caused the hang:[a] Chain 1 Signature: Chain 1 Signature Hash: 0x673a0128[b] Chain 2 Signature: 'Streams AQ: waiting for messages in the queue'Chain 2 Signature Hash: 0xa00e2e87=============================================================================== Sessions in an involuntary wait or not in a wait:------------------------------------------------------------------------------- Chain 1: -------------------------------------------------------------------------------Oracle session identified by:{instance: 1 (ANBOB.ANBOB)os id: 27158process id: 94, oracle@dev-db (TNS V1-V3)session id: 2290session serial #: 7981}is not in a wait:{last wait: 11 min 0 sec agoblocking: 0 sessionswait history:1. event: 'latch: shared pool'time waited: 0.000114 secwait id: 183 p1: 'address'=0x6010a890p2: 'number'=0x133p3: 'tries'=0x0* time between wait #1 and #2: 1.586255 sec2. event: 'latch: shared pool'time waited: 0.000032 secwait id: 182 p1: 'address'=0x6010a890p2: 'number'=0x133p3: 'tries'=0x0* time between wait #2 and #3: 0.133830 sec3. event: 'latch: shared pool'time waited: 0.000114 secwait id: 181 p1: 'address'=0x6010a890p2: 'number'=0x133p3: 'tries'=0x0}Chain 1 Signature: Chain 1 Signature Hash: 0x673a0128
–对systemstate 没发现可疑信息
[oracle@dev-db ~]$ awk -f ass109.awk /oracle/diag/rdbms/ANBOB/ANBOB/trace/ANBOB_ora_23020.trc
— 奇怪为什么会发生在latch:shared pool上? 应该是sql解析和shared pool相关的事件,随后结束delete,做10046 观察究竟
sys@ANBOB>oradebug setmypid;
Statement processed.
sys@ANBOB>oradebug event 10046 trace name context forever,level 12
Statement processed.
sys@ANBOB>delete bjhr.doctor_exam_member;
92102 rows deleted.
sys@ANBOB>oradebug tracefile_name
/u01/app/oracle/diag/rdbms/anbob/anbob/trace/anbob_ora_7784.trc
sys@ANBOB>oradebug event 10046 trace name context off;
Statement processed.
— 格式化trace,终于发现了答案.
delete bjhr.doctor_exam_member call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 47.30 48.39 201 222 657611 92102 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 47.31 48.39 201 222 657611 92102Elapsed times include waiting on following events:Event waited on Times Max. Wait Total Waited---------------------------------------- Waited ---------- ------------Disk file operations I/O 2 0.00 0.00db file scattered read 26 0.00 0.00db file sequential read 24 0.00 0.00SQL*Net message to client 1 0.00 0.00SQL*Net message from client 1 0.00 0.00 ********************************************************************************-- check deferred objectsselect pctfree_stg, pctused_stg, size_stg,initial_stg, next_stg, minext_stg, maxext_stg, maxsiz_stg, lobret_stg,mintim_stg, pctinc_stg, initra_stg, maxtra_stg, optimal_stg, maxins_stg,frlins_stg, flags_stg, bfp_stg, enc_stg,cmpflag_stg, cmplvl_stg fromdeferred_stg$ where obj# =:1******************************************************************************** select /*+ all_rows */ count(1) from"BJHR"."RESULT_NOTIFICATION_RECORD" where "DOCTOR_EXAM_MEMBER_ID" = :1call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 92102 11.31 11.34 0 0 0 0 Fetch 92102 0.63 0.64 0 0 0 92102 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 184205 11.95 11.99 0 0 0 92102Misses in library cache during parse: 1 Misses in library cache during execute: 1 Optimizer mode: ALL_ROWS Parsing user id: SYS (recursive depth: 1) Number of plan statistics captured: 1Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- ---------------------------------------------------1 1 1 SORT AGGREGATE (cr=0 pr=0 pw=0 time=43 us)0 0 0 TABLE ACCESS FULL RESULT_NOTIFICATION_RECORD (cr=0 pr=0 pw=0 time=12 us cost=3 size=5 card=1)Elapsed times include waiting on following events:Event waited on Times Max. Wait Total Waited---------------------------------------- Waited ---------- ------------latch: shared pool 2 0.00 0.00******************************************************************************** select /*+ all_rows */ count(1) from"BJHR"."DOCTOR_EXAM_RESULT" where "DOCTOR_EXAM_MEMBER_ID" = :1call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 92102 6.97 7.11 0 0 0 0 Fetch 92102 1012.96 1016.14 0 566243096 92102 92102 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 184205 1019.93 1023.25 0 566243096 92102 92102Rows (1st) Rows (avg) Rows (max) Row Source Operation ---------- ---------- ---------- ---------------------------------------------------1 1 1 SORT AGGREGATE (cr=6148 pr=0 pw=0 time=30196 us)0 0 0 TABLE ACCESS FULL DOCTOR_EXAM_RESULT (cr=6148 pr=0 pw=0 time=30184 us cost=1647 size=5 card=1)******************************************************************************** OVERALL TOTALS FOR ALL RECURSIVE STATEMENTScall count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 39 0.00 0.01 0 0 0 0 Execute 184248 18.29 18.46 0 0 0 0 Fetch 184276 1013.60 1016.79 0 566243218 92102 184238 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 368563 1031.90 1035.27 0 566243218 92102 1842381129 elapsed seconds in trace file.
TIP:
在删除doctor_exam_member表时,检查了他的所有参照表(子表),然后对doctor_exam_member表的每次记录都要去参照表查询是否存在,此时刚好参考表的外键列上并无索引,导致每一行记录都会导致FTS(full table scan),这也是查询v$session_event时偶尔出现latch: CBC (hot block)的原因。
你可能疑问子表数据都delete了为什么还查询这么久?我做个小测试
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT;
COUNT(*)
——————–
0
sys@ORA10GR2>select bytes,blocks from dba_segments where segment_name=’DOCTOR_EXAM_RESULT’ and owner=’BJHR_DEV’;
BYTES BLOCKS
——————– ——————–
50331648 6144
sys@ORA10GR2>set autot trace stat
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT where DOCTOR_EXAM_MEMBER_ID=1;
Statistics
———————————————————-
0 recursive calls
0 db block gets
6040 consistent gets
0 physical reads
0 redo size
514 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
sys@ORA10GR2>alter table bjhr_dev.doctor_exam_result enable row movement;
Table altered.
sys@ORA10GR2>alter table bjhr_dev.DOCTOR_EXAM_RESULT shrink space;
Table altered.
sys@ORA10GR2>alter table bjhr_dev.doctor_exam_result disable row movement;
Table altered.
sys@ORA10GR2>select bytes,blocks from dba_segments where segment_name=’DOCTOR_EXAM_RESULT’ and owner=’BJHR_DEV’;
BYTES BLOCKS
——————– ——————–
196608 24
sys@ORA10GR2>select count(*) from bjhr_dev.DOCTOR_EXAM_RESULT where DOCTOR_EXAM_MEMBER_ID=1;
Statistics
———————————————————-
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
514 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
TIP:
FTS查询会遍历表segment 已格式化过所有data block.
Summary:
在建有外键约束的子表列上需要创建索引,对子表全表删除时可以采用truncate 或delete(有外键不能truncate时)后对表进行shrink space操作,或删除父表前对子表的外键约束做Disable.