IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    rac kill 大事物后回滚慢,smon等待DFS lock handle和enq: TX – contention

    惜分飞发表于 2022-11-08 09:34:31
    love 0

    联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

    标题:rac kill 大事物后回滚慢,smon等待DFS lock handle和enq: TX – contention

    作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

    Customer killed a large DELETE statement of a large table in instance 1 in high workload period in RAC environment of 2 instances.
    Transaction recovery did not yet complete for more than 1 hour by querying V$FAST_START_TRANSACTIONS.
    Update on the same table on instance 2 hung more than 1 hour.
    Transaction recovery complete quickly after shutdown instance 2.
    ASH shows high waiting on “enq: TX – row lock contention” and “transaction” without blocking session until instance 2 shutdown after DELETE session killed in instance 1

    SQL> select instance_number,program,event,BLOCKING_INST_ID b_inst,count(session_id) session_cnt,
    count(BLOCKING_SESSION) b_session_cnt,min(sample_time) start_time,max(sample_time) end_time,count(*) cnt
      2  from DBA_HIST_ACTIVE_SESS_HISTORY where sql_id='d1xg4v7rqx3kt' and instance_number=2
    group by instance_number,program,event,BLOCKING_INST_ID
    order by instance_number,start_time
      3    4    5  ;
    
    INSTANCE_NUMBER PROGRAM            EVENT                         B_INST SESSION_CNT B_SESSION_CNT START_TIME END_TIME       CNT
    --------------- ------------------ ----------------------------- ------ ----------- ------------- ---------- ---------- -------
                  2 JDBC Thin Client   enq: TX - row lock contention             728939             0 19:20:34   21:04:25    728939
                  2 JDBC Thin Client   transaction                               489837             0 20:12:44   21:04:25    489837
    

    ASH shows that all sessions of the “enq: TX – row lock contention” without blocking session on instance 2 were waiting for the transaction of the Killed DELETE on instance 1

    SQL> select instance_number,event,trunc(p2/65536) XIDUSN, trunc(mod(p2,65536)) XIDSLOT, 
    p3 XIDSQN,count(session_id) session_cnt,count(BLOCKING_SESSION) b_session_cnt,min(sample_time) start_time,max(sample_time) end_time,count(*) cnt
    from DBA_HIST_ACTIVE_SESS_HISTORY where sql_id='d1xg4v7rqx3kt' and instance_number=2 and 
    to_char(sample_time,'HH24:MI') >= '20:12' AND BLOCKING_SESSION IS NULL AND EVENT='enq: TX - row lock contention'
    group by instance_number,event,trunc(p2/65536) , trunc(mod(p2,65536)) , p3
    order by start_time
    ;
      2    3    4    5
    
    INSTANCE_NUMBER EVENT                            XIDUSN  XIDSLOT   XIDSQN SESSION_CNT B_SESSION_CNT START_TIME END_TIME      CNT
    --------------- ------------------------------ -------- -------- -------- ----------- ------------- ---------- ---------- ------
                  2 enq: TX - row lock contention       932       15   623055      717850             0 20:12:55   21:04:25   717850
    SQL>
    

    ASH shows that SMON of instance 1 was waiting for “DFS lock handle” and “enq: TX – contention” among DELETE session killed in instance 1 and instance 2 shutdown.

    SQL> select instance_number,session_id,session_serial#,program,event,BLOCKING_SESSION b_sid,
    BLOCKING_SESSION_SERIAL# b_serial,BLOCKING_INST_ID b_inst,min(sample_time) start_time,
    max(sample_time) end_time,count(*) cnt
     from DBA_HIST_ACTIVE_SESS_HISTORY where upper(program) like '%SMON%'
    group by instance_number,session_id,session_serial#,program,event,BLOCKING_SESSION ,BLOCKING_SESSION_SERIAL# ,BLOCKING_INST_ID
    order by instance_number,start_time
    ;
    
    INSTANCE_NUMBER SESSION_ID SESSION_SERIAL# PROGRAM                       EVENT                 B_SID  B_SERIAL B_INST START_TIME END_TIME   CNT
    --------------- ---------- --------------- ----------------------------- --------------------- ----- --------- ------ ---------- ---------- ---
                  1       3761               1 oracle@<instance_name> (SMON) DFS lock handle                              19:05:24   21:05:07   128
                  1       3761               1 oracle@<instance_name> (SMON) enq: TX - contention                         20:14:31   21:04:47    50
    

    The P1 of “DFS lock handle” shows that SMON was requesting TA enqueue through cross instance calls which is used for serializing operations on undo segments and undo tablespaces

    SQL> col event for a20
    select instance_number,sample_time,session_id,session_serial#,program,event,p1,p2,p3
      from DBA_HIST_ACTIVE_SESS_HISTORY where upper(program) like '%SMON%' AND EVENT='DFS lock handle' ;
      SQL>   2
    INSTANCE_NUMBER SAMPLE_TIME     SESSION_ID SESSION_SERIAL# PROGRAM                       EVENT                      P1     P2     P3
    --------------- --------------- ---------- --------------- ----------------------------- ---------------- ------------ ------ ------
                  1 19:05:24              3761               1 oracle@<instance_name> (SMON) DFS lock handle    1413545989      3   4722
                  1 20:12:27              3761               1 oracle@<instance_name> (SMON) DFS lock handle    1413545989      3   6668
                  1 20:13:40              3761               1 oracle@<instance_name> (SMON) DFS lock handle    1413545989      3   5787
                  1 20:13:50              3761               1 oracle@<instance_name> (SMON) DFS lock handle    1413545989      3   5580
                  1 20:14:00              3761               1 oracle@<instance_name> (SMON) DFS lock handle    1413545989      3   5826
    ... ...
                  1 21:04:57              3761               1 oracle@<instance_name> (SMON) DFS lock handle    1413545989      3   1347
                  1 21:05:07              3761               1 oracle@<instance_name> (SMON) DFS lock handle    1413545989      3   4582
    
    144 rows selected.
    
    * All of P1 is 1413545989, converted to hexadecimal: ‭54410005‬, 54 ASCII code: T, 41 ASCII code: A => TA enqueue
    

    The P2,P3 of “enq: TX – row lock contention” shows that SMON process was requesting TX enqueue of the UNDO block of the transaction of the Killed DELETE on instance 1.

    SQL> select instance_number,session_id,session_serial#,program,event,trunc(p2/65536) XIDUSN, trunc(mod(p2,65536)) XIDSLOT, p3 XIDSQN,COUNT(1) CNT,min(sample_time) start_time,max(sample_time) end_time
    from DBA_HIST_ACTIVE_SESS_HISTORY WHERE session_id =3761 and SESSION_SERIAL# = 1 and instance_number=1 AND event like 'enq: TX - contention%'
    group by instance_number,session_id,session_serial#,program,event,trunc(p2/65536) , trunc(mod(p2,65536)) , p3
    ;
      2    3    4
    INSTANCE_NUMBER SESSION_ID SESSION_SERIAL# PROGRAM                        EVENT                  XIDUSN  XIDSLOT   XIDSQN CNT START_TIME END_TIME
    --------------- ---------- --------------- ------------------------------ -------------------- -------- -------- -------- --- ---------- ----------
                  1       3761               1 oracle@<instance_name> (SMON)  enq: TX - contention      932       15   623055  50 20:14:31   21:04:47
    
    SQL>
    SQL> select instance_number,session_id,session_serial#,program,event,CURRENT_OBJ#, CURRENT_FILE#, CURRENT_BLOCK#,CURRENT_ROW#,COUNT(1) CNT,min(sample_time) start_time,max(sample_time) end_time
      2  from DBA_HIST_ACTIVE_SESS_HISTORY WHERE session_id =3761 and SESSION_SERIAL# = 1 and instance_number=1 AND event like 'enq: TX - contention%'
    group by instance_number,session_id,session_serial#,program,event,CURRENT_OBJ#, CURRENT_FILE#, CURRENT_BLOCK#,CURRENT_ROW#
    order by start_time
    ;
      3    4    5
    INSTANCE_NUMBER SESSION_ID SESSION_SERIAL# PROGRAM                       EVENT                CURRENT_OBJ# CURRENT_FILE# CURRENT_BLOCK# CURRENT_ROW#   CNT START_TIME END_TIME
    --------------- ---------- --------------- ----------------------------- -------------------- ------------ ------------- -------------- ------------ ----- ---------- ----------
                  1       3761               1 oracle@<instance_name> (SMON) enq: TX - contention            1            42              4            0     1 20:14:31   20:14:31
                  1       3761               1 oracle@<instance_name> (SMON) enq: TX - contention            1             3              3            0     1 20:16:22   20:16:22 <<<<CURRENT_FILE#:3 ==>UNDO
                  1       3761               1 oracle@<instance_name> (SMON) enq: TX - contention            1             3              5            0     4 20:22:17   20:25:49 <<<<CURRENT_FILE#:3 ==>UNDO
                  1       3761               1 oracle@<instance_name> (SMON) enq: TX - contention            0             3           1920            0   3 20:30:13   20:30:33 <<<<CURRENT_FILE#:3 ==>UNDO
                  1       3761               1 oracle@<instance_name> (SMON) enq: TX - contention            0             3          15728            0 23 20:37:58   20:52:39 <<<<CURRENT_FILE#:3 ==>UNDO
                  1       3761               1 oracle@<instance_name> (SMON) enq: TX - contention            1          1002              2            0    18 20:53:09   21:04:47
    
    6 rows selected.
    
    SQL>
    

    解决方案
    Please avoid to kill large transaction during high workload period.
    If has to kill large transaction, shutdown all remote instances can speed up the transaction recovery.
    参考:Transaction Recovery Slow And High Row Lock Contention After Killed Large Transaction in RAC (Doc ID 2668617.1)

    • V$LOCK视图相关知识
    • Hanganalyze分析会话阻塞—锁表
    • 通过sql语句获取awr/statspack逻辑读/物理读
    • To find the TX Enqueue contention in a RAC or OPS environment
    • library cache pin等待事件
    • cursor: pin S wait on X 等待事件
    • oracle open hang 等待cursor: pin S wait on X
    • dblink会话引起library cache lock
    • 分析一例 TX Enqueue contention案例
    • Systemstates分析会话阻塞—锁表
    • library cache lock等待事件
    • ORA-00600: internal error code, arguments: [6006], [1], [], [], [], [], [], []


沪ICP备19023445号-2号
友情链接