IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    Oracle bug ORA-600 k2vcbk_2故障恢复

    惜分飞发表于 2015-10-04 14:52:13
    love 0

    联系:手机(13429648788) QQ(107644445)

    链接:http://www.orasos.com/oracle-bug-ora-600-k2vcbk_2%e6%95%85%e9%9a%9c%e6%81%a2%e5%a4%8d.html

    标题:Oracle bug ORA-600 k2vcbk_2故障恢复

    作者:惜分飞©版权所有[文章允许转载,但必须以链接方式注明源地址,否则追究法律责任.]

    有朋友找到我说他们数据库无法启动,数据库启动报ORA-600[k2vcbk_2]错误,数据库版本为11.2.0.2 RAC,操作系统是AIX 6.1

    SQL> recover database;
    Media recovery complete.
    SQL> alter database open;
    alter database open
    *
    ERROR at line 1:
    ORA-01092: ORACLE instance terminated. Disconnection forced
    ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [],
    [], [], [], [], []
    Process ID: 7930020
    Session ID: 49 Serial number: 14761
    

    数据库节点1日志

    Mon Sep 21 15:45:41 2015
    Thread 1 advanced to log sequence 54076 (LGWR switch)
      Current log# 13 seq# 54076 mem# 0: +DG01/xifenfei/onlinelog/group_13.332.779459035
      Current log# 13 seq# 54076 mem# 1: +DG01/xifenfei/onlinelog/group_13.344.779582621
    Mon Sep 21 15:45:44 2015
    Archived Log entry 74655 added for thread 1 sequence 54075 ID 0x5a0bc0e1 dest 1:
    Mon Sep 21 15:56:18 2015
    Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_18088342.trc  (incident=184348):
    ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
    Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184348/xifenfei1_ora_18088342_i184348.trc
    Mon Sep 21 15:56:34 2015
    Use ADRCI or Support Workbench to package the incident.
    See Note 411.1 at My Oracle Support for error and packaging details.
    Error 600 trapped in 2PC on transaction 7.16.120119. Cleaning up.
    Error stack returned to user:
    ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
    Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_ora_18088342.trc  (incident=184349):
    ORA-00603: ORACLE 服务器会话因致命错误而终止
    ORA-00600: 内部错误代码, 参数: [kturPOTS_0], [], [], [], [], [], [], [], [], [], [], []
    Mon Sep 21 15:56:34 2015
    Dumping diagnostic data in directory=[cdmp_20150921155634], requested by (instance=1, osid=18088342), summary=[incident=184348].
    Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184349/xifenfei1_ora_18088342_i184349.trc
    Mon Sep 21 15:56:35 2015
    Sweep [inc][184349]: completed
    Sweep [inc][184348]: completed
    Sweep [inc2][184348]: completed
    opiodr aborting process unknown ospid (18088342) as a result of ORA-603
    Mon Sep 21 15:57:12 2015
    Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_smon_7536810.trc  (incident=184274):
    ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
    Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei1/incident/incdir_184274/xifenfei1_smon_7536810_i184274.trc
    Use ADRCI or Support Workbench to package the incident.
    See Note 411.1 at My Oracle Support for error and packaging details.
    Mon Sep 21 15:57:16 2015
    Dumping diagnostic data in directory=[cdmp_20150921155716], requested by (instance=1, osid=7536810 (SMON)), summary=[incident=184274].
    Fatal internal error happened while SMON was doing active transaction recovery.
    Errors in file /oracle/diag/rdbms/xifenfei/xifenfei1/trace/xifenfei1_smon_7536810.trc:
    ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
    SMON (ospid: 7536810): terminating the instance due to error 474
    Mon Sep 21 15:57:18 2015
    ORA-1092 : opitsk aborting process
    

    数据库节点2日志

    Mon Sep 21 15:21:50 2015
    Archived Log entry 74653 added for thread 2 sequence 23559 ID 0x5a0bc0e1 dest 1:
    Mon Sep 21 15:44:28 2015
    Thread 2 advanced to log sequence 23561 (LGWR switch)
      Current log# 12 seq# 23561 mem# 0: +DG01/xifenfei/onlinelog/group_12.338.779457003
      Current log# 12 seq# 23561 mem# 1: +DG01/xifenfei/onlinelog/group_12.265.779582493
    Mon Sep 21 15:44:31 2015
    Archived Log entry 74654 added for thread 2 sequence 23560 ID 0x5a0bc0e1 dest 1:
    Mon Sep 21 15:45:31 2015
    DISTRIB TRAN xifenfei.1ebab0a5.20.3.1533822
      is local tran 20.3.1533822 (hex=14.03.17677e)
      insert pending committed tran, scn=14590688068086 (hex=d45.28c781f6)
    Mon Sep 21 15:45:31 2015
    DISTRIB TRAN xifenfei.1ebab0a5.20.3.1533822
      is local tran 20.3.1533822 (hex=14.03.17677e))
      delete pending committed tran, scn=14590688068086 (hex=d45.28c781f6)
    Mon Sep 21 15:56:35 2015
    Dumping diagnostic data in directory=[cdmp_20150921155634], requested by (instance=1, osid=18088342), summary=[incident=184348].
    Mon Sep 21 15:57:10 2015
    Error 3135 trapped in 2PC on transaction 20.11.1534704. Cleaning up.
    Error stack returned to user:
    ORA-03135: 连接失去联系
    opidcl aborting process unknown ospid (9175532) as a result of ORA-604
    Mon Sep 21 15:57:17 2015
    Dumping diagnostic data in directory=[cdmp_20150921155716], requested by (instance=1, osid=7536810 (SMON)), summary=[incident=184274].
    Mon Sep 21 15:57:23 2015
    Reconfiguration started (old inc 18, new inc 20)
    List of instances:
     2 (myinst: 2) 
     Global Resource Directory frozen
     * dead instance detected - domain 0 invalid = TRUE 
     Communication channels reestablished
     Master broadcasted resource hash value bitmaps
     Non-local Process blocks cleaned out
    Mon Sep 21 15:57:23 2015
     LMS 2: 3 GCS shadows cancelled, 1 closed, 0 Xw survived
    Mon Sep 21 15:57:23 2015
     LMS 0: 2 GCS shadows cancelled, 0 closed, 0 Xw survived
    Mon Sep 21 15:57:23 2015
     LMS 1: 3 GCS shadows cancelled, 1 closed, 0 Xw survived
     Set master node info 
     Submitted all remote-enqueue requests
     Dwn-cvts replayed, VALBLKs dubious
     All grantable enqueues granted
     Post SMON to start 1st pass IR
    Mon Sep 21 15:57:23 2015
    minact-scn: Inst 2 is now the master inc#:20 mmon proc-id:6816208 status:0x7
    minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0d45.28c2bb5c gcalc-scn:0x0d45.28c3bd2e
    minact-scn: master found reconf/inst-rec before recscn scan old-inc#:20 new-inc#:20
    Mon Sep 21 15:57:23 2015
    Instance recovery: looking for dead threads
     Submitted all GCS remote-cache requests
     Post SMON to start 1st pass IR
     Fix write in gcs resources
    Reconfiguration complete
    Beginning instance recovery of 1 threads
     parallel recovery started with 31 processes
    Started redo scan
    Completed redo scan
     read 12626 KB redo, 1724 data blocks need recovery
    Started redo application at
     Thread 1: logseq 54076, block 184416
    Recovery of Online Redo Log: Thread 1 Group 13 Seq 54076 Reading mem 0
      Mem# 0: +DG01/xifenfei/onlinelog/group_13.332.779459035
      Mem# 1: +DG01/xifenfei/onlinelog/group_13.344.779582621
    Completed redo application of 9.78MB
    Completed instance recovery at
     Thread 1: logseq 54076, block 209669, scn 14590688357285
     1633 data blocks read, 1794 data blocks written, 12626 redo k-bytes read
    Thread 1 advanced to log sequence 54077 (thread recovery)
    Mon Sep 21 15:57:33 2015
    Error 3113 trapped in 2PC on transaction 21.18.1965522. Cleaning up.
    Redo thread 1 internally disabled at seq 54077 (SMON)
    Error stack returned to user:
    ORA-02050: 事务处理 21.18.1965522 已回退, 某些远程数据库可能有问题
    ORA-03113: 通信通道的文件结尾
    ORA-02063: 紧接着 line (起自 ZSK)
    Mon Sep 21 15:57:34 2015
    Archived Log entry 74656 added for thread 1 sequence 54076 ID 0x5a0bc0e1 dest 1:
    Mon Sep 21 15:57:34 2015
    ARC0: Archiving disabled thread 1 sequence 54077
    Archived Log entry 74657 added for thread 1 sequence 54077 ID 0x5a0bc0e1 dest 1:
    Mon Sep 21 15:57:35 2015
    Thread 2 advanced to log sequence 23562 (LGWR switch)
      Current log# 8 seq# 23562 mem# 0: +DG01/xifenfei/onlinelog/group_8.334.779456945
      Current log# 8 seq# 23562 mem# 1: +DG01/xifenfei/onlinelog/group_8.267.779582453
    Mon Sep 21 15:57:36 2015
    Errors in file /oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_smon_6750672.trc  (incident=200218):
    ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
    Incident details in: /oracle/diag/rdbms/xifenfei/xifenfei2/incident/incdir_200218/xifenfei2_smon_6750672_i200218.trc
    Archived Log entry 74658 added for thread 2 sequence 23561 ID 0x5a0bc0e1 dest 1:
    Mon Sep 21 15:57:38 2015
    minact-scn: master continuing after IR
    Mon Sep 21 15:57:41 2015
    Dumping diagnostic data in directory=[cdmp_20150921155741], requested by (instance=2, osid=6750672 (SMON)), summary=[incident=200218].
    Use ADRCI or Support Workbench to package the incident.
    See Note 411.1 at My Oracle Support for error and packaging details.
    Fatal internal error happened while SMON was doing instance transaction recovery.
    Errors in file /oracle/diag/rdbms/xifenfei/xifenfei2/trace/xifenfei2_smon_6750672.trc:
    ORA-00600: internal error code, arguments: [k2vcbk_2], [], [], [], [], [], [], [], [], [], [], []
    SMON (ospid: 6750672): terminating the instance due to error 474
    Mon Sep 21 15:57:41 2015
    ORA-1092 : opitsk aborting process
    Mon Sep 21 15:57:42 2015
    ORA-1092 : opitsk aborting process
    Mon Sep 21 15:57:42 2015
    License high water mark = 291
    Instance terminated by SMON, pid = 6750672
    USER (ospid: 18874814): terminating the instance
    

    通过数据库日志大概可以看出来,由于节点2的分布式事事务异常,而在11.2.0.2中,分布式事务跨节点,引起节点2的pmon清理异常事务,但是由于bug,使得异常事务无法被清理掉,从而引起节点1 crash,节点1 crash之后节点2进行恢复,也因为分布式事务bug,导致smon回滚失败,实例也crash。无法进行回滚导致数据库无法正常启动,通过查询mos发现定位到是Bug 10222544 ORA-600 [k2vpci_2] from multi-branch distributed transaction
    ORA-600-k2vpci_2


    对于这类问题,由于分布事务无法清理,处理方法就是找出来事务人工提交或者直接屏蔽掉这个事务解决该问题
    • ORA-00600[17182],ORA-00600[25027],ORA-00600[kghfrempty:ds]故障处理
    • ORA-01115 ORA-01110 ORA-27067故障恢复案例
    • 记录一次 ORA-600 2663 故障恢复
    • 记录一次数据库异常导致ipc未释放案例
    • 记录一次由于坏块和不恰当恢复引起各种ORA-600案例
    • 分享一次ORA-01113 ORA-01110故障处理过程
    • 异常断电导致current redo损坏处理
    • ORACLE 12C ORA-07445[ktuHistRecUsegCrtMain()+1173]恢复
    • 在数据库open过程中常遇到ORA-01555汇总
    • 一起ORA-600 3020故障恢复的大体思路
    • 使用bbed解决ORA-00600[2662]
    • 模拟跨resetlogs恢复
    • ORA-600 kcratr_nab_less_than_odr故障解决
    • 误杀进程导致rac hang住
    • ORA-00600[kcfrbd_3]故障解决


沪ICP备19023445号-2号
友情链接