IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    200T 数据库非归档无备份恢复

    惜分飞发表于 2024-08-14 08:23:25
    love 0

    联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

    标题:200T 数据库非归档无备份恢复

    作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

    一套近200T的,6个节点的RAC,由于存储管线链路不稳定,导致服务器经常性掉盘,引起asm 磁盘组频繁dismount/mount,数据库集群节点不停的重启,修复好链路问题之后,数据库启动报ORA-01113,ORA-01110
    ORA-01113-ORA-01110


    通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)脚本检测,发现有10个数据文件异常,无法正常恢复
    20240814155122

    该库比较大,有近200T,因此恢复需要各位谨慎(无法做现场备份,另外客户要求2天时间必须恢复好)
    200t

    由于数据库是非归档模式,该库无法通过应用归档日志来实现对这些文件进行恢复,对于这种情况,直接使用dbms_diskgroup把数据文件头拷贝到文件系统中,类似操作

    SQL> @dbms_diskgroup_cp_block_to_asm.sql  /tmp/xff/xifenfei.dbf.header  +DATA/xifenfei.dbf 1 1 
    
    Parameter 1:
    v_FsFileName (required)
    
    
    Parameter 2:
    v_AsmFileName (required)
    
    
    Parameter 3
    v_offstart (required)
    
    
    Parameter 4
    v_numblks (required)
    
    old  16: v_FsFileName := '&v_FsFileName';
    new  16: v_FsFileName := '/tmp/xff/xifenfei.dbf.header';
    old  17: v_AsmFileName := '&v_AsmFileName';
    new  17: v_AsmFileName := '+DATA/xifenfei.dbf';
    old  18: v_offstart := '&v_offstart';
    new  18: v_offstart := '1';
    old  19:  v_numblks := '&v_numblks';
    new  19:  v_numblks := '1';
    File: +DATA/xifenfei.dbf
    Type: 2 Data File
    Size (in logical blocks): 3978880
    Logical Block Size: 16384
    
    PL/SQL procedure successfully completed.
    

    然后通过bbed修改相关scn

    BBED> set filename 'xifenfei.dbf.header'
    	FILENAME       	xifenfei.dbf.header
    
    BBED> set blocksize 16384
    	BLOCKSIZE      	16384
    
    BBED> map
     File: xifenfei.dbf.header (0)
     Block: 1                                     Dba:0x00000000
    ------------------------------------------------------------
     Data File Header
    
     struct kcvfh, 860 bytes                    @0       
    
     ub4 tailchk                                @16380   
    
    
    BBED> p kcvfh.kcvfhckp.kcvcpscn
    struct kcvcpscn, 8 bytes                    @484     
       ub4 kscnbas                              @484      0xa8061324
       ub2 kscnwrp                              @488      0x0081
    
    BBED> assign file 295 block 1 kcvfh.kcvfhckp.kcvcpscn = file 1 block 1 kcvfh.kcvfhckp.kcvcpscn;
    struct kcvcpscn, 8 bytes                    @484     
       ub4 kscnbas                              @484      0xa8133e2b
       ub2 kscnwrp                              @488      0x0081
    
    

    然后把修改的数据文件头写回到asm中

    SQL> @dbms_diskgroup_cp_block_to_asm.sql  /tmp/xff/xifenfei.dbf.header  +DATA/xifenfei.dbf 1 1 
    
    Parameter 1:
    v_FsFileName (required)
    
    
    Parameter 2:
    v_AsmFileName (required)
    
    
    Parameter 3
    v_offstart (required)
    
    
    Parameter 4
    v_numblks (required)
    
    old  16: v_FsFileName := '&v_FsFileName';
    new  16: v_FsFileName := '/tmp/xff/xifenfei.dbf.header';
    old  17: v_AsmFileName := '&v_AsmFileName';
    new  17: v_AsmFileName := '+DATA/xifenfei.dbf';
    old  18: v_offstart := '&v_offstart';
    new  18: v_offstart := '1';
    old  19:  v_numblks := '&v_numblks';
    new  19:  v_numblks := '1';
    File: +DATA/xifenfei.dbf
    Type: 2 Data File
    Size (in logical blocks): 3978880
    Logical Block Size: 16384
    
    PL/SQL procedure successfully completed.
    

    查询文件头是否修改成功

    [oracle@xff1 xff]$ sqlplus / as sysdba
    
    SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 10 16:45:02 2024
    
    Copyright (c) 1982, 2011, Oracle.  All rights reserved.
    
    
    Connected to:
    Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
    With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
    Data Mining and Real Application Testing options
    
    SQL> set numw 16
    SQL> select CHECKPOINT_CHANGE# from v$datafile_header where file# in (1,295);
    
    CHECKPOINT_CHANGE#
    ------------------
          556870614571
          556870614571
    
    SQL> recover datafile 295;
    Media recovery complete.
    

    通过上述操作,确认bbed修改文件头成功,后续类似方法对其他9个文件进行修改,并打开数据库

    SQL> recover database;
    Media recovery complete.
    SQL> alter database open;
    
    Database altered.
    

    alert日志提示

    Sat Aug 10 16:46:11 2024
    ALTER DATABASE RECOVER  datafile 295  
    Media Recovery Start
    Serial Media Recovery started
    WARNING! Recovering data file 295 from a fuzzy backup. It might be an online
    backup taken without entering the begin backup command.
    Media Recovery Complete (xff1)
    Completed: ALTER DATABASE RECOVER  datafile 295  
    Sat Aug 10 16:46:39 2024
    ALTER DATABASE RECOVER  database  
    Media Recovery Start
     started logmerger process
    Sat Aug 10 16:46:51 2024
    WARNING! Recovering data file 1139 from a fuzzy backup. It might be an online
    backup taken without entering the begin backup command.
    WARNING! Recovering data file 1140 from a fuzzy backup. It might be an online
    backup taken without entering the begin backup command.
    WARNING! Recovering data file 1601 from a fuzzy backup. It might be an online
    backup taken without entering the begin backup command.
    WARNING! Recovering data file 1803 from a fuzzy backup. It might be an online
    backup taken without entering the begin backup command.
    WARNING! Recovering data file 1827 from a fuzzy backup. It might be an online
    backup taken without entering the begin backup command.
    WARNING! Recovering data file 1931 from a fuzzy backup. It might be an online
    backup taken without entering the begin backup command.
    WARNING! Recovering data file 2185 from a fuzzy backup. It might be an online
    backup taken without entering the begin backup command.
    WARNING! Recovering data file 2473 from a fuzzy backup. It might be an online
    backup taken without entering the begin backup command.
    WARNING! Recovering data file 2616 from a fuzzy backup. It might be an online
    backup taken without entering the begin backup command.
    Sat Aug 10 16:46:54 2024
    Parallel Media Recovery started with 64 slaves
    Media Recovery Complete (xff1)
    Completed: ALTER DATABASE RECOVER  database  
    Sat Aug 10 17:19:58 2024
    alter database open
    This instance was first to open
    Sat Aug 10 17:19:58 2024
    SUCCESS: diskgroup DATA was mounted
    Sat Aug 10 17:19:58 2024
    NOTE: dependency between database xff and diskgroup resource ora.DATA.dg is established
    Sat Aug 10 17:20:10 2024
    Picked broadcast on commit scheme to generate SCNs
    Sat Aug 10 17:20:10 2024
    SUCCESS: diskgroup REDO was mounted
    Sat Aug 10 17:20:10 2024
    NOTE: dependency between database xff and diskgroup resource ora.REDO.dg is established
    Thread 1 opened at log sequence 124958
      Current log# 14 seq# 124958 mem# 0: +REDO/xff/log2.ora
    Successful open of redo thread 1
    MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
    Sat Aug 10 17:20:14 2024
    SMON: enabling cache recovery
    Instance recovery: looking for dead threads
    Instance recovery: lock domain invalid but no dead threads
    [33770] Successfully onlined Undo Tablespace 2.
    Undo initialization finished serial:0 start:261099864 end:261100854 diff:990 (9 seconds)
    Verifying file header compatibility for 11g tablespace encryption..
    Verifying 11g file header compatibility for tablespace encryption completed
    SMON: enabling tx recovery
    Database Characterset is ZHS16GBK
    Sat Aug 10 17:20:16 2024
    minact-scn: Inst 1 is now the master inc#:2 mmon proc-id:33650 status:0x7
    minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000
    Starting background process GTX0
    Sat Aug 10 17:20:16 2024
    GTX0 started with pid=45, OS id=34119 
    Starting background process RCBG
    Sat Aug 10 17:20:16 2024
    RCBG started with pid=46, OS id=34121 
    replication_dependency_tracking turned off (no async multimaster replication found)
    Starting background process QMNC
    Sat Aug 10 17:20:16 2024
    QMNC started with pid=47, OS id=34134 
    Starting background process SMCO
    Completed: alter database open
    

    其他集群其他节点数据库,一切正常
    20240814162201


    检查数据字典一致性

    SQL> @hcheck.sql
    HCheck Version 07MAY18 on 10-AUG-2024 18:24:49
    ----------------------------------------------
    Catalog Version 11.2.0.3.0 (1102000300)
    db_name: XFF
    
    				   Catalog	 Fixed
    Procedure Name			   Version    Vs Release    Timestamp
    Result
    ------------------------------ ... ---------- -- ---------- --------------
    ------
    .- LobNotInObj		       ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS
    .- MissingOIDOnObjCol	       ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS
    .- SourceNotInObj	       ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS
    .- OversizedFiles	       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
    .- PoorDefaultStorage	       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
    .- PoorStorage		       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
    .- TabPartCountMismatch        ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
    .- OrphanedTabComPart	       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
    .- MissingSum$		       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
    .- MissingDir$		       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
    .- DuplicateDataobj	       ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS
    .- ObjSynMissing	       ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
    .- ObjSeqMissing	       ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
    .- OrphanedUndo 	       ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
    .- OrphanedIndex	       ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
    .- OrphanedIndexPartition      ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS
    .- OrphanedIndexSubPartition   ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- OrphanedTable	       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- OrphanedTablePartition      ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- OrphanedTableSubPartition   ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- MissingPartCol	       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- OrphanedSeg$ 	       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- OrphanedIndPartObj#	       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- DuplicateBlockUse	       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- FetUet		       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- Uet0Check		       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- SeglessUET		       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- BadInd$		       ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS
    .- BadTab$		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- BadIcolDepCnt	       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- ObjIndDobj		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- TrgAfterUpgrade	       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- ObjType0		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- BadOwner		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- StmtAuditOnCommit	       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- BadPublicObjects	       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- BadSegFreelist	       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- BadDepends		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- CheckDual		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- ObjectNames		       ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS
    .- BadCboHiLo		       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
    .- ChkIotTs		       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
    .- NoSegmentIndex	       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
    .- BadNextObject	       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
    .- DroppedROTS		       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
    .- FilBlkZero		       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
    .- DbmsSchemaCopy	       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
    .- OrphanedObjError	       ... 1102000300 >  1102000000 08/10 18:24:54 PASS
    .- ObjNotLob		       ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS
    .- MaxControlfSeq	       ... 1102000300 <=  *All Rel* 08/10 18:24:55 PASS
    .- SegNotInDeferredStg	       ... 1102000300 >  1102000000 08/10 18:25:18 PASS
    .- SystemNotRfile1	       ... 1102000300 >   902000000 08/10 18:25:18 PASS
    .- DictOwnNonDefaultSYSTEM     ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS
    .- OrphanTrigger	       ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS
    .- ObjNotTrigger	       ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS
    ---------------------------------------
    10-AUG-2024 18:25:18  Elapsed: 29 secs
    ---------------------------------------
    Found 0 potential problem(s) and 0 warning(s)
    
    PL/SQL procedure successfully completed.
    
    Statement processed.
    
    Complete output is in trace file:
    /u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_71148_HCHECK.trc
    

    运气不错,数据字典本身没有损坏,业务直接运行,一切正常(主要原因是在光纤链路不稳定的情况下,客户已经没有往库中写入数据)

    • ORA-00333 ORA-01595 恢复
    • ORA-00316 ORA-00312故障处理
    • 断电引起文件scn异常数据库恢复
    • 硬件故障导致ORA-600 2662错误处理
    • 记录一次ORA-600 3004 恢复过程和处理思路
    • ORA-600 ktbsdp2 处理
    • resetlogs失败故障恢复-ORA-01555
    • Oracle断电故障处理
    • Oracle Recovery Tools快速恢复ORA-19909
    • ORA-00742 ORA-00312 恢复
    • 存储故障后oracle报—ORA-01122/ORA-01207故障处理
    • 分布式存储故障导致数据库无法启动故障处理


沪ICP备19023445号-2号
友情链接