IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    删除asmlib磁盘导致磁盘组故障恢复

    惜分飞发表于 2024-11-29 14:03:53
    love 0

    联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

    标题:删除asmlib磁盘导致磁盘组故障恢复

    作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

    有客户执行drop disk磁盘组操作之后,然后立刻从oracle asmlib层面执行了oracleasm deletedisk,并且在操作系统层面delete partition(删除磁盘分区),导致磁盘组直接dismount

    Tue Nov 26 16:44:04 2024
    SQL> alter diskgroup data drop disk DATA_0008 
    NOTE: GroupBlock outside rolling migration privileged region
    Tue Nov 26 08:44:05 2024
    NOTE: stopping process ARB0
    NOTE: rebalance interrupted for group 2/0x28dec0d5 (DATA)
    NOTE: requesting all-instance membership refresh for group=2
    NOTE: membership refresh pending for group 2/0x28dec0d5 (DATA)
    Tue Nov 26 08:44:14 2024
    GMON querying group 2 at 48 for pid 18, osid 27385
    SUCCESS: refreshed membership for 2/0x28dec0d5 (DATA)
    SUCCESS: alter diskgroup data drop disk DATA_0008
    NOTE: starting rebalance of group 2/0x28dec0d5 (DATA) at power 2
    Starting background process ARB0
    Tue Nov 26 08:44:14 2024
    ARB0 started with pid=38, OS id=56987 
    NOTE: assigning ARB0 to group 2/0x28dec0d5 (DATA) with 2 parallel I/Os
    Tue Nov 26 08:44:17 2024
    NOTE: Attempting voting file refresh on diskgroup DATA
    NOTE: Refresh completed on diskgroup DATA. No voting file found.
    Tue Nov 26 08:44:57 2024
    cellip.ora not found.
    Tue Nov 26 17:08:46 2024
    SQL> alter diskgroup data drop disk DATA_0008 
    ORA-15032: not all alterations performed
    ORA-15071: ASM disk "DATA_0008" is already being dropped
    ERROR: alter diskgroup data drop disk DATA_0008
    Tue Nov 26 17:10:30 2024
    SQL> alter diskgroup data drop disk DATA_0008 
    ORA-15032: not all alterations performed
    ORA-15071: ASM disk "DATA_0008" is already being dropped
    ERROR: alter diskgroup data drop disk DATA_0008
    Tue Nov 26 09:34:38 2024
    WARNING: cache read  a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8 (DATA_0008) incarn=3911069755 au=0 blk=98 count=1
    Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
    ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
    NOTE: a corrupted block from group DATA was dumped to /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc
    WARNING:cache read (retry) a corrupt block:group=2(DATA) dsk=8 blk=98 disk=8(DATA_0008)incarn=3911069755 au=0 blk=98 count=1
    Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
    ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
    ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
    ERROR: cache failed to read group=2(DATA) dsk=8 blk=98 from disk(s): 8(DATA_0008)
    ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
    ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
    NOTE: cache initiating offline of disk 8 group DATA
    NOTE: process _arb0_+asm1(56987)initiating offline of disk 8.3911069755 (DATA_0008) with mask 0x7e in group 2
    NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e303b, mask = 0x6a, op = clear
    Tue Nov 26 09:34:38 2024
    GMON updating disk modes for group 2 at 49 for pid 38, osid 56987
    ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
    ERROR: too many offline disks in PST (grp 2)
    Tue Nov 26 09:34:38 2024
    NOTE: cache dismounting (not clean) group 2/0x28DEC0D5 (DATA) 
    WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
    NOTE: messaging CKPT to quiesce pins Unix process pid: 89645, image: oracle@ahptdb5 (B000)
    Tue Nov 26 09:34:38 2024
    NOTE: halting all I/Os to diskgroup 2 (DATA)
    Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc  (incident=413105):
    ORA-15335: ASM metadata corruption detected in disk group 'DATA'
    ORA-15130: diskgroup "DATA" is being dismounted
    ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
    ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
    ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
    Tue Nov 26 09:34:39 2024
    ERROR: ORA-15130 in COD recovery for diskgroup 2/0x28dec0d5 (DATA)
    ERROR: ORA-15130 thrown in RBAL for group number 2
    Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_rbal_27385.trc:
    ORA-15130: diskgroup "DATA" is being dismounted
    ERROR: ORA-15335 thrown in ARB0 for group number 2
    Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_arb0_56987.trc:
    ORA-15335: ASM metadata corruption detected in disk group 'DATA'
    ORA-15130: diskgroup "DATA" is being dismounted
    ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
    ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
    ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh] [2147483656] [98] [0 != 1]
    NOTE: stopping process ARB0
    Tue Nov 26 09:34:40 2024
    NOTE: LGWR doing non-clean dismount of group 2 (DATA)
    NOTE: LGWR sync ABA=716.2684 last written ABA 716.2684
    

    通过重新分区,并且kfed repair修复磁盘头操作之后,重新mount磁盘组报错

    SQL> alter diskgroup data mount 
    NOTE: cache registered group DATA number=2 incarn=0x73bec220
    NOTE: cache began mount (first) of group DATA number=2 incarn=0x73bec220
    NOTE: Assigning number (2,16) to disk (/dev/oracleasm/disks/DATA208)
    NOTE: Assigning number (2,15) to disk (/dev/oracleasm/disks/DATA207)
    NOTE: Assigning number (2,14) to disk (/dev/oracleasm/disks/DATA206)
    NOTE: Assigning number (2,13) to disk (/dev/oracleasm/disks/DATA205)
    NOTE: Assigning number (2,12) to disk (/dev/oracleasm/disks/DATA204)
    NOTE: Assigning number (2,11) to disk (/dev/oracleasm/disks/DATA203)
    NOTE: Assigning number (2,10) to disk (/dev/oracleasm/disks/DATA202)
    NOTE: Assigning number (2,9) to disk (/dev/oracleasm/disks/DATA201)
    NOTE: Assigning number (2,6) to disk (/dev/oracleasm/disks/DATA07)
    NOTE: Assigning number (2,5) to disk (/dev/oracleasm/disks/DATA06)
    NOTE: Assigning number (2,4) to disk (/dev/oracleasm/disks/DATA05)
    NOTE: Assigning number (2,0) to disk (/dev/oracleasm/disks/DATA01)
    NOTE: Assigning number (2,3) to disk (/dev/oracleasm/disks/DATA04)
    NOTE: Assigning number (2,2) to disk (/dev/oracleasm/disks/DATA03)
    NOTE: Assigning number (2,1) to disk (/dev/oracleasm/disks/DATA02)
    NOTE: Assigning number (2,8) to disk (/dev/oracleasm/disks/DATA101)
    Tue Nov 26 11:48:22 2024
    NOTE: GMON heartbeating for grp 2
    GMON querying group 2 at 83 for pid 27, osid 15781
    NOTE: cache opening disk 0 of grp 2: DATA_0000 path:/dev/oracleasm/disks/DATA01
    NOTE: F1X0 found on disk 0 au 2 fcn 0.127835487
    NOTE: cache opening disk 1 of grp 2: DATA_0001 path:/dev/oracleasm/disks/DATA02
    NOTE: cache opening disk 2 of grp 2: DATA_0002 path:/dev/oracleasm/disks/DATA03
    NOTE: cache opening disk 3 of grp 2: DATA_0003 path:/dev/oracleasm/disks/DATA04
    NOTE: cache opening disk 4 of grp 2: DATA_0004 path:/dev/oracleasm/disks/DATA05
    NOTE: cache opening disk 5 of grp 2: DATA_0005 path:/dev/oracleasm/disks/DATA06
    NOTE: cache opening disk 6 of grp 2: DATA_0006 path:/dev/oracleasm/disks/DATA07
    NOTE: cache opening disk 8 of grp 2: DATA_0008 path:/dev/oracleasm/disks/DATA101
    NOTE: cache opening disk 9 of grp 2: DATA_0009 path:/dev/oracleasm/disks/DATA201
    NOTE: cache opening disk 10 of grp 2: DATA_0010 path:/dev/oracleasm/disks/DATA202
    NOTE: cache opening disk 11 of grp 2: DATA_0011 path:/dev/oracleasm/disks/DATA203
    NOTE: cache opening disk 12 of grp 2: DATA_0012 path:/dev/oracleasm/disks/DATA204
    NOTE: cache opening disk 13 of grp 2: DATA_0013 path:/dev/oracleasm/disks/DATA205
    NOTE: cache opening disk 14 of grp 2: DATA_0014 path:/dev/oracleasm/disks/DATA206
    NOTE: cache opening disk 15 of grp 2: DATA_0015 path:/dev/oracleasm/disks/DATA207
    NOTE: cache opening disk 16 of grp 2: DATA_0016 path:/dev/oracleasm/disks/DATA208
    NOTE: cache mounting (first) external redundancy group 2/0x73BEC220 (DATA)
    Tue Nov 26 11:48:22 2024
    * allocate domain 2, invalid = TRUE 
    kjbdomatt send to inst 2
    Tue Nov 26 11:48:22 2024
    NOTE: attached to recovery domain 2
    NOTE: starting recovery of thread=1 ckpt=716.1536 group=2 (DATA)
    NOTE: starting recovery of thread=2 ckpt=763.6248 group=2 (DATA)
    NOTE: recovery initiating offline of disk 8 group 2 (*)
    NOTE: cache initiating offline of disk 8 group DATA
    NOTE: process _user15781_+asm1 (15781) initiating offline of disk 8.3911069996 (DATA_0008) with mask 0x7e in group 2
    NOTE: initiating PST update: grp = 2, dsk = 8/0xe91e312c, mask = 0x6a, op = clear
    GMON updating disk modes for group 2 at 84 for pid 27, osid 15781
    ERROR: Disk 8 cannot be offlined, since diskgroup has external redundancy.
    ERROR: too many offline disks in PST (grp 2)
    WARNING: Offline for disk DATA_0008 in mode 0x7f failed.
    Tue Nov 26 11:48:23 2024
    NOTE: halting all I/Os to diskgroup 2 (DATA)
    NOTE: recovery (pass 2) of diskgroup 2 (DATA) caught error ORA-15130
    Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_ora_15781.trc:
    ORA-15130: diskgroup "DATA" is being dismounted
    ORA-15066: offlining disk "DATA_0008" in group "DATA" may result in a data loss
    ORA-15131: block 97 of file 8 in diskgroup 2 could not be read
    ORA-15196: invalid ASM block header [kfc.c:7600] [endian_kfbh] [2147483656] [97] [0 != 1]
    

    由于客户执行了oracleasm deletedisk,根据经验确认该操作是对asm磁盘头的前1M数据进行了清空,而客户这个asm刚好是drop disk触发了rebalance操作的时候干掉磁盘的,基于这样的情况,直接通过修复磁盘1M数据并且mount磁盘组继续使用该磁盘组的概率不大.因此处理建议:
    1. 直接恢复出来该磁盘组数据然后打开该库
    2. 直接提取客户需要的核心表数据
    有过客户有类似操作是asmlib重新创建了磁盘信息恢复:分享oracleasm createdisk重新创建asm disk后数据0丢失恢复案例
    删除分区信息之后数据库恢复案例:删除分区 oracle asm disk 恢复

    • WARNING: Read Failed.导致asm磁盘组异常
    • kfed修复ORA-15196
    • ORA-15196: invalid ASM block header [kfc.c:26368] [endian_kfbh]故障处理
    • ORA-15335 ORA-15130 ORA-15066 ORA-15196
    • 误删除asm disk导致磁盘组无法mount数据库恢复
    • Exadata磁盘损坏导致磁盘组无法mount恢复(oracle一体机磁盘组异常恢复)
    • ORA-15335: ASM metadata corruption detected in disk group ‘DATA’
    • ORA-15196: invalid ASM block header [kfc.c:26368]故障恢复
    • ORA-15130: diskgroup “ORADATA” is being dismounted
    • asm disk被加入到另外一个磁盘组故障恢复
    • Oracle Exadata坏盘导致磁盘组无法mount恢复
    • asm 加磁盘导致磁盘组损坏恢复


沪ICP备19023445号-2号
友情链接