IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    分区无法识别导致asm diskgroup无法mount

    惜分飞发表于 2016-03-15 17:21:23
    love 0

    联系:手机(13429648788) QQ(107644445)QQ咨询惜分飞

    标题:分区无法识别导致asm diskgroup无法mount

    作者:惜分飞©版权所有[未经本人同意,请不得以任何形式转载,否则有进一步追究法律责任的权利.]

    有客户咨询由于主机重启之后,导致四个磁盘组中的data2磁盘组无法mount(报ORA-15032,ORA-15017,ORA-15063),数据库无法open,让我们帮忙分析解决

    Wed Mar 09 18:10:53 2016
    NOTE: Assigning number (1,1) to disk (/dev/oracleasm/disks/VOL011)
    Wed Mar 09 18:10:53 2016
    ERROR: no read quorum in group: required 1, found 0 disks
    NOTE: cache dismounting (clean) group 1/0xBD42B778 (DATA2) 
    NOTE: messaging CKPT to quiesce pins Unix process pid: 45093, image: oracle@BA (TNS V1-V3)
    NOTE: dbwr not being msg'd to dismount
    NOTE: lgwr not being msg'd to dismount
    NOTE: cache dismounted group 1/0xBD42B778 (DATA2) 
    NOTE: cache ending mount (fail) of group DATA2 number=1 incarn=0xbd42b778
    NOTE: cache deleting context for group DATA2 1/0xbd42b778
    GMON dismounting group 1 at 16 for pid 18, osid 45093
    NOTE: Disk DATA2_0001 in mode 0x9 marked for de-assignment
    ERROR: diskgroup DATA2 was not mounted
    ORA-15032: not all alterations performed
    ORA-15017: diskgroup "DATA2" cannot be mounted
    ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA2"
    ERROR: ALTER DISKGROUP DATA2 MOUNT  /* asm agent *//* {0:0:431} */
    

    这里很明显由于缺少asm disk导致data2无法正常mount,进一步分析发现data2是有两块磁盘组成

    Mon Sep 14 13:14:35 2015
    SQL> create diskgroup data2 external redundancy  disk '/dev/oracleasm/disks/VOL010','/dev/oracleasm/disks/VOL011' 
    NOTE: Assigning number (4,0) to disk (/dev/oracleasm/disks/VOL010)
    NOTE: Assigning number (4,1) to disk (/dev/oracleasm/disks/VOL011)
    NOTE: initializing header on grp 4 disk DATA2_0000
    NOTE: initializing header on grp 4 disk DATA2_0001
    NOTE: initiating PST update: grp = 4
    Mon Sep 14 13:14:35 2015
    GMON updating group 4 at 29 for pid 26, osid 51535
    NOTE: group DATA2: initial PST location: disk 0000 (PST copy 0)
    NOTE: PST update grp = 4 completed successfully 
    NOTE: cache registered group DATA2 number=4 incarn=0xea085f62
    NOTE: cache began mount (first) of group DATA2 number=4 incarn=0xea085f62
    NOTE: cache opening disk 0 of grp 4: DATA2_0000 path:/dev/oracleasm/disks/VOL010
    NOTE: cache opening disk 1 of grp 4: DATA2_0001 path:/dev/oracleasm/disks/VOL011
    NOTE: cache creating group 4/0xEA085F62 (DATA2)
    NOTE: cache mounting group 4/0xEA085F62 (DATA2) succeeded
    NOTE: allocating F1X0 on grp 4 disk DATA2_0000
    NOTE: diskgroup must now be re-mounted prior to first use
    NOTE: cache dismounting (clean) group 4/0xEA085F62 (DATA2) 
    NOTE: messaging CKPT to quiesce pins Unix process pid: 51535, image: oracle@BA (TNS V1-V3)
    NOTE: lgwr not being msg'd to dismount
    NOTE: cache dismounted group 4/0xEA085F62 (DATA2) 
    GMON dismounting group 4 at 30 for pid 26, osid 51535
    GMON dismounting group 4 at 31 for pid 26, osid 51535
    NOTE: Disk DATA2_0000 in mode 0x7e marked for de-assignment
    NOTE: Disk DATA2_0001 in mode 0x7e marked for de-assignment
    SUCCESS: diskgroup DATA2 was created
    

    结合这部分信息,我们可以确定data2磁盘组是由两个磁盘组构成,分别为VOL010和VOL011,现在由于只发现了VOL011,因此data2磁盘组无法正常mount.观察发现该系统使用的是asmlib,通过oracleasm querydisk命令结合fdisk的盘符,
    oracleasm-querydisk


    基本上可以确定VOL010丢失应该在mpathb盘(由于只有该盘和分区未被使用,其他盘和分区已经全部被现在可以查询到的asmlib使用作为asmdisk)之上
    Disk /dev/mapper/mpathb: 3846.7 GB, 3846677987328 bytes
    255 heads, 63 sectors/track, 467665 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000
    
                 Device Boot      Start         End      Blocks   Id  System
    /dev/mapper/mpathbp1               1      267350  2147483647+  ee  GPT
    
    Disk /dev/mapper/mpathbp1: 3846.7 GB, 3846675890176 bytes
    255 heads, 63 sectors/track, 467665 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0xb84bb99a
    
                    Device Boot      Start         End      Blocks   Id  System
    /dev/mapper/mpathbp1p1               1      200513  1610620641   83  Linux
    /dev/mapper/mpathbp1p2          200514      267349   536860170   83  Linux
    /dev/mapper/mpathbp1p3          267350      467665  1609038270   83  Linux
    

    这里我们发现奇怪现象:mpathb盘先使用parted分为一个mapthbp1分区,然后又使用fdisk分了三个p1p1,p1p2,p1p3三个子分区.然后我们查看/dev/mapper/中的设备情况
    mpathb


    发现没有p1p1,p1p2,p1p3这三个本该属于mapthb上的子分区.现在基本上明确,是由于对mapthb先使用了parted分区,然后再使用fdisk分区,在操作系统重启之后,无法正常识别相关子分区导致该问题.到此解决该问题的思路有三种.
    1. 因为磁盘分区表信息是正常的,就是分区表信息没有同步到操作系统之上,想办法同步过去即可,os部分内容,此处忽略
    2. 使用数据文件重组的方式直接对data2这两个asm disk进行重组,这里因为三个子分区未发现,直接对mapthbp1分区进行扫描即可,参考:asm disk header 彻底损坏恢复
    3. 因为分区对于asm disk来说主要就是设置了磁盘的偏移量和大小,如果找到磁盘的偏移量,然后确定asm disk大小,直接通过dd命令把该部分dd到新的磁盘设备之上,然后直接mount磁盘组即可,这里重点讲解第三种方法恢复处理
    使用dd出来mapthp1的磁盘头,然后使用bbed找出来偏移量,主要依据是第一次出现01820101信息的部分

    BBED> d
     File: bp1 (0)
     Block: 64               Offsets:    0 to   63           Dba:0x00000000
    ------------------------------------------------------------------------
     01820101 00000000 00000080 bc60223c 00000000 00000000 00000000 00000000
     4f52434c 4449534b 564f4c30 31300000 00000000 00000000 00000000 00000000
    
     <32 bytes per line>
    
    BBED> show all
            FILE#           0
            BLOCK#          64
            OFFSET          0
            DBA             0x00000000 (0 0,64)
            FILENAME        bp1
            BIFILE          bifile.bbd
            LISTFILE
            BLOCKSIZE       512
            MODE            Browse
            EDIT            Unrecoverable
            IBASE           Dec
            OBASE           Dec
            WIDTH           80
            COUNT           64
            LOGFILE         log.bbd
            SPOOL           No
    

    这里基本上可以定位到asm disk header对于mapthbp1的偏移量为32256,dd出来asm disk header分析
    dd


    使用kfed查看磁盘头信息
    kfed
    kfed2

    现在基本上可以确定,asm disk大小为1572871M,磁盘的偏移量为32256,然后使用dd命令把这部分dd到新的磁盘设备上,然后oracleasm scandisks后
    asm-mount
    asmcd-lsdg
    oracle-open

    data2 mount成功,数据库正常open,此数据库完美恢复

    • ORA-15042: ASM disk “N” is missing from group number “M” 故障恢复
    • VMware vSphere6.0 初试
    • ASM中磁盘组权限设置
    • vSphere ssh登陆配置
    • asm disk格式化为ntfs恢复
    • ORACLE 7.3.4 截图欣赏
    • Oracle 8.0.5 安装过程截图
    • 在win 64位平台上运行bbed(支持ORACLE 10g 11g 12c)
    • windows平台listener.log超过4G导致监听异常
    • 通过Administration Assistant for Windows配置win服务和实例关联性
    • Oracle 8i安装过程截图
    • asm备份元数据之md_backup和md_restore


沪ICP备19023445号-2号
友情链接