IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    ssd trim导致fdisk格式化磁盘之后无法恢复

    惜分飞发表于 2023-12-21 11:45:50
    love 0

    联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

    标题:ssd trim导致fdisk格式化磁盘之后无法恢复

    作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

    8个节点的rac,有一个节点有点问题,重装系统的时候,没有选择任何磁盘
    20231221191215


    结果发现该机器上的所有磁盘都被分区并且加入到centos的vg中了,导致另外7个节点的asm disk的所有磁盘全部异常,磁盘组无法正常mount,在以前正常的rac节点上pvs扫描发现如下结果
    20231221191551

    对原asm disk(现在被分区加入到vg中的磁盘)进行分析发现所有的ssd磁盘被的以前记录被置空,机械磁盘的数据正常,机械磁盘kfed看到的记录

    [root@xxxrac5 oracle]#  kfed read /dev/mapper/hdisk60 aun=2 aus=4096k
    kfbh.endian:                          1 ; 0x000: 0x01
    kfbh.hard:                          130 ; 0x001: 0x82
    kfbh.type:                            3 ; 0x002: KFBTYP_ALLOCTBL
    kfbh.datfmt:                          2 ; 0x003: 0x02
    kfbh.block.blk:                    2048 ; 0x004: blk=2048
    kfbh.block.obj:              2147483682 ; 0x008: disk=34
    kfbh.check:                  2349679586 ; 0x00c: 0x8c0d43e2
    kfbh.fcn.base:                   998219 ; 0x010: 0x000f3b4b
    kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
    kfbh.spare1:                          0 ; 0x018: 0x00000000
    kfbh.spare2:                          0 ; 0x01c: 0x00000000
    kfdatb.aunum:                    916608 ; 0x000: 0x000dfc80
    ………………
    kfdate[446].free.hi:                  0 ; 0xe1c: S=0 V=0 L=0 ASZM=0x0 S=0
    kfdate[447].discriminator:            0 ; 0xe20: 0x00000000
    kfdate[447].free.lo.next:             0 ; 0xe20: 0x0000
    kfdate[447].free.lo.prev:             0 ; 0xe22: 0x0000
    kfdate[447].free.hi:                  0 ; 0xe24: S=0 V=0 L=0 ASZM=0x0 S=0
    

    分析ssd磁盘的记录情况,发现磁盘中记录全部被置空

    [root@xxxrac5 oracle]# kfed read /dev/mapper/ssdisk30 aun=2 aus=4096k
    kfbh.endian:                          0 ; 0x000: 0x00
    kfbh.hard:                            0 ; 0x001: 0x00
    kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
    kfbh.datfmt:                          0 ; 0x003: 0x00
    kfbh.block.blk:                       0 ; 0x004: blk=0
    kfbh.block.obj:                       0 ; 0x008: file=0
    kfbh.check:                           0 ; 0x00c: 0x00000000
    kfbh.fcn.base:                        0 ; 0x010: 0x00000000
    kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
    kfbh.spare1:                          0 ; 0x018: 0x00000000
    kfbh.spare2:                          0 ; 0x01c: 0x00000000
    000800000 00000000 00000000 00000000 00000000  [................]
      Repeat 255 times
    KFED-00322: invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
    
    [root@xxxrac5 oracle]# kfed read /dev/mapper/ssdisk30 aun=11 aus=4096k
    kfbh.endian:                          0 ; 0x000: 0x00
    kfbh.hard:                            0 ; 0x001: 0x00
    kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
    kfbh.datfmt:                          0 ; 0x003: 0x00
    kfbh.block.blk:                       0 ; 0x004: blk=0
    kfbh.block.obj:                       0 ; 0x008: file=0
    kfbh.check:                           0 ; 0x00c: 0x00000000
    kfbh.fcn.base:                        0 ; 0x010: 0x00000000
    kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
    kfbh.spare1:                          0 ; 0x018: 0x00000000
    kfbh.spare2:                          0 ; 0x01c: 0x00000000
    002C00000 00000000 00000000 00000000 00000000  [................]
      Repeat 255 times
    KFED-00322: invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
    
    [root@xxxrac5 oracle]# kfed read /dev/mapper/ssdisk30 aun=100
    kfbh.endian:                          0 ; 0x000: 0x00
    kfbh.hard:                            0 ; 0x001: 0x00
    kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
    kfbh.datfmt:                          0 ; 0x003: 0x00
    kfbh.block.blk:                       0 ; 0x004: blk=0
    kfbh.block.obj:                       0 ; 0x008: file=0
    kfbh.check:                           0 ; 0x00c: 0x00000000
    kfbh.fcn.base:                        0 ; 0x010: 0x00000000
    kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
    kfbh.spare1:                          0 ; 0x018: 0x00000000
    kfbh.spare2:                          0 ; 0x01c: 0x00000000
    006400000 00000000 00000000 00000000 00000000  [................]
      Repeat 255 times
    KFED-00322: invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
    
    [root@xxxrac5 oracle]# kfed read /dev/mapper/ssdisk30 aun=1000
    kfbh.endian:                          0 ; 0x000: 0x00
    kfbh.hard:                            0 ; 0x001: 0x00
    kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
    kfbh.datfmt:                          0 ; 0x003: 0x00
    kfbh.block.blk:                       0 ; 0x004: blk=0
    kfbh.block.obj:                       0 ; 0x008: file=0
    kfbh.check:                           0 ; 0x00c: 0x00000000
    kfbh.fcn.base:                        0 ; 0x010: 0x00000000
    kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
    kfbh.spare1:                          0 ; 0x018: 0x00000000
    kfbh.spare2:                          0 ; 0x01c: 0x00000000
    03E800000 00000000 00000000 00000000 00000000  [................]
      Repeat 255 times
    KFED-00322: invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
    
    [root@xxxrac5 oracle]# kfed read /dev/mapper/ssdisk30 aun=10000
    kfbh.endian:                          0 ; 0x000: 0x00
    kfbh.hard:                            0 ; 0x001: 0x00
    kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
    kfbh.datfmt:                          0 ; 0x003: 0x00
    kfbh.block.blk:                       0 ; 0x004: blk=0
    kfbh.block.obj:                       0 ; 0x008: file=0
    kfbh.check:                           0 ; 0x00c: 0x00000000
    kfbh.fcn.base:                        0 ; 0x010: 0x00000000
    kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
    kfbh.spare1:                          0 ; 0x018: 0x00000000
    kfbh.spare2:                          0 ; 0x01c: 0x00000000
    271000000 00000000 00000000 00000000 00000000  [................]
      Repeat 255 times
    KFED-00322: invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
    
    [root@xxxrac5 oracle]# kfed read /dev/mapper/ssdisk30 aun=100000
    kfbh.endian:                          0 ; 0x000: 0x00
    kfbh.hard:                            0 ; 0x001: 0x00
    kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
    kfbh.datfmt:                          0 ; 0x003: 0x00
    kfbh.block.blk:                       0 ; 0x004: blk=0
    kfbh.block.obj:                       0 ; 0x008: file=0
    kfbh.check:                           0 ; 0x00c: 0x00000000
    kfbh.fcn.base:                        0 ; 0x010: 0x00000000
    kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
    kfbh.spare1:                          0 ; 0x018: 0x00000000
    kfbh.spare2:                          0 ; 0x01c: 0x00000000
    186A000000 00000000 00000000 00000000 00000000  [................]
      Repeat 255 times
    KFED-00322: invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
    
    [root@xxxrac5 oracle]# kfed read /dev/mapper/ssdisk30 aun=100000
    kfbh.endian:                          0 ; 0x000: 0x00
    kfbh.hard:                            0 ; 0x001: 0x00
    kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
    kfbh.datfmt:                          0 ; 0x003: 0x00
    kfbh.block.blk:                       0 ; 0x004: blk=0
    kfbh.block.obj:                       0 ; 0x008: file=0
    kfbh.check:                           0 ; 0x00c: 0x00000000
    kfbh.fcn.base:                        0 ; 0x010: 0x00000000
    kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
    kfbh.spare1:                          0 ; 0x018: 0x00000000
    kfbh.spare2:                          0 ; 0x01c: 0x00000000
    186A000000 00000000 00000000 00000000 00000000  [................]
      Repeat 255 times
    KFED-00322: invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
    

    dd ssd磁盘100M数据压缩前后大小比较(压缩之后只有150kb左右,除了分区和lv信息之外其他全部是0)
    100M
    20231221193859


    通过linux命令查看磁盘是否是为ssd(cat /sys/block/sdfr/queue/rotational为1表示机械磁盘,0表示ssd磁盘)

    hdisk66 (360060e8007c550000030c55000000223) dm-66 HITACHI ,OPEN-V          
    size=2.0T features='0' hwhandler='0' wp=rw
    `-+- policy='service-time 0' prio=1 status=active
      |- 10:0:1:4  sdfr               130:208 active ready running
      |- 7:0:0:4   sds                65:32   active ready running
      |- 8:0:1:4   sdu                65:64   active ready running
      `- 9:0:1:4   sdx                65:112  active ready running
    ssdisk58 (26e8d69b6480c854f6c9ce90091cfe58e) dm-22 Nimble  ,Server          
    size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
    |-+- policy='round-robin 0' prio=50 status=active
    | `- 7:0:1:48  sdfu               131:0   active ready running
    `-+- policy='round-robin 0' prio=1 status=enabled
      `- 10:0:0:48 sdek               128:192 active ghost running
    ssdisk43 (2e0e493e08cfd75996c9ce90091cfe58e) dm-8 Nimble  ,Server          
    size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
    |-+- policy='round-robin 0' prio=50 status=active
    | `- 7:0:1:35  sdev               129:112 active ready running
    `-+- policy='round-robin 0' prio=1 status=enabled
      `- 10:0:0:35 sddj               71:16   active ghost running
    ssdisk60 (2696bc3246f81fb116c9ce90091cfe58e) dm-34 Nimble  ,Server          
    size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
    |-+- policy='round-robin 0' prio=50 status=active
    | `- 7:0:1:59  sdgq               132:96  active ready running
    `-+- policy='round-robin 0' prio=1 status=enabled
      `- 10:0:0:59 sdfg               130:32  active ghost running
    ssdisk08 (2629f870548db01a76c9ce90091cfe58e) dm-35 Nimble  ,Server          
    size=2.0T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
    |-+- policy='round-robin 0' prio=50 status=active
    | `- 7:0:1:6   sdcq               69:224  active ready running
    `-+- policy='round-robin 0' prio=1 status=enabled
      `- 10:0:0:6  sdz                65:144  active ghost running
    [root@odsrac5 oracle]# cat /sys/block/sdcq/queue/rotational
    0
    [root@odsrac5 oracle]# cat /sys/block/sdcu/queue/rotational
    0
    [root@odsrac5 oracle]# cat /sys/block/sdfr/queue/rotational
    1
    

    出现该问题是由于ssd盘的trim操作导致,相关知识补充:TRIM(SATA), Deallocate(NVMe), UNMAP(SCSI)指的是同一类指令,都是为了减少不必要的数据搬移。
    原因:
    在文件系统中,删除文件并没有真正的删除物理的数据,只是清空了记录表。而此时,对SSD来说,它并不知道文件已经被删除了,只有下次覆写的时候,SSD才能发现之前被删除的文件对应的page是无效的,从而启动GC。然而,如果在此之前发生了GC等数据搬移动作,无效的page仍然会被当做是有效的。重新磁盘分区,重建文件系统也可能触发类似操作
    作用:
    Trim 只是一个指令,它让操作系统通知 SSD 主控某个页的数据已经‘无效’后,任务就已完成,并没有更多的操作。TRIM 的先进性在于它可以让固态硬盘在进行垃圾回收的时候跳过移动无用数据的过程,从而不再用重新写入这些无用的数据,达到节省时间的目的。

    • asm disk 磁盘部分被清空恢复
    • ERROR: diskgroup XXXX was not mounted
    • 使用asm disk header 自动备份信息恢复异常asm disk header
    • pvid=yes导致asm无法mount
    • ORA-15063: ASM discovered an insufficient number of disks for diskgroup 恢复
    • KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type]
    • asm disk被分区,格式化为ext4恢复
    • win asm disk header 异常恢复
    • pvcreate asm disk导致asm磁盘组异常恢复
    • fdisk分区导致asm disk破坏数据库恢复
    • ASM DISK HEADER 备份与恢复
    • 通过kfed说明asm disk header定义


沪ICP备19023445号-2号
友情链接