IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    修复 LVM XFS 的 Input/output error

    Phoenix Nemo发表于 2021-07-07 15:05:32
    love 0

    某服务挂了。

    设备被强制重启之后发现 LVM 满了,但是文件无法访问,所有文件操作显示 Input/output error。

    查看 dmesg 发现大量文件系统错误,应该是磁盘写满后仍有进程不断读写的过程中被强制断电的结果。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    [ 1714.217864] XFS (dm-0): page discard on page 00000000161e11d5, inode 0xd861b703d, offset 937984.
    [ 1714.219674] XFS (dm-0): page discard on page 000000001d433e5e, inode 0xd861b703d, offset 942080.
    [ 1714.221132] XFS (dm-0): page discard on page 00000000820efe8d, inode 0xd861b703d, offset 946176.
    [ 1714.222431] XFS (dm-0): page discard on page 00000000518c8216, inode 0xd861b703d, offset 950272.
    [ 1714.223744] XFS (dm-0): page discard on page 00000000753db760, inode 0xd861b703d, offset 954368.
    [ 1714.225041] XFS (dm-0): page discard on page 00000000da40787d, inode 0xd861b703d, offset 958464.
    [ 1714.226341] XFS (dm-0): page discard on page 00000000ba8adb4b, inode 0xd861b703d, offset 962560.
    [ 1714.227629] XFS (dm-0): page discard on page 00000000784c4724, inode 0xd861b703d, offset 966656.
    [ 1714.228923] XFS (dm-0): page discard on page 0000000063b2c764, inode 0xd861b703d, offset 970752.
    [ 1714.228990] XFS (dm-0): page discard on page 0000000046a36fd8, inode 0xd861b703d, offset 974848.
    [ 1714.337426] dm-0: writeback error on inode 58084519997, offset 905216, sector 34365282240
    [ 1716.586318] dm-0: writeback error on inode 58084519997, offset 905216, sector 34365309816
    [ 1728.444718] xfs_discard_page: 9674 callbacks suppressed

    ...

    [ 1763.990454] XFS (dm-0): xfs_do_force_shutdown(0x8) called from line 955 of file fs/xfs/xfs_trans.c. Return address = 00000000ea9478e4
    [ 1763.990459] XFS (dm-0): Corruption of in-memory data detected. Shutting down filesystem
    [ 1763.992696] XFS (dm-0): Please unmount the filesystem and rectify the problem(s)

    日志写的很清楚了,那就来卸载修复吧

    1
    2
    ~> umount /data
    ~> xfs_repair /dev/mapper/data

    显示

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    Phase 1 - find and verify superblock...
    - reporting progress in intervals of 15 minutes
    Phase 2 - using internal log
    - zero log...
    ERROR: The filesystem has valuable metadata changes in a log which needs to
    be replayed. Mount the filesystem to replay the log, and unmount it before
    re-running xfs_repair. If you are unable to mount the filesystem, then use
    the -L option to destroy the log and attempt a repair.
    Note that destroying the log may cause corruption -- please attempt a mount
    of the filesystem before doing this.

    咦,为什么还要我 remount。

    1
    2
    3
    ~> mount -a
    ~> umount /data
    ~> xfs_repair /dev/mapper/data

    然后就是漫长的等待…(因为是 HDD)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    Phase 1 - find and verify superblock...
    - reporting progress in intervals of 15 minutes
    Phase 2 - using internal log
    - zero log...
    - 19:33:58: zeroing log - 521728 of 521728 blocks done
    - scan filesystem freespace and inode maps...
    - 19:34:11: scanning filesystem freespace - 33 of 33 allocation groups done
    - found root inode chunk
    Phase 3 - for each AG...
    - scan and clear agi unlinked lists...
    - 19:34:11: scanning agi unlinked lists - 33 of 33 allocation groups done
    - process known inodes and perform inode discovery...
    - agno = 0
    - agno = 15
    - agno = 30
    ...
    - agno = 27
    - agno = 28
    - agno = 29
    - 19:43:14: process known inodes and inode discovery - 16555072 of 16555072 inodes done
    - process newly discovered inodes...
    - 19:43:14: process newly discovered inodes - 33 of 33 allocation groups done
    Phase 4 - check for duplicate blocks...
    - setting up duplicate extent list...
    - 19:43:15: setting up duplicate extent list - 33 of 33 allocation groups done
    - check for inodes claiming duplicate blocks...
    - agno = 7
    - agno = 3
    - agno = 8
    ...
    - agno = 30
    - agno = 31
    - agno = 32
    - 19:43:24: check for inodes claiming duplicate blocks - 16555072 of 16555072 inodes done
    Phase 5 - rebuild AG headers and trees...
    - 19:43:27: rebuild AG headers and trees - 33 of 33 allocation groups done
    - reset superblock...
    Phase 6 - check inode connectivity...
    - resetting contents of realtime bitmap and summary inodes
    - traversing filesystem ...
    - 19:48:58: rebuild AG headers and trees - 33 of 33 allocation groups done
    - traversal finished ...
    - moving disconnected inodes to lost+found ...
    Phase 7 - verify and correct link counts...
    - 19:50:02: verify and correct link counts - 33 of 33 allocation groups done
    done

    完事后重启,就可以重新访问 LVM 里的文件啦。

    后记:

    这次所幸根分区是单独的盘,如果根分区和 LVM 在同一块物理盘上的话,需要重启系统进入救援模式,手动激活 LVM 再执行修复。

    以及 XFS 还是靠谱呀(看向某每天摸鱼看番剧透的 btrfs 开发者



沪ICP备19023445号-2号
友情链接