IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    网卡异常导致数据库实例启动异常

    惜分飞发表于 2023-03-07 15:12:55
    love 0

    联系:手机/微信(+86 17813235971) QQ(107644445)QQ咨询惜分飞

    标题:网卡异常导致数据库实例启动异常

    作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]

    一套集群,一个节点启动正常,另外一个节点无法正常启动实例,启动异常节点alert日志

    Tue Mar 07 19:07:29 2023
    IPC Send timeout detected. Receiver ospid 6386 [
    Tue Mar 07 19:07:29 2023
    Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms0_6386.trc:
    IPC Send timeout detected. Receiver ospid 6402 [
    Tue Mar 07 19:07:29 2023
    Errors in file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_lms4_6402.trc:
    Tue Mar 07 19:07:29 2023
    Received an instance abort message from instance 1
    Please check instance 1 alert and LMON trace files for detail.
    System state dump requested by (instance=2, osid=6384 (LMD0)), summary=[abnormal instance termination].
    System State dumped to trace file /u01/app/oracle/diag/rdbms/xff/xff2/trace/xff2_diag_6374_20230307190729.trc
    LMD0 (ospid: 6384): terminating the instance due to error 481
    Dumping diagnostic data in directory=[cdmp_20230307190729],
          requested by (instance=2, osid=6384 (LMD0)), summary=[abnormal instance termination].
    Instance terminated by LMD0, pid = 6384
    

    正常节点alert日志

    Tue Mar 07 19:02:07 2023
    Reconfiguration started (old inc 20, new inc 22)
    List of instances:
     1 2 (myinst: 1)
     Global Resource Directory frozen
     Communication channels reestablished
     Master broadcasted resource hash value bitmaps
     Non-local Process blocks cleaned out
    Tue Mar 07 19:02:08 2023
     LMS 5: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
    Tue Mar 07 19:02:08 2023
     LMS 2: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
    Tue Mar 07 19:02:08 2023
     LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
    Tue Mar 07 19:02:08 2023
     LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
    Tue Mar 07 19:02:08 2023
     LMS 3: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
    Tue Mar 07 19:02:08 2023
     LMS 7: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
    Tue Mar 07 19:02:08 2023
    Tue Mar 07 19:02:08 2023
     LMS 4: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
     LMS 6: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
     Set master node info
     Submitted all remote-enqueue requests
     Dwn-cvts replayed, VALBLKs dubious
     All grantable enqueues granted
     Submitted all GCS remote-cache requests
     Fix write in gcs resources
    Tue Mar 07 19:02:27 2023
    IPC Send timeout detected. Sender: ospid 6936 [oracle@xffnode1.localdomain (PING)]
    Receiver: inst 2 binc 441249706 ospid 59731
    Tue Mar 07 19:07:29 2023
    IPC Send timeout detected. Sender: ospid 6946 [oracle@xffnode1.localdomain (LMS0)]
    Receiver: inst 2 binc 429479852 ospid 6386
    Tue Mar 07 19:07:29 2023
    IPC Send timeout detected. Sender: ospid 6962 [oracle@xffnode1.localdomain (LMS4)]
    Receiver: inst 2 binc 429479854 ospid 6402
    Tue Mar 07 19:07:29 2023
    IPC Send timeout detected. Sender: ospid 6966 [oracle@xffnode1.localdomain (LMS5)]
    

    通过上述日志,可以确认主要由于两个节点之间无法正常通讯,从而使得新节点无法加入到集群(无法完成集群重组),从而使得实例启动异常.一般出现这类情况最检查的就是私网异常,通过分析oswnetstat记录发现packet reassembles failed特别严重
    20230307230341


    一般出现该问题,考虑是由于ipfrag_*_thresh默认值不足导致,通过设置

    net.ipv4.ipfrag_high_thresh = 16777216
    net.ipv4.ipfrag_low_thresh = 15728640
    

    临时请库成功,但是数据库实例重组时间依旧过长
    20230307230658


    packet reassembles failed依旧在增加,通过分析网卡情况发现网卡异常,采用haip(双万兆网卡)的其中一块网卡异常
    20230307230813

    为了数据库性能不收太大影响,临时禁用异常网卡,后续等网络层面解决之后再启用

    • IMP-00009: abnormal end of export file
    • truncate sys用户表导致数据库异常恢复
    • .eight加密数据库恢复
    • _locked加密数据库恢复
    • 部分oracle数据文件被加密完美恢复
    • 再一例asm disk被误加入vg并且扩容lv恢复
    • InnoDB: Database page corruption on disk or a failed file read of page恢复
    • ORA-600 3417故障处理
    • 重建control遗漏数据文件,reseltogs报ORA-1555错误处理
    • ORA-600 kcbzpbuf_1故障恢复
    • ORA-600 k2vcbk_2 故障恢复
    • ORA-600 3417和ORA-600 3005故障处理


沪ICP备19023445号-2号
友情链接