联系:手机/微信(+86 17813235971) QQ(107644445)
标题:Oracle 19c异常恢复—ORA-01209/ORA-65088
作者:惜分飞©版权所有[未经本人同意,不得以任何形式转载,否则有进一步追究法律责任的权利.]
由于raid卡bug故障,导致文件系统异常,从而使得数据库无法正常启动,客户找到我之前已经让多人分析,均未恢复成功,查看alert日志,发现他们恢复的时候尝试resetlogs库,然后报ORA-600 kcbzib_kcrsds_1错误
2024-09-15T17:07:32.553215+08:00 alter database open resetlogs 2024-09-15T17:07:32.569110+08:00 RESETLOGS is being done without consistancy checks. This may result in a corrupted database. The database should be recreated. RESETLOGS after incomplete recovery UNTIL CHANGE 274757454692 time .... (PID:8074): Clearing online redo logfile 1 /opt/oracle/oradata/XFF/onlinelog/o1_mf_1_j3k201g9_.log .... (PID:8074): Clearing online redo logfile 2 /opt/oracle/oradata/XFF/onlinelog/o1_mf_2_j3k201h3_.log .... (PID:8074): Clearing online redo logfile 3 /opt/oracle/oradata/XFF/onlinelog/o1_mf_3_j3k201hk_.log Clearing online log 1 of thread 1 sequence number 0 Clearing online log 2 of thread 1 sequence number 0 Clearing online log 3 of thread 1 sequence number 0 2024-09-15T17:07:34.939550+08:00 .... (PID:8074): Clearing online redo logfile 1 complete .... (PID:8074): Clearing online redo logfile 2 complete .... (PID:8074): Clearing online redo logfile 3 complete Online log /opt/oracle/oradata/XFF/onlinelog/o1_mf_1_j3k201g9_.log: Thread 1 Group 1 was previously cleared Online log /opt/oracle/fast_recovery_area/XFF/onlinelog/o1_mf_1_j3k201l4_.log: Thread 1 Group 1 was previously cleared Online log /opt/oracle/oradata/XFF/onlinelog/o1_mf_2_j3k201h3_.log: Thread 1 Group 2 was previously cleared Online log /opt/oracle/fast_recovery_area/XFF/onlinelog/o1_mf_2_j3k201kw_.log: Thread 1 Group 2 was previously cleared Online log /opt/oracle/oradata/XFF/onlinelog/o1_mf_3_j3k201hk_.log: Thread 1 Group 3 was previously cleared Online log /opt/oracle/fast_recovery_area/XFF/onlinelog/o1_mf_3_j3k201mt_.log: Thread 1 Group 3 was previously cleared 2024-09-15T17:07:34.966674+08:00 Setting recovery target incarnation to 2 2024-09-15T17:07:34.992357+08:00 Ping without log force is disabled: instance mounted in exclusive mode. Buffer Cache Full DB Caching mode changing from FULL CACHING DISABLED to FULL CACHING ENABLED 2024-09-15T17:07:34.994329+08:00 Crash Recovery excluding pdb 2 which was cleanly closed. 2024-09-15T17:07:34.994390+08:00 Crash Recovery excluding pdb 3 which was cleanly closed. 2024-09-15T17:07:34.994433+08:00 Crash Recovery excluding pdb 4 which was cleanly closed. 2024-09-15T17:07:34.994474+08:00 Crash Recovery excluding pdb 5 which was cleanly closed. Initializing SCN for created control file Database SCN compatibility initialized to 3 Endian type of dictionary set to little 2024-09-15T17:07:35.001752+08:00 Assigning activation ID 2966012017 (0xb0c9c071) Redo log for group 1, sequence 1 is not located on DAX storage 2024-09-15T17:07:35.015921+08:00 TT00 (PID:8113): Gap Manager starting 2024-09-15T17:07:35.034047+08:00 Thread 1 opened at log sequence 1 Current log# 1 seq# 1 mem# 0: /opt/oracle/oradata/XFF/onlinelog/o1_mf_1_j3k201g9_.log Current log# 1 seq# 1 mem# 1: /opt/oracle/fast_recovery_area/XFF/onlinelog/o1_mf_1_j3k201l4_.log Successful open of redo thread 1 2024-09-15T17:07:35.034573+08:00 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set stopping change tracking 2024-09-15T17:07:35.063726+08:00 TT03 (PID:8119): Sleep 5 seconds and then try to clear SRLs in 2 time(s) 2024-09-15T17:07:35.129748+08:00 Undo initialization recovery: Parallel FPTR failed: start:2528681 end:2528684 diff:3 ms (0.0 seconds) Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_ora_8074.trc (incident=146455) (PDBNAME=CDB$ROOT): ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [], [], [], [], [], [], [], [] Incident details in: /opt/oracle/diag/rdbms/xff/XFF/incident/incdir_146455/XFF_ora_8074_i146455.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Please look for redo dump in pinned buffers history in incident trace file, if not dumped for what so ever reason, use the following command to dump it at the earliest. ALTER SYSTEM DUMP REDO DBA MIN 4 128 DBA MAX 4 128 SCN MIN 1; ***************************************************************** An internal routine has requested a dump of selected redo. This usually happens following a specific internal error, when analysis of the redo logs will help Oracle Support with the diagnosis. It is recommended that you retain all the redo logs generated (by all the instances) during the past 12 hours, in case additional redo dumps are required to help with the diagnosis. ***************************************************************** Undo initialization recovery: err:600 start: 2528681 end: 2529341 diff: 660 ms (0.7 seconds) 2024-09-15T17:07:35.786923+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_ora_8074.trc: ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [], [], [], [], [], [], [], [] 2024-09-15T17:07:35.786967+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_ora_8074.trc: ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [], [], [], [], [], [], [], [] Error 600 happened during db open, shutting down database Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_ora_8074.trc (incident=146456) (PDBNAME=CDB$ROOT): ORA-00603: ORACLE server session terminated by fatal error ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00600: internal error code, arguments: [kcbzib_kcrsds_1], [], [], [], [], [], [], [], [], [], [], [] Incident details in: /opt/oracle/diag/rdbms/xff/XFF/incident/incdir_146456/XFF_ora_8074_i146456.trc 2024-09-15T17:07:36.291884+08:00 opiodr aborting process unknown ospid (8074) as a result of ORA-603 2024-09-15T17:07:36.299928+08:00 ORA-603 : opitsk aborting process License high water mark = 4 USER(prelim) (ospid: 8074): terminating the instance due to ORA error 600
然后他们又重建了ctl,通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)检查,发现几个问题:
1. PDB$SEED不在该库记录中(由于该pdb中无业务数据,可以忽略)
[oracle@localhost check_db]$ sqlplus / as sysdba SQL*Plus: Release 19.0.0.0.0 - Production on Tue Sep 17 11:29:28 2024 Version 19.9.0.0.0 Copyright (c) 1982, 2020, Oracle. All rights reserved. Connected to: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.9.0.0.0 SQL> alter database open resetlogs; alter database open resetlogs * ERROR at line 1: ORA-00603: ORACLE server session terminated by fatal error ORA-01092: ORACLE instance terminated. Disconnection forced ORA-65088: database open should be retried Process ID: 101712 Session ID: 105 Serial number: 4711
对应的alert日志报错
Endian type of dictionary set to little 2024-09-17T11:29:46.691904+08:00 Assigning activation ID 2966261119 (0xb0cd8d7f) Redo log for group 1, sequence 1 is not located on DAX storage 2024-09-17T11:29:46.714594+08:00 TT00 (PID:101731): Gap Manager starting 2024-09-17T11:29:46.735407+08:00 Thread 1 opened at log sequence 1 Current log# 1 seq# 1 mem# 0: /opt/oracle/oradata/XFF/onlinelog/o1_mf_1_j3k201g9_.log Successful open of redo thread 1 2024-09-17T11:29:46.736182+08:00 MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set stopping change tracking 2024-09-17T11:29:46.774207+08:00 TT03 (PID:101737): Sleep 5 seconds and then try to clear SRLs in 2 time(s) 2024-09-17T11:29:46.793381+08:00 Undo initialization recovery: Parallel FPTR complete: start:99831350 end:99831351 diff:1 ms (0.0 seconds) Undo initialization recovery: err:0 start: 99831349 end: 99831351 diff: 2 ms (0.0 seconds) Undo initialization online undo segments: err:0 start: 99831351 end: 99831353 diff: 2 ms (0.0 seconds) Undo initialization finished serial:0 start:99831349 end:99831356 diff:7 ms (0.0 seconds) Dictionary check beginning 2024-09-17T11:29:46.817810+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_ora_101712.trc: ORA-65106: Pluggable database #2 (PDB$SEED) is in an invalid state. Pluggable Database PDB$SEED (#2) found in data dictionary, but not in the control file. Adding it to control file. Pluggable Database PDB1 (#3) found in data dictionary, but not in the control file. Adding it to control file. Pluggable Database PDB2 (#4) found in data dictionary, but not in the control file. Adding it to control file. Pluggable Database PDB3 (#5) found in data dictionary, but not in the control file. Adding it to control file. Tablespace 'TEMP' #3 found in data dictionary, but not in the controlfile. Adding to controlfile. 2024-09-17T11:29:46.878684+08:00 Read of datafile '/opt/oracle/oradata/XFF/PDB/datafile/o1_mf_system_j3kc9hl0_.dbf'(fno 9)header failed with ORA-01209 Rereading datafile 9 header failed with ORA-01209 2024-09-17T11:29:46.921314+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_dbw0_100632.trc: ORA-01186: file 9 failed verification tests ORA-01122: database file 9 failed verification check ORA-01110: data file 9: '/opt/oracle/oradata/XFF/PDB/datafile/o1_mf_system_j3kc9hl0_.dbf' ORA-01209: data file is from before the last RESETLOGS File 9 not verified due to error ORA-01122 ………… Read of datafile '/opt/oracle/oradata/XFF/datafile/users07.dbf' (fno 39) header failed with ORA-01209 Rereading datafile 39 header failed with ORA-01209 2024-09-17T11:29:46.983955+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_dbw0_100632.trc: ORA-01186: file 39 failed verification tests ORA-01122: database file 39 failed verification check ORA-01110: data file 39: '/opt/oracle/oradata/XFF/datafile/users07.dbf' ORA-01209: data file is from before the last RESETLOGS File 39 not verified due to error ORA-01122 2024-09-17T11:29:46.987947+08:00 Dictionary check complete Verifying minimum file header compatibility for tablespace encryption for pdb 1.. Verifying file header compatibility for tablespace encryption completed for pdb 1 ********************************************************************* WARNING: The following temporary tablespaces in container(CDB$ROOT) contain no files. This condition can occur when a backup controlfile has been restored. It may be necessary to add files to these tablespaces. That can be done using the SQL statement: ALTER TABLESPACE <tablespace_name> ADD TEMPFILE Alternatively, if these temporary tablespaces are no longer needed, then they can be dropped. Empty temporary tablespace: TEMP ********************************************************************* Database Characterset is AL32UTF8 2024-09-17T11:29:47.059806+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_mz00_101739.trc: ORA-01110: data file 9: '/opt/oracle/oradata/XFF/PDB/datafile/o1_mf_system_j3kc9hl0_.dbf' ORA-01209: data file is from before the last RESETLOGS ………… ********************************************************** WARNING: Files may exists in db_recovery_file_dest that are not known to the database. Use the RMAN command CATALOG RECOVERY AREA to re-catalog any such files. If files cannot be cataloged, then manually delete them using OS command. One of the following events caused this: 1. A backup controlfile was restored. 2. A standby controlfile was restored. 3. The controlfile was re-created. 4. db_recovery_file_dest had previously been enabled and then disabled. ********************************************************** Starting background process IMCO 2024-09-17T11:29:47.340660+08:00 2024-09-17T11:29:47.382153+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_mz00_101739.trc: ORA-01110: data file 13: '/opt/oracle/oradata/XFF/PDB/datafile/o1_mf_users_j3kckos2_.dbf' ORA-01209: data file is from before the last RESETLOGS replication_dependency_tracking turned off (no async multimaster replication found) LOGSTDBY: Validating controlfile with logical metadata LOGSTDBY: Validation complete 2024-09-17T11:29:47.464233+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_mz00_101739.trc: ORA-01110: data file 14: '/opt/oracle/oradata/XFF/PDB/datafile/o1_mf_users_j3kckqfx_.dbf' ORA-01209: data file is from before the last RESETLOGS AQ Processes can not start in restrict mode Could not open PDB$SEED error=65106 2024-09-17T11:29:47.522825+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_ora_101712.trc: ORA-65106: Pluggable database #2 (PDB$SEED) is in an invalid state. ORA-65106: Pluggable database #2 (PDB$SEED) is in an invalid state. 2024-09-17T11:29:47.525249+08:00 db_recovery_file_dest_size of 65536 MB is 0.05% used. This is a user-specified limit on the amount of space that will be used by this database for recovery-related files, and does not reflect the amount of space available in the underlying filesystem or ASM diskgroup. 2024-09-17T11:29:47.529134+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_ora_101712.trc: ORA-65088: database open should be retried 2024-09-17T11:29:47.529202+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_ora_101712.trc: ORA-65088: database open should be retried 2024-09-17T11:29:47.529253+08:00 Error 65088 happened during db open, shutting down database 2024-09-17T11:29:47.545440+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_mz00_101739.trc: ORA-01110: data file 15: '/opt/oracle/oradata/XFF/PDB/datafile/o1_mf_users_j3kckstd_.dbf' ORA-01209: data file is from before the last RESETLOGS Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_ora_101712.trc(incident=775863)(PDBNAME=CDB$ROOT): ORA-00603: ORACLE server session terminated by fatal error ORA-01092: ORACLE instance terminated. Disconnection forced ORA-65088: database open should be retried 2024-09-17T11:29:48.046698+08:00 Errors in file /opt/oracle/diag/rdbms/xff/XFF/trace/XFF_mz00_101739.trc: ORA-01110: data file 21: '/opt/oracle/oradata/XFF/PDB2/datafile/o1_mf_users_j45x90oq_.dbf' ORA-01209: data file is from before the last RESETLOGS 2024-09-17T11:29:48.073328+08:00 opiodr aborting process unknown ospid (101712) as a result of ORA-603 2024-09-17T11:29:48.081576+08:00 ORA-603 : opitsk aborting process License high water mark = 122 USER(prelim) (ospid: 101712): terminating the instance due to ORA error 65088 2024-09-17T11:29:49.104770+08:00 Instance terminated by USER(prelim), pid = 101712
主要错误有两个
ORA-01209: data file is from before the last RESETLOGS 和
ORA-65088: database open should be retried
通过分析这两个错误
[oracle@ora19c:/home/oracle]$ oerr ora 65088 65088, 00000, "database open should be retried" // *Cause: An inconsistency between the control file and the data dictionary // was found and fixed during the database open. The database open // needs to be executed again. // *Action: Retry the database open. // [oracle@ora19c:/home/oracle]$ oerr ora 01209 01209, 00000, "data file is from before the last RESETLOGS" // *Cause: The reset log data in the file header does not match the // control file. If the database is closed or the file is offline, // the backup is old because it was taken before the last ALTER // DATABASE OPEN RESETLOGS command. If opening a database that is // open already by another instance, or if another instance just // brought this file online, the file accessed by this instance is // probably a different version. Otherwise, a backup of the file // probably was restored while the file was in use. // *Action: Make the correct file available to the database. Then, either open // the database, or execute ALTER SYSTEM CHECK DATAFILES.
ORA-65088参见官方:ORA-65088 while opening DB with resetlogs for multi-tenant DB in 12.2 (Doc ID 2449591.1),应该不是一个技术问题(由于重建ctl+resetlogs导致)
ORA-01209: data file is from before the last RESETLOGS 这个错误,可以简单理解resetlogs的信息比数据文件的checkpoint信息新,对于这种情况,以及结合上述的部分文件resetlogs信息不一致问题,索性直接使用m_scn小工具对其进行批量