在阿里云上创建rac或者rac onenode,共享存储比较麻烦,需要配iSCSI,我们可以用dataguard+fsfo架构来类似的实现高可用方案。
本文介绍了在阿里云上oracle 12.1.0.2 dataguard+fsfo的搭建过程。
(一)、基础工作序列:
阿里云上创建Oracle Dataguard+FSFO | 负责人 | |
1 | 主机分配 | 云管理员 |
2 | 操作系统设置 | 云管理员、DBA |
3 | 数据库安装 | DBA |
4 | 数据库配置 | DBA |
5 | 数据库高可用场景测试 | DBA |
6 | 数据库备份设置 | DBA |
(二)、操作系统设置:
需求项 | 操作环境 | 需求细节内容 | 说明 | |
操作系统要求 | primary instance host/standby instance host/observer host | linux版本:SLES 11 | 阿里云上经过oracle certification的操作系统,只有sles。 阿里云上没有rhel和oel |
|
文件系统要求 | primary instance host/standby instance host/observer host | 文件系统:EXT3 | sles 11不支持ext4,ext格式化后的文件系统只能以read only的方式mount上去,所以选择ext3 | |
RPM包要求 | primary instance host/standby instance host/observer host | 安装完成后,检查如下的rpm(或更高版本)是否已经安装: # For SUSE 11 binutils-2.21.1-0.7.25 gcc-4.3-62.198 gcc-c++-4.3-62.198 glibc-2.11.3-17.31.1 glibc-devel-2.11.3-17.31.1 ksh-93u-0.6.1 libaio-0.3.109-0.1.46 libaio-devel-0.3.109-0.1.46 libcap1-1.10-6.10 libstdc++33-3.3.3-11.9 libstdc++33-32bit-3.3.3-11.9 libstdc++43-devel-4.3.4_20091019-0.22.17 libstdc++46-4.6.1_20110701-0.13.9 libgcc46-4.6.1_20110701-0.13.9 make-3.81 sysstat-8.1.5-7.32.1 xorg-x11-libs-32bit-7.4 xorg-x11-libs-7.4 xorg-x11-libX11-32bit-7.4 xorg-x11-libX11-7.4 xorg-x11-libXau-32bit-7.4 xorg-x11-libXau-7.4 xorg-x11-libxcb-32bit-7.4 xorg-x11-libxcb-7.4 xorg-x11-libXext-32bit-7.4 xorg-x11-libXext-7.4 |
|
|
X图形界面要求 | primary instance host/standby instance host/observer host | 1. 配置vnc,vnc的配置文件 #!/bin/shxrdb $HOME/.Xresources xsetroot -solid grey xterm -geometry 80×24+10+10 -ls -title “$VNCDESKTOP Desktop” & #twm & startgnome & DISPLAY=:1 gnome-session & 2. 启动gdm: 3. 启动vnc: |
要求能启动图形化界面 | |
SWAP大小要求 | primary instance host/standby instance host/observer host | 内存大小为2GB~16GB,swap大小要求设置成和内存大小一样。 内存大于16GB,swap大小设置成16GB检查方法:grep SwapTotal /proc/meminfo 创建方法: |
SWAP要求不符合,将会导致安装时ORA-4030报错,随后数据库crash。 | |
关闭透明大页 | primary instance host/standby instance host/observer host | 检查: cat /sys/kernel/mm/transparent_hugepage/enabled –应该看到[never]如果没关闭,需要在SLES的图形化界面中启动computer,启动YaST2,在BootLoader中的Optional Kernel command line Parameter中,加上transparent_hugepage=never 见右图 |
||
修改IO调度队列 | primary instance host/standby instance host | 1. 如果使用的是SSD,需要将IO调度队列改成noop,cat /sys/block/<sda>/queue/scheduler 显示为noop
如果没显示为noop,需要在SLES的图形化界面中启动computer,启动YaST2,在BootLoader中的Optional Kernel command line Parameter中,加上elevator=noop (类似上面的修改) 2. 如果是高效云盘,不是SSD,则无需修改IO调度队列。 |
||
建卷要求 | primary instance host/standby instance host/ | 1.软件卷:/u01,大小最小50GB,用于存放Oracle软件,以及产生的数据库日志和trace文件,属主为oracle:oinstall 2.数据卷:/data01,大小最小为200G,用于存放数据库的数据文件,属主为oracle:oinstall 3.归档卷:/fra,大小最小为200G,用于存放数据库的归档日志和备份片,属主为oracle:oinstall |
软件卷可以放在本地硬盘 数据卷和归档卷需要SAN存储 数据卷可根据生产数据增长量,增加/data02,/data03……或者扩展/data01卷 归档卷不能增加新卷,只能扩展归档卷 |
|
observer host | 1.软件卷:/u01,大小最小50GB,用于存放Oracle软件,以及产生的数据库日志和trace文件,属主为oracle:oinstall 不需要数据卷和归档卷 |
软件卷可以放在本地硬盘 数据卷和归档卷需要SAN存储 数据卷可根据生产数据增长量,增加/data02,/data03……或者扩展/data01卷 归档卷不能增加新卷,只能扩展归档卷 |
||
防火墙要求 | primary instance host/standby instance host/observer host | 数据库主库和dataguard灾备库,以及仲裁库,需要双向打通1521端口 | ||
os group | primary instance host/standby instance host/observer host | groupadd -g 501 oinstall groupadd -g 502 dba groupadd -g 503 oper |
||
Osuser | primary instance host/standby instance host/observer host | useradd -u 500 -g oinstall -G dba,oper oracle | 生产环境oracle操作系统用户统一为oracle, 开发环境oracle操作系统用户统一为oradev 测试环境oracle操作系统用户统一为orauat |
|
设置kernel参数 | primary instance host/standby instance host/observer host | vi /etc/sysctl.conf ## add for install oracle fs.aio-max-nr = 1048576 fs.file-max = 6815744 kernel.shmall = 2097152 kernel.shmmax = 15294967295 kernel.shmmni = 4096 kernel.sem = 250 32000 100 128 net.ipv4.ip_local_port_range = 9000 65500 net.core.rmem_default = 262144 net.core.rmem_max = 4194304 net.core.wmem_default = 262144 net.core.wmem_max = 1048576 |
注:kernel.shmmax设置成接近物理内存 | |
设置hosts | primary instance host/standby instance host/observer host | # on dataguardfsfohost003 127.0.0.1 dataguardfsfohost003 192.168.56.1 dataguardfsfohost003 192.168.56.2 dataguardfsfohost001 192.168.56.3 dataguardfsfohost002# on dataguardfsfohost001 127.0.0.1 dataguardfsfohost001 192.168.56.1 dataguardfsfohost003 192.168.56.2 dataguardfsfohost001 192.168.56.3 dataguardfsfohost002 # on dataguardfsfohost002 |
(三)数据库软件安装和实例安装
需求项 | 操作环境 | 需求细节内容 | 说明 |
数据库实例名要求 | primary instance host/standby instance host | 应用ORACLE_SID 以所属系统来命名,长度不能超过6位,如果超过6位,需要进行缩写。 | 命名以英文单词组成,不允许使用汉语拼音。 命名中不允许出现ORACLE数据库中的关键字(Reserved Word)。 如:user、remark等单词不允许使用。Oracle关键字可从v$reserved_words中查询。超过6位的ORACLE_SID使用缩写,缩写规则如下: 1.包含四个或四个以下字符的单词,可以不必缩写; |
软件目录要求 | primary instance host/standby instance host | ORACLE_BASE: /u01/<PRD>/oracle ORACLE_HOME: /u01/<PRD>/oracle/<db_version_4_digit>/<ORACLE_SID> |
需要替换部分: <PRD>:生产环境统一为PRD,开发环境统一为DEV,测试环境统一为UAT <db_version_4_digit>为4位oracle的版本,如11.2.0.4,如12.1.0.2 <ORACLE_SID>为数据库实例名 |
数据文件存储要求 | primary instance host/standby instance host | 数据文件: /data01/<PRD>/oracle/<db_version_4_digit>/<ORACLE_SID>/dfile日志文件: /data01/<PRD>/oracle/<db_version_4_digit>/<ORACLE_SID>/lfile 控制文件: |
需要替换部分: <PRD>:生产环境统一为PRD,开发环境统一为DEV,测试环境统一为UAT <db_version_4_digit>为4位oracle的版本,如11.2.0.4,如12.1.0.2 <ORACLE_SID>为数据库实例名 |
数据库版本 | primary instance host/standby instance host | PatchSet:12.1.0.2 PSU:当年最新一个PSU的上一个PSU Oneoff:根据补丁分析列表,打上对应的oneoff补丁注:Dataguard备库的数据库软件版本,必须严格和主库一致 |
Oneoff补丁,以Patch_Assessment_on_top_of_xxxxxxx为准 |
数据库用户环境变量要求 | primary instance host/standby instance host | 数据库用户的环境变量(bash_profile文件)中至少应该包含如下: export ORACLE_SID=<ORACLE_SID> export ORACLE_BASE=/u01/<PRD>/oracle export ORACLE_HOME=$ORACLE_BASE/<db_version_4_digit>/<ORACLE_SID> export PATH=$ORACLE_HOME/bin:$PATH:$HOME/bin PATH=$PATH:$HOME/bin |
需要替换部分: <PRD>:生产环境统 一为PRD,开发环境统一为DEV,测试环境统一为UAT <db_version_4_digit>为4位oracle的版本,如11.2.0.4,如12.1.0.2. <ORACLE_SID>为数据库实例名 |
安装数据库软件 | primary instance host/standby instance host | 运行runInstaller.sh安装数据库,见下图 | 只进行软件安装,数据库创建待安装完成后,另外运行dbca进行创建。 |
安装数据库实例 | primary instance host | 运行dbca安装实例,见下图 | 只需在主库安装实例,备库不需要安装实例 |
(四)数据库初始化参数调整。
见【关于11g和12c数据库初始化参数的一些最佳实践参考】
注:_ktb_debug_flags参数和db_ultra_safe 有冲突,在同时设置的情况下,在dataguard switchover过去,再switchover回来的时候,会报错ORA-00600: internal error code, arguments: [kcbz_zib_simulation_1], [29064], [64272],[], [], [], [], [], [], [], [], []
(五)Dataguard和FSFO创建:
需求项 | 操作环境 | 需求细节内容 | 说明 |
调整redo大小 | primary instance host | alter database drop logfile group 1; alter database drop logfile group 2; alter database drop logfile group 3;alter database add logfile group 1(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/redo01_01.log’,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/redo01_02.log’) size 300m; alter database add logfile group 2(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/redo02_01.log’,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/redo02_02.log’) size 300m; alter database add logfile group 3(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/redo03_01.log’,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/redo03_02.log’) size 300m; |
|
拷贝密码文件到备库 | primary instance host | scp $ORACLE_HOME/dbs/orapwmyappn oracle@192.168.56.3:/u01/PRD/oracle/12.1.0.2/myappn/dbs/ | |
主备库创建各个目录 | primary instance host/standby instance host | mkdir -p /fra/PRD/oracle/12.1.0.2/myappn mkdir -p /data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/ mkdir -p /data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/ mkdir -p /u01/PRD/oracle mkdir -p /u01/PRD/oracle/admin/myappn/adump |
|
创建备库初始化文件 | standby instance host | *._high_priority_processes=’LMS*|VKTM|LGWR’ *._kks_obsolete_dump_threshold=0 *._ktb_debug_flags=8 *._kttext_warning=1 *._optimizer_aggr_groupby_elim=FALSE *._optimizer_dsdir_usage_control=0 *._optimizer_reduce_groupby_key=FALSE *._optimizer_unnest_scalar_sq=FALSE *._serial_direct_read=’NEVER’ *._sql_plan_directive_mgmt_control=0 *._undo_autotune=FALSE *._use_adaptive_log_file_sync=’FALSE’ *.archive_lag_target=1800 *.audit_file_dest=’/u01/PRD/oracle/admin/myappn/adump’ *.audit_trail=’db’ *.cell_offload_processing=FALSE *.compatible=’12.1.0.2.0′ *.control_files=’/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/control01.ctl’,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/control02.ctl’ *.db_block_size=8192 *.db_domain=” *.db_files=800 *.db_name=’myappn’ *.db_recovery_file_dest=’/fra/PRD/oracle/12.1.0.2/myappn’ *.db_recovery_file_dest_size=161061273600 *.db_securefile=’ALWAYS’ *.deferred_segment_creation=FALSE *.diagnostic_dest=’/u01/PRD/oracle’ *.dispatchers='(PROTOCOL=TCP) (SERVICE=myappnXDB)’ *.enable_ddl_logging=TRUE *.event=’28401 TRACE NAME CONTEXT FOREVER, LEVEL 1′ *.job_queue_processes=32 #*.local_listener=’LISTENER_myappn’ *.log_archive_format=’myappn_%R_%T_%S.arc’ *.open_cursors=300 *.parallel_max_servers=128 *.pga_aggregate_target=2560m *.processes=300 *.recyclebin=’OFF’ *.remote_login_passwordfile=’EXCLUSIVE’ *.session_cached_cursors=200 *.sga_target=7680m *.undo_tablespace=’UNDOTBS1′ *.utl_file_dir=’?/dbs’ ## add for dataguard *.log_archive_dest_1=’LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=myappn_dg’ *.db_unique_name=’myappn_dg’ *.log_archive_dest_state_1=’ENABLE’ *.log_archive_dest_state_2=’DEFER’ |
|
在主库创建standby redo log | primary instance host | alter database add standby logfile group 6(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/stby_redo01_01.log’ ,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/stby_redo01_02.log’) size 300m; alter database add standby logfile group 7(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/stby_redo02_01.log’ ,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/stby_redo02_02.log’) size 300m; alter database add standby logfile group 8(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/stby_redo03_01.log’ ,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/stby_redo03_02.log’) size 300m; alter database add standby logfile group 9(‘/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/stby_redo04_01.log’ ,’/fra/PRD/oracle/12.1.0.2/myappn/myappn/stby_redo04_02.log’) size 300m; |
|
修改主库的初始化文件 | primary instance host | alter database force logging; alter system set LOG_ARCHIVE_DEST_STATE_1=ENABLE; alter system set LOG_ARCHIVE_DEST_STATE_2=defer; alter system set LOG_ARCHIVE_CONFIG=’DG_CONFIG=(myappn,myappn_dg)’; alter system set LOG_ARCHIVE_DEST_1=’LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=myappn’; alter system set LOG_ARCHIVE_DEST_2=’SERVICE=myappn_dg lgwr ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=myappn_dg’;alter system set standby_file_management=auto scope=spfile; alter system set fal_server=’myappn_dg’; alter system set fal_client=’myappn’; alter system set db_create_file_dest=’/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/’ scope=spfile; |
|
设置主库的listener.ora文件 | primary instance host | cat listener.ora
SID_LIST_LISTENER = SID_LIST_LISTENER_DB = LISTENER = ADR_BASE_LISTENER = /u01/PRD/oracle |
|
设置主库的tnsnames.ora文件 | primary instance host | cat tnsnames.ora
myappn = LISTENER_myappn = myappn_DG = |
|
设置备库的listener.ora文件 | standby instance host | oracle@dataguardfsfohost002:/u01/PRD/oracle/12.1.0.2/myappn/network/admin> cat listener.ora # listener.ora Network Configuration File: /u01/PRD/oracle/12.1.0.2/myappn/network/admin/listener.ora # Generated by Oracle configuration tools.SID_LIST_LISTENER = (SID_LIST = (SID_DESC = (GLOBAL_DBNAME = myappn) (ORACLE_HOME = /u01/PRD/oracle/12.1.0.2/myappn) (SID_NAME = myappn) ) (SID_DESC = (GLOBAL_DBNAME = myappn_dg_dgmgrl) (ORACLE_HOME = /u01/PRD/oracle/12.1.0.2/myappn) (SID_NAME = myappn) ) ) SID_LIST_LISTENER_DB = LISTENER = ADR_BASE_LISTENER = /u01/PRD/oracle |
|
设置备库的tnsnames.ora文件 | standby instance host | oracle@dataguardfsfohost002:/u01/PRD/oracle/12.1.0.2/myappn/network/admin> cat tnsnames.ora.kk myappn = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = dataguardfsfohost001)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = myappn) ) )myappn_dg = (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = dataguardfsfohost002)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = myappn) ) ) |
|
启动备库数据库到nomount状态 | standby instance host | startup nomount | 注意备库应该以pfile启动,不能用spfile启动 |
rman登录主库,准备开始复制数据库 | primary instance host | rman target sys/dji123@myappn auxiliary sys/dji123@myappn_dg | |
rman复制数据库 | primary instance host | run { allocate channel prmy1 type disk; allocate channel prmy2 type disk; allocate auxiliary channel stby type disk; duplicate target database for standby from active database nofilenamecheck spfile set db_unique_name=’myappn_dg’ set log_archive_max_processes=’5′ set fal_client=’myappn_dg’ set fal_server=’myappn’ set standby_file_management=’AUTO’ set log_archive_config=’DG_CONFIG=(myappn,myappn_dg)’ set log_archive_dest_1=’LOCATION=USE_DB_RECOVERY_FILE_DEST VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=myappn_dg’ set log_archive_dest_2=’SERVICE=myappn lgwr ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=myappn’ set db_create_file_dest=’/data01/PRD/oracle/12.1.0.2/myappn/dfile/myappn/’ ; } |
|
主库和备库都启用归档日志传输,将defer状态改成enable | primary instance host/standby instance host | alter system set log_archive_dest_state_2=enable; | |
重启standby节点,恢复日志apply进度 | primary instance host | shutdown immediate; startup; alter database recover managed standby database using current logfile parallel 8 disconnect; |
|
测试验证是否同步成功,建立测试表 | primary instance host | create table t1 as select sysdate as mydate from dual; | |
测试验证是否同步成功,查询测试表 | primary instance host | select to_char(mydate,’yyyy-mm-dd hh24:mi:ss’) as mydate from t1; | |
dataguard状态检查 | primary instance host | SQL> select SWITCHOVER_STATUS from v$database;
SWITCHOVER_STATUS |
只有状态是SESSIONS ACTIVE(有活动session连接)或者TO STANDBY(无活动session连接),才可以进行切换,如果状态是RESOLVABLE GAP,则需要手工解决gap问题先。 |
dataguard状态检查 | standby instance host | SQL> select SWITCHOVER_STATUS from v$database;
SWITCHOVER_STATUS |
|
切换到physical standby | primary instance host | SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN;
Database altered. SQL> |
|
切换到primary | standby instance host | SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered. SQL> select SWITCHOVER_STATUS from v$database; SWITCHOVER_STATUS SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY; Database altered. SQL> |
|
重启备节点为primary节点 | standby instance host | SQL> shutdown immediate; ORA-01109: database not openDatabase dismounted. ORACLE instance shut down. SQL> SQL> startup ORACLE instance started. |
|
重启原primary节点为standby节点 | primary instance host | SQL> exit
SQL> startup Total System Global Area 1185853440 bytes Database altered. SQL> |
|
还原主备 | primary instance host/standby instance host | 重复上面18~23步骤 | |
主库设置和FSFO相关的参数,并重启数据库 | primary instance host | alter system set dg_broker_config_file1 =’/u01/PRD/oracle/12.1.0.2/myappn/dbs/dgbroker1.dat’ SCOPE=BOTH; alter system set dg_broker_config_file2 =’/u01/PRD/oracle/12.1.0.2/myappn/dbs/dgbroker2.dat’ SCOPE=BOTH; alter system reset log_archive_dest_1 scope=spfile sid=’*’; alter system reset log_archive_dest_2 scope=spfile sid=’*’; alter system set dg_broker_start=true scope=both; alter database flashback on; alter system set db_flashback_retention_target = 4320 scope=both;shutdown immediate; startup |
|
备库设置与FSFO相关参数,并重启数据库 | standby instance host | alter system set dg_broker_config_file1 =’/u01/PRD/oracle/12.1.0.2/myappn/dbs/dgbroker1.dat’ SCOPE=BOTH; alter system set dg_broker_config_file2 =’/u01/PRD/oracle/12.1.0.2/myappn/dbs/dgbroker2.dat’ SCOPE=BOTH; alter system reset log_archive_dest_1 scope=spfile sid=’*’; alter system reset log_archive_dest_2 scope=spfile sid=’*’; alter system set dg_broker_start=true scope=both; alter database recover managed standby database cancel; alter database flashback on; alter system set db_flashback_retention_target = 4320 scope=both; alter database recover managed standby database using current logfile parallel 8 disconnect;shutdown immediate; startup alter database recover managed standby database using current log file disconnect; |
|
登录dgmgrl控制台,开始准备配置策略 | primary instance host | dgmgrl sys/dji123@myappn | |
配置broker策略 | primary instance host | DGMGRL>create configuration ‘HASolution’ as primary database is ‘myappn’ connect identifier is ‘myappn’; DGMGRL>add database ‘myappn_dg’ as connect identifier is ‘myappn_dg’ ; DGMGRL>edit database myappn set property ‘LogXptMode’=’ASYNC’; DGMGRL>edit database myappn set property NetTimeout=60; DGMGRL>edit database myappn_dg set property ‘LogXptMode’=’ASYNC’; DGMGRL>edit database myappn_dg set property NetTimeout=60 DGMGRL>enable configuration; |
|
在broker测试切换是否成功 | primary instance host | DGMGRL>switchover to ‘myappn_dg’; | |
切换原主备关系 | primary instance host | DGMGRL>switchover to ‘myappn’; | |
启动FSFSO | primary instance host | DGMGRL>enable fast_start failover; | |
配置obsever | observer host | dgmgrl sys/dji123@OBSRV_myappn_DGMGRL
DGMGRL>edit database myappn set property ObserverConnectIdentifier=’OBSRV_myappn_DGMGRL’; |
|
启动obsever | observer host | cd /u01/PRD/oracle/12.1.0.2/myappn/network/dg_observ
nohup dgmgrl -silent sys/dji123@OBSRV_myappn_DGMGRL “start observer” & |
|
创建TAF应用连接服务 | primary instance host | exec dbms_service.create_service(‘prim_db’,’prim_db’); exec dbms_service.create_service(‘stby_db’,’stby_db’); exec dbms_service.start_service(‘prim_db’);CREATE OR REPLACE TRIGGER startDgServices after startup on database DECLARE db_role VARCHAR(30); db_open_mode VARCHAR(30); BEGIN SELECT DATABASE_ROLE, OPEN_MODE INTO db_role, db_open_mode FROM V$DATABASE; IF db_role = ‘PRIMARY’ THEN DBMS_SERVICE.START_SERVICE(‘prim_db’); DBMS_SERVICE.STOP_SERVICE(‘stby_db’); END IF; IF db_role = ‘PHYSICAL STANDBY’ AND db_open_mode LIKE ‘READ ONLY%’ THEN DBMS_SERVICE.START_SERVICE(‘stby_db’); END IF; END; / |
注:由FSFO自动主备切换的,备库不会完全shutdown,所以stby_db的服务不会停止,导致在原备库出现prim_db和stby_db两个服务的出现。因此在startup trigger中加入了stop stby_db service的命令。
但是却导致了,在正常重启的时候,由于不存在stby_db服务,因此alertlog中会报错: 可以忽略该报错。 |
(六)FSFO测试场景
描述 | 测试动作 | 预期结果 | 实际结果/备注 |
计划外主库实例挂掉 | kill -9 pid_of_smon 或者 SQL > shutdown abort |
Primary instance crash,standby instance进行切换。
切换后,原standby instance成primary instance. 在dgmgrl控制台可以看到ORA-16829: fast-start failover configuration is lagging,且原primary instance可以看到ORA-16661: the standby database needs to be reinstated |
2分钟左右完成切换 |
计划外备库实例挂掉 | kill -9 pid_of_smon 或者 SQL > shutdown abort |
dgmgrl控制台主库报错ORA-16810: multiple errors or warnings detected for the database,备库报错ORA-01034: ORACLE not available | 备库挂掉,应用不受影响 |
计划内关闭主库 | SQL > shutdown immediate | 主备库不切换,在dgmgrl控制台可以看到主库报错:Error: ORA-01034: ORACLE not available,备库提示信息:(*) Physical standby database | 计划内维护 |
计划内关闭备库 | SQL > shutdown immediate | 主备库不切换,在dgmgrl控制台可以看到主库报错:ORA-16829: fast-start failover configuration is lagging,备库提示信息:(*) Physical standby database (disabled) | 计划内维护 |
重新启动失败数据库实例 | SQL > startup | 启动原primary库之后,自动变成physical standby角色。在完成Clearing online log之后,进行mrp的介质恢复。
在dgmgrl控制台,看到原primary状态从ORA-16661: the standby database needs to be reinstated,变成ORA-16657: reinstatement of database in progress,继而变成ORA-16829: fast-start failover configuration is lagging,待lag补齐之后最终变成Physical standby database |
重启的原primary库恢复正常大约5~10分钟,重启期间应用完全不受影响。 |
主库监听挂掉 | kill -9 pid_of_listener | 应用无法连接。
主库和备库之间如果没有lag,处于standby redolog的recovery,则不受影响。 主库和备库之间如果有lag,则当备库启动时,需要使用archivelog的recovery,就会报错 |
监听挂掉,应用无法连接数据库,需要手工启动监听。
注:已经添加自动重启监听脚本,发现监听宕掉,自动拉起监听。 |
备库监听挂掉 | kill -9 pid_of_listener | 应用正常连接。
主库和备库之间如果没有lag,处于standby redolog的recovery,则不受影响。 主库和备库之间如果有lag,dgmgrl控制台主库报错:ORA-16810: multiple errors or warnings detected for the database,备库报错:ORA-12541: TNS:no listener |
应用连接正常,需要手工启动备库监听
注:已经添加自动重启监听脚本,发现监听宕掉,自动拉起监听。 |
计划内切换主备库 | DGMGRL> switchover to ‘myappn_dg’; | 在切换期间,应用连接报错ORA-16456: switchover to standby in progress or completed。
切换后,应用恢复正常。 |
大约2分钟左右完成切换 |
(七)备份和其他
需求项 | 操作环境 | 需求细节内容 | 说明 |
数据库备份配置 | primary instance host/standby instance host | 每天1天进行数据库备库,具体脚本内容见右。 | 00 01 * * * /home/oracle/dba_script/backup.sh |
listener watchlog | primary instance host/standby instance host | 监控listener进程,每分钟检查一次,如果发现宕掉,自动拉起listener。 | * * * * * /home/oracle/dba_script/watchdog_lnsr.sh |
(八)应用用户初始化:
需求项 | 操作环境 | 需求细节内容 | 说明 |
创建读写应用用户 | primary instance host | create user appuser_rw identified by “password1”; | |
创建只读应用用户 | primary instance host | create user appuser_ro identified by “password2”; | |
授权读写用户基本权限 | primary instance host | grant connect,resource,unlimited tablespace to appuser_rw; | |
授权只写用户基本权限 | primary instance host | grant create session to appuser_ro; | |
创建读写角色 | primary instance host | create role app_role_rw; | |
创建只写角色 | primary instance host | create role app_role_ro; | |
将除resource之外的权限给读写角色 | primary instance host | grant CREATE JOB, CREATE MATERIALIZED VIEW, CREATE PUBLIC DATABASE LINK, CREATE PUBLIC SYNONYM, CREATE VIEW, ALTER SESSION to app_role_rw; |
|
将select any table给只读用户 | primary instance host | grant SELECT ANY TABLE to app_role_ro; | |
将读写角色给读写用户 | primary instance host | grant app_role_rw to appuser_rw; | |
将只读角色给只读用户 | primary instance host | grant app_role_ro to appuser_ro; | |
将用户密码过期,要求用户第一次登录的时候自己修改密码。 | primary instance host | alter user appuser_rw password expire; | |
alter user appuser_ro password expire; |