IT博客汇 | Step by Step建立带ADG的sharding db

Step by Step建立带ADG的sharding db

小荷发表于 2016-11-14 08:31:01

上次的文章，我们说到sharding其实是非常需要HA架构的，我们今天来安装一个带ADG的sharding node的sharding database。
（我只是把shard node做成了ADG，且没有加RAC架构，shardcat也没有做ADG，因为……我的虚拟机实在吃不消了！）

1.大致环境介绍：
5台主机：
sdb1装shardcat，
sdb2装shard node，sh1，
sdb3装shard node，sh2，
sdb4装shard node，sh3，
sdb5装shard node，sh4，
其中sh1和sh3互为主备，sh2和sh4互为主备。

sh1和sh2在region1，成为一个primary_shardgroup的shard group
sh3和sh4在region2，成为一个standby_shardgroup的shard group。

2.开始安装

2.1 在5个主机安装12.2数据库软件，在sdb1建立shardcat数据库实例。

2.2 在sdb1的shardcat数据库：

SQL> alter system set db_create_file_dest='/u01/ora12c/app/oracle/oradata' scope=both;
SQL> alter system set open_links=16 scope=spfile;
SQL> alter system set open_links_per_instance=16 scope=spfile;
SQL> startup force
SQL> 
SQL> alter user gsmcatuser account unlock;
SQL> alter user gsmcatuser identified by oracle;                 
SQL> CREATE USER mygdsadmin IDENTIFIED BY oracle;
SQL> GRANT connect, create session, gsmadmin_role to mygdsadmin;
SQL> grant inherit privileges on user SYS to GSMADMIN_INTERNAL;  
SQL> 
SQL> execute dbms_xdb.sethttpport(8080);
SQL> commit;
SQL> @?/rdbms/admin/prvtrsch.plb 
SQL> exec DBMS_SCHEDULER.SET_AGENT_REGISTRATION_PASS('oracleagent'); 
SQL> exit

2.3 在各个shard node节点：

schagent -start
echo oracleagent|schagent -registerdatabase sdb1 8080
mkdir -p /u01/ora12c/app/oracle/oradata      
mkdir -p /u01/ora12c/app/oracle/fast_recovery_area

2.4 在shardcat节点：

GDSCTL>create shardcatalog -database sdb1:1521:shardcat -chunks 12 -user mygdsadmin/oracle -sdb shardcat -region region1,region2
GDSCTL>
GDSCTL>add gsm -gsm sharddirector1 -listener 1571 -pwd oracle -catalog sdb1:1521:shardcat -region region1
GDSCTL>add gsm -gsm sharddirector2 -listener 1571 -pwd oracle -catalog sdb1:1521:shardcat -region region2
GDSCTL>
GDSCTL>start gsm -gsm sharddirector1
GDSCTL>start gsm -gsm sharddirector2
GDSCTL>
GDSCTL>add credential -credential oracle_cred -osaccount oracle12c -ospassword oracle12c
GDSCTL>
GDSCTL>add shardgroup -shardgroup primary_shardgroup -region region1 -deploy_as primary
GDSCTL>ADD SHARDGROUP -shardgroup standby_shardgroup -region region2 -deploy_as active_standby
GDSCTL>
GDSCTL>add invitednode sdb2
GDSCTL>create shard -shardgroup primary_shardgroup -destination sdb2 -credential oracle_cred
GDSCTL>add invitednode sdb3
GDSCTL>create shard -shardgroup primary_shardgroup -destination sdb3 -credential oracle_cred
GDSCTL>add invitednode sdb4
GDSCTL>create shard -shardgroup standby_shardgroup -destination sdb4 -credential oracle_cred
GDSCTL>add invitednode sdb5
GDSCTL>create shard -shardgroup standby_shardgroup -destination sdb5 -credential oracle_cred




GDSCTL>config

Regions
------------------------
region1                       
region2                       

GSMs
------------------------
sharddirector1                
sharddirector2                

Sharded Database
------------------------
shardcat                      

Databases
------------------------
sh1                           
sh2                           
sh3                           
sh4                           

Shard Groups
------------------------
primary_shardgroup            
standby_shardgroup            

Shard spaces
------------------------
shardspaceora                 

Services
------------------------

GDSCTL pending requests
------------------------
Command                       Object                        Status                        
-------                       ------                        ------                        

Global properties
------------------------
Name: oradbcloud
Master GSM: sharddirector1
DDL sequence #: 0


GDSCTL>
GDSCTL>
GDSCTL>config shard
Name                Shard Group         Status    State       Region    Availability 
----                -----------         ------    -----       ------    ------------ 
sh1                 primary_shardgroup  U         none        region1   -            
sh2                 primary_shardgroup  U         none        region1   -            
sh3                 standby_shardgroup  U         none        region2   -            
sh4                 standby_shardgroup  U         none        region2   -            

GDSCTL>
GDSCTL>
GDSCTL>
GDSCTL>deploy
               
 deploy: examining configuration...
deploy: deploying primary shard 'sh1' ...
deploy: network listener configuration successful at destination 'sdb2'
deploy: starting DBCA at destination 'sdb2' to create primary shard 'sh1' ...
deploy: deploying primary shard 'sh2' ...
deploy: network listener configuration successful at destination 'sdb3'
deploy: starting DBCA at destination 'sdb3' to create primary shard 'sh2' ...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: waiting for 2 DBCA primary creation job(s) to complete...
deploy: DBCA primary creation job succeeded at destination 'sdb2' for shard 'sh1'
deploy: deploying standby shard 'sh3' ...
deploy: network listener configuration successful at destination 'sdb4'
deploy: starting DBCA at destination 'sdb4' to create standby shard 'sh3' ...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: waiting for 1 DBCA primary creation job(s) to complete...
deploy: DBCA primary creation job succeeded at destination 'sdb3' for shard 'sh2'
deploy: deploying standby shard 'sh4' ...
deploy: network listener configuration successful at destination 'sdb5'
deploy: starting DBCA at destination 'sdb5' to create standby shard 'sh4' ...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: waiting for 2 DBCA standby creation job(s) to complete...
deploy: DBCA standby creation job succeeded at destination 'sdb4' for shard 'sh3'
deploy: waiting for 1 DBCA standby creation job(s) to complete...
deploy: waiting for 1 DBCA standby creation job(s) to complete...
deploy: waiting for 1 DBCA standby creation job(s) to complete...
deploy: DBCA standby creation job succeeded at destination 'sdb5' for shard 'sh4'
deploy: requesting Data Guard configuration on shards via GSM
deploy: shards configured successfully
The operation completed successfully
GDSCTL>GDSCTL>
GDSCTL>
GDSCTL>
GDSCTL>config shard
Name                Shard Group         Status    State       Region    Availability 
----                -----------         ------    -----       ------    ------------ 
sh1                 primary_shardgroup  Ok        Deployed    region1   ONLINE       
sh2                 primary_shardgroup  Ok        Deployed    region1   ONLINE       
sh3                 standby_shardgroup  Ok        Deployed    region2   READ ONLY    
sh4                 standby_shardgroup  Ok        Deployed    region2   READ ONLY    

GDSCTL>
GDSCTL>databases
Database: "sh1" Registered: Y State: Ok ONS: N. Role: PRIMARY Instances: 1 Region: region1
Alert: Data Guard observer is not running.
   Registered instances:
     shardcat%1
Database: "sh2" Registered: Y State: Ok ONS: N. Role: PRIMARY Instances: 1 Region: region1
Alert: Data Guard observer is not running.
   Registered instances:
     shardcat%11
Database: "sh3" Registered: Y State: Ok ONS: N. Role: PH_STNDBY Instances: 1 Region: region2
   Registered instances:
     shardcat%21
Database: "sh4" Registered: Y State: Ok ONS: N. Role: PH_STNDBY Instances: 1 Region: region2
   Registered instances:
     shardcat%31

GDSCTL>

我们注意上，上面有个提示：Alert: Data Guard observer is not running. 因为默认安装好之后，observer是没有启动的，（因为没有自动配置好）。

3. 配置observer：

3.1 show configuration显示是observer没有启动

DGMGRL> connect sys/oracle
Connected to "sh1"
Connected as SYSDG.
DGMGRL> 
DGMGRL> show configuration verbose

Configuration - sh1

  Protection Mode: MaxPerformance
  Members:
  sh1 - Primary database
    Warning: ORA-16819: fast-start failover observer not started

    sh3 - (*) Physical standby database 
      Warning: ORA-16819: fast-start failover observer not started

  (*) Fast-Start Failover target

  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    TraceLevel                      = 'SUPPORT'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'
    ConfigurationWideServiceName    = 'sh1_CFG'

Fast-Start Failover: ENABLED

  Threshold:          30 seconds
  Target:             sh3
  Observer:           (none)
  Lag Limit:          30 seconds
  Shutdown Primary:   TRUE
  Auto-reinstate:     TRUE
  Observer Reconnect: (none)
  Observer Override:  FALSE

Configuration Status:
WARNING

DGMGRL>

3.2 尝试手工启动：

DGMGRL>  START OBSERVER 
[W000 11/12 11:40:51.83] FSFO target standby is sh3
[P001 11/12 11:40:54.22] Authentication failed.
DGM-16979: Unable to log on to the primary or standby database as SYSDBA
Failed to start the observer.

原因是不能使用sys/oracle登录。需要用sys/oracle@sh1登录。

3.3 由于是gsm创建的sys用户，你不知道sys用户密码是什么，所以需要修改：

SQL> alter user sys identified by oracle;

3.4 observer需要第三方节点做为observer site。我们选择shardcat节点作为obsever，在shardcat主机，配置tnsnames如下：

[oracle12c@sdb1 admin]$ cat tnsnames.ora
# tnsnames.ora Network Configuration File: /u01/ora12c/app/oracle/product/12.2.0/db_1/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.

LISTENER_SHARDCAT =
  (ADDRESS = (PROTOCOL = TCP)(HOST = sdb1)(PORT = 1521))


SHARDCAT =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sdb1)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = shardcat)
    )
  )

SH1_OBSRV =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sdb2)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = sh1)
    )
  )


SH2_OBSRV =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sdb3)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = sh2)
    )
  )


SH3_OBSRV =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sdb4)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = sh3)
    )
  )

SH4_OBSRV =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sdb5)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = sh4)
    )
  )



SH1_OBSRV_DGMGRL =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sdb2)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = sh1_DGMGRL)
    )
  )


SH2_OBSRV_DGMGRL =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sdb3)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = sh2_DGMGRL)
    )
  )


SH3_OBSRV_DGMGRL =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sdb4)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = sh3_DGMGRL)
    )
  )

SH4_OBSRV_DGMGRL =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = sdb5)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = sh4_DGMGRL)
    )
  )
[oracle12c@sdb1 admin]$

3.5 在shardcat主机：

dgmgrl sys/oracle@sh1_obsrv
edit database sh1 set property ObserverConnectIdentifier='sh1_obsrv';
edit database sh3 set property ObserverConnectIdentifier='sh3_obsrv';

dgmgrl sys/oracle@sh2_obsrv
edit database sh2 set property ObserverConnectIdentifier='sh2_obsrv';
edit database sh4 set property ObserverConnectIdentifier='sh4_obsrv';

3.6 启动observer

cd /home/oracle12c/startObsrv_sh1
nohup dgmgrl -silent sys/oracle@sh1_obsrv "start observer" &

cd /home/oracle12c/startObsrv_sh2
nohup dgmgrl -silent sys/oracle@sh2_obsrv "start observer" &

注意2个ovserver需要到2个目录下面启动，不然每次启动，会生成一个fsfo.dat，在同一目录下不能生成同名文件，会冲突。

3.7 检查，恢复正常：

DGMGRL> show configuration verbose

Configuration - sh1

  Protection Mode: MaxPerformance
  Members:
  sh1 - Primary database
    sh3 - (*) Physical standby database 

  (*) Fast-Start Failover target

  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    TraceLevel                      = 'SUPPORT'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'
    ConfigurationWideServiceName    = 'sh1_CFG'

Fast-Start Failover: ENABLED

  Threshold:          30 seconds
  Target:             sh3
  Observer:           sdb1<<<<<<<<<<<<<<<<<注意，observer在shardcat上，即sdb1上。
  Lag Limit:          30 seconds
  Shutdown Primary:   TRUE
  Auto-reinstate:     TRUE
  Observer Reconnect: (none)
  Observer Override:  FALSE

Configuration Status:
SUCCESS

DGMGRL>

同时，在gdsctl中检查：

GDSCTL>databases
Database: "sh1" Registered: Y State: Ok ONS: N. Role: PRIMARY Instances: 1 Region: region1
   Registered instances:
     shardcat%1
Database: "sh2" Registered: Y State: Ok ONS: N. Role: PRIMARY Instances: 1 Region: region1
   Registered instances:
     shardcat%11
Database: "sh3" Registered: Y State: Ok ONS: N. Role: PH_STNDBY Instances: 1 Region: region2
   Registered instances:
     shardcat%21
Database: "sh4" Registered: Y State: Ok ONS: N. Role: PH_STNDBY Instances: 1 Region: region2
   Registered instances:
     shardcat%31

GDSCTL>
GDSCTL>
GDSCTL>config shard
Name                Shard Group         Status    State       Region    Availability 
----                -----------         ------    -----       ------    ------------ 
sh1                 primary_shardgroup  Ok        Deployed    region1   ONLINE       
sh2                 primary_shardgroup  Ok        Deployed    region1   ONLINE       
sh3                 standby_shardgroup  Ok        Deployed    region2   READ ONLY    
sh4                 standby_shardgroup  Ok        Deployed    region2   READ ONLY    

GDSCTL>



注：
GDSCTL>databases
Database: "sh1" Registered: Y State: Ok ONS: N. Role: PRIMARY Instances: 1 Region: region1
   Registered instances: <<<<<<<<<<<<已经配置好的，所以没有告警
     shardcat%1
Database: "sh2" Registered: Y State: Ok ONS: N. Role: PRIMARY Instances: 1 Region: region1
Alert: Data Guard observer is not running. <<<<<<如果没配置好，会有一个alert告警
   Registered instances:
     shardcat%11
Database: "sh3" Registered: Y State: Ok ONS: N. Role: PH_STNDBY Instances: 1 Region: region2
   Registered instances:
     shardcat%21
Database: "sh4" Registered: Y State: Ok ONS: N. Role: PH_STNDBY Instances: 1 Region: region2
   Registered instances:
     shardcat%31

GDSCTL>

4. 添加services：

GDSCTL>  add service -service oltp_rw_srvc -role primary
The operation completed successfully
GDSCTL>start service -service oltp_rw_srvc
The operation completed successfully
GDSCTL>
GDSCTL>add service -service oltp_ro_srvc -role physical_standby
The operation completed successfully
GDSCTL>start service oltp_ro_srvc
GSM-45011: Unexpected argument
oltp_ro_srvc 
GDSCTL>start service -service oltp_ro_srvc
The operation completed successfully
GDSCTL>
GDSCTL>
GDSCTL>status service 
Service "oltp_ro_srvc.shardcat.oradbcloud" has 2 instance(s). Affinity: ANYWHERE
   Instance "shardcat%21", name: "sh3", db: "sh3", region: "region2", status: ready.
   Instance "shardcat%31", name: "sh4", db: "sh4", region: "region2", status: ready.
Service "oltp_rw_srvc.shardcat.oradbcloud" has 2 instance(s). Affinity: ANYWHERE
   Instance "shardcat%1", name: "sh1", db: "sh1", region: "region1", status: ready.
   Instance "shardcat%11", name: "sh2", db: "sh2", region: "region1", status: ready.

GDSCTL>
GDSCTL>
GDSCTL>services
Service "oltp_ro_srvc.shardcat.oradbcloud" has 2 instance(s). Affinity: ANYWHERE
   Instance "shardcat%21", name: "sh3", db: "sh3", region: "region2", status: ready.
   Instance "shardcat%31", name: "sh4", db: "sh4", region: "region2", status: ready.
Service "oltp_rw_srvc.shardcat.oradbcloud" has 2 instance(s). Affinity: ANYWHERE
   Instance "shardcat%1", name: "sh1", db: "sh1", region: "region1", status: ready.
   Instance "shardcat%11", name: "sh2", db: "sh2", region: "region1", status: ready.

GDSCTL>

5. 创建sharding的应用用户app_schema:

sqlplus "/ as sysdba
alter session enable shard ddl;
create user app_schema identified by app_schema_password;
grant all privileges to app_schema;
grant gsmadmin_role to app_schema;
grant select_catalog_role to app_schema;
grant connect, resource to app_schema;
grant dba to app_schema;
grant execute on dbms_crypto to app_schema;

后面就是照常的创建shard table和duplicate table了。不再累述。