IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    为shading database添加节点

    小荷发表于 2016-11-10 10:04:15
    love 0

    添加或者删除shard,是shard的数量发生变化,叫做re-shard。re-shard会导致chunk的挪动(chunk migration),re-shard的chunk migration是将原来的chunk的序号大的部分,移动到新node上。示意图如下:
    re-sharding

    我们来发起一次添加shard节点,做re-shard:
    1. 检查初始状态,在shardcat主机:

    [oracle12c@sdb1 ~]$ gdsctl
    GDSCTL: Version 12.2.0.0.0 - Beta on Mon Nov 07 10:52:30 CST 2016
    
    Copyright (c) 2011, 2015, Oracle.  All rights reserved.
    
    Welcome to GDSCTL, type "help" for information.
    
    Current GSM is set to SHARDDIRECTOR1
    GDSCTL>set gsm -gsm SHARDDIRECTOR1
    GDSCTL>connect mygdsadmin/oracle
    Catalog connection is established
    GDSCTL>
    GDSCTL>
    GDSCTL>config shard  
    Name                Shard Group         Status    State       Region    Availability 
    ----                -----------         ------    -----       ------    ------------ 
    sh1                 primary_shardgroup  Ok        Deployed    region1   ONLINE       
    sh2                 primary_shardgroup  Ok        Deployed    region1   ONLINE       
    
    GDSCTL>
    
    [oracle12c@sdb1 ~]$ cat /etc/hosts
    # Do not remove the following line, or various programs
    # that require network functionality will fail.
    127.0.0.1               testdb2 localhost.localdomain localhost
    192.168.56.21   sdb1    sdb1.localdomain
    192.168.56.22   sdb2    sdb2.localdomain
    192.168.56.23   sdb3    sdb3.localdomain
    192.168.56.24   sdb4    sdb4.localdomain
    192.168.56.25   sdb5    sdb5.localdomain
    [oracle12c@sdb1 ~]$

    2. 在新的shard node主机(sdb4)启动schagent

    [oracle12c@sdb4 dbs]$ schagent -start
    
    Scheduler agent started using port 22991
    [oracle12c@sdb4 dbs]$ 
    [oracle12c@sdb4 dbs]$ schagent -status
    Agent running with PID 4585
    
    Agent_version:12.2.0.1.2
    Running_time:00:00:10
    Total_jobs_run:0
    Running_jobs:0
    Platform:Linux
    ORACLE_HOME:/u01/ora12c/app/oracle/product/12.2.0/db_1
    ORACLE_BASE:/u01/ora12c/app/oracle
    Port:22991
    Host:sdb4
    
    [oracle12c@sdb4 dbs]$ 
    [oracle12c@sdb4 dbs]$ 
    [oracle12c@sdb4 dbs]$ echo oracleagent|schagent -registerdatabase sdb1 8080
    Agent Registration Password ?  
    Oracle Scheduler Agent Registration for 12.2.0.1.2 Agent
    Agent Registration Successful!
    [oracle12c@sdb4 dbs]$ 
    [oracle12c@sdb4 dbs]$ mkdir -p /u01/ora12c/app/oracle/oradata
    [oracle12c@sdb4 dbs]$ mkdir -p /u01/ora12c/app/oracle/fast_recovery_area
    [oracle12c@sdb4 dbs]$

    3.在shardcat主机add shard node

    GDSCTL>add invitednode sdb4
    GDSCTL>
    GDSCTL>create shard -shardgroup primary_shardgroup -destination sdb4 -credential oracle_cred
    DB Unique Name: sh21
    GDSCTL>

    4. deploy前检查状态:

    GDSCTL>config
    
    Regions
    ------------------------
    region1                       
    
    GSMs
    ------------------------
    sharddirector1                
    
    Sharded Database
    ------------------------
    shardcat                      
    
    Databases
    ------------------------
    sh1                           
    sh2                           
    sh21                          
    
    Shard Groups
    ------------------------
    primary_shardgroup            
    
    Shard spaces
    ------------------------
    shardspaceora                 
    
    Services
    ------------------------
    oltp_rw_srvc                  
    
    GDSCTL pending requests
    ------------------------
    Command                       Object                        Status                        
    -------                       ------                        ------                        
    
    Global properties
    ------------------------
    Name: oradbcloud
    Master GSM: sharddirector1
    DDL sequence #: 38
    
    
    GDSCTL>
    GDSCTL>
    GDSCTL>config shardspace
    SHARDSPACE                    Chunks                        
    ----------                    ------                        
    shardspaceora                 12                            
    
    GDSCTL>
    GDSCTL>
    GDSCTL>config vncr
    Name                          Group ID                      
    ----                          --------                      
    sdb2                                                        
    sdb3                                                        
    192.168.56.22                                               
    192.168.56.23                                               
    192.168.56.21                                               
    sdb4                                                        
    127.0.0.1                                                   
    
    GDSCTL>
    GDSCTL>config shardgroup
    Shard Group         Chunks Region              SHARDSPACE          
    -----------         ------ ------              ----------          
    primary_shardgroup  12     region1             shardspaceora       
    
    GDSCTL>config shard
    Name                Shard Group         Status    State       Region    Availability 
    ----                -----------         ------    -----       ------    ------------ 
    sh1                 primary_shardgroup  Ok        Deployed    region1   ONLINE       
    sh2                 primary_shardgroup  Ok        Deployed    region1   ONLINE       
    sh21                primary_shardgroup  U         none        region1   -            
    
    GDSCTL>
    GDSCTL>config chunks
    Chunks
    ------------------------
    Database                      From      To        
    --------                      ----      --        
    sh1                           1         6         
    sh2                           7         12        
    
    GDSCTL>
    
    chunk的情况,也可以在shardcat数据库里检查:
    SQL> select a.name Shard, count( b.chunk_number) Number_of_Chunks from
      2  gsmadmin_internal.database a, gsmadmin_internal.chunk_loc b where
      3  a.database_num=b.database_num group by a.name;
    
    SHARD                          NUMBER_OF_CHUNKS
    ------------------------------ ----------------
    sh1                                           6
    sh2                                           6
    
    SQL>
    
    另外,re-shard的状态也可以如下检查,可以看到ongoing是空,即在deploy之前,没有re-shard的动作:
    [oracle12c@sdb1 ~]$ gdsctl config chunks -show_Reshard
    Chunks
    ------------------------
    Database                      From      To        
    --------                      ----      --        
    sh1                           1         6         
    sh2                           7         12        
    
    Ongoing chunk movement
    ------------------------
    Chunk     Source                        Target                        status    
    -----     ------                        ------                        ------    
    
    [oracle12c@sdb1 ~]$

    5.运行deploy
    注意deploy之前,确认新主机上的listener已经被关闭,不然的话,在deploy会报错:

    --如果不关闭listener,会报错。
    GDSCTL>deploy
    GSM Errors: 
    CATALOG:ORA-02610: Remote job failed with error: 
    EXTERNAL_LOG_ID="job_76954_8",
    USERNAME="oracle12c"
    For more details:
      select destination, output from all_scheduler_job_run_details
      where job_name='SHARD_SH21_NETCA'
    CATALOG:ORA-45575: Deployment has terminated due to previous errors.
    
    --正常的安装,如下:
    GDSCTL>deploy
         deploy: examining configuration...
    deploy: deploying primary shard 'sh21' ...
    deploy: network listener configuration successful at destination 'sdb4'
    deploy: starting DBCA at destination 'sdb4' to create primary shard 'sh21' ...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: waiting for 1 DBCA primary creation job(s) to complete...
    deploy: DBCA primary creation job succeeded at destination 'sdb4' for shard 'sh21'
    deploy: requesting Data Guard configuration on shards via GSM
    deploy: shards configured; background operations in progress
    The operation completed successfully
    GDSCTL>

    在deploy的过程中,可以看re-shard已经在schedule了,注意此时的情况是sdb4正在create instance的过程中:

    [oracle12c@sdb1 ~]$ gdsctl config chunks -show_Reshard
    Chunks
    ------------------------
    Database                      From      To        
    --------                      ----      --        
    sh1                           1         6         
    sh2                           7         12        
    
    Ongoing chunk movement
    ------------------------
    Chunk     Source                        Target                        status    
    -----     ------                        ------                        ------    
    6         sh1                                                         scheduled 
    5         sh1                                                         scheduled 
    12        sh2                                                         scheduled 
    11        sh2                                                         scheduled 
    
    [oracle12c@sdb1 ~]$

    注意,deploy完,只是sdb4上的数据库创建完成,chunk是还没有挪完的。

    [oracle12c@sdb1 ~]$ gdsctl config shard 
    
    Name                Shard Group         Status    State       Region    Availability 
    ----                -----------         ------    -----       ------    ------------ 
    sh1                 primary_shardgroup  Ok        Deployed    region1   ONLINE       
    sh2                 primary_shardgroup  Ok        Deployed    region1   ONLINE       
    sh21                primary_shardgroup  Ok        Deployed    region1   ONLINE

    6. 观察re-shard的过程,注意Ongoing chunk movement部分的变化:

    [oracle12c@sdb1 ~]$ gdsctl config chunks -show_reshard
    Chunks
    ------------------------
    Database                      From      To        
    --------                      ----      --        
    sh1                           1         6         
    sh2                           7         12        
    
    Ongoing chunk movement
    ------------------------
    Chunk     Source                        Target                        status    
    -----     ------                        ------                        ------    
    6         sh1                           sh21                          scheduled 
    5         sh1                           sh21                          scheduled 
    12        sh2                           sh21                          scheduled 
    11        sh2                           sh21                          scheduled 
    
    
    (过一会观察)
    [oracle12c@sdb1 ~]$ gdsctl config chunks -show_reshard
    Chunks
    ------------------------
    Database                      From      To        
    --------                      ----      --        
    sh1                           1         6         
    sh2                           7         12        
    
    Ongoing chunk movement
    ------------------------
    Chunk     Source                        Target                        status    
    -----     ------                        ------                        ------    
    6         sh1                           sh21                          copied    
    5         sh1                           sh21                          scheduled 
    12        sh2                           sh21                          scheduled 
    11        sh2                           sh21                          scheduled 
    
    (过一会观察)
    [oracle12c@sdb1 ~]$ gdsctl config chunks -show_reshard
    Chunks
    ------------------------
    Database                      From      To        
    --------                      ----      --        
    sh1                           1         5         
    sh2                           7         12        
    sh21                          6         6         
    
    Ongoing chunk movement
    ------------------------
    Chunk     Source                        Target                        status    
    -----     ------                        ------                        ------    
    5         sh1                           sh21                          started   
    12        sh2                           sh21                          scheduled 
    11        sh2                           sh21                          scheduled 
    
    
    (过一会观察)
    [oracle12c@sdb1 ~]$ gdsctl config chunks -show_reshard
    Chunks
    ------------------------
    Database                      From      To        
    --------                      ----      --        
    sh1                           1         4         
    sh2                           7         12        
    sh21                          5         6         
    
    Ongoing chunk movement
    ------------------------
    Chunk     Source                        Target                        status    
    -----     ------                        ------                        ------    
    12        sh2                           sh21                          started   
    11        sh2                           sh21                          scheduled 
    
    
    (过一会观察)
    [oracle12c@sdb1 ~]$ gdsctl config chunks -show_reshard
    Chunks
    ------------------------
    Database                      From      To        
    --------                      ----      --        
    sh1                           1         4         
    sh2                           7         11        
    sh21                          5         6         
    sh21                          12        12        
    
    Ongoing chunk movement
    ------------------------
    Chunk     Source                        Target                        status    
    -----     ------                        ------                        ------    
    11        sh2                           sh21                          started  
    
    (过一会观察)
    [oracle12c@sdb1 ~]$ gdsctl config chunks -show_reshard
    Chunks
    ------------------------
    Database                      From      To        
    --------                      ----      --        
    sh1                           1         4         
    sh2                           7         10        
    sh21                          5         6         
    sh21                          11        12        
    
    Ongoing chunk movement
    ------------------------
    Chunk     Source                        Target                        status    
    -----     ------                        ------                        ------

    注:如果deploy完之后一直是schedule状态不动,需要config shared看看是否有DDL error。

    另外,需要额外提起的是,chunk migration不仅仅是在shard数量发生改变的时候,也发生在数据或者负载倾斜较大的时候(是有DBA发起)。在chunk migration的时候,chunk大部分时间是online的,但是期间会有几秒钟的时间chunk中的data处于read-only状态。

    我们来手工的move一下chunk:

    GDSCTL>MOVE CHUNK -CHUNK 4 -SOURCE sh1 -TARGET sh21
    The operation completed successfully
    GDSCTL>

    另一窗口:

    [oracle12c@sdb1 ~]$ gdsctl config chunks -show_reshard
    Chunks
    ------------------------
    Database                      From      To        
    --------                      ----      --        
    sh1                           1         4         
    sh2                           7         10        
    sh21                          5         6         
    sh21                          11        12        
    
    Ongoing chunk movement
    ------------------------
    Chunk     Source                        Target                        status    
    -----     ------                        ------                        ------    
    4         sh1                           sh21                          started   
    
    (过一会观察)
    [oracle12c@sdb1 ~]$ gdsctl config chunks -show_reshard
    Chunks
    ------------------------
    Database                      From      To        
    --------                      ----      --        
    sh1                           1         3         
    sh2                           7         10        
    sh21                          4         6         
    sh21                          11        12        
    
    Ongoing chunk movement
    ------------------------
    Chunk     Source                        Target                        status    
    -----     ------                        ------                        ------

    做move shard,我们来看看对应的节点的alertlog:
    节点sh1的alertlog:

    2016-11-10T17:38:13.430097+08:00
    alter tablespace "C004TSP_SET_1" READ ONLY
    2016-11-10T17:38:13.462870+08:00
     Converting block 0 to version 10 format
    Completed: alter tablespace "C004TSP_SET_1" READ ONLY
    2016-11-10T17:38:15.202020+08:00
    DM00 started with pid=60, OS id=6099, job GSMADMIN_INTERNAL.SYS_EXPORT_TRANSPORTABLE_01
    2016-11-10T17:38:16.688903+08:00
    
    DW00 started with pid=61, OS id=6101, wid=1, job GSMADMIN_INTERNAL.SYS_EXPORT_TRANSPORTABLE_01
    2016-11-10T17:39:03.939903+08:00
    Some indexes or index [sub]partitions of table APP_SCHEMA.LINEITEMS have been marked unusable
     2016-11-10T17:39:04.578624+08:00
    Some indexes or index [sub]partitions of table APP_SCHEMA.ORDERS have been marked unusable
    2016-11-10T17:39:04.678710+08:00
    Some indexes or index [sub]partitions of table APP_SCHEMA.CUSTOMERS have been marked unusable
     drop tablespace "C004TSP_SET_1" including contents
    2016-11-10T17:39:06.693499+08:00
    Deleted Oracle managed file /u01/ora12c/app/oracle/oradata/SH1/datafile/o1_mf_c004tsp__d28dcj4w_.dbf
    Completed:  drop tablespace "C004TSP_SET_1" including contents
      2016-11-10T17:44:10.282821+08:00
    Starting control autobackup
    
    Control autobackup written to DISK device
    
    handle '/u01/ora12c/app/oracle/fast_recovery_area/SH1/autobackup/2016_11_10/o1_mf_s_927567850_d28jcbo5_.bkp'

    节点sh21的alertlog:

    2016-11-10T17:39:37.392725+08:00
    Full restore complete of datafile 10 to datafile copy /u01/ora12c/app/oracle/oradata/SH21/datafile/o1_mf_c004tsp__d28dcj4w_.dbf.  Elapsed time: 0:00:00
      checkpoint is 1471673
      last deallocation scn is 1431063
    2016-11-10T17:40:13.636724+08:00
    Incremental restore complete of datafile 10 to datafile copy /u01/ora12c/app/oracle/oradata/SH21/datafile/o1_mf_c004tsp__d28dcj4w_.dbf
      checkpoint is 1471679
      last deallocation scn is 1431063
    2016-11-10T17:40:14.957209+08:00
    DM00 started with pid=54, OS id=5456, job GSMADMIN_INTERNAL.SYS_IMPORT_TRANSPORTABLE_01
    2016-11-10T17:40:15.530391+08:00
    
    DW00 started with pid=55, OS id=5458, wid=1, job GSMADMIN_INTERNAL.SYS_IMPORT_TRANSPORTABLE_01
     2016-11-10T17:40:18.143905+08:00
    Plug in tablespace C004TSP_SET_1 with datafile
      '/u01/ora12c/app/oracle/oradata/SH21/datafile/o1_mf_c004tsp__d28dcj4w_.dbf'
    2016-11-10T17:40:20.382693+08:00
    ALTER TABLESPACE "C004TSP_SET_1" READ WRITE
    Completed: ALTER TABLESPACE "C004TSP_SET_1" READ WRITE
    ALTER TABLESPACE "C004TSP_SET_1" READ ONLY
    2016-11-10T17:40:20.925913+08:00
     Converting block 0 to version 10 format
    Completed: ALTER TABLESPACE "C004TSP_SET_1" READ ONLY
    2016-11-10T17:40:24.893850+08:00
     alter tablespace "C004TSP_SET_1" read write
    Completed:  alter tablespace "C004TSP_SET_1" read write
        2016-11-10T17:49:12.080018+08:00
    Starting control autobackup
    
    2016-11-10T17:49:13.293974+08:00
    Control autobackup written to DISK device
    
    handle '/u01/ora12c/app/oracle/fast_recovery_area/SH21/autobackup/2016_11_10/o1_mf_s_927568152_d28jnrnp_.bkp'

    从上面的alertlog,我们大致的可以看出,chunk migration的过程就是综合利用rman增量备份和TTS的过程:
    level 0备份源chunk相关的TS,还原到新shard->开始FAN(等待几秒)->将源chunk相关的TS置于read-only->level 1备份还原->chunk up(更新routing table连新shard)->chunk down(更新routing table断开源shard)->结束FAN(等待几秒)->删除原shard上的老chunk



沪ICP备19023445号-2号
友情链接