IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    [原]GreenPlum 集群 gpfdist 实战

    mchdba发表于 2017-05-19 16:30:47
    love 0



    并行文件服务gpfdist组件模块,能够实现最大并行度、加载带宽,默认greenplum集群已经有了已经安装了gpfdist,但是如果在单独的服务器上,还是需要再次安装的单独的组件,需要下载一个loaders的组件安装包进行安装。



    1,下载

    下载地址:https://network.pivotal.io/products/pivotal-gpdb#/releases/4540/file_groups/561,选择和greenplumdatabase相同款的loaders,loaders里面包括有gpfdisk组件,下载显示如下:

    C:\pic\greenplum\005.png

     

     



    2,安装

    基础组件

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    wget http://pyyaml.org/download/libyaml/yaml-0.1.7.tar.gz

    tar -xvf yaml-0.1.7.tar.gz

    cd yaml-0.1.7

    ./configure                                                                             

    make

    make install

     

     

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    (1)解压缩

    unzip greenplum-loaders-4.3.8.2-build-1-RHEL5-x86_64.zip                                          

     

    (2)创建软件目录

    mkdir /data/greenplum

    chown -R gpadmin:gpadmin /data/greenplum

     

    (3)开始安装

    sh greenplum-loaders-4.3.8.2-build-1-RHEL5-x86_64.bin -y

    (4)查看组件,可以看到gpfdist和gpload

    [gpadmin@db_m2_slave1 ~]$ ll /data/greenplum/bin

    total 756

    drwxr-xr-x 4 gpadmin gpadmin   4096 May 10  2016 ext

    -rwxr-xr-x 1 gpadmin gpadmin 663372 May 10  2016 gpfdist

    -rwxr-xr-x 1 gpadmin gpadmin    311 May 10  2016 gpload

    -rwxr-xr-x 1 gpadmin gpadmin 100338 May 10  2016 gpload.py                                  

    [gpadmin@db_m2_slave1 ~]$

     



    3,使用

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    启动命令:

    nohup /data/greenplum/bin/gpfdist -d /home/gpadmin/ -p 8090 > /home/gpadmin/gpfdist.log  &

     

    启动过程:

    [gpadmin@db_m2_slave1 ~]$ nohup /data/greenplum/bin/gpfdist -d /home/gpadmin/ -p 8090 > /home/gpadmin/gpfdist.log  &

    [1] 27003

    [gpadmin@db_m2_slave1 ~]$

    [gpadmin@db_m2_slave1 ~]$ more /home/gpadmin/gpfdist.log

    2017-05-12 14:10:31 27003 INFO Before opening listening sockets - following listening sockets are available:

    2017-05-12 14:10:31 27003 INFO IPV6 socket: [::]:8090

    2017-05-12 14:10:31 27003 INFO IPV4 socket: 0.0.0.0:8090

    2017-05-12 14:10:31 27003 INFO Trying to open listening socket:

    2017-05-12 14:10:31 27003 INFO IPV6 socket: [::]:8090

    2017-05-12 14:10:31 27003 INFO Opening listening socket succeeded

    2017-05-12 14:10:31 27003 INFO Trying to open listening socket:

    2017-05-12 14:10:31 27003 INFO IPV4 socket: 0.0.0.0:8090

    Serving HTTP on port 8090, directory /home/gpadmin

    [gpadmin@db_m2_slave1 ~]$

     



    4,通过gpfdist服务建立的外部表

    建立测试数据,准备2个txt数据,文件名字t01.txt/t02.txt

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    [gpadmin@db_m2_slave1 gpdextdata]$ pwd

    /home/gpadmin/gpdextdata

    [gpadmin@db_m2_slave1 gpdextdata]$ more t01.txt

    1|aaa

    2|zhangsan

     

    [gpadmin@db_m2_slave1 gpdextdata]$ more t02.txt                                       

    3|wanger

    4|mazi

     

    [gpadmin@db_m2_slave1 gpdextdata]$

     

    在greenplum db上建立外部表,指向gpfdist服务的t01.txt、t02.txt数据,建立外部表的sql语句如下,在psql命令窗口上执行:

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    create external table public.t01_ext_1 (

    id integer,

    name varchar(128)

    )

    location (

    /*'gpfdist://101.254.13.72:8090/gpextdata/test001.txt',

    'gpfdist:// 101.254.3.72:8090/gpextdata/test002.txt'*/

    /*'gpfdist:// 101.254.13.72:8090/gpextdata/*.txt'*/

    'gpfdist://101.254.13.72:8090/gpextdata/t01.txt',

    'gpfdist:// 101.254.13.72:8090/gpextdata/t02.txt'

    )

    Format 'TEXT' (delimiter as E'|' null as '' escape 'OFF')

    --Encoding 'GB18030' Log errors into public.test001_err segment reject limit 10 rows               

    ;

     


    执行过程:

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    (1)创建外部表成功:

    yueworld_db=# create external table public.t01_ext_1 (

    yueworld_db(# id integer,

    yueworld_db(# name varchar(128)

    yueworld_db(# )

    yueworld_db-# location (

    yueworld_db(# 'gpfdist://101.254.13.72:8090/gpextdata/t01.txt',

    yueworld_db(# 'gpfdist:// 101.254.13.72:8090/gpextdata/t02.txt'

    yueworld_db(# )

    yueworld_db-# Format 'TEXT' (delimiter as E'|' null as '' escape 'OFF')                                 

    yueworld_db-# ;

    CREATE EXTERNAL TABLE

    yueworld_db=# 

    yueworld_db=# select * from public.t01_ext_1;;
     id | name 
    ----+------
      1 | aaa
      2 | zhangsan
      3 | wanger
      4 | mazi
    (4 rows)


    yueworld_db=# 

     



沪ICP备19023445号-2号
友情链接