IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    [原]Hadoop2.2.0生产环境模拟

    book_mmicky发表于 2014-05-13 15:22:48
    love 0
    1:规划
    centOS6.4上搭建hadoop2.2.0环境,java版本7UP21
    192.168.100.201 product201.product (namenode)
    192.168.100.202 product202.product (datanode)
    192.168.100.203 product203.product (datanode)
    192.168.100.204 product204.product (datanode)
    192.168.100.200 productserver.product (DNS、NFS)
    指导思想:
    A:将Hadoop2.2.0的部署文件共享在productserver.product:/share/hadoop,每台客户端安装时通过脚本下载
    B:SSH公钥文件所有节点共享,将各自产生的公钥添加到productserver.product:/mnt/.ssh/authorized_keys
    C:Hadoop配置文件所有节点共享,部署在productserver.product:/mnt/.ssh,各节点通过软链接引用
    D:各节点安装使用安装脚本,包括启动共享文件的装载、向slaves文件注册、下载Hadoop部署文件、建立软链接等等


    2:创建虚拟机样板机(VM和vitualBOX都可以)
    A:安装centOS6.4虚拟机product201.product,开通ssh服务,屏蔽iptables服务
    [root@hadoop1 ~]# chkconfig sshd on
    [root@hadoop1 ~]# chkconfig iptables off
    [root@hadoop1 ~]# chkconfig ip6tables off
    [root@hadoop1 ~]# chkconfig postfix off

    B:修改/etc/sysconfig/selinux
    SELINUX=disabled

    C:修改ssh配置/etc/ssh/sshd_config,打开注释:
    RSAAuthentication yes
    PubkeyAuthentication yes
    AuthorizedKeysFile .ssh/authorized_keys

    D:安装JAVA,在环境变量配置文件/etc/profile末尾增加:
    export JAVA_HOME=/usr/java/jdk1.7.0_21
    export JRE_HOME=/usr/java/jdk1.7.0_21/jre
    export HADOOP_PREFIX=/app/hadoop/hadoop220
    export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
    export HADOOP_CONF_DIR=${HADOOP_PREFIX}/etc/hadoop
    export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
    export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:${HADOOP_PREFIX}/bin:${HADOOP_PREFIX}/sbin:$PATH

    E:增加hadoop组和hadoop用户,并设置hadoop用户密码。

    3:创建虚拟机
    A:关闭样板机,分别复制成product202.product、product203.product、product204.product、productserver.product:
    修改虚拟机的下列文件中相关的信息,使虚拟机的网络配置和虚拟机名相应
    /etc/udev/rules.d/70-persistent-net.rules
    /etc/sysconfig/network
    /etc/sysconfig/network-scripts/ifcfg-eth0

    B:启动product201.product、product202.product、product203.product、product204.product、productserver.product,确保相互之间能ping通。

    4:服务器安装(提供DNS和NFS服务)
    A:DNS安装
    [root@productserver ~]# yum install bind_libs bind bind-utils
    Hadoop2.2.0生产环境模拟 - mmicky - mmicky 的博客
     **************************************************************************
    如果想增加系统安全性,可以将bind的根目录限定在某一个目录之中,需要安装bind-chroot。但要注意将"$AddUnixListenSocket /var/named/chroot/dev/log"加入/etc/rsyslog.conf文件中,不然rsyslog守护程序将无法记载bind日志。
    [root@productserver ~]# yum install bind_libs bind bind-utils bind-chroot
    [root@productserver ~]# vi rsyslog.conf
    $AddUnixListenSocket /var/named/chroot/dev/log
    **************************************************************************

    B:配置/etc/named.conf 和 /etc/name
    [root@productserver ~]# vi /etc/named.conf 
    [root@productserver ~]# cat /etc/named.conf

    // Provided by Red Hat bind package to configure the ISC BIND named(8) DNS
    // server as a caching only nameserver (as a localhost DNS resolver only).
    //
    // See /usr/share/doc/bind*/sample/ for example named configuration files.
    //

    options {
    listen-on port 53 { any; };
    listen-on-v6 port 53 { ::1; };
    directory "/var/named";
    dump-file "/var/named/data/cache_dump.db";
    statistics-file "/var/named/data/named_stats.txt";
    memstatistics-file "/var/named/data/named_mem_stats.txt";
    allow-query { any; };
    recursion yes;
    forwarders { 202.101.172.35; };

    dnssec-enable yes;
    dnssec-validation yes;
    dnssec-lookaside auto;

    /* Path to ISC DLV key */
    bindkeys-file "/etc/named.iscdlv.key";

    managed-keys-directory "/var/named/dynamic";
    };

    logging {
    channel default_debug {
    file "data/named.run";
    severity dynamic;
    };
    };

    zone "." IN {
    type hint;
    file "named.ca";
    };

    include "/etc/named.rfc1912.zones";
    include "/etc/named.root.key";


    [root@productserver ~]# vi /etc/named.rfc1912.zones
    [root@productserver ~]# cat /etc/named.rfc1912.zones

    // named.rfc1912.zones:
    //
    // Provided by Red Hat caching-nameserver package
    //
    // ISC BIND named zone configuration for zones recommended by
    // RFC 1912 section 4.1 : localhost TLDs and address zones
    // and http://www.ietf.org/internet-drafts/draft-ietf-dnsop-default-local-zones-02.txt
    // (c)2007 R W Franks
    //
    // See /usr/share/doc/bind*/sample/ for example named configuration files.
    //

    zone "localhost.localdomain" IN {
    type master;
    file "named.localhost";
    allow-update { none; };
    };

    zone "localhost" IN {
    type master;
    file "named.localhost";
    allow-update { none; };
    };

    //注释下面几行
    //zone "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa" IN {
    // type master;
    // file "named.loopback";
    // allow-update { none; };
    //};

    //zone "1.0.0.127.in-addr.arpa" IN {
    // type master;
    // file "named.loopback";
    // allow-update { none; };
    //};

    zone "0.in-addr.arpa" IN {
    type master;
    file "named.empty";
    allow-update { none; };
    };

    zone "product" IN {
    type master;
    file "product.zone";
    };

    zone "100.168.192.in-addr.arpa" IN {
    type master;
    file "100.168.192.zone";
    };



    C:正反解文件配置
    正反解文件配置容易出错,可以用“named-checkzone zone名 zone文件”命令来检查
    [root@productserver ~]# vi /var/named/product.zone
    [root@productserver ~]# cat /var/named/product.zone

    $TTL 86400
    @ IN SOA product. root.product. (
    2013122801 ; serial (d. adams)
    3H ; refresh
    15M ; retry
    1W ; expiry
    1D ) ; minimum
    @ IN NS productserver.
    productserver IN A 192.168.100.200

    ; 正解设置
    product201 IN A 192.168.100.201
    product202 IN A 192.168.100.202
    product203 IN A 192.168.100.203
    product204 IN A 192.168.100.204
    product211 IN A 192.168.100.211
    product212 IN A 192.168.100.212
    product213 IN A 192.168.100.213
    product214 IN A 192.168.100.214

    [root@productserver ~]# vi /var/named/100.168.192.zone
    [root@productserver ~]# cat /var/named/100.168.192.zone 

    $TTL 86400
    @ IN SOA productserver. root.productserver. (
    2013122801 ; serial (d. adams)
    3H ; refresh
    15M ; retry
    1W ; expiry
    1D ) ; minimum
    IN NS productserver.
    200 IN PTR productserver.product.

    ;反解设置
    201 IN PTR product201.product.
    202 IN PTR product202.product.
    203 IN PTR product203.product.
    204 IN PTR product204.product.
    211 IN PTR product211.product.
    212 IN PTR product212.product.
    213 IN PTR product213.product.
    214 IN PTR product214.product.


    D:配置DNS服务并启动DNS
    [root@productserver ~]# chkconfig named on
    [root@productserver ~]# /etc/init.d/named restart
    Hadoop2.2.0生产环境模拟 - mmicky - mmicky 的博客
    **************************************************************************
    启动named时报错
    Generating /etc/rndc.key:
    处理方法
    [root@productserver ~]# rndc-confgen -r /dev/urandom -a
    然后再启动named就可以了
    **************************************************************************

    E:安装NFS
    **************************************************************************
    用下面命令检查运行NFS必须的软件包是否已经安装
    [root@productserver ~]# rpm -qa |grep rpcbind
    [root@productserver ~]# rpm -qa |grep nfs
    若没有使用下面命令安装
    [root@productserver ~]# yum install nfs-utils
    **************************************************************************
    安装后配置NFS服务和启动
    [root@productserver ~]# chkconfig rpcbind on
    [root@productserver ~]# chkconfig nfs on
    [root@productserver ~]# chkconfig nfslock on
    [root@productserver ~]# service rpcbind restart
    [root@productserver ~]# service nfs restart
    [root@productserver ~]# service nfslock restart

    解压缩hadoop安装文件到/app/hadoop/hadoop220,其中将/app/hadoop整个目录赋予hadoop:hadoop,并且在/app/hadoop/hadoop220下建立mydata目录存放数据,建立logs目录存放日志。
    创建共享目录/share/hadoop(存放hadoop文件,用户hadoop只读)和/share/.ssh(存放所有节点公钥,用户hadoop读写)
    [root@productserver ~]# mkdir -p /share/hadoop
    [root@productserver ~]# cp -r /app/hadoop/hadoop220 /share/hadoop/
    [root@productserver ~]# chown -R hadoop:hadoop /share/hadoop
    [root@productserver ~]# setfacl -m u:hadoop:rwx /share/hadoop
    [root@productserver ~]# mkdir -p /share/.ssh
    [root@productserver ~]# chmod 700 /share/.ssh
    [root@productserver ~]# chown -R hadoop:hadoop /share/.ssh
    [root@productserver ~]# setfacl -m u:hadoop:rwx /share/.ssh

    配置共享目录配置文件
    [root@productserver ~]# vi /etc/exports
    [root@productserver ~]# cat /etc/exports 
    /share/hadoop 192.168.100.0/24(ro)
    /share/.ssh 192.168.100.0/24(rw)

    使共享目录配置文件/etc/exports生效:
    [root@productserver ~]# exportfs -arv

    5:Hadoop节点安装配置
    A:DNS客户端配置
    client解析主机的IP的次序:首先访问/etc/nsswitch.conf;如果是file先,就找/etc/hosts解析;如果是dns先,就先找/etc/resolv.conf解析。本环境是由dns来解析,所以要配置/etc/nsswitch.conf和/etc/resolv.conf。由于CentOS6.x的NetworkManager服务有时候会产生一些比较奇特的现象《鸟哥的Linux私房菜-服务器架设篇(第三版)P597》,所以顺便关闭NetworkManger服务。

    B:NFS客户端配置
    如果想自动加载nfs,可以通过两种方法:
    在/etc/rc.d/rc.local加入加载命令
    配置autofs服务的/etc/auto.master文件及其指向的加载配置明细文件,并启动autofs文件
    由于hadoop系统启动的时候要使用productserver:/share/.ssh中的公钥,所以需要在节点重启的时候自动加载nfs,本方案将采用第一种方法,并关闭autofs服务。

    C:生成ssh公钥和节点名

    D:复制hadoop安装包

    E:启动每台虚拟机(product201、product202、product203、product204),生成两个脚本hadoop_root.sh和hadoop_hadoop.sh,分别由用户root和hadoop运行。
    hadoop_root.sh脚本,由root运行
    [root@product201 ~]# vi /app/hadoop_root.sh

    #/bin/bash
    echo "############DNS客户端配置###############"
    sed -i 's/hosts: files dns/hosts: dns files/g' `find /etc/ -name nsswitch.conf`
    echo "nameserver 192.168.100.200" >>/etc/resolv.conf
    chkconfig NetworkManager off
    chkconfig autofs off

    echo "############NFS客户端配置###############"
    chkconfig rpcbind on
    chkconfig nfslock on
    service rpcbind restart
    service nfslock restart
    mkdir -p /mnt/hadoop
    mkdir -p /mnt/.ssh
    mount -t nfs productserver:/share/hadoop /mnt/hadoop
    mount -t nfs productserver:/share/.ssh /mnt/.ssh
    echo "mount -t nfs productserver:/share/hadoop /mnt/hadoop">>/etc/rc.d/rc.local
    echo "mount -t nfs productserver:/share/.ssh /mnt/.ssh">>/etc/rc.d/rc.local
    mkdir -p /app/hadoop/
    chown -R hadoop:hadoop /app/hadoop

    echo "----------END-----------------"


    hadoop_hadoop.sh脚本,由hadoop运行
    [root@product201 ~]# vi /app/hadoop_hadoop.sh

    #/bin/bash
    echo "############生成公钥###############"
    ssh-keygen -t rsa -N123456 -f /home/hadoop/.ssh/id_rsa
    cat /home/hadoop/.ssh/id_rsa.pub>>/mnt/.ssh/authorized_keys
    ln -sf /mnt/.ssh/authorized_keys /home/hadoop/.ssh/authorized_keys
    hostname>>/mnt/.ssh/slaves

    echo "############复制hadoop文件###############"
    cp -r /mnt/hadoop/hadoop220 /app/hadoop/

    echo "############链接hadoop配置文件###############"
    rm -rf /app/hadoop/hadoop220/etc/hadoop/*
    ln -sf /mnt/.ssh/capacity-scheduler.xml /app/hadoop/hadoop220/etc/hadoop/capacity-scheduler.xml
    ln -sf /mnt/.ssh/configuration.xsl /app/hadoop/hadoop220/etc/hadoop/configuration.xsl
    ln -sf /mnt/.ssh/container-executor.cfg /app/hadoop/hadoop220/etc/hadoop/container-executor.cfg
    ln -sf /mnt/.ssh/core-site.xml /app/hadoop/hadoop220/etc/hadoop/core-site.xml
    ln -sf /mnt/.ssh/hadoop-env.cmd /app/hadoop/hadoop220/etc/hadoop/hadoop-env.cmd
    ln -sf /mnt/.ssh/hadoop-env.sh /app/hadoop/hadoop220/etc/hadoop/hadoop-env.sh
    ln -sf /mnt/.ssh/hadoop-metrics2.properties /app/hadoop/hadoop220/etc/hadoop/hadoop-metrics2.properties
    ln -sf /mnt/.ssh/hadoop-metrics.properties /app/hadoop/hadoop220/etc/hadoop/hadoop-metrics.properties
    ln -sf /mnt/.ssh/hadoop-policy.xml /app/hadoop/hadoop220/etc/hadoop/hadoop-policy.xml
    ln -sf /mnt/.ssh/hdfs-site.xml /app/hadoop/hadoop220/etc/hadoop/hdfs-site.xml
    ln -sf /mnt/.ssh/httpfs-env.sh /app/hadoop/hadoop220/etc/hadoop/httpfs-env.sh
    ln -sf /mnt/.ssh/httpfs-log4j.properties /app/hadoop/hadoop220/etc/hadoop/httpfs-log4j.properties
    ln -sf /mnt/.ssh/httpfs-signature.secret /app/hadoop/hadoop220/etc/hadoop/httpfs-signature.secret
    ln -sf /mnt/.ssh/httpfs-site.xml /app/hadoop/hadoop220/etc/hadoop/httpfs-site.xml
    ln -sf /mnt/.ssh/log4j.properties /app/hadoop/hadoop220/etc/hadoop/log4j.properties
    ln -sf /mnt/.ssh/mapred-env.cmd /app/hadoop/hadoop220/etc/hadoop/mapred-env.cmd
    ln -sf /mnt/.ssh/mapred-env.sh /app/hadoop/hadoop220/etc/hadoop/mapred-env.sh
    ln -sf /mnt/.ssh/mapred-queues.xml.template /app/hadoop/hadoop220/etc/hadoop/mapred-queues.xml.template
    ln -sf /mnt/.ssh/mapred-site.xml /app/hadoop/hadoop220/etc/hadoop/mapred-site.xml
    ln -sf /mnt/.ssh/mapred-site.xml.template /app/hadoop/hadoop220/etc/hadoop/mapred-site.xml.template
    ln -sf /mnt/.ssh/masters /app/hadoop/hadoop220/etc/hadoop/masters
    ln -sf /mnt/.ssh/slaves /app/hadoop/hadoop220/etc/hadoop/slaves
    ln -sf /mnt/.ssh/ssl-client.xml.example /app/hadoop/hadoop220/etc/hadoop/ssl-client.xml.example
    ln -sf /mnt/.ssh/ssl-server.xml.example /app/hadoop/hadoop220/etc/hadoop/ssl-server.xml.example
    ln -sf /mnt/.ssh/yarn-env.cmd /app/hadoop/hadoop220/etc/hadoop/yarn-env.cmd
    ln -sf /mnt/.ssh/yarn-env.sh /app/hadoop/hadoop220/etc/hadoop/yarn-env.sh
    ln -sf /mnt/.ssh/yarn-site.xml /app/hadoop/hadoop220/etc/hadoop/yarn-site.xml
    echo "----------END-----------------"



    [root@product201 ~]# chmod 777 /app/*.sh
    [root@product201 ~]# /app/hadoop_root.sh
    Hadoop2.2.0生产环境模拟 - mmicky - mmicky 的博客
     
    [root@product201 ~]# su - hadoop
    [hadoop@product201 ~]$ /app/hadoop_hadoop.sh
    Hadoop2.2.0生产环境模拟 - mmicky - mmicky 的博客
     
    6:启动hadoop
    A:由于productserver:/share/.ssh/slaves注册所有的节点机器名,需要将namenode机器名删除。本实验以product201.product作为namenode。
    B:修改 productserver:/share/.ssh/ 的读写属性是700,productserver:/share/.ssh/authorized_keys的读写属性是600,不然免密码会出现问题。
    C:在 product201.product做一次ssh各节点,保留初次密码。
    D:格式化namenode
    E:启动Hadoop
    Hadoop2.2.0生产环境模拟 - mmicky - mmicky 的博客
     
    7:Tips
    A:hadoop_hadoop.sh脚本中对hadoop配置文件进行连接,命令可以通过awk生成。
    cat ccc | awk '{ print "ln -sf " $10 }'
    cat ccc | awk '{ print "ln -sf /mnt/.ssh/" $10 " /app/hadoop/hadoop220/etc/hadoop/" $10 }'
    B:使用DNS,最好各节点的hostname是XXX.YYY类似的名称,如果只是一个简单的机器名,在解析配置的时候不大好配置。


沪ICP备19023445号-2号
友情链接