[关闭]
@Vany 2016-03-26T15:48:47.000000Z 字数 6146 阅读 1537

CentOS7上Hadoop集群搭建与使用小记(二)——脚本介绍

Hadoop Shell Bash CentOS Linux


这一部分主要介绍的是思想以及原理,如果要具体操作步骤,可以直接跳过看下一部分(三).

基本思想

抽象看来,集群的部署操作基本只有两种类型操作:

基本脚本

根据基本思想小节中提到的两种操作,我用两个脚本来抽象这两种操作,一个是applyall.sh,一个是remoterunall.sh。

传送文件tf (transfer file)

在正式介绍applyall和remoterunall之前,我先介绍一个基本功能,tf,即传送文件的功能。传送文件在部署过程中是非常常见的一个需求,因此我们将其抽象出来作为一个单独的“命令”。

因为传送文件的前提是目标目录存在,因此每次传送文件前需要先mkdir xxx一下以确保文件夹存在,再传送文件。

但在传送过程中很有可能需要密码或者输入yes以保存对方的fingerprint,因此这里采用了expect程序来做交互式处理,如果没有的话需要安装一下:yum install expect

具体实现的代码如下:

  1. #!/usr/bin/expect
  2. set source [lindex $argv 0]
  3. set dest [lindex $argv 1]
  4. set destfile [lindex $argv 2]
  5. set serverip [lindex $argv 3]
  6. set pwd [lindex $argv 4]
  7. spawn ssh root@$serverip "mkdir $dest"
  8. expect {
  9. "*continue connecting*" { send "yes\r"; exp_continue }
  10. "*password*" { send "$pwd\r"; exp_continue }
  11. "#" { send "\r" }
  12. }
  13. spawn scp -r $source root@$serverip:$dest$destfile
  14. expect {
  15. "*continue connecting*" { send "yes\r"; exp_continue }
  16. "*password*" { send "$pwd\r"; exp_continue }
  17. "#" { send "\r" }
  18. }

Apply-All

先来说说applyall.sh. 调用脚本,传入hostlist(格式同hosts文件),以及要执行的bashfile,就会帮你依次执行,例如我想将本地的hosts文件传送到各个其他机器,则用法如下:
./applyall.sh hostlist deployhosts.sh mypassword
其中hostlist是预先定好的hosts列表(见上一个文章文末),deployhosts.sh内容如下(该脚本必须按照ip,name,pwd的顺序来取对应的参数):

  1. #!/usr/bin/bash
  2. hostip=$1
  3. hostname=$2
  4. pwd=$3
  5. echo -e "\n===Sending hosts to client $1, $2..."
  6. ./tf hosts /etc/ hosts $hostip $pwd

这里的tf即上文提到的传送文件的tf。

applyall.sh的脚本如下所示,如果参数中没有传送pwd参数,那么会询问用户手动输入。

  1. #!/usr/bin/bash
  2. if [ $# -lt 2 ]; then
  3. echo "Error call format"
  4. echo "Format: ./applyall.sh hostlist bashfile [pwd]"
  5. exit
  6. fi
  7. hostlist=$1
  8. bashfile=$2
  9. if [ $# -lt 3 ]; then
  10. read -p "Enter the client password:" -s pwd
  11. else
  12. pwd=$3
  13. fi
  14. echo
  15. echo ===Enter Apply-All hostlist: $hostlist, bashfile: $bashfile
  16. cat $hostlist | while read hostInfo
  17. do
  18. hostip=`echo $hostInfo | cut -d " " -f1`
  19. hostname=`echo $hostInfo | cut -d " " -f2`
  20. echo -e "\n===Applying Client $hostip $hostname..."
  21. bash $bashfile $hostip $hostname $pwd
  22. done
  23. echo -e "\n===Leave Apply-All."

Remote-Run-All

remoterunall.sh,顾名思义,就是远程执行脚本的意思。假设我想每台机器更新python,那么我只要写一个小脚本updatepython2.sh在同一个目录下:

  1. yum update python -y

然后运行remoterunall.sh hostlist updatepython2.sh即可。

remoterunall.sh的实现如下:

  1. #!/usr/bin/bash
  2. if [ $# -lt 2 ]; then
  3. echo "Error call format"
  4. echo "Format: ./remoterunall.sh hostlist bashfile"
  5. exit
  6. fi
  7. hostlist=$1
  8. bashfile=$2
  9. echo
  10. echo ===Enter Remote-RunAll hostlist: $hostlist, bashfile: $bashfile
  11. cat $hostlist | while read hostInfo
  12. do
  13. hostip=`echo $hostInfo | cut -d " " -f1`
  14. hostname=`echo $hostInfo | cut -d " " -f2`
  15. echo -e "\n===Run script in client $hostip $hostname..."
  16. ssh root@$hostip 'bash -s' < $bashfile
  17. done
  18. echo -e "\n===Leave Remote-RunAll."

这里假设执行remoterunall.sh的时候,所有机器之间的ssh-key已经配置好了,可以无密码访问,因此这里并没有需要密码。

部署脚本

接下来的脚本基本是建立在上面的“基本脚本”的基础上的,这里只介绍三个主要脚本:
1. init.sh 负责初始化一个集群
2. configmaster.sh 负责配置master节点 (主要是配置Ambari相关)
3. addnewhost.sh 负责当集群汇总新加入一个节点时,配置新机器以及重新分发hosts等文件

init.sh

使用方法: ./init.sh hostlist
其中hostlist是一个以hosts格式存着主机名和ip的文件

  1. #!/usr/bin/bash
  2. if [ $# -lt 1 ]; then
  3. echo "Error: parameter is not enough..."
  4. echo "Format: ./init.sh hostlist"
  5. exit
  6. fi
  7. hostlist=$1
  8. read -p "Enter the client password(all should be same):" -s pwd
  9. # generate sever ssh-key
  10. filepath=~/.ssh/id_rsa
  11. if [ -f $filepath ]; then
  12. echo
  13. echo "ssh-key already exists. it won't be generated again."
  14. echo "if you want to generate new key, please remove the key file first."
  15. else
  16. echo
  17. echo "===ssh key generating..."
  18. ssh-keygen -f ~/.ssh/id_rsa -P ""
  19. fi
  20. echo
  21. echo "===Clearing public key gathering folder..."
  22. rm -f ~/.ssh/pub/*
  23. echo
  24. echo "===Copy this public key to that folder..."
  25. cp ~/.ssh/id_rsa.pub ~/.ssh/pub/id_rsa.pub
  26. echo
  27. echo "===Generating hosts file...."
  28. cat ./others/hosts_template $hostlist > hosts
  29. # Deploying Clients...
  30. ./applyall.sh $hostlist others/deployclient.sh $pwd
  31. echo
  32. echo "===Gathering pub keys..."
  33. cat ~/.ssh/pub/*.pub > authorized_keys
  34. echo "===Deploying pub keys..."
  35. ./applyall.sh $hostlist others/deploypubkeys.sh $pwd
  36. echo
  37. echo "===Cleaning..."
  38. rm -f hosts
  39. rm -f authorized_keys
  40. echo "===Clean finished."
  41. echo
  42. echo "===Reboot all the machine..."
  43. cat $hostlist | while read hostinfo
  44. do
  45. hostip=`echo $hostinfo|cut -d " " -f1`
  46. hostname=`echo $hostinfo|cut -d " " -f2`
  47. ssh root@$hostip 'reboot' < /dev/null
  48. done
  49. echo "===reboot finished."

configmaster.sh

这里调用configmaster.sh,然后输入主机ip及其密码,即可自动配置ambari的repo地址,并且安装ambari,拷贝现有的jdk到master节点上(Amabri配置时使用)。

  1. read -p "Enter your master ip:" masterip
  2. read -p "Enter your password:" -s pwd
  3. echo $masterip, $pwd
  4. ./tf others/ambari.repo /etc/yum.repos.d/ ambari.repo $masterip $pwd
  5. ssh root@$masterip 'bash -s' < others/installmaster.sh
  6. ./tf ../jdk/jdk-8u60-linux-x64.tar.gz /var/lib/ambari-server/resources/ jdk-8u60-linux-x64.tar.gz $masterip $jdk
  7. ./tf ../jdk/jce_policy-8.zip /var/lib/ambari-server/resources/ jce_policy-8.zip $masterip $jdk

这调用了installmaster.sh,其实就是执行了三句话(即安装ambari):

  1. yum clean all
  2. yum repolist
  3. yum install ambari-server -y

注意调用完该脚本仅仅是安装好了Ambari,接下来还需要到master节点上运行以下命令以配置、启动Amabri:

  1. ambari-server setup
  2. ambari-server start

addnewhost.sh

这个用法和init.sh类似,但是要注意传入一个新的hostlist,一个旧的hostlist。注意:程序并不会自动合并两个hostlist(为了避免任务失败,还需要重新建立hostist)。

  1. #!/usr/bin/bash
  2. newhostlist=$1
  3. orighostlist=$2
  4. if [ $# -lt 2 ]; then
  5. echo "Error: parameter is not enough..."
  6. echo "Format: ./addnew.sh newhostlist orignhostlist"
  7. exit
  8. fi
  9. echo
  10. echo "Enter addnew scirpt: $newhostlist $orighostlist"
  11. read -p "Enter the client password(all should be same):" -s pwd
  12. echo
  13. echo "===Generating new hosts file...."
  14. updatedhostlist="updatedhostlist"
  15. cat $orighostlist > $updatedhostlist
  16. cat $newhostlist >> $updatedhostlist
  17. cat ./others/hosts_template $updatedhostlist > hosts
  18. # Deploying new clients...
  19. ./applyall.sh $newhostlist others/deployclient.sh $pwd
  20. echo
  21. echo "===reDeploying hosts file..."
  22. ./applyall.sh $updatedhostlist others/deployhosts.sh $pwd
  23. echo -e "\n===gathering pub keys..."
  24. cat ~/.ssh/pub/*.pub > authorized_keys
  25. echo -e "\n===reDeploying pub keys for all clients..."
  26. ./applyall.sh $updatedhostlist others/deploypubkeys.sh $pwd
  27. echo
  28. echo "===Cleaning..."
  29. rm -f hosts
  30. rm -f $updatedhostlist
  31. rm -f authorized_keys
  32. echo "===Clean finished."
  33. echo
  34. echo "===reboot all the new machine..."
  35. cat $newhostlist | while read hostinfo
  36. do
  37. hostip=`echo $hostinfo|cut -d " " -f1`
  38. hostname=`echo $hostinfo|cut -d " " -f2`
  39. ssh root@$hostip 'reboot' < /dev/null
  40. done
  41. echo "===reboot finished."
  42. echo
  43. echo "Leave addnew scirpt."

目录结构

其他的一些脚本这里就不一一放上来了,整个脚本目录如下:

  1. [root@yumsource script]# tree
  2. .
  3. ├── addnewhost.sh
  4. ├── applyall.sh
  5. ├── configmaster.sh
  6. ├── hostlist
  7.    ├── hostlist_new
  8. ├── init.sh
  9. ├── others
  10.    ├── ambari.repo
  11.    ├── configclient_template.sh
  12.    ├── deployclient.sh
  13.    ├── deploypubkeys.sh
  14.    ├── deploypythons.sh
  15.    ├── disSELinuxCfg
  16.    ├── hosts_template
  17.    ├── installmaster.sh
  18.    ├── installpython34.sh
  19.    └── updatepython2.sh
  20. ├── remoterunall.sh
  21. └── tf
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注