@nalan90
2017-08-24T10:43:07.000000Z
字数 4600
阅读 570
大数据
摘自:http://www.powerxing.com/install-hadoop-cluster/
前章:https://www.zybuluo.com/nalan90/note/854642
环境准备
配置 (主从节点)
## /etc/hosts
[hadoop@dev-162 hadoop]$ cat /etc/hosts
172.16.1.162 master
172.16.1.163 slave1
----------
## 环境变量
[hadoop@dev-162 ~]$ cat .bashrc
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
export PATH=$PATH:/usr/local/hadoop/sbin:/usr/local/hadoop/bin
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
----------
## hadoop配置文件
集群/分布式模式需要修改 /usr/local/hadoop/etc/hadoop 中的5个配置文件,更多设置项可点击查看官方说明,这里仅设置了正常启动所必须的设置项: slaves、core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml 。
1, 文件 slaves,将作为 DataNode 的主机名写入该文件,每行一个,默认为 localhost,所以在伪分布式配置时,节点即作为 NameNode 也作为 DataNode。分布式配置可以保留 localhost,也可以删掉,让 master 节点仅作为 NameNode 使用。
[hadoop@dev-162 hadoop]$ cat slaves
slave1
----------
2, 文件 core-site.xml 改为下面的配置:
[hadoop@dev-162 hadoop]$ cat core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
</configuration>
----------
3, 文件 hdfs-site.xml,dfs.replication 一般设为 3,但我们只有一个 Slave 节点,所以 dfs.replication 的值还是设为 1:
[hadoop@dev-162 hadoop]$ cat hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/data</value>
</property>
</configuration>
----------
4, 文件 mapred-site.xml (可能需要先重命名,默认文件名为 mapred-site.xml.template),然后配置修改如下:
[hadoop@dev-162 hadoop]$ cat mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
----------
5, 文件 yarn-site.xml:
[hadoop@dev-162 hadoop]$ cat yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
配置master免密码登录slave
## 将master节点/home/hadoop/.ssh/id_rsa.pub内容复制到从节点/home/hadoop/.ssh/authorized_keys
## /home/hadoop/.ssh 为700
[hadoop@dev-163 ~]$ ls -ld .ssh
drwx------ 2 hadoop hadoop 29 Aug 23 15:58 .ssh
## /home/hadoop/.ssh/authorized_keys 为600
[hadoop@dev-163 ~]$ ls -l .ssh/authorized_keys
-rw------- 1 hadoop hadoop 396 Aug 23 15:58 .ssh/authorized_keys
[hadoop@dev-163 .ssh]$ pwd
/home/hadoop/.ssh
[hadoop@dev-163 .ssh]$ ls
authorized_keys
[hadoop@dev-163 .ssh]$ cat authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDIOOEMRSgX3OothfEzneBnoZqfIlD3a5oaDzRmqKDISFx1sXWTBAtKCKRocq4pWU7DKN82hwcskWFlPnxpz2zP42gohPPpz8SuXXMsDsSKbkVpHduaPG9QvKFJRqtPNNnZQ4A5jZ02lZCcvZ3FDzdpyFTecyRejqdS0Q2EfVswQ7Xc/MySrk2/c7DaC/Xrz1oxu/wsHf45vDj0NiXAadufyIGN0SIxJbW50IB3eAKABQuwNU5CQRkTAcJf59xGixarRo4gqtCFAdtdyHoP/RIYgC1dWafA5TIFGbHuwfFEWluJQJPwpQ1w5mIJkRoPgwWVLI2bscghSzEVIGrRBuZZ hadoop@dev-162
启动hadoop (master操作)
[hadoop@dev-162 hadoop]$ hdfs namenode -format
[hadoop@dev-162 hadoop]$ start-dfs.sh
[hadoop@dev-162 hadoop]$ start-yarn.sh
[hadoop@dev-162 hadoop]$ mr-jobhistory-daemon.sh start historyserver
## master
[hadoop@dev-162 hadoop]$ jps
3872 NameNode
4210 ResourceManager
5046 Jps
4488 JobHistoryServer
4059 SecondaryNameNode
## slave1
[hadoop@dev-163 .ssh]$ jps
3797 NodeManager
5813 Jps
3690 DataNode
## 查看 DataNode 是否正常启动,如果 Live datanodes 不为 0 ,则说明集群启动成功
[hadoop@dev-162 hadoop]$ hdfs dfsadmin -report
Live datanodes (1):
Name: 172.16.1.163:50010 (slave1)
Hostname: slave1
......
Last contact: Wed Aug 23 16:19:15 CST 2017
[hadoop@dev-162 hadoop]$ hdfs dfs -mkdir -p /user/hadoop
[hadoop@dev-162 hadoop]$ hdfs dfs -mkdir input
[hadoop@dev-162 hadoop]$ hdfs dfs -put /usr/local/hadoop/etc/hadoop/*.xml input
[hadoop@dev-162 hadoop]$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.4.jar grep input output 'dfs[a-z.]+'
[hadoop@dev-162 hadoop]$ hdfs dfs -cat output/*
1 dfsadmin
1 dfs.replication
1 dfs.namenode.secondary.http
1 dfs.namenode.name.dir
1 dfs.datanode.data.dir
[hadoop@dev-162 hadoop]$ hdfs dfs -ls output
Found 2 items
-rw-r--r-- 1 hadoop supergroup 0 2017-08-23 16:23 output/_SUCCESS
-rw-r--r-- 1 hadoop supergroup 107 2017-08-23 16:23 output/part-r-00000
关闭hadoop (master操作)
[hadoop@dev-162 hadoop]$ stop-yarn.sh
[hadoop@dev-162 hadoop]$ stop-dfs.sh
[hadoop@dev-162 hadoop]$ mr-jobhistory-daemon.sh stop historyserver