[关闭]
@nalan90 2017-08-24T10:43:07.000000Z 字数 4600 阅读 570

Hadoop集群安装配置

大数据


摘自:http://www.powerxing.com/install-hadoop-cluster/

前章:https://www.zybuluo.com/nalan90/note/854642

环境准备

配置 (主从节点)

  1. ## /etc/hosts
  2. [hadoop@dev-162 hadoop]$ cat /etc/hosts
  3. 172.16.1.162 master
  4. 172.16.1.163 slave1
  5. ----------
  6. ## 环境变量
  7. [hadoop@dev-162 ~]$ cat .bashrc
  8. export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
  9. export PATH=$PATH:/usr/local/hadoop/sbin:/usr/local/hadoop/bin
  10. export HADOOP_HOME=/usr/local/hadoop
  11. export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
  12. ----------
  13. ## hadoop配置文件
  14. 集群/分布式模式需要修改 /usr/local/hadoop/etc/hadoop 中的5个配置文件,更多设置项可点击查看官方说明,这里仅设置了正常启动所必须的设置项: slavescore-site.xmlhdfs-site.xmlmapred-site.xmlyarn-site.xml
  15. 1, 文件 slaves,将作为 DataNode 的主机名写入该文件,每行一个,默认为 localhost,所以在伪分布式配置时,节点即作为 NameNode 也作为 DataNode。分布式配置可以保留 localhost,也可以删掉,让 master 节点仅作为 NameNode 使用。
  16. [hadoop@dev-162 hadoop]$ cat slaves
  17. slave1
  18. ----------
  19. 2, 文件 core-site.xml 改为下面的配置:
  20. [hadoop@dev-162 hadoop]$ cat core-site.xml
  21. <configuration>
  22. <property>
  23. <name>fs.defaultFS</name>
  24. <value>hdfs://master:9000</value>
  25. </property>
  26. <property>
  27. <name>hadoop.tmp.dir</name>
  28. <value>file:/usr/local/hadoop/tmp</value>
  29. <description>Abase for other temporary directories.</description>
  30. </property>
  31. </configuration>
  32. ----------
  33. 3, 文件 hdfs-site.xmldfs.replication 一般设为 3,但我们只有一个 Slave 节点,所以 dfs.replication 的值还是设为 1
  34. [hadoop@dev-162 hadoop]$ cat hdfs-site.xml
  35. <configuration>
  36. <property>
  37. <name>dfs.namenode.secondary.http-address</name>
  38. <value>master:50090</value>
  39. </property>
  40. <property>
  41. <name>dfs.replication</name>
  42. <value>1</value>
  43. </property>
  44. <property>
  45. <name>dfs.namenode.name.dir</name>
  46. <value>file:/usr/local/hadoop/tmp/dfs/name</value>
  47. </property>
  48. <property>
  49. <name>dfs.datanode.data.dir</name>
  50. <value>file:/usr/local/hadoop/tmp/dfs/data</value>
  51. </property>
  52. </configuration>
  53. ----------
  54. 4, 文件 mapred-site.xml (可能需要先重命名,默认文件名为 mapred-site.xml.template),然后配置修改如下:
  55. [hadoop@dev-162 hadoop]$ cat mapred-site.xml
  56. <configuration>
  57. <property>
  58. <name>mapreduce.framework.name</name>
  59. <value>yarn</value>
  60. </property>
  61. <property>
  62. <name>mapreduce.jobhistory.address</name>
  63. <value>master:10020</value>
  64. </property>
  65. <property>
  66. <name>mapreduce.jobhistory.webapp.address</name>
  67. <value>master:19888</value>
  68. </property>
  69. </configuration>
  70. ----------
  71. 5, 文件 yarn-site.xml
  72. [hadoop@dev-162 hadoop]$ cat yarn-site.xml
  73. <configuration>
  74. <property>
  75. <name>yarn.resourcemanager.hostname</name>
  76. <value>master</value>
  77. </property>
  78. <property>
  79. <name>yarn.nodemanager.aux-services</name>
  80. <value>mapreduce_shuffle</value>
  81. </property>
  82. </configuration>

配置master免密码登录slave
  1. ## 将master节点/home/hadoop/.ssh/id_rsa.pub内容复制到从节点/home/hadoop/.ssh/authorized_keys
  2. ## /home/hadoop/.ssh 为700
  3. [hadoop@dev-163 ~]$ ls -ld .ssh
  4. drwx------ 2 hadoop hadoop 29 Aug 23 15:58 .ssh
  5. ## /home/hadoop/.ssh/authorized_keys 为600
  6. [hadoop@dev-163 ~]$ ls -l .ssh/authorized_keys
  7. -rw------- 1 hadoop hadoop 396 Aug 23 15:58 .ssh/authorized_keys
  8. [hadoop@dev-163 .ssh]$ pwd
  9. /home/hadoop/.ssh
  10. [hadoop@dev-163 .ssh]$ ls
  11. authorized_keys
  12. [hadoop@dev-163 .ssh]$ cat authorized_keys
  13. ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDIOOEMRSgX3OothfEzneBnoZqfIlD3a5oaDzRmqKDISFx1sXWTBAtKCKRocq4pWU7DKN82hwcskWFlPnxpz2zP42gohPPpz8SuXXMsDsSKbkVpHduaPG9QvKFJRqtPNNnZQ4A5jZ02lZCcvZ3FDzdpyFTecyRejqdS0Q2EfVswQ7Xc/MySrk2/c7DaC/Xrz1oxu/wsHf45vDj0NiXAadufyIGN0SIxJbW50IB3eAKABQuwNU5CQRkTAcJf59xGixarRo4gqtCFAdtdyHoP/RIYgC1dWafA5TIFGbHuwfFEWluJQJPwpQ1w5mIJkRoPgwWVLI2bscghSzEVIGrRBuZZ hadoop@dev-162

启动hadoop (master操作)
  1. [hadoop@dev-162 hadoop]$ hdfs namenode -format
  2. [hadoop@dev-162 hadoop]$ start-dfs.sh
  3. [hadoop@dev-162 hadoop]$ start-yarn.sh
  4. [hadoop@dev-162 hadoop]$ mr-jobhistory-daemon.sh start historyserver
  5. ## master
  6. [hadoop@dev-162 hadoop]$ jps
  7. 3872 NameNode
  8. 4210 ResourceManager
  9. 5046 Jps
  10. 4488 JobHistoryServer
  11. 4059 SecondaryNameNode
  12. ## slave1
  13. [hadoop@dev-163 .ssh]$ jps
  14. 3797 NodeManager
  15. 5813 Jps
  16. 3690 DataNode
  17. ## 查看 DataNode 是否正常启动,如果 Live datanodes 不为 0 ,则说明集群启动成功
  18. [hadoop@dev-162 hadoop]$ hdfs dfsadmin -report
  19. Live datanodes (1):
  20. Name: 172.16.1.163:50010 (slave1)
  21. Hostname: slave1
  22. ......
  23. Last contact: Wed Aug 23 16:19:15 CST 2017
  24. [hadoop@dev-162 hadoop]$ hdfs dfs -mkdir -p /user/hadoop
  25. [hadoop@dev-162 hadoop]$ hdfs dfs -mkdir input
  26. [hadoop@dev-162 hadoop]$ hdfs dfs -put /usr/local/hadoop/etc/hadoop/*.xml input
  27. [hadoop@dev-162 hadoop]$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.4.jar grep input output 'dfs[a-z.]+'
  28. [hadoop@dev-162 hadoop]$ hdfs dfs -cat output/*
  29. 1 dfsadmin
  30. 1 dfs.replication
  31. 1 dfs.namenode.secondary.http
  32. 1 dfs.namenode.name.dir
  33. 1 dfs.datanode.data.dir
  34. [hadoop@dev-162 hadoop]$ hdfs dfs -ls output
  35. Found 2 items
  36. -rw-r--r-- 1 hadoop supergroup 0 2017-08-23 16:23 output/_SUCCESS
  37. -rw-r--r-- 1 hadoop supergroup 107 2017-08-23 16:23 output/part-r-00000

关闭hadoop (master操作)
  1. [hadoop@dev-162 hadoop]$ stop-yarn.sh
  2. [hadoop@dev-162 hadoop]$ stop-dfs.sh
  3. [hadoop@dev-162 hadoop]$ mr-jobhistory-daemon.sh stop historyserver
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注