[关闭]
@nalan90 2017-09-13T14:55:25.000000Z 字数 11236 阅读 635

Hadoop伪分布式配置

大数据


摘自:http://www.powerxing.com/install-hadoop/

核心组件

安装包准备

添加用户并设置权限
  1. ## 添加hadoop用户
  2. [root@dev-162 zhangshuang]# useradd -m hadoop -s /bin/bash
  3. ## 设置密码
  4. [root@dev-162 zhangshuang]# passwd hadoop
  5. Changing password for user hadoop.
  6. New password:
  7. BAD PASSWORD: The password is shorter than 7 characters
  8. Retype new password:
  9. passwd: all authentication tokens updated successfully.
  10. ## 添加sudo权限(追加如下行)
  11. [root@dev-162 zhangshuang]# visudo
  12. hadoop ALL=(ALL) NOPASSWD:ALL
  13. ## 切换至hadoop用户
  14. [root@dev-162 zhangshuang]# su - hadoop
  15. ## 生成ssh证书(全部回车即可)
  16. [hadoop@dev-162 ~]$ ssh-keygen -t rsa
  17. Generating public/private rsa key pair.
  18. Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
  19. Created directory '/home/hadoop/.ssh'.
  20. Enter passphrase (empty for no passphrase):
  21. Enter same passphrase again:
  22. Your identification has been saved in /home/hadoop/.ssh/id_rsa.
  23. Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
  24. The key fingerprint is:
  25. 7a:13:7f:9d:80:37:38:56:e9:b5:de:53:c1:c0:5a:33 hadoop@dev-162
  26. The key randomart image is:
  27. +--[ RSA 2048]----+
  28. | .. |
  29. | Eo |
  30. | = +o |
  31. | * . ..|
  32. | S = = . .|
  33. | . + o = o.|
  34. | . o . . +..|
  35. | . . . .|
  36. | |
  37. +-----------------+
  38. ## 将公钥写入authorized_keys文件
  39. [hadoop@dev-162 ~]$ cat .ssh/id_rsa.pub >> .ssh/authorized_keys
  40. ## 修改权限
  41. [hadoop@dev-162 ~]$ chmod 600 .ssh/authorized_keys
  42. ## 无密码登录本机
  43. [hadoop@dev-162 ~]$ ssh localhost
  44. The authenticity of host 'localhost (::1)' can not be established.
  45. ECDSA key fingerprint is fe:0f:40:d7:56:c8:c1:b4:29:c3:ce:d8:d6:12:66:2e.
  46. Are you sure you want to continue connecting (yes/no)? yes
  47. Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
  48. Last login: Wed Aug 23 13:56:50 2017

安装java环境
  1. ## 安装openjdk
  2. [hadoop@dev-162 ~]$ sudo yum -y install java-1.8.0-openjdk
  3. [hadoop@dev-162 ~]$ sudo yum -y install java-1.8.0-openjdk-devel
  4. ## 设置java环境变量 (注意是.bashrc不是.bash_profile)
  5. [hadoop@dev-161 ~]$ vim .bashrc
  6. ## 追加如下行
  7. export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
  8. export PATH=$JAVA_HOME/bin:$PATH:/usr/local/hadoop/sbin:/usr/local/hadoop/bin
  9. export HADOOP_HOME=/usr/local/hadoop
  10. export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
  11. [hadoop@dev-161 ~]$ source .bashrc
  12. [hadoop@dev-161 ~]$ echo $JAVA_HOME
  13. /usr/lib/jvm/java-1.8.0-openjdk
  14. [hadoop@dev-161 ~]$ java -version
  15. openjdk version "1.8.0_141"
  16. OpenJDK Runtime Environment (build 1.8.0_141-b16)
  17. OpenJDK 64-Bit Server VM (build 25.141-b16, mixed mode)
  18. [hadoop@dev-161 ~]$ $JAVA_HOME/bin/java -version
  19. openjdk version "1.8.0_141"
  20. OpenJDK Runtime Environment (build 1.8.0_141-b16)
  21. OpenJDK 64-Bit Server VM (build 25.141-b16, mixed mode)

下载hadoop
  1. ## download hadoop-2.7.4.tar.gz
  2. [hadoop@dev-162 ~]$ wget https://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.7.4/hadoop-2.7.4.tar.gz --no-check-certificate
  3. ## tar hadoop-2.7.4.tar.gz to /usr/local
  4. [hadoop@dev-161 tmp]$ sudo tar -zxvf hadoop-2.7.4.tar.gz -C /usr/local/
  5. [hadoop@dev-161 tmp]$ cd /usr/local/
  6. ## rename /usr/local/hadoop-2.7.4 dir to /usr/local/hadoop
  7. [hadoop@dev-161 local]$ sudo mv hadoop-2.7.4/ hadoop
  8. ## change /usr/local/hadoop dir owner
  9. [hadoop@dev-161 local]$ sudo chown -R hadoop hadoop/
  10. [hadoop@dev-161 local]$ cd /usr/local/hadoop/
  11. ## show hadoop version
  12. [hadoop@dev-161 hadoop]$ ./bin/hadoop version
  13. Hadoop 2.7.4
  14. Subversion https://shv@git-wip-us.apache.org/repos/asf/hadoop.git -r cd915e1e8d9d0131462a0b7301586c175728a282
  15. Compiled by kshvachk on 2017-08-01T00:29Z
  16. Compiled with protoc 2.5.0
  17. From source with checksum 50b0468318b4ce9bd24dc467b7ce1148
  18. This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.4.jar

配置core-site.xml
  1. ## 配置core-site.xml
  2. [hadoop@dev-161 hadoop]$ pwd
  3. /usr/local/hadoop
  4. [hadoop@dev-161 hadoop]$ vim etc/hadoop/core-site.xml
  5. <configuration>
  6. <property>
  7. <name>hadoop.tmp.dir</name>
  8. <value>file:/usr/local/hadoop/tmp</value>
  9. <description>Abase for other temporary directories.</description>
  10. </property>
  11. <property>
  12. <name>fs.defaultFS</name>
  13. <value>hdfs://localhost:9000</value>
  14. </property>
  15. </configuration>

配置hdfs-site.xml
  1. ## 配置 hdfs-site.xml
  2. [hadoop@dev-161 hadoop]$ vim etc/hadoop/hdfs-site.xml
  3. <configuration>
  4. <property>
  5. <name>dfs.replication</name>
  6. <value>1</value>
  7. </property>
  8. <property>
  9. <name>dfs.namenode.name.dir</name>
  10. <value>file:/usr/local/hadoop/tmp/dfs/name</value>
  11. </property>
  12. <property>
  13. <name>dfs.datanode.data.dir</name>
  14. <value>file:/usr/local/hadoop/tmp/dfs/data</value>
  15. </property>
  16. </configuration>

格式化并启动hdfs
  1. [hadoop@dev-161 hadoop]$ ./bin/hdfs namenode -format
  2. ## 出现以下信息表示成功
  3. 17/08/23 14:21:21 INFO common.Storage: Storage directory /usr/local/hadoop/tmp/dfs/name has been successfully formatted.
  4. 17/08/23 14:21:21 INFO util.ExitUtil: Exiting with status 0
  5. [hadoop@dev-161 ~]$ cd /usr/local/hadoop/
  6. ## 启动hdfs
  7. [hadoop@dev-161 hadoop]$ ./sbin/start-dfs.sh
  8. Starting namenodes on [localhost]
  9. localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-namenode-dev-161.out
  10. localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hadoop-datanode-dev-161.out
  11. Starting secondary namenodes [0.0.0.0]
  12. The authenticity of host '0.0.0.0 (0.0.0.0)' can not be established.
  13. ECDSA key fingerprint is fe:0f:40:d7:56:c8:c1:b4:29:c3:ce:d8:d6:12:66:2e.
  14. Are you sure you want to continue connecting (yes/no)? yes
  15. 0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
  16. 0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-secondarynamenode-dev-161.out
  17. ## 启动成功
  18. [hadoop@dev-161 hadoop]$ jps
  19. 5714 DataNode
  20. 5589 NameNode
  21. 6022 Jps
  22. 5881 SecondaryNameNode
  23. # webui
  24. http://172.16.1.161:50070/dfshealth.html#tab-overview

image_1bo6st27211a0cgu1occ11iu1t6u9.png-184.5kB


hdfs常用命令
  1. [hadoop@dev-161 hadoop]$ ./bin/hdfs
  2. Usage: hdfs [--config confdir] [--loglevel loglevel] COMMAND
  3. where COMMAND is one of:
  4. dfs run a filesystem command on the file systems supported in Hadoop.
  5. classpath prints the classpath
  6. namenode -format format the DFS filesystem
  7. secondarynamenode run the DFS secondary namenode
  8. namenode run the DFS namenode
  9. journalnode run the DFS journalnode
  10. zkfc run the ZK Failover Controller daemon
  11. datanode run a DFS datanode
  12. dfsadmin run a DFS admin client
  13. haadmin run a DFS HA admin client
  14. fsck run a DFS filesystem checking utility
  15. balancer run a cluster balancing utility
  16. jmxget get JMX exported values from NameNode or DataNode.
  17. mover run a utility to move block replicas across
  18. storage types
  19. oiv apply the offline fsimage viewer to an fsimage
  20. oiv_legacy apply the offline fsimage viewer to an legacy fsimage
  21. oev apply the offline edits viewer to an edits file
  22. fetchdt fetch a delegation token from the NameNode
  23. getconf get config values from configuration
  24. groups get the groups which users belong to
  25. snapshotDiff diff two snapshots of a directory or diff the
  26. current directory contents with a snapshot
  27. lsSnapshottableDir list all snapshottable dirs owned by the current user
  28. Use -help to see options
  29. portmap run a portmap service
  30. nfs3 run an NFS version 3 gateway
  31. cacheadmin configure the HDFS cache
  32. crypto configure HDFS encryption zones
  33. storagepolicies list/get/set block storage policies
  34. version print the version
  35. Most commands print help when invoked w/o parameters.
  36. ----------
  37. [hadoop@dev-161 hadoop]$ ./bin/hdfs dfs
  38. Usage: hadoop fs [generic options]
  39. [-appendToFile <localsrc> ... <dst>]
  40. [-cat [-ignoreCrc] <src> ...]
  41. [-checksum <src> ...]
  42. [-chgrp [-R] GROUP PATH...]
  43. [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
  44. [-chown [-R] [OWNER][:[GROUP]] PATH...]
  45. [-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]
  46. [-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
  47. [-count [-q] [-h] <path> ...]
  48. [-cp [-f] [-p | -p[topax]] <src> ... <dst>]
  49. [-createSnapshot <snapshotDir> [<snapshotName>]]
  50. [-deleteSnapshot <snapshotDir> <snapshotName>]
  51. [-df [-h] [<path> ...]]
  52. [-du [-s] [-h] <path> ...]
  53. [-expunge]
  54. [-find <path> ... <expression> ...]
  55. [-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
  56. [-getfacl [-R] <path>]
  57. [-getfattr [-R] {-n name | -d} [-e en] <path>]
  58. [-getmerge [-nl] <src> <localdst>]
  59. [-help [cmd ...]]
  60. [-ls [-d] [-h] [-R] [<path> ...]]
  61. [-mkdir [-p] <path> ...]
  62. [-moveFromLocal <localsrc> ... <dst>]
  63. [-moveToLocal <src> <localdst>]
  64. [-mv <src> ... <dst>]
  65. [-put [-f] [-p] [-l] <localsrc> ... <dst>]
  66. [-renameSnapshot <snapshotDir> <oldName> <newName>]
  67. [-rm [-f] [-r|-R] [-skipTrash] <src> ...]
  68. [-rmdir [--ignore-fail-on-non-empty] <dir> ...]
  69. [-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
  70. [-setfattr {-n name [-v value] | -x name} <path>]
  71. [-setrep [-R] [-w] <rep> <path> ...]
  72. [-stat [format] <path> ...]
  73. [-tail [-f] <file>]
  74. [-test -[defsz] <path>]
  75. [-text [-ignoreCrc] <src> ...]
  76. [-touchz <path> ...]
  77. [-truncate [-w] <length> <path> ...]
  78. [-usage [cmd ...]]
  79. Generic options supported are
  80. -conf <configuration file> specify an application configuration file
  81. -D <property=value> use value for given property
  82. -fs <local|namenode:port> specify a namenode
  83. -jt <local|resourcemanager:port> specify a ResourceManager
  84. -files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
  85. -libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
  86. -archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
  87. The general command line syntax is
  88. bin/hadoop command [genericOptions] [commandOptions]
  89. ----------
  90. ## 在HDFS上创建目录
  91. [hadoop@dev-161 hadoop]$ ./bin/hdfs dfs -mkdir -p /user/hadoop
  92. [hadoop@dev-161 hadoop]$ ./bin/hdfs dfs -mkdir input
  93. ## 查看HDFS的目录结构
  94. [hadoop@dev-161 hadoop]$ ./bin/hdfs dfs -ls /user/hadoop
  95. Found 1 items
  96. drwxr-xr-x - hadoop supergroup 0 2017-08-23 14:40 /user/hadoop/input
  97. ## 上传本地文件至HDFS指定目录
  98. [hadoop@dev-161 hadoop]$ ./bin/hdfs dfs -put etc/hadoop/*.xml input
  99. [hadoop@dev-161 hadoop]$ ./bin/hdfs dfs -ls input
  100. Found 8 items
  101. -rw-r--r-- 1 hadoop supergroup 4436 2017-08-23 14:40 input/capacity-scheduler.xml
  102. -rw-r--r-- 1 hadoop supergroup 1115 2017-08-23 14:40 input/core-site.xml
  103. -rw-r--r-- 1 hadoop supergroup 9683 2017-08-23 14:40 input/hadoop-policy.xml
  104. -rw-r--r-- 1 hadoop supergroup 1180 2017-08-23 14:40 input/hdfs-site.xml
  105. -rw-r--r-- 1 hadoop supergroup 620 2017-08-23 14:40 input/httpfs-site.xml
  106. -rw-r--r-- 1 hadoop supergroup 3518 2017-08-23 14:40 input/kms-acls.xml
  107. -rw-r--r-- 1 hadoop supergroup 5540 2017-08-23 14:40 input/kms-site.xml
  108. -rw-r--r-- 1 hadoop supergroup 690 2017-08-23 14:40 input/yarn-site.xml
  109. [hadoop@dev-161 hadoop]$ ./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.4.jar grep input output 'dfs[a-z.]+'
  110. ## 查看HDFS文件内容
  111. [hadoop@dev-161 hadoop]$ ./bin/hdfs dfs -cat output/*
  112. 1 dfsadmin
  113. 1 dfs.replication
  114. 1 dfs.namenode.name.dir
  115. 1 dfs.datanode.data.dir
  116. [hadoop@dev-161 hadoop]$ ./bin/hdfs dfs -ls
  117. Found 2 items
  118. drwxr-xr-x - hadoop supergroup 0 2017-08-23 14:40 input
  119. drwxr-xr-x - hadoop supergroup 0 2017-08-23 14:42 output
  120. ## 下载HDFS目录至本地
  121. [hadoop@dev-161 hadoop]$ ./bin/hdfs dfs -get output output
  122. [hadoop@dev-161 hadoop]$ cat output/*
  123. 1 dfsadmin
  124. 1 dfs.replication
  125. 1 dfs.namenode.name.dir
  126. 1 dfs.datanode.data.dir
  127. ## 删除HDFS指定目录
  128. [hadoop@dev-161 hadoop]$ ./bin/hdfs dfs -rm -r output
  129. 17/08/23 14:48:55 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
  130. Deleted output
  131. [hadoop@dev-161 hadoop]$ ./bin/hdfs dfs -ls
  132. Found 1 items
  133. drwxr-xr-x - hadoop supergroup 0 2017-08-23 14:40 input

修改配置并启动YARN
  1. [hadoop@dev-161 hadoop]$ mv etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml
  2. [hadoop@dev-161 hadoop]$ vim etc/hadoop/mapred-site.xml
  3. <configuration>
  4. <property>
  5. <name>mapreduce.framework.name</name>
  6. <value>yarn</value>
  7. </property>
  8. </configuration>
  9. [hadoop@dev-161 hadoop]$ vim etc/hadoop/yarn-site.xml
  10. <configuration>
  11. <property>
  12. <name>yarn.nodemanager.aux-services</name>
  13. <value>mapreduce_shuffle</value>
  14. </property>
  15. </configuration>
  16. ## 然后就可以启动 YARN 了(需要先执行过 ./sbin/start-dfs.sh)
  17. [hadoop@dev-161 hadoop]$ ./sbin/start-yarn.sh
  18. starting yarn daemons
  19. starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-resourcemanager-dev-161.out
  20. localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-nodemanager-dev-161.out
  21. [hadoop@dev-161 hadoop]$ ./sbin/mr-jobhistory-daemon.sh start historyserver
  22. starting historyserver, logging to /usr/local/hadoop/logs/mapred-hadoop-historyserver-dev-161.out
  23. [hadoop@dev-161 hadoop]$ jps
  24. 5714 DataNode
  25. 5589 NameNode
  26. 7189 Jps
  27. 6726 ResourceManager
  28. 6838 NodeManager
  29. 5881 SecondaryNameNode
  30. 7150 JobHistoryServer
  31. ## webui
  32. http://172.16.1.161:8088/cluster

image_1bo6u2rd115jqsq413oc1nmt17ltm.png-150.9kB

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注