@zhangyy
2019-12-11T03:33:36.000000Z
字数 3923
阅读 409
hadoop的部分
hadoop 简介:开源软件,可靠的,可分布式,可伸缩的。去IOE---------IBM // ibm 小型机Oracle // oracle 数据库服务器EMC // 共享存储柜cluster:-----------集群1T = 1024G1P = 1024T1E = 1024P1Z = 1024E1Y = 1024Z1N = 1024Y海量的数据:------PB大数据解决了两个问题:---------------------1. 存储分布式存储2. 计算分布式计算云计算:------1. 服务2. 虚拟化分布式:--------------由分布在不同主机上的进程协同在一起,才能构成整个应用b/s 结构---------------------Browser /http server: 瘦客端模式failure over // 容灾fault over // 荣错大数据4V特点:-------------------Volume : 容量大variety: 多样化velocity : 速度快valueless : 价值密度低Hadoop 的四个模块------------------1. common2. hdfs3. hadoop yarn4. mapreduce (mr)hadoop 的安装模式:1. 独立模式 (standalone,local)nothing !2. 伪分布模式 (pseudodistributed mode)3. 集群模式 (cluster mode)
1. jdk-8u151-linux-x64.tar.gz2. hadoop-2.7.4.tar.gz
(1) 卸载原有jdk:rpm -e java-1.8.0-openjdk-devel-1.8.0.131-11.b12.el7.x86_64 java-1.7.0-openjdk-headless-1.7.0.141-2.6.10.5.el7.x86_64 java-1.8.0-openjdk-headless-1.8.0.131-11.b12.el7.x86_64 copy-jdk-configs-2.2-3.el7.noarch java-1.8.0-openjdk-1.8.0.131-11.b12.el7.x86_64 java-1.6.0-openjdk-1.6.0.41-1.13.13.1.el7_3.x86_64 java-1.7.0-openjdk-1.7.0.141-2.6.10.5.el7.x86_64 java-1.6.0-openjdk-devel-1.6.0.41-1.13.13.1.el7_3.x86_64 java-1.7.0-openjdk-devel-1.7.0.141-2.6.10.5.el7.x86_64 --nodeps(2) 创建安装目录:mkdir /softtar -zxvf jdk-8u151-linux-x64.tar.gz -C /softcd /softln -s jdk1.8.0_151 jdk-----配置环境变量vim /etc/profile----最后加上:# jdkexport JAVA_HOME=/soft/jdkexport CLASSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jarPATH=$PATH:$HOME/bin:$JAVA_HOME/bin---source /etc/profilejava -version

cd softwaretar -zxvf hadoop-2.7.4.tar.gz -C /softcd /softln -s hadoop-2.7.4 hadoop配置环境变量vim /etc/profile----到最后加上# hadoopexport HADOOP_HOME=/soft/hadoopPATH=$PATH:$HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin---source /etc/profilecd /soft/hadoop/bin/hadoop version

cd /soft/hadoop/etc/hadoop编辑core-site.xml 文件:vim core-site.xml<configuration><property><name>hadoop.tmp.dir</name><value>/soft/hadoop/data</value><description>hadoop_temp</description></property><property><name>fs.default.name</name><value>hdfs://node01.yangyang.com:8020</value><description>hdfs_derect</description></property></configuration>
编辑hdfs-site.xmlvim hdfs-site.xml------------------<configuration><property><name>dfs.replication</name><value>1</value><description>num</description><name>dfs.namenode.http-address</name><value>node01.yangyang.com:50070</value></property></configuration>
编辑 mapred-site.xmlcp -p mapred-site.xml.template mapred-site.xmlvim mapred-site.xml------<configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property><property><name>mapreduce.jobhistory.webapp.address</name><value>node01.yangyang.com:19888</value></property></configuration>
配置yarn-site.xmlvim yarn-site.xml-----------------<configuration><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property></configuration>
#echo "export JAVA_HOME=/soft/jdk" >> hadoop-env.sh#echo "export JAVA_HOME=/soft/jdk" >> mapred-env.sh#echo "export JAVA_HOME=/soft/jdk" >> yarn-env.sh
格式化文件系统:bin/hdfs namenode -format

启动namenode 与 datanodehadoop-daemon.sh start namenodehadoop-daemon.sh start datanode打开浏览器:

启动yarnyarn-daemon.sh start resourcemanageryarn-daemon.sh start nodemanager打开浏览器


hdfs dfs -mkdir /inputvim file1

hdfs dfs -put file1 /inputcd /soft/hadoop/share/hadoop/mapreduceyarn jar hadoop-mapreduce-examples-2.7.4.jar wordcount /input /output

hdfs dfs -ls /outputhdfs dfs -get /output

启动jobhistoryservermr-jobhistory-daemon.sh start historyserver


50070 //namenode http port50075 //datanode http port50090 //SecondaryNameNode http port8020 // namenode rpc port50010 // datanode rpc port8088 //yarn http port8042 //nodemanager http port19888 // jobhistoryserver http port
commonhdfs // namenode + datanode+ secondarynamenodemapredyarn //rescourcemanager + nodemanager
1. start-all.sh // 启动所有进程2. stop-all.sh // 停止所有进程3. start-dfs.sh //NN ,DN , SNN4. start-yarn.sh //RM,NM
