@tsing1226
2015-12-29T01:38:26.000000Z
字数 5597
阅读 976
hadoop
Hadoop是分布式文件系统、采用主从(master/slave)架构。Hadoop包括NameNode、DataNode组件,NameNode存储的是镜像文件fsimage和日志文件edits,DataNode存储数据。当客户端访问DataNode,
hadoop高可用性采用两个NameNode保证hadoop的高可用性,一个Active、一个Standby。
HDFS HA采用两个NameNode节点,能够保证高可用性的。NameNode是存储元数据的,当客户端读/写文件时,都是通过NameNode决定客户端访问哪个DataNode。
HDFS HA配置有四点要点:
保证双NameNode节点能够时时同步一致,NameNode启动时先要读取edits文件,所以要保证日志文件的安全性,将采用日志文件节点分布式管理日志文件,搭建JournalNode 节点。
NameNode在启动时,等待DataNode向NameNode注册,并时时接受DataNode心跳和块的报告,所以报保证两个NameNode节点能够时时接受到DataNode心跳和块的报告。
当客户端访问NameNode时,如何判断两个NameNode的active和standby,主要是通过代理(Proxy)的方式来帮助客户端访问哪个节点。
如何保证在任意时刻,集群中只有一个NameNode对外提供服务呢,此时需要隔离机制。一般采用ssh无密钥隔离机制
- hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>hadoop-senior01.grc.com:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>hadoop-senior02.grc.com:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>hadoop-senior01.grc.com:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>hadoop-senior02.grc.com:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-senior01.grc.com:8485;hadoop-senior02.grc.com:8485;hadoop-senior03.grc.com/ns1</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/app/hadoop-2.5.0/data/dfs/jn</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/grc/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hdfs://hadoop-senior03.grc.com:50090</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
</configuration>
* core-site.xml
fs.defaultFS
hdfs://ns1
hadoop.tmp.dir
/opt/app/hadoop-2.5.0/data/tmp
$sbin/hadoop-daemon.sh start namenode
$bin/hdfs namenode -bootstrapStandby
$sbin/hadoop-daemon.sh start namenode
$bin/hdfs haadmin -transitionToActive nn1
$sbin/hadoop-daemon.sh start datanode
此时NameNode1(Standby) NameNode2(active)
$bin/hdfs dfs -mkdir -p tmp/conf
$bin/hdfs dfs -put etc/hadoop/*-site.xml tmp/conf/
$sbin/hadoop-daemon.sh start journalnode
$bin/hdfs namenode -format
$sbin/hadoop-daemon.sh start namenode
$bin/hdfs namenode -bootstrapStandby
$sbin/hadoop-daemon.sh start namenode
$bin/hdfs haadmin -transitionToActive nn1
$sbin/hadoop-daemon.sh start datanode
- 存储RM的状态信息,集群资源、任务相关信息;
*自动故障转移,选取active.
- hdfs-site.xml
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop-senior01.grc.com:2181,hadoop-senior02.grc.com:2181,hadoop-senior03.grc.com:2181</value>
</property>
hdfs zkfc -formatzk
bin/zkServer.sh start
ResourceManager高可性能拥有添加冗余特性,采用主从ReourceManager来解决以往单节点宕机、系统升级的不足。ZooKeeper是一种分布式协助框架,能够解决ResourceManager高可同用性问题,主要表现为以下五点:
存储RM状态信息
集群资源使用
任务相关信息
自动故障转移
*选举Active
ResourceManager HA 采用了主从架构,在任何时刻只有一个Active节点,其他的一个或多个ResourceManager,处于standby状态,当主节点出现故障时,Standby中的一个ResourceManager将会出现接管原ActiveResourceManager,保证作业任务不丢失。Active ResourceManager将信息写到分布式协助框架ZooKeeper系统中,当Active ResourceManager出现故障时,处于Standby的 ResourceManager节点将会切换到active。
- yarn-site.xml
<configuration>
<property>
<--配置resourcemanager高可用性-->
<name>yarn.resourcemanager.ha.enabled</name>#
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn-cluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop-senior02.grc.com</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop-senior03.grc.com</value>
</property>
<--配置resourcemanager启动Zookeeper协助框架-->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop-senior01.grc.com:2181,hadoop-senior02.grc.com:2181,hadoop-senior03.grc.com:2181</value>
</property>
<--配置resourcemanager可重启配置-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
</configuration>
参考文献:
- http://hadoop.apache.org/docs/r2.5.2/hadoop-project-dist/hadoop-common/core-default.xml
- http://hadoop.apache.org/docs/r2.5.2/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
- http://hadoop.apache.org/docs/r2.5.2/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml
- http://hadoop.apache.org/docs/r2.5.2/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
- http://hadoop.apache.org/docs/r2.5.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
- http://hadoop.apache.org/docs/r2.5.2/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html
- http://hadoop.apache.org/docs/r2.5.2/hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html