@duguyiren3476
2015-08-04T08:37:40.000000Z
字数 4665
阅读 331
未分类
hadoop的ha高可用是指hdfs的高可用机制,如果要配置ha,请在配置好的非高可用的配置基础上修改,环境
- namenode:
nn1:10.128.17.21;
nn2:10.128.17.39;
- datanode:
dn1:10.128.17.24
dn2:10.128.17.25
- journalnode:
10.128.17.39;
10.128.17.24;
10.128.17.25;
- zookeeper:
10.128.17.16
10.128.17.17
10.128.17.20
如下文件:
<configuration><property><name>hadoop.tmp.dir</name><value>/oneapm/data/hadoop</value><description>Abase for other temporary directories.</description></property><property><name>fs.defaultFS</name><value>hdfs://onlinecluster</value></property><property><name>fs.trash.interval</name><value>2880</value><description>trash save two days (m)</description></property><property><name>hadoop.logfile.size</name><value>104857600</value><description>The max size of each log file 100M</description></property><property><name>hadoop.logfile.count</name><value>10</value><description>The max number of log files</description></property><property><name>hadoop.native.lib</name><value>true</value></property><property><name>io.compression.codecs</name><value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec</value></property><property><name>ha.zookeeper.quorum</name><value>10.128.17.16:2181,10.128.17.17:2181,10.128.17.20:2181</value></property><property><name>ha.zookeeper.session-timeout.ms</name><value>1000</value><description>ms</description></property></configuration>
<configuration><property><name>dfs.replication</name><value>2</value></property><property><name>dfs.support.append</name><value>true</value></property><property><name>dfs.permissions</name><value>false</value></property><property><name>dfs.nameservices</name><value>onlinecluster</value></property><property><name>dfs.ha.namenodes.onlinecluster</name><value>nn1,nn2</value></property><property><name>dfs.namenode.rpc-address.onlinecluster.nn1</name><value>10.128.17.21:8020</value></property><property><name>dfs.namenode.rpc-address.onlinecluster.nn2</name><value>10.128.17.39:8020</value></property><property><name>dfs.namenode.servicerpc-address.onlinecluster.nn1</name><value>10.128.17.21:53310</value></property><property><name>dfs.namenode.servicerpc-address.onlinecluster.nn2</name><value>10.128.17.39:53310</value></property><property><name>dfs.namenode.http-address.onlinecluster.nn1</name><value>10.128.17.21:50070</value></property><property><name>dfs.namenode.http-address.onlinecluster.nn2</name><value>10.128.17.39:50070</value></property><property><name>dfs.namenode.shared.edits.dir</name><value>qjournal://10.128.17.24:8485;10.128.17.25:8485;1 0.128.17.39:8485/onlinecluster</value></property><property><name>dfs.client.failover.proxy.provider.onlinecluster</name><value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value></property><property><name>dfs.ha.fencing.methods</name><value>sshfence</value></property><property><name>dfs.ha.fencing.ssh.private-key-files</name><value>/home/hadoop/.ssh/id_rsa_nn1</value></property><property><name>dfs.ha.fencing.ssh.connect-timeout</name><value>30000</value></property><property><name>dfs.journalnode.edits.dir</name><value>/oneapm/data/hadoop/journaldata</value></property><property><name>dfs.ha.automatic-failover.enabled</name><value>true</value></property><property><name>ha.failover-controller.cli-check.rpc-timeout.ms</name><value>60000</value></property><property><name>ipc.client.connect.timeout</name><value>60000</value></property><property><name>dfs.image.transfer.bandwidthPerSec</name><value>4194304</value></property></configuration>
--
dfs.ha.fencing.methods 的方式为ssh方式,也可以选择shell方式,后者相比更加的麻烦
dfs.ha.fencing.ssh.private-key-files :/home/hadoop/.ssh/id_rsa_nn1 采用sshfence方式的话,需要nn1和nn2之间相互copy对方的私钥id_rsa 并重命名为nn1和nn2; 配置文件同步完之后此值要相应修改为id_rsa_nn1和id_rsa_nn2
同时需要在nn1和nn2的主机上吧所有的dn哈nn的hosts文件配置完整:之后scp分发配置到各个节点
shell
sbin/hadoop-daemon.sh start journalnode
shell
bin/hdfs zkfc -formatZK
shell
bin/hdfs namenode -formate
shell
sbin/hadoop-daemon.sh start namenode
bin/hdfs namenode -bootstrapStandby
sbin/stop-dfs.sh
sbin/start-dfs.sh
浏览器粉笔访问nn1:50070和nn2:50070查看namenode状态,并分别kill掉active状态的namenode进程,观察是否standby状态的namenode改变为active,如此反复测试ha