@zqbinggong
2018-06-16T20:19:40.000000Z
字数 6194
阅读 942
MapReduce基础
MapReduce应用开发
YARN
hadoop
《权威指南》
! All pictures are screenshots from the book 'Hadoop: The Definitive Guide, Fourth Edititon, by Tom White(O'Reilly).Copyright©2015TomWhite, 978-1-491-90163-2'
典型配置
<property>
<name>dfs.datanode.data.dir</name>
<value>/disk1/hdfs/data,/disk2/hdfs/data</value>
</property>
该属性指定一系列目录来供namenode存储永久性的文件系统元数据(编辑日志和文件系统映像),通常将nanenode元数据写入一两个本地磁盘和一个远程磁盘(如NFS挂载的目录)之中
典型配置
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/disk1/hdfs/namesecondary,/disk2/hdfs/namesecondary</value>
</property>
指定datanodde存储数据块的目录列表,此处指定多个目录列表不是为了冗余备份(namenode是此目的),而是为了使datanode循环地在各个目录中写数据;因而为了提高性能,最好为每一个本地磁盘指定一个存储目录,而且这样一来,数据块跨硬盘分布,可以使得针对不同数据块地读操作可以并发执行,提高读取性能
典型配置
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/disk1/hdfs/namesecondary,/disk2/hdfs/namesecondary</value>
</property>
为辅助namenode指定存储文件系统检查点地目录(同namenode,多个目录使为了冗余备份)
YARN没有tasktraver,它依赖于shuffle句柄将map任务地输出送给reduce任务
—— namenode、辅助namanode和datanode等HDFS组件如何在磁盘上组织永久性数据
${dfs.namenode.name.dir}/
├── current
│ ├── VERSION
│ ├── edits_0000000000000000001-0000000000000000019
│ ├── edits_inprogress_0000000000000000020
│ ├── fsimage_0000000000000000000
│ ├── fsimage_0000000000000000000.md5
│ ├── fsimage_0000000000000000019
│ ├── fsimage_0000000000000000019.md5
│ └── seen_txid
└── in_use.lock
VERSION: 包含正在运行的hdfs额度版本信息
#Mon Sep 29 09:54:36 BST 2014
namespaceID=1342387246 //文件系统命名空间的唯一标识符,namenode首次格式化时产生
clusterID=CID-01b5c398-959c-4ea8-aae6-1e0d9bd8b142//将hdfs集群作为一个整体的唯一标识符(对联邦hdfs很重要)
cTime=0 //namenode存储系统的创建时间
storageType=NAME_NODE
blockpoolID=BP-526805057-127.0.0.1-1411980876842//数据块池的唯一标识符,数据块池包含了由一个namenode管理的命名空间中的所有文件
layoutVersion=-57 //负整数,描述hdfs持久性数据结构的版本,布局变更,值递减,并且hdfs需要升级
in_use.lock:锁文件,供namenode使用来为存储目录加锁,从而避免其他namenode实例同时使用同一个存储目录的情况
这一部分设计到namenode和文件系统客户端之间的交互,后者会改变数据,前者需根据改变来更改元数据
文件系统客户端在执行写操作时(如移动或者创建文件),这些事物先被记录到编辑日志中,当编辑日志发被修改时,相关元数据信息会同步被更新(namenode在内存中维护文件系统的元数据)
fsimage文件
- The secondary asks the primary to roll its in-progress edits file, so new edits go to
a new file. The primary also updates the seen_txid file in all its storage directories.- The secondary retrieves the latest fsimage and edits files from the primary (using
HTTP GET).- The secondary loads fsimage into memory, applies each transaction from edits, then
creates a new merged fsimage file.- The secondary sends the new fsimage back to the primary (using HTTP PUT), and
the primary saves it as a temporary .ckpt file.- The primary renames the temporary fsimage file to make it available.
同一个datanode上的每个磁盘上的块不会重复,只有不同的datanode之间块才会有重复
${dfs.datanode.data.dir}/
├── current
│ ├── BP-526805057-127.0.0.1-1411980876842 //对应于namanode的VERSION文件中的数据块池ID
│ │ └── current
│ │ ├── VERSION
│ │ ├── finalized
│ │ │ ├── blk_1073741825
│ │ │ ├── blk_1073741825_1001.meta
│ │ │ ├── blk_1073741826
│ │ │ └── blk_1073741826_1002.meta
│ │ └── rbw
│ └── VERSION
└── in_use.lock
监控的目标在于检测集群在什么是什么时候未能提供所期望地服务,主守护进程(主namenode,辅助namenode,资源管理器)时最需要监控的
即向集群中添加和移除节点,通常情况下,节点同时运行datanode和节点管理器,因而两者一般同时被委任和解除
The file (or files) specified by the dfs.hosts and yarn.resourcemanager.nodes.include-path properties is different from the slaves file. The former is used by the namenode and resource manager to determine which worker nodes may connect. The slaves file is used by the Hadoop control scripts to perform cluster- wide operations, such as cluster restarts. It is never used by the Hadoop daemons.
To add new nodes to the cluster:
- Add the network addresses of the new nodes to the include file.
- Update the namenode with the new set of permitted datanodes using this command:
% hdfs dfsadmin -refreshNodes
- Update the resource manager with the new set of permitted node managers using:
% yarn rmadmin -refreshNodes
- Update the slaves file with the new nodes, so that they are included in future oper‐ ations performed by the Hadoop control scripts.
- Start the new datanodes and node managers.
- Check that the new datanodes and node managers appear in the web UI.
用户将拟退出的若干datanode告知namenode,Hadoop系统就在这些datanode停机前将数据复制到其他datanode
基本过程是委任新节点的反过程
To remove nodes from the cluster:
- Add the network addresses of the nodes to be decommissioned to the exclude file. Do not update the include file at this point.
- Update the namenode with the new set of permitted datanodes, using this command:
% hdfs dfsadmin -refreshNodes
- Update the resource manager with the new set of permitted node managers using:
% yarn rmadmin -refreshNodes
- Go to the web UI and check whether the admin state has changed to “Decommission In Progress” for the datanodes being decommissioned. They will start copying their blocks to other datanodes in the cluster.
- When all the datanodes report their state as “Decommissioned,” all the blocks have been replicated. Shut down the decommissioned nodes.
- Remove the nodes from the include file, and run:
% hdfs dfsadmin -refreshNodes
% yarn rmadmin -refreshNodes- Remove the nodes from the slaves file.