[关闭]
@zhangyy 2020-11-19T09:28:06.000000Z 字数 8622 阅读 161

oozie 任务调度处理

协作框架


  • 一:oozie example 运行任务调度案例
  • 二:oozie 运行自定的mapreduce 的jar 包
  • 三:oozie 调度shell 脚本
  • 四:oozie 的coordinator 周期性调度当前任务

一: 运行oozie example 案例

1.1 解压exmaple包

  1. 解压example
  2. tar -zxvf oozie-examples.tar.gz
  3. cd /home/hadoop/yangyang/oozie/examples/apps/map-reduce
  4. job.properties --定义job相关的属性,比如目录路径、namenode节点等。
  5. --定义workflow的位置
  6. workflow.xml --定义工作流相关的配置(start --end --kill)(action
  7. --mapred.input.dir
  8. --mapred.output.dir
  9. lib --目录,存放job任务需要的资源(jar包)

1.2 更改job.properties

  1. nameNode=hdfs://namenode01.hadoop.com:8020
  2. jobTracker=namenode01.hadoop.com:8032
  3. queueName=default
  4. examplesRoot=examples
  5. oozie.wf.application.path=${nameNode}/user/hadoop/${examplesRoot}/apps/map-reduce/workflow.xml
  6. outputDir=map-reduce

1.3 配置workflow.xml 文件:

  1. <workflow-app xmlns="uri:oozie:workflow:0.2" name="map-reduce-wf">
  2. <start to="mr-node"/>
  3. <action name="mr-node">
  4. <map-reduce>
  5. <job-tracker>${jobTracker}</job-tracker>
  6. <name-node>${nameNode}</name-node>
  7. <prepare>
  8. <delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}"/>
  9. </prepare>
  10. <configuration>
  11. <property>
  12. <name>mapred.job.queue.name</name>
  13. <value>${queueName}</value>
  14. </property>
  15. <property>
  16. <name>mapred.mapper.class</name>
  17. <value>org.apache.oozie.example.SampleMapper</value>
  18. </property>
  19. <property>
  20. <name>mapred.reducer.class</name>
  21. <value>org.apache.oozie.example.SampleReducer</value>
  22. </property>
  23. <property>
  24. <name>mapred.map.tasks</name>
  25. <value>1</value>
  26. </property>
  27. <property>
  28. <name>mapred.input.dir</name>
  29. <value>/user/${wf:user()}/${examplesRoot}/input-data/text</value>
  30. </property>
  31. <property>
  32. <name>mapred.output.dir</name>
  33. <value>/user/${wf:user()}/${examplesRoot}/output-data/${outputDir}</value>
  34. </property>
  35. </configuration>
  36. </map-reduce>
  37. <ok to="end"/>
  38. <error to="fail"/>
  39. </action>
  40. <kill name="fail">
  41. <message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  42. </kill>
  43. <end name="end"/>
  44. </workflow-app>

1.3 上传example 目录到hdfs 上面

  1. hdfs dfs -put example example

image_1aki0h9sk1qrf1t7l1hjo18b1mll9.png-11.7kB
image_1aki0o964i62lpu1iolnrj1stmm.png-58kB

1.4 运行oozie 调度任务

  1. bin/oozie job -oozie http://namenode01.hadoop.com:11000/oozie -config examples/apps/map-reduce/job.properties -run

image_1aki2bafs1in9s6h1lt11j1m17km.png-15.9kB

查看状态:

image_1aki2apoaup14av14li1ea25959.png-39.1kB

输出目录

image_1aki2i60p1vg41dse1u752aj1u1313.png-66.6kB
image_1aki2j2n71ngsvv81q9o1rtd1k7d1g.png-105.4kB
image_1aki2k7h417e2dfpsj51l1u134a1t.png-109.4kB

二:oozie 运行自定的mapreduce 的jar 包

2.1 在hdfs 上创建上传目录

  1. cd /home/hadoop/yangyang/oozie/
  2. hdfs dfs -mkdir oozie-apps

2.2 新建本地的文件用作上传的目录

  1. mkdir oozie-apps
  2. cd /home/hadoop/yangyang/oozie/examples/apps
  3. cp -ap map-reduce /home/hadoop/yangyang/oozie/oozie-apps/
  4. cd /homme/hadoop/yangyang/oozie/oozie-appps/map-reduce
  5. mkdir input-data

2.3 拷贝运行的jar包与要运行的job 任务的文件

  1. cp -p mr-wordcount.jar yangyang/oozie/oozie-apps/map-reduce/lib/
  2. cp -p /home/hadoop/wc.input ./input-data

2.4 配置job.properties 文件和workflow.xml

vim job.properties

  1. nameNode=hdfs://namenode01.hadoop.com:8020
  2. jobTracker=namenode01.hadoop.com:8032
  3. queueName=default
  4. examplesRoot=oozie-apps/map-reduce
  5. oozie.wf.application.path=${nameNode}/user/hadoop/${examplesRoot}/workflow.xml
  6. outputDir=oozie-reduce

image_1aki8mlep80s134k1q3d1ks21bec2a.png-24kB

vim workflow.xml

  1. <workflow-app xmlns="uri:oozie:workflow:0.2" name="wc-map-reduce">
  2. <start to="mr-node"/>
  3. <action name="mr-node">
  4. <map-reduce>
  5. <job-tracker>${jobTracker}</job-tracker>
  6. <name-node>${nameNode}</name-node>
  7. <prepare>
  8. <delete path="${nameNode}/user/hadoop/${examplesRoot}/output-data/${outputDir}"/>
  9. </prepare>
  10. <configuration>
  11. <property>
  12. <name>mapred.job.queue.name</name>
  13. <value>${queueName}</value>
  14. </property>
  15. <!--0 new API-->
  16. <property>
  17. <name>mapred.mapper.new-api</name>
  18. <value>true</value>
  19. </property>
  20. <property>
  21. <name>mapred.reducer.new-api</name>
  22. <value>true</value>
  23. </property>
  24. <!--1 input-->
  25. <property>
  26. <name>mapred.input.dir</name>
  27. <value>/user/hadoop/${examplesRoot}/input-data</value>
  28. </property>
  29. <!--2 mapper class -->
  30. <property>
  31. <name>mapreduce.job.map.class</name>
  32. <value>org.apache.hadoop.wordcount.WordCountMapReduce$WordCountMapper</value>
  33. </property>
  34. <property>
  35. <name>mapreduce.map.output.key.class</name>
  36. <value>org.apache.hadoop.io.Text</value>
  37. </property>
  38. <property>
  39. <name>mapreduce.map.output.value.class</name>
  40. <value>org.apache.hadoop.io.IntWritable</value>
  41. </property>
  42. <!--3 reduer class -->
  43. <property>
  44. <name>mapreduce.job.reduce.class</name>
  45. <value>org.apache.hadoop.wordcount.WordCountMapReduce$WordCountReducer</value>
  46. </property>
  47. <property>
  48. <name>mapreduce.job.output.key.class</name>
  49. <value>org.apache.hadoop.io.Text</value>
  50. </property>
  51. <property>
  52. <name>mapreduce.job.output.value.class</name>
  53. <value>org.apache.hadoop.io.IntWritable</value>
  54. </property>
  55. <!--4 output -->
  56. <property>
  57. <name>mapred.output.dir</name>
  58. <value>/user/hadoop/${examplesRoot}/output-data/${outputDir}</value>
  59. </property>
  60. </configuration>
  61. </map-reduce>
  62. <ok to="end"/>
  63. <error to="fail"/>
  64. </action>
  65. <kill name="fail">
  66. <message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  67. </kill>
  68. <end name="end"/>
  69. </workflow-app>

2.6 上传文件到hdfs 上面:

  1. hdfs dfs -put map-reduce oozie-apps

2.7 执行oozie 命令运行job 处理

  1. bin/oozie job -oozie http://namenode01.hadoop.com:11000/oozie -config oozie-apps/map-reduce/job.properties -run

2.8 在浏览器上面查看测试结果

image_1aki9dbt41iko11g92t3ottjn2n.png-16.1kB

image_1aki9e3nh1g581213ss18ru8gk34.png-45.5kB

image_1aki9fcol86j1p4h1pak1r715qp3h.png-50.2kB

image_1aki9gjhj1uprt9j14o7vdd1f9m3u.png-31.3kB

三:oozie 调度shell 脚本

3.1 生成配置文件:

  1. cd /home/hadoop/yangyang/oozie/examples/apps
  2. cp -ap shell/ ../../oozie-apps/
  3. mv shell mem-shell

3.2 书写shell 脚本:

  1. cd /home/hadoop/yangyang/oozie/oozie-apps/mem-shell

vim meminfo.sh

  1. #!/bin/bash
  2. /usr/bin/free -m >> /tmp/meminfo

3.3 配置job.properties 文件和workflow.xml

vim job.properties

  1. nameNode=hdfs://namenode01.hadoop.com:8020
  2. jobTracker=namenode01.hadoop.com:8032
  3. queueName=default
  4. examplesRoot=oozie-apps/mem-shell
  5. oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/workflow.xml
  6. EXEC=meminfo.sh

image_1akibh1td13h0fig1jbp1dsfu364b.png-18.2kB

vim workflow.xml

  1. <workflow-app xmlns="uri:oozie:workflow:0.4" name="mem-shell-wf">
  2. <start to="shell-node"/>
  3. <action name="shell-node">
  4. <shell xmlns="uri:oozie:shell-action:0.2">
  5. <job-tracker>${jobTracker}</job-tracker>
  6. <name-node>${nameNode}</name-node>
  7. <configuration>
  8. <property>
  9. <name>mapred.job.queue.name</name>
  10. <value>${queueName}</value>
  11. </property>
  12. </configuration>
  13. <exec>${EXEC}</exec>
  14. <file>/user/hadoop/oozie-apps/mem-shell/${EXEC}#${EXEC}</file>
  15. </shell>
  16. <ok to="end"/>
  17. <error to="fail"/>
  18. </action>
  19. <kill name="fail">
  20. <message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  21. </kill>
  22. <end name="end"/>
  23. </workflow-app>

3.4 上传配置文件到hdfs 上面

  1. cd /home/hadoop/yangyang/oozie/oozie-apps
  2. hdfs dfs -put mem-shell oozie-apps

image_1akibnkmg1svtvt9ktu4m2u6k4o.png-28.4kB

3.5 执行oozie 调度 shell脚本

  1. bin/oozie job -oozie http://namenode01.hadoop.com:11000/oozie -config oozie-apps/mem-shell/job.properties -run

image_1akibqg8i1j0o13ei15321d8v1f4h55.png-31.2kB

image_1akibr84i1sfm10qtq7ler9183b5i.png-64.5kB

image_1akibru3n1i2e1458jqb1f0611335v.png-109.8kB

image_1akibsedf1c6qm19clapmnsb6c.png-83.1kB

四:oozie 的coordinator 周期性调度当前任务

4.1 配置时区 更改oozie 的配置文件

  1. cd /home/hadoop/yangyang/oozie/conf
  2. vim oozie-site.xml 增加:
  3. <property>
  4. <name>oozie.processing.timezone</name>
  5. <value>GMT+0800</value>
  6. </property>
  7. <property>
  8. <name>oozie.service.coord.check.maximum.frequency</name>
  9. <value>false</value>
  10. </property>

4.2 更改本地 时间

  1. 使用root 账户 配置
  2. cp -p /etc/localtime /etc/localtime.bak
  3. rm -rf /etc/localtime
  4. cd /usr/share/zoneinfo/Asia/
  5. cp -p Shanghai /etc/localtime

4.3 更改oozie-consle.js 文件

  1. cd /home/hadoop/yangyang/oozie/oozie-server/webapps/oozie
  2. vim oozie-console.js
  3. function getTimeZone() {
  4. Ext.state.Manager.setProvider(new Ext.state.CookieProvider());
  5. return Ext.state.Manager.get("TimezoneId","GMT+0800");
  6. }

image_1akie7ik4cna40p1jlvmnd65n6p.png-8.1kB

4.4 从新启动oozie 服务

  1. bin/oozie-stop.sh
  2. bin/oozie-start.sh

4.5 查看oozie 的当前时间

image_1akiebr9214d01ihe1oi1uvr9gp76.png-46.5kB

4.6 配置job.properties 文件和workflow.xml

  1. cd /home/hadoop/yangyang/oozie/examples/apps
  2. cp -ap cron ../../oozie-apps/
  3. cd cron
  4. rm -rf job.properties workflow.xml
  5. cd /home/hadoop/yangyang/oozie/oozie-apps/mem-shell
  6. cp -p * ../cron

配置job.properties

  1. vim job.properties
  2. ---
  3. nameNode=hdfs://namenode01.hadoop.com:8020
  4. jobTracker=namenode01.hadoop.com:8032
  5. queueName=default
  6. examplesRoot=oozie-apps/cron
  7. oozie.coord.application.path=${nameNode}/user/hadoop/${examplesRoot}/
  8. start=2016-06-6T16:57+0800
  9. end=2016-06-6T20:00+0800
  10. workflowAppUri=${nameNode}/user/hadoop/${examplesRoot}/
  11. EXEC=meminfo.sh

image_1akihapdt1m3f5a13n71q2vg4g7j.png-18.8kB

配置workflow.xml

  1. vim workflow.xml
  2. ---
  3. <workflow-app xmlns="uri:oozie:workflow:0.4" name="memcron-shell-wf">
  4. <start to="shell-node"/>
  5. <action name="shell-node">
  6. <shell xmlns="uri:oozie:shell-action:0.2">
  7. <job-tracker>${jobTracker}</job-tracker>
  8. <name-node>${nameNode}</name-node>
  9. <configuration>
  10. <property>
  11. <name>mapred.job.queue.name</name>
  12. <value>${queueName}</value>
  13. </property>
  14. </configuration>
  15. <exec>${EXEC}</exec>
  16. <file>/user/hadoop/oozie-apps/cron/${EXEC}#${EXEC}</file>
  17. </shell>
  18. <ok to="end"/>
  19. <error to="fail"/>
  20. </action>
  21. <kill name="fail">
  22. <message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  23. </kill>
  24. <end name="end"/>
  25. </workflow-app>

配置coordinator.xml

  1. vim coordinator.xml
  2. ---
  3. <coordinator-app name="cron-coord" frequency="${coord:minutes(2)}" start="${start}" end="${end}" timezone="GMT+0800"
  4. xmlns="uri:oozie:coordinator:0.2">
  5. <action>
  6. <workflow>
  7. <app-path>${workflowAppUri}</app-path>
  8. <configuration>
  9. <property>
  10. <name>jobTracker</name>
  11. <value>${jobTracker}</value>
  12. </property>
  13. <property>
  14. <name>nameNode</name>
  15. <value>${nameNode}</value>
  16. </property>
  17. <property>
  18. <name>queueName</name>
  19. <value>${queueName}</value>
  20. </property>
  21. <property>
  22. <name>EXEC</name>
  23. <value>${EXEC}</value>
  24. </property>
  25. </configuration>
  26. </workflow>
  27. </action>
  28. </coordinator-app>

4.7 上传配置文件到hdfs 上面:

  1. hdfs dfs -put cron oozie-apps

4.8 执行 oozie 命令 运行job

  1. bin/oozie job -oozie http://namenode01.hadoop.com:11000/oozie -config oozie-apps/cron/job.properties -run

4.9 从web浏览job的相关问题

image_1akihodpprn31lo59ol1h1d1qk380.png-38.8kB

image_1akihpt0opo1rp81ujv1p6a5h48d.png-76.7kB

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注