快速學習-Azkaban實戰

三 Azkaban實戰

Azkaba内置的任務類型支援command、java

3.1單一job案例

1）建立job描述檔案

[atguigu@hadoop102 jobs]$ vim first.job
#first.job
type=command
command=echo 'this is my first job'

将job資源檔案打包成zip檔案

[atguigu@hadoop102 jobs]$ zip first.zip first.job 
  adding: first.job (deflated 15%)
[atguigu@hadoop102 jobs]$ ll

總用量 8

-rw-rw-r--. 1 atguigu atguigu  60 10月 18 17:42 first.job
-rw-rw-r--. 1 atguigu atguigu 219 10月 18 17:43 first.zip

注意：

目前，Azkaban上傳的工作流檔案隻支援xxx.zip檔案。zip應包含xxx.job運作作業所需的檔案和任何檔案（檔案名字尾必須以.job結尾，否則無法識别）。作業名稱在項目中必須是唯一的。

3）通過azkaban的web管理平台建立project并上傳job的zip包

首先建立project

上傳zip包

4）啟動執行該job

點選執行工作流

點選繼續

5）Job執行成功

6）點選檢視job日志

3.2多job工作流案例

1）建立有依賴關系的多個job描述

第一個job：start.job

[atguigu@hadoop102 jobs]$ vim start.job
#start.job
type=command
command=touch /opt/module/kangkang.txt

第二個job：step1.job依賴start.job

[atguigu@hadoop102 jobs]$ vim step1.job
#step1.job
type=command
dependencies=start
command=echo "this is step1 job"

第三個job：step2.job依賴start.job

[atguigu@hadoop102 jobs]$ vim step2.job
#step2.job
type=command
dependencies=start
command=echo "this is step2 job"

第四個job：finish.job依賴step1.job和step2.job

[atguigu@hadoop102 jobs]$ vim finish.job
#finish.job
type=command
dependencies=step1,step2
command=echo "this is finish job"

2）将所有job資源檔案打到一個zip包中

[atguigu@hadoop102 jobs]$ zip jobs.zip start.job step1.job step2.job finish.job
updating: start.job (deflated 16%)
  adding: step1.job (deflated 12%)
  adding: step2.job (deflated 12%)
  adding: finish.job (deflated 14%)

3）在azkaban的web管理界面建立工程并上傳zip包

5）啟動工作流flow

6）檢視結果

思考：

将student.txt檔案上傳到hdfs，根據所傳檔案建立外部表，再将表中查詢到的結果寫入到本地檔案

3.3 java操作任務

使用Azkaban排程java程式

1）編寫java程式

public class AzkabanTest {
	public void run() throws IOException {
        // 根據需求編寫具體代碼
		FileOutputStream fos = new FileOutputStream("/opt/module/azkaban/output.txt");
		fos.write("this is a java progress".getBytes());
		fos.close();
    }

	public static void main(String[] args) throws IOException {
		AzkabanTest azkabanTest = new AzkabanTest();
		azkabanTest.run();
	}
}

2）将java程式打成jar包，建立lib目錄，将jar放入lib内

[atguigu@hadoop102 azkaban]$ mkdir lib
[atguigu@hadoop102 azkaban]$ cd lib/
[atguigu@hadoop102 lib]$ ll
總用量 4
-rw-rw-r--. 1 atguigu atguigu 3355 10月 18 20:55 azkaban-0.0.1-SNAPSHOT.jar

3）編寫job檔案

[atguigu@hadoop102 jobs]$ vim azkabanJava.job
#azkabanJava.job
type=javaprocess
java.class=com.atguigu.azkaban.AzkabanTest
classpath=/opt/module/azkaban/lib/*

4）将job檔案打成zip包

[atguigu@hadoop102 jobs]$ zip azkabanJava.zip azkabanJava.job 
  adding: azkabanJava.job (deflated 19%)

5）通過azkaban的web管理平台建立project并上傳job壓縮包，啟動執行該job

[atguigu@hadoop102 azkaban]$ pwd
/opt/module/azkaban
[atguigu@hadoop102 azkaban]$ ll
總用量 24
drwxrwxr-x.  2 atguigu atguigu 4096 10月 17 17:14 azkaban-2.5.0
drwxrwxr-x. 10 atguigu atguigu 4096 10月 18 17:17 executor
drwxrwxr-x.  2 atguigu atguigu 4096 10月 18 20:35 jobs
drwxrwxr-x.  2 atguigu atguigu 4096 10月 18 20:54 lib
-rw-rw-r--.  1 atguigu atguigu   23 10月 18 20:55 output
drwxrwxr-x.  9 atguigu atguigu 4096 10月 18 17:17 server
[atguigu@hadoop102 azkaban]$ cat output 
this is a java progress

3.3 HDFS操作任務

[atguigu@hadoop102 jobs]$ vim fs.job
#hdfs job
type=command
command=/opt/module/hadoop-2.7.2/bin/hadoop fs -mkdir /azkaban

2）将job資源檔案打包成zip檔案

[atguigu@hadoop102 jobs]$ zip fs.zip fs.job 
  adding: fs.job (deflated 12%)

3）通過azkaban的web管理平台建立project并上傳job壓縮包

4）啟動執行該job

5）檢視結果

3.4 mapreduce任務

mapreduce任務依然可以使用azkaban進行排程

1）建立job描述檔案，及mr程式jar包

[atguigu@hadoop102 jobs]$ vim mapreduce.job
#mapreduce job
type=command
command=/opt/module/hadoop-2.7.2/bin/hadoop jar /opt/module/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-

2.7.2.jar wordcount /wordcount/input /wordcount/output

[atguigu@hadoop102 jobs]$ zip mapreduce.zip mapreduce.job 
  adding: mapreduce.job (deflated 43%)

4）啟動job

3.5 Hive腳本任務

1）建立job描述檔案和hive腳本

（1）Hive腳本：student.sql

[atguigu@hadoop102 jobs]$ vim student.sql
use default;
drop table student;
create table student(id int, name string)
row format delimited fields terminated by '\t';
load data local inpath '/opt/module/datas/student.txt' into table student;
insert overwrite local directory '/opt/module/datas/student'
row format delimited fields terminated by '\t'
select * from student;

（2）Job描述檔案：hive.job

[atguigu@hadoop102 jobs]$ vim hive.job
#hive job
type=command
command=/opt/module/hive/bin/hive -f /opt/module/azkaban/jobs/student.sql

[atguigu@hadoop102 jobs]$ zip hive.zip hive.job 
  adding: hive.job (deflated 21%)

[atguigu@hadoop102 student]$ cat /opt/module/datas/student/000000_0 
1001    yangyang
1002    bobo
1003    banzhang
1004    pengpeng

快速學習-Azkaban實戰

3.1單一job案例

3.2多job工作流案例

3.3 java操作任務

3.3 HDFS操作任務

3.4 mapreduce任務

2.7.2.jar wordcount /wordcount/input /wordcount/output

3.5 Hive腳本任務

繼續閱讀

nginx location中斜線的位置的重要性

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的簡單使用

neo4j之cypher使用文檔

Ambari介紹和架構原理

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

spark/scala關于【資源檔案】加載方法概述外部檔案加載方案測試資源檔案打包入jar包中小結

NOSQL安全攻擊

mybatis_入門程式Mybatis入門

AOP程式設計_Android優雅權限架構(1)概念基礎，2021金三銀四前言正文大綱正文

Effective Java 8:通用程式設計

OOM三種類型

工廠模式-三種類型

【遞歸】高效率求2的n次幂

win10本地scala和spark安裝安裝scala安裝spark

scala (3) Function 和 Method