Oozie Job Executor

Oozie Job Executor

This job entry executes Oozie Workflows. It is a front end on top of the OozieClient Java API that submits jobs to an Oozie server using web service calls.

Oozie is a workflow/coordination system to manage Hadoop jobs. Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions. Oozie Coordinator jobs are recurrent Oozie Workflow jobs and can be configured so a job is triggered by time (frequency) and data availability.

Oozie is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (Java map-reduce, Streaming map-reduce, Pig, Distcp, etc.). To learn more about Oozie and Oozie Workflows, visit Oozie's website: http://oozie.apache.org/index.html.

Oozie Job Executor (Quick Setup Mode)

Option

Definition

Name

The name of this job instance.

Hadoop Cluster

Allows you to create, edit, and select a Hadoop cluster configuration for use.  Hadoop cluster configurations settings can be reused in transformation steps and job entries that support this feature.  In a Hadoop cluster configuration, you can specify information like host names and ports for HDFS, Job Tracker, and other big data cluster components.  The Edit button allows you to edit Hadoop cluster configuration information.  The New button allows you to add a new Hadoop cluster configuration.  Information on Hadoop Clusters can be found in Pentaho Help.

 

 

Enable Blocking

Option blocks the rest of a transformation from executing until the Oozie job finishes when checked.

Polling Interval (ms)

Field allows you to set the interval rate to check for Oozie workflows.

Workflow Properties

Field to enter the Workfile Properties file. This path is required and must be a valid job properties file. In the properties file, the oozie.wf.application.path path must be set.

Hadoop Cluster

The Hadoop cluster configuration dialog allows you to specify configuration detail such as host names and ports for HDFS, Job Tracker, and other big data cluster components, which can be reused in transformation steps and job entries that support this feature.

Option

Definition

Cluster Name

Name that you assign the cluster configuration.

Use MapR Client

Indicates that this configuration is for a MapR cluster. If this box is checked, the fields in the HDFS and JobTracker sections are disabled because those parameters are not needed to configure MapR.

Hostname (in HDFS section)

Hostname for the HDFS node in your Hadoop cluster.

Port (in HDFS section)

Port for the HDFS node in your Hadoop cluster. 

Username (in HDFS section)

Username for the HDFS node.

Password (in HDFS section)

Password for the HDFS node.

Hostname (in JobTracker section)

Hostname for the JobTracker node in your Hadoop cluster. If you have a separate job tracker node, type in the hostname here. Otherwise use the HDFS hostname.

Port (in JobTracker section)

Port for the JobTracker in your Hadoop cluster. Job tracker port number; this cannot be the same as the HDFS port number.

Hostname (in ZooKeeper section)

Hostname for the Zookeeper node in your Hadoop cluster.

Port (in Zookeeper section)

Port for the Zookeeper node in your Hadoop cluster.

URL (in Oozie section)

Field to enter an Oozie URL. This must be a valid Oozie location.

Oozie Job Executor (Advanced Setup Mode)

If you have not set the Oozie path within your workflow properties file, you can add the needed path with Advanced Setup Mode within the Oozie Job Executor. To access Advanced Setup Mode, from within the Oozie Job Executor dialog, click Advanced Options.

Advanced Setup Mode allows you to add the needed Oozie path to your workflow properties file. It does not add the path directly to the properties file, instead the path is added by the Oozie Job Executor, not directly changing your workflow properties file.

Option

Definition

Workflow Properties

Displays the arguments, and their values, that are set within the workflow properties file found at the Oozie URL specified within the Oozie URL field.

Add Argument (green plus button)

Allows you to add a workflow property argument. Use this button to add the required Oozie path if it is not already set. This does not add the path to the properties file, instead it adds it to the PDI job, which adds it to the workflow configuration upon execution of the job.

Delete Argument (red "x" button)

Allows you to delete an argument. To delete an argument from the Oozie Executor job, select the desired argument from Workflow Properties, then click the Delete Argument button.