Transformation (job entry)

(warning) PLEASE NOTE: This documentation applies to an earlier version. For the most recent documentation, visit the Pentaho Enterprise Edition documentation site.

Description

The Transformation job entry is used to execute a previously defined transformation.

For ease of use, it is also possible to create a new transformation within the dialog, pressing the New Transformation button.

Options

Transformation specification tab

Option

Description

Name of the Job Entry

The unique name of the job entry on the canvas. A job entry can be placed on the canvas several times; however it will be the same job entry

Transformation Filename

If you are not working in a repository, specify the XML file name of the transformation to start. Click the button to browse through your local files.

Specify by Name and Directory

If you are working in the DI Repository, (or database repository) specify the name of the transformation to start. Click the Browse In Jobs button to browse through the repository.

Specify by Reference

If you specify a transformation or job by reference, you can rename or move it around in the repository. The reference (identifier) is stored, not the name and directory. This is enabled when working with the DI or database repository.

Advanced

Option

Description

Copy previous results to args

The results from a previous transformation can copied as arguments of the transformation using the "Copy rows to result" step. If Execute for every input row is enabled then each row is a set of command line arguments to be passed into the transformation, otherwise only the first row is used to generate the command line arguments.

Copy previous results to parameters

The results from a previous transformation can copied as parameters of the transformation using the "Copy rows to result" step.

Execute for every input row

Allows a transformation to be executed once for every input row (looping)

Clear the list or result rows before execution

Checking this makes sure that the list or result rows is cleared before the transformation is started.

Clear the list of result files before execution

Checking this makes sure that the list or result files is cleared before the transformation is started.

Run this transformation in a clustered mode

Allows you to execute the job or transformation in a clustered environment. See Running a Transformation for more details on how to execute a transformation in a clustered environment.

Log remote execution locally

If enabled, transfer the log lines from the cluster nodes to the local node.

Remote slave server

Specifies the slave server where the transformation will be run.

Wait for the remote transformation to finish

If enabled, the job is blocked until the transformation has completed on the slave server.

Follow local abort to remote transformation

If enabled, an abort signal sent locally will also be sent remotely.

Logging Settings tab

By default, if you do not set logging, Pentaho Data Integration will take log entries that are being generated and create a log record inside the job. For example, suppose a job has three transformations to run and you have not set logging. The transformations will not output logging information to other files, locations, or special configuration. In this instance, the job executes and puts logging information into its master job log.
In most instances, it is acceptable for logging information to be available in the job log. For example, if you have load dimensions, you want logs for your load dimension runs to display in the job logs. If there are errors in the transformations, they will be displayed in the job logs. If, however, you want all your log information kept in one place, you must set up logging.

Option

Description

Specify logfile

Enable to specify a separate logging file for the execution of this transformation

Append logfile

Enable to append to the logfile as opposed to creating a new one

Name of log file

The directory and base name of the log file (for example C:\logs)

Create parent folder

Enable to create a parent folder for the log file it it does not exist.

Extension of logfile

The file name extension; for example, log or txt

Include date in filename

Adds the system date to the filename with format YYYYMMDD (_20051231).

Include time in filename

Adds the system time to the filename with format HHMMSS (_235959).

Logging level

Specifies the logging level for the execution of the transformation. See also the logging window in Logging

Argument tab

Option

Description

Arguments

Specify which command-line arguments will be passed to the transformation.

Parameters tab

Specify which parameters will be passed to the transformation:

Option

Description

Pass all parameter values down to the sub-transformation

Enable this option to pass all parameters of the job down to the sub-transformation.

Parameters

Specify the parameter name that will be passed to the transformation.

Stream column name

Allows you to capture fields of incoming records of a result set as a parameter.

Value

Allows you to specify the values for the transformation's parameters. You can do this by:

  • Manually typing a value (Ex: ETL Job)
  • Use a parameter to set the value (Ex: ${Internal.Job.Name}
  • Using a combination of manually specified values and parameter values (Ex: ${FILE_PREFIX}_${FILE_DATE}.txt)