Transformation Executor

(warning) PLEASE NOTE: This documentation applies to an earlier version. For the most recent documentation, visit the Pentaho Enterprise Edition documentation site.

WORK IN PROGRESS

This topic needs further documentation to make it great. If you have experience with this transformation step, we encourage you to update this topic. More information can be found in JIRA case DOC-2111.

Description

The transformation executor allows you to execute a Pentaho Data Integration transformation. It is similar to the Job Executor step but works on transformations.
By default the specified transformation will be executed once for each input row. This row can be used to set parameters and variables and it is passed to the transformation in the form of a result row.
You can also allow a group of records to be passed based on the value in a field (when the value changes the transformation is executed) or on time. In these cases, the first row of the group or rows is used to set parameters or variables in the job.

It is possible to launch multiple copies of this step to facilitate parallel transformation processing.

Note: This step does not abort when the calling transformation errors out. To control the flow or abort of the transformation in case of errors, please specify the fields and a target step in the tab "Execution results" to get the number of errors. (fixed by PDI-12759 in PDI version 5.3).

Note: At the actual implementation, the log of the parent transformation contains only the last processed bunch of data. It was implemented this way to keep the strain on the logging back-end conservative. The detailed log of the child transformation can be obtained by looking at the execution results (define a target step within the Execution Result tabs) and look at the Fieldname of execution logging text (by default ExecutionLogText).

Options

Option

Description

Step name

Name of the step. Note: This name has to be unique in a single transformation.

Transformation

Use this section to specify the transformation to execute.  You have the following options to specify the transformation:

  • Use a file for the transformation: when this option is enabled, you can enter the the .ktr file that is to be used as transformation. The filename may contain variables (for example, you can use the built-in Internal.Transformation.Filename.Directory variable to construct a filename relative to the current transformation), or you can use the "Browse" button to select a file using a file browser.
  • Use a transformation from the repository: This option is available when connected to a repository. When enabled, you can enter the name and the repository path in the two fields corresponding to this option. Alternatively you can use the "Select" button to browse the repository and point to the transformation stored in the repository.
  • Specify by reference:  
    The following two buttons in this section makes it easier to work with the transformation: 
  • New transformation: create a new transformation to be used. The new transformation will be opened in a new tab.
  • Edit transformation: open the currently selected transformation in a new tab so you can edit it.

Parameter Options tab

In this tab you can specify which field to use to set a certain parameter or variable value. If multiple rows are passed to the job, the first row is taken to set the parameters or variables.

Option

Description

Variable / Parameter name

The Parameters tab allows you to define or pass Kettle variables down to the transformation.

Field to use

Specify which field to use to set a certain parameter or variable value. If you specify an input field to use, the static input value is not used.

Static input value

Instead of a field to use you can specify a static value here.

If you enable the "Inherit all variables from the transformation" option, all the variables defined in the parent transformation are passed to the transformation.

There is a button in the lower right corner of the tab that will insert all the defined parameters of the specified transformation. For information the description of the parameter is inserted into the static input value field.

Row grouping Options tab

On this tab you can specify the amount of input rows that are passed to the transformation in the form of result rows. You can use the result rows in a Get rows from result step in a transformation.

Option

Description

The number of rows to send to the transformation

after every X rows the job will be executed and these X rows will be passed to the transformation

Field to group rows on

Rows will be accumulated in a group as long as the field value stays the same. If the value
changes the transformation will be executed and the accumulated rows will be passed to the transformation.

The time to wait collecting rows before execution

This is time in Milliseconds the step will spend accumulating rows prior to the execution of the transformation.

Result tabs

Please see the Job executor step - the usage is identical.

Example

WORK IN PROGRESS, please see an example on http://jira.pentaho.com/browse/PDI-12204 (with actual issues in 5.0.6)