Mapping

(warning) PLEASE NOTE: This documentation applies to an earlier version. For the most recent documentation, visit the Pentaho Enterprise Edition documentation site.

Description

When you want to re-use a certain sequence of steps, you can turn the repetitive part into a mapping.
A "mapping" as such is a regular transformation with the exception that it is possible to define mapping input and output steps as placeholders.

  • Mapping input: the placeholder where the mapping expects input from the parent transformation
  • Mapping output: the placeholder where the parent transformation is reading data from.

A mapping is also known as a sub-transformation.

Note: To differentiate log lines from a mapping, you can set the KETTLE_LOG_MARK_MAPPINGS variable to 'Y'. Set this variable to Y to precede log lines with the mapping step name and the mapping itself (available since PDI 5.1).

Options

Option

Description

Step name

Name of the step. Note: This name has to be unique in a single transformation.

Mapping transformation

Use this section to specify the sub-transformation to execute.  You have the following options to specify the sub-transformation:

  • Use a file for the mapping transformation: when this option is enabled, you can enter the the .ktr file that is to be used as sub-transformation. The filename may contain variables (for example, you can use the built-in Internal.Transformation.Filename.Directory variable to construct a filename relative to the current transformation), or you can use the "Browse" button to select a file using a file browser.
  • Use a mapping transformation from the repository: This option is available when connected to a repository. When enabled, you can enter the name and the repository path in the two fields corresponding to this option. Alternatively you can use the "Select" button to browse the repository and point to the transformation stored in the repository.
  • Specify by reference:  
    The following two buttons in this section makes it easier to work with the mapped transformation: 
  • New transformation: create a new transformation to be used as mapping. The new transformation will be opened in a new tab. The new transformation already has a Mapping Input step connected to a Mapping Output step - you will have to edit it and add any steps inbetween to create a functional sub-transformation.
  • Edit transformation: open the currently selected transformation in a new tab so you can edit it.

Parameters

The Parameters tab allows you to define or pass Kettle variables down to the mapping.
This will allow you to reach a high degree of customization.
Note: it is possible to include variable expressions in the string values for the variable names.

  • Inherit all variables from the parent transformation: If this option is checked, all variables available in the parent transformation will be available in the sub-transformation, even if they are not explicitly specified in the Parameters tab.
    IMPORTANT!! : Only those variables/values that are specified are passed down to the sub-transformation.

Input tabs

Each of the input tabs (can be absent as well) correspond to one Mapping Input specification step in the mapping- or sub-transformation.
That means you can have a number of these tabs in a single Mapping step.

By default, 1 Input tab is available. You can add Input tabs using the "Add Input" button. To add more than one Input tab, you first need to check the "Allow multiple 'Mapping Input' steps in the sub-transformation" option
You can remove an Input tab simply by closing the tab (by clicking the red X icon on the tab itself).  

  • Input source step name: the name of the step in the parent transformation (not the mapping) to read from. This can be any step in the parent transformation with an outgoing hop that is connected to the Mapping step. You can use the "Choose" button to select the step from a list.
  • Mapping target step name: the name of the Mapping Input specification step inside the sub-transformation that is to receive the rows from the Input soure step. You can use the "Choose" button to select this step from a list.
  • Is this the main data path?: check this if you only have one input mapping and you can leave the 2 fields above empty.
  • Step mapping description: you can add a description to this input step mapping here
  • Ask these values to be renamed back on output?: Fields get renamed before they are transferred to the mapping transformation. Enabling this options will rename them back when they reach the Mapping output step. This will make your sub-transformations more transparent and reusable.
  • Mapping button: use this button to open the field mappings dialog. In this dialog you can specify exactly how the fields from the Input source step are connected to the fields of the Mapping target step.

Output tabs

Each of the output tabs (can be absent as well) correspond to one Mapping Output specification step in the mapping- or sub-transformation.
That means you can have a number of these tabs in a single Mapping step. 

By default, 1 Output tab is available. You can add Output tabs using the "Add Output" button. To add more than one Output tab, you first need to check the "Allow multiple 'Mapping Output' steps in the sub-transformation" option
You can remove an Output tab simply by closing the tab (by clicking the red X icon on the tab itself).  

  • Mapping source step: the name of a Mapping output specification step in the sub-transformation where we will read from. You can use the "Choose" button to select this step from a list.
  • Output target step name: the name of the step in the current transformation (parent) that is to receive the rows from the Mapping source step. This can be any step whose incoming hop is connected to the Mapping step. You can use the "Choose" button to select this step from a list.
  • Is this the main data path?: check this if you only have one output mapping and you can leave the 2 fields above empty.
  • Step mapping description: you can add a description to this input step mapping here
  • Mapping button: use this button to open the field mappings dialog. In this dialog you can specify exactly how the fields from the Mapping source step are connected to the fields of the Output target step
    Note: The 'Mapping' button is only enabled in the Output Mapping fields tab when multiple Output paths are available and the 'Is this the main data path?" option is disabled so that both steps (Mapping source step and Output target step) are given for the mapping.

Allow multiple 'Mapping Input' steps in the sub-transformation

When checked, the sub-transformation can have multiple Mapping input specification steps that receive data from the parent transformation. In this case, the Add Input button is enabled so you can add multiple Input tabs to specify the mapping for each Mapping input specification. When not checked, 1 Mapping input specification step (and hence, 1 Input tab) is presumed.

Add input

Use this button to add a tab to specify an input mapping for the specified sub-transformation. This button will always be enabled when the "Allow multiple 'Mapping Input' steps in the sub-transformation" option is checked. When that option is not checked, the "Add Input" button will be enabled only in case there are no input tabs present.

Allow multiple "Mapping Output' steps in the sub-transformation

When checked, the sub-transformation can have multiple Mapping output specification steps that send data to the parent transformation. In this case, the Add Output button is enabled so you can add multiple Output tabs to specify the mapping for each Mapping output specification. When not checked, 1 Mapping output specification step (and hence, 1 Output tab) is presumed.

Add output

Use this button to add a tab to specify an output mapping for the specified sub-transformation. This button will always be enabled when the "Allow multiple 'Mapping Output' steps in the sub-transformation" option is checked. When that option is not checked, the "Add Output" button will be enabled only in case there are no output tabs present.

Example

You can find the sample described below in your distribution over here:

samples/mapping/Mapping - simple mapping.ktr
samples/mapping/Mapping - use simple mapping.ktr

Suppose we have a JavaScript step that we want to re-use over and over in several transformations (a simple concatenation to demonstrate the point) :

As you can see, the input fields that the script needs are: leftValue and rightValue, 2 Strings.
What we can then do is define those in a "Mapping Input" step:

The calculated value "res" is a field we want to pass to the parent transformations, so we add a "Mapping Output" step as well.
Remember, Mapping Input and Output are placeholders, there is no actual logic in them.  
The resulting mapping looks like this:

Now that our mapping is done, let's try to use it...

In this example, there are 2 fields coming into the Mapping step "X=A+B": A and B.
That means that there is somehow a "mapping" to be made between

  • "A" and "leftValue"
  • "B" and "rightValue"
  • "res" and "X" the result field

This can be achieved in the "Input" and "Output" tabs of the Mapping dialog.  Here is a screen shot of the Input tab:
*NOTE:* In our sample, we only use one input and output mapping.  It is possible however to use 0, 1 or more of either input or output mappings in a mapping transformation.
That means that we need to be able to specify which input or output we're addressing in the various tabs.  That is where the various step name choices come from in the screenshot.
In our simple case, we simply checked the "Is this the main data path" option.
 Here is a screen shot for the output tab: