Single Threader

(warning) PLEASE NOTE: This documentation applies to an earlier version. For the most recent documentation, visit the Pentaho Enterprise Edition documentation site.

Description

This step is similar to the Mapping step and is calling a sub transformation. Within the sub transformation a Mapping Input Specification is needed.

The Single Threader step uses the single threaded engine to execute the sub transformation that can help in different scenarios, e.g.

1) It solves the issues in transformations with a lot of steps due to the reduction of data passing and thread context switching overhead. More details about this can be found in Matt Casters blog about The Single Threader step.

2) Another use case is in real time streaming, e.g. to sort a bunch of data rows in a specific time frame or for a specific number of rows.

3) If you want to process chunks of data and pause and continue the processing after the chunk size.

Options

Option

Description

Step name

Name of the step; this name has to be unique in a single transformation.

Mapping transformation

Define the sub transformation to execute.

Injector step

The step that has the Mapping Input specification.

Retrieval step

The step that returns the rows.

Batch size

The number of rows to get processed in the chunk

Batch time (ms)

Process all rows that come in after this specified time.

Parameters

Define the parameters (or pass the existing parameters) to the sub transformation.

Note: The processing of a chunk of data gets started when the defined Batch size or Batch time is reached.

See also the sample that is delivered with PDI 4.3 in samples/real-time-streaming/SingleThreaderTestOverlapping.ktr. The sample for 4.2 is also available over here.