Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

As of version 3.1.0 of Pentaho Data Integration, we are introducing a way to track the performance of individual steps in a transformation.  This is an important tool that allows you to fine-tune performance of transformation because that performance is determined by the slowest step in the transformation.

Enabling monitoring

You can enable the step performance monitoring in the transformation settings dialog:

...

 As you can see, this option is NOT enabled by default as it may cause memory consumption problems for long running transformations.  By default, every second a performance snapshot is taken for all the running steps.  This is not a CPU intensive operation and it should not have any big impact on performance.  However, if you have a lot of steps or if you take a lot of snapshots (several per second for example), there is obviously going to be some negative impact on performance that you should be aware off.

Saving step performance logging

All the step performance data being kept in memory during the execution of the transformation.  However, at the end of the transformation, you can opt to save the data into a logging table:

...

Code Block
CREATE TABLE L_STEP
(
  ID_BATCH INT
, SEQ_NR INT
, LOGDATE DATETIME
, TRANSNAME VARCHAR(255)
, STEPNAME VARCHAR(255)
, STEP_COPY INT
, LINES_READ BIGINT
, LINES_WRITTEN BIGINT
, LINES_UPDATED BIGINT
, LINES_INPUT BIGINT
, LINES_OUTPUT BIGINT
, LINES_REJECTED BIGINT
, ERRORS BIGINT
, INPUT_BUFFER_ROWS BIGINT
, OUTPUT_BUFFER_ROWS BIGINT
)
;

Performance graphs


If you configured step performance monitoring as shown above, with the database logging being optional of-course, it is also possible to get performance evolution graphs by using the "Graph" button in the logging tab of the running transformation.

...