ETL Metadata Injection

(warning) PLEASE NOTE: This documentation applies to an earlier version. For the most recent documentation, visit the Pentaho Enterprise Edition documentation site.

Introduction

The ETL Metadata Injection step is capable of injecting step metadata into a template transformation.  So instead of statically entering ETL metadata in a step dialog, you pass it at run-time.  It is possible to solve repetitive ETL workloads like loading of text files, data migration and so on.

The following steps support Metadata Injection:

Step

Version
Introduced

Fields Supporting Metadata Injection

Concat Fields

5.1

All fields

CSV File Input

4.1

See CSV File Input for a list of supported field

Data Grid

5.1

All fields

Fixed File Input

4.1

See Fixed File Input for a list of supported field

Get Data from XML

5.0

See Get Data from XML for a list of supported field

GZIP CSV Input

5.1

All fields

Group By

5.0

All fields

JSON Output

5.2

All fields

Microsoft Access Input

5.0

See Microsoft Access Input for a list of supported field

Microsoft Excel Input

4.1

See Microsoft Excel Input for a list of supported field

Microsoft Excel Output

5.1

See Microsoft Excel Output for a list of supported field

Microsoft Excel Writer

5.3

See Microsoft Excel Writer for a list of supported field

Pentaho Reporting Output

5.0

All fields

PostgreSQL Bulk Loader

5.1

All fields

Row Denormaliser

4.2

See Row Denormaliser for a list of supported field

Row Normaliser

4.2

See Row Normaliser for a list of supported field

Select Values

4.1

All fields

Sort Rows

5.0

All fields

Split Field

5.0

See Split Field for a list of supported field

Table Input

5.2

See Table Input for a list of supported field

Table Output

5.1

See Table Output for a list of supported field

Text File Input

5.0

All fields

Text File Output (Deprecated)

5.2

All fields

User Defined Java Expression

5.2

All fields

Options

  • Transformation template: in this section of the dialog you can specify the transformation to use as a template.  When you have specified a transformation you can use the "Validate and Refresh" button (4.4).  The "Edit" button will open the specified template in a new tab in Spoon.
  • Source step to read from (optional): If you specify a step from the template here, then the output of the "ETL Metadata Injection" step will be the output from the source step.
  • Optional target file (ktr after injection): For debugging or transformation generation you can save the resulting transformation filename, after metadata injection, to a file.  If you want that specify a file name, for example "result.ktr".
  • Don't execute resulting transformation: If you prefer to not execute the resulting transformation (after metadata injection) you can select that here.
  • Field mapping: You can select any row in the metadata tree table with your mouse which will pop up a source step and field selection dialog. 

Data Streaming

Since version 5.1 this step is capable of streaming data from one transformation into another. 

To pass data from your template transformation (after injection, during execution) to your current transformation, specify the "Template step to read from".  You can also specify the expected output fields so that it's easier to design the steps which come after the ETL Metadata Injection step.

To pass data from a source step into the template transformation (again, after injection) you can specify the "Streaming source step" and the "Streaming target step" in the template transformation.

Documentation links

Articles

http://diethardsteiner.blogspot.com/2011/07/metadata-driven-etl-and-reporting.html

http://www.ibridge.be/?p=194

http://www.ambientbi.co.uk/?p=634

Non Native metadata injection

Videos: