JSON Input


(warning) PLEASE NOTE: This documentation applies to an earlier version. For the most recent documentation, visit the Pentaho Enterprise Edition documentation site.


Description

The JSON Input step extracts relevant portions out of JSON structures, files or incoming fields, and outputs rows.

Options

File Tab

The File tab is where you enter basic connection information for accessing a resource.

Option

Definition

Step name

Name of this step as it appears in the transformation workspace

Source is from a previous step

Retrieves the source from a previously defined field

Select fieldThe field name to use as a source from a previous step

Use field as file names

Indicates source is a filename

Read source as URL

Indicates a source should be accessed as a URL

Do not pass field downstream

The source field will be removed from the output stream. This improves performance and memory utilization with large JSON fields.

File or directory

Indicates the location of the source if the source is not defined in a field

Regular expression

All filenames that match this regular expression are selected if a directory is specified

Exclude regular expression

All filenames that match this regular expression are excluded if a directory is specified

Show filename

Displays the file names of the connected source

Content Tab

The Content tab enables you to configure which data to collect.

Option

Definition

Ignore empty file

When checked, indicates to skip empty files---when unchecked, instances of empty files causes the process fail and stop

Do not raise an error if no files

When unchecked, causes the transformation to fail when there is no file to process---then checked, avoids failure when there is no file to process

Ignore missing path

When unchecked, causes the transformation to fail when the JSON path is missing---then checked, avoids failure when there is no JSON path

Limit

Sets a limit on the number of records generated from the step when set greater than zero

Include filename in output

Adds a string field with the filename in the result

Rownum in output

Adds an integer field with the row number in the result

Add files to result filesname

If checked, adds processed files to the result file list

Fields Tab

The Fields tab displays field definitions to extract values from the JSON structure. This step uses JSONPath to extract fields from JSON structures.

Additional Output Fields Tab

The Additional output fields tab enables you to provide additional information about the file to process.

Examples

Pentaho Data Integration ships with sample transformations you can run to demonstrate step functionality. To open a sample transformation, from within the Spoon interface, go to the File menu and select Open. Browse to pentaho\design-tools\data-integration\samples\transformations, then select the sample transformation you want to run. Within this directory are several sample transformations to demonstrate the functionality of this step.

JsonInput - read a dynamic file.ktr
JsonInput - read a file.ktr
JsonInput - read incoming stream.ktr

Metadata Injection Support (7.x and later)

All fields of this step support metadata injection. You can use this step with ETL Metadata Injection to pass metadata to your transformation at runtime.