Unique Rows

PLEASE NOTE: This documentation applies to Pentaho 8.0 and earlier. For Pentaho 8.1 and later, see Unique Rows on the Pentaho Enterprise Edition documentation site.

Description

The Unique rows step removes duplicate rows from the input stream(s).

Important: Make sure that the input stream is sorted; otherwise, only consecutive double rows are evaluated correctly.

See also the Unique rows (HashSet) step that does not need the rows to be sorted.

The table below contains descriptions of all options for the Unique rows step:

Option	Description
Step name	Name of the step; this name has to be unique in a single transformation
Add counter to output?	Check this option to add a counter field to the stream.
Counter field	Define the counter field name.
Redirect duplicate row	Processes duplicate rows as an error and redirect rows to the error stream of the step. Requires you to set error handling for this step.
Error Description	Sets the error handling description to display when duplicate rows are detected. Only available when Redirect duplicate row is checked.
Fields to compare table	Specify the field names on which you want to force uniqueness or click Get to insert all fields from the input stream(s) You can choose to ignore case by setting the Ignore case flag to Y. For example: Kettle, KETTLE, kettle are the same if the compare is performed as case-insensitive. In this instance, the first occurrence (Kettle) is passed to the next step(s).