Panel | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
PLEASE NOTE: This documentation applies to Pentaho 7.1 and earlier. For Pentaho 8.0 and later, see Hadoop File Input on the Pentaho Enterprise Edition documentation site. |
Description
The Hadoop File Input step is used to read data from a variety of different text-file types stored on a Hadoop cluster. The most commonly used formats include comma separated values (CSV files) generated by spreadsheets and fixed width flat files.
This step enables you to specify a list of files to read, or a list of directories with wild cards in the form of regular expressions. In addition, you can accept file names from a previous step.
...