Description
The Get File Names step allows you to get information associated with file names on the file system. The retrieved file names are added as rows onto the stream.
The output fields for this step are:
- filename - the complete filename, including the path (/tmp/kettle/somefile.txt)
- short_filename - only the filename, without the path (somefile.txt)
- path - only the path (/tmp/kettle/)
- type
- exists
- ishidden
- isreadable
- iswriteable
- lastmodifiedtime
- size
- extension
- uri
- rooturi
File tab
This tab defines the location of the files you want to retrieve filenames for. For more information about specifying file locations, see Selecting Files to read data from.
The "Selecting Files to read data from" page referred to above doesnt appear to exist on this site (at least, I was unable to find it). In the absence of such a page, I'll point out that the "Wildcard" field does not take what you would normally use as a wildcard when doing directory listings in Unix or Windows (e.g. a * to represent all files). In fact, what you need to put in here is a regular expression, as understood by java.util.regex. So, for example, to get names of all files in a directory, you could use .+ in the Wildcard field. For full details of regular expression syntax, see http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html
Filters
The filters tab allows you to filter the retrieved file names based on:
- All files and folders
- Files only
- Folders only
It also gives you:
- The ability to include a row number in the output
- The ability to limit the number of rows returned
- The ability to add the filename(s) to the result list