Kitchen User Documentation

This page is superseded by the InfoCenter

Please look at the latest documentation over here http://infocenter.pentaho.com/help/topic/pdi_user_guide/reference_kitchen.html

Index



What is Kitchen?


Kitchen is a program that can execute jobs designed by Spoon in XML or in a database repository. Usually jobs are scheduled in batch mode to be run automatically at regular intervals.

Installation


The first step is the installation of Sun Microsystems Java Runtime Environment version 1.5 or higher. You can download a JRE for free at http://www.java.com/.

After this, you can simply unzip the distribution zip-file in a directory of your choice.
In the Kettle directory where you unzipped the file, you will find a number of files.
Under Unix-like environments (Solaris, Linux, OSX, ...) you will need to make the shell scripts executable. Execute these commands to make all shell scripts in the Kettle directory executable:

cd Kettle
chmod +x *.sh



Launching Kitchen


To launch Kitchen on the different platforms these are the scripts that are provided:

  • Kitchen.bat: run Kitchen on the Windows platform.
  • kitchen.sh: run Kitchen on Unix platforms and Mac OSX

Kitchen can be run on any platform that has a version of the Java Runtime Environment version 1.5 or higher.



Command line options


These are the command line options that you can use.

IMPORTANT NOTES:
On Windows system, the use of the minus ("-") in the options causes problems as well as the equal sign ("="). Because of this, from version 2.2.2 on, you can also use this format or any combination of /,- and :,=
Fields in italic represent the values that the options use.
It's important that if spaces are present in the option values, you use quotes or double quotes to keep them together. Take a look at the examples below for more info.

/option:value

Below are the valid options.

Display version information

-version

This option displays the version of the Kettle core library (kettle.jar).
The build version number and build date are shown as well.

Named parameters

-param

You can set the value of a named parameter, for example: -param:FOO=value

-listparam

List the named parameters (their name, default value and description) that are defined in the specified job.

See also: Named Parameters.

Launch XML File

-file=filename

This option runs the job defined in the XML file. (.kjb : Kettle Job)

Specify named parameters

-param:key=value

Specifies the value of a named parameter.  For example:"-param:MASTER_HOST=192.168.1.3" "-param:MASTER_PORT=8181"

See also: Named Parameters

Set the logging file

-logfile=Logging Filename

Specifies the log file. The default is the standard output.

Set the logging level

-level=Logging Level

The level option sets the log level for the job that's being run.
These are the possible values:

  • Error: Only show errors
  • Nothing: Don't show any output
  • Minimal: Only use minimal logging
  • Basic: This is the default basic logging level
  • Detailed: Give detailed logging output
  • Debug: For debugging purposes, very detailed output.
  • Rowlevel: Logging at a row level, this can generate a lot of data.

Choose a repository

-rep=Repository name

Connect to the repository with name "Repository name".
You also need to specify the options -user, -pass, -dir and -job.
You can also specify this option in the form of environment variable KETTLE_REPOSITORY.

Set the repository user name

-user=Username

This is the username with which you want to connect to the repository.
You can also specify this option in the form of environment variable KETTLE_USER.

Set the repository password

-pass=Password

The password to use to connect to the repository
You can also specify this option in the form of environment variable KETTLE_PASSWORD.

Select the repository job to run

-job=Job Name

Use this option to select the job to run from the repository. Please also select the directory with the "-dir" option.

List the directories in the repository

-listdir=Y

Print a listing of all the sub-directories in the repository directory specified with the option "-dir".

Set the repository directory

-dir=directory

Specifies the directory in the repository to use. Repository directories are specified like this:

  • The root directory: /
  • A subdirectory: /production/Dimensions

From version 2.2.2 on, a / (slash) is used to separate directories on all platforms.

List the repository jobs

-listjobs=Y

Show a list of all the jobs in the repository directory specified with the option "-dir".

List the available repositories

-listrep=Y

Print a listing of all the defined repositories.

Don't log in to the repository

-norep=Y

If you have set environment variables KETTLE_REPOSITORY, KETTLE_USER, KETTLE_PASSWORD, you can prevent Kitchen from logging into the repository. For example if you want to launch a job from an XML file.



Path


Please make sure that you are positioned in the Kettle directory before running the samples below. If you put these scripts into a batch file or shell script, simply do a change directory to the installation directory:

If Kettle was installed on windows on the D:\ drive

D:
cd \Kettle

If Kettle was installed in the /product directory on a Unix system:

cd /product/Kettle/



Run a job from file


This example runs a job from file on a windows platform:

kitchen.bat /file:D:\Jobs\updateWarehouse.kjb /level:Basic

This example runs a job from file on a Linux box:

kitchen.sh -file=/PRD/updateWarehouse.kjb -level=Minimal



Run a job from Repository


This example runs a job from the repository on a windows platform:
(Enter on a single line without returns...)

kitchen.bat
                    /rep:"Production Repository"
                    /job:"Update dimensions"
                    /dir:/Dimensions
                    /user:matt
                    /pass:somepassword123
                    /level:Basic



Redirecting output


If you don't want the output of the file to appear on the screen but rather be put into a log file, you can use redirection.

This example adds the Kitchen output to an ever-growing log file:

kitchen.sh -file="/PRD/updateWarehouse.kjb" --level=Minimal >> /LOG/trans.log

This example writes the Kitchen output to a file that gets overwritten every time:

kitchen.bat /file:C:\PRD\runAll.kjb /level:Basic > C:\LOG\trans.log



Return codes


Kitchen returns an error code based on how the execution went:

  • 0 : The job ran without a problem.
  • 1 : Errors occurred during processing
  • 2 : An unexpected error occurred during loading / running of the job
  • 7 : The job couldn't be loaded from XML or the Repository
  • 8 : Error loading steps or plugins (error in loading one of the plugins mostly)
  • 9 : Command line usage printing



Scheduling

Schedule a job on windows

The best way to go at it is to test the command first at the dos prompt.
Then you can use the windows scheduler to launch this command.
Windows versions since Windows 2000 have a GUI for doing this accessible through the control panel. However it's also possible to use the command line to do this:

at 23:30 /every:Monday,Wednesday,Friday "D:\updateWarehouse.bat

To see a list of the scheduled commands simply type:

at

Schedule a job on Unix

First create a shell script that runs all the jobs you need. Then you can schedule this script to run.
On Unix like systems the easiest way to schedule a command is by using the "cron table". You can do this by entering the following command:

crontab -e

Then you can enter the time at which the command needs to be run as well as the command on a single line in the text file that is presented.
The first options are:

  • Minute: The minute of the hour, 0-59
  • Hour: The hour of the day, 0-23
  • Month day: The day of the month, 1-31
  • Month: The month of the year, 1-12
  • Weekday: The day of the week, 0-6, 0=Sunday

You can specify more then 1 number for each of these values by separating 2 number with a hyphen -. This means an inclusive number range. If you separate the number by commas (,), it means distinct values. If you use * instead of a number, it means: every possible hour, minute, day, month or weekday.

So, if you want to update the dimensions every hour, at 15 and 45 minutes past the hour during the weekdays, you might enter these lines in a crontab:

#
# Launches the update of the dimensions in the warehouse
# 15,45 * * * 1-5 /PROD/update_dimensions.sh
#