Writing your own Pentaho Data Integration Plug-In

If you want to write your own Pentaho Data Integration Plug-In, this is the right place for you.

This information is collected from some posts of Matt in the forum and the developer mailing lists.

Note: More extensive and up to date information about plug-in development can be found in the PDI SDK in "Embedding and Extending PDI Functionality"


In General 

Basics for Plug-In development (until 2.5.x)

At minimum you need 4 classes implementing the following interfaces (package be.ibridge.kettle.trans.step):

  • StepMetaInterface: it defines the metadata and takes care of XML representation, saving loading from/to repository, checks, etc.
  • StepInterface: makes the step execute: inherit from BaseStep to make your life easier.
  • StepDataInterface: holds open cursors, resultsets, files, etc.
  • StepDialogInterface: GUI/dialog code to edit the meta-data

Don't get too creative when naming these classes. Spoon expects the metadata class name to end in "Meta" and the dialog class name to end in "Dialog", with the same prefix. So if your metadata object is called MyStepMeta, your dialog class will have to be called MyStepDialog. StepMetaInterface.getDialogClassName() is used only for finding the package, not the class name itself.

There are 50+ samples in the source code as this is the way all steps are constructed. What you need to do is add a plugin.xml file and an Icon to represent the step. Then you throw the 4 classes in a jar file. Throw all this : someplugin.jar, plugin.xml and someplugin.png in a directory under a kettle sub-directory : plugins/transformations/steps/SomePlugin.

You can also throw extra jar files you need in there and specify it's use in the plugin.xml file. (very simple)

See the attachment DummyPlugin.zip.

Changes from 2.5.x to 3.0

  • Metadata and Data in rows are now separated
  • The row data are now stored in an object array: Object[]
  • The primitives allowed in this Object array are the same as in the "old style" values:
    • String (String)
    • Double (Number)
    • Long (Integer)
    • Date (Date)
    • BigDecimal (BigNumber)
    • Boolean (Boolean)
    • byte[] (Binary)
  • Our aim for the 2.5 style engine was for empty Strings and values to be equal to null. This is enforced now by simply making elements in the Object array null.
  • Java 5 is now used.
  • A step now can be included with annotations or referenced from a XML file (kettle-plugins.xml)

...and many more, make sure to check the developer mailing list mentioned above (wink)  

See the attachment DummyPlugin3.zip

Committing your Plug-In

If you donate it to the public as open-source or not, you can add your plugin to the list of available plug-ins.

Resources