PDI Plugin Loading

Sep 24, 2007
Submitted by Alex Silva, Pentaho

Introduction

PDI allows users to create their own customized job entries and steps, which can then be deployed to the PDI platform.

The mechanism by which these plugins are loaded is somewhat different from their native counterparts.  The purpose of this document is to further explain how plugins are handled and loaded by PDI.

 Plugin Configuration Files

There are two files involved when configuring plugins:

  • kettle-config.xml: This is the global configuration file for PDI.  Its purpose is to list several "config" elements, which with a specific purpose.  For plugin loading, look at the config element with id "plugins-config."  Here is a copy of the element:
    <config id="plugins-config">
    <config-class>org.pentaho.di.core.config.DigesterConfigManager</config-class>
    <property name="configURL" value="kettle-plugins.xml" />
    <property name="rulesURL" value="org/pentaho/di/core/config/plugin-rules.xml" />
    <property name="setNext" value="plugins/plugin" />
    </config>
    
    The only relevant sub-element for our purposes is the "configURL" property.  The value of this property indicates the location of the file that contains the locations from which plugins are going to be loaded.  The default location is "/kettle-plugins.xml" and there are two ways to override this property:
    • Manually changing the kettle-config.xml file and change the configURL property to the desired location.
    • Start PDI with a system property called 'pdi.plugins.config' set to the location where the plugin file is located.
  • kettle-plugins.xml: Even though the name and location of this file can be overriden (see above) its structure and format must be the same.  In essence, this XML file contains "<plugin>" elements that point to the location from which plugins will be loaded from.  The file provided by default has the following content:
    <plugin id="PUBLIC_JOBENTRIES_DIR">
    <location>plugins/jobentries</location></plugin>
    <plugin id="PUBLIC_STEPS_DIR">
    <location>plugins/steps</location>
    </plugin>
    <plugin id="PRIVATE_JOBENTRIES_DIR">
    <location>ognl:@org.pentaho.di.core.Const@getKettleDirectory()+"/plugins/"</location>
    </plugin>
    
    Each "<plugin>" element corresponds to a location that will be scanned for plugins.  Currently, two types of plugin locations are supported: file system folders or jar files.
    • A plugin folder may contain a combination of:
        1- One or more subfolders each with plugin definitions.
        2- Jar files that represent plugins themselves and follow the same file hierarchy as folder plugins.
    • Jar files containing the plugins can be located using any valid URI.  For instance, jar files can be loaded from a remote server using HTTP, FTP, or simply be loaded from the local file system or a network share.  Obviously, jar plugins should have the same file organization as a folder-based plugin entry.
In order to work property, jar locations resolving to the local file system should be prefixed with 'jar:'.
Otherwise, VFS will not recognize them as "proper" jar files and they will not be deployed properly. For instance:
 jar:file:///c:/testplugin.jar -> GOOD
 file:///c:/testplugin.jar -> BAD 
 

Plugin Deployment

Jar files containing plugin classes and are"exploded" and deployed into the user's working directory after loaded.  Currently, this is the '~/.kettle/work' directory.  This way, after the initial loading all plugins are local to Kettle, regardless of the location they were loaded from originally.

In contrast, "unjarred" plugins that reside on the file system are referenced and loaded from their original locations and not copied anywhere else.

Plugin.xml

This file, among other things, is responsible for specifying the plugin implementation class, icon locations, descriptions, and other configuration metadata. 

An example:

<?xml version="1.0" encoding="UTF-8"?>
<plugin
   id="DummyJob"
   iconfile="DPL.png"
   description="Dummy Job Entry"
   tooltip="This is a dummy plugin test job entry"
   category="JobEntry"
   classname="kettle.jobentry.dummy.JobEntryDummy">
   <libraries>
    <library name="dummyjob.jar"/>
    </libraries>
   
   <localized_category>
     <category locale="en_US">Transform</category>
     <category locale="nl_NL">Transform</category>
     <category locale="fr_FR">Transformation</category>
   </localized_category>
   <localized_description>
     <description locale="en_US">Transform</description>
     <description locale="nl_NL">Transform</description>
     <description locale="fr_FR">Transform</description>
   </localized_description>
   <localized_tooltip>
     <tooltip locale="en_US">This is a dummy plugin test step</tooltip>
     <tooltip locale="nl_NL">Dit is een voorbeeld plugin ook wel 'dummy plugin' geheten</tooltip>
     <tooltip locale="fr_FR">Ceçi est une example de plugin</tooltip>
   </localized_tooltip>
  
</plugin>

The '<library>' element above specifies which jar files should be added to the plugin classpath.  Plugins are self-contained and have their own classloader; therefore, all required libraries should be added to the plugin distribution file, unless they are already present in the PDI distribution.

Wild cards

The '<library>' element supports wild cards.  You can use either '*' or '?' to denote a complete path.  For instance:

  • 'lib/*.jar' --> Loads all the jar files located under the lib folder into the plugin's classpath.

 Plugin Directory Structure Rules

These rules apply for either jar or folder plugins.  In summary, all the resources needed by the plugin must be encapsulated within the jar file or folder in question.

This includes any images, help files, localization resources, plugin implementation and required libraries.

PDI will load and add to the plugin's classpath all the jar files located under the 'lib' folder of any plugin distribution.

Plugins and Annotations

Previously, all plugins were required to contain a plugin.xml file (described above.)  Starting with 3.0, these files became optional, as plugins can be configured using the annotation-based approach.  Two types of annotations are available:

  • @Step - Defines a "step" plugin. 
  • @Job - Defines a "job" plugin.
For more information on these annotations, further documentation can be found in the package org.pentaho.di.core.annotations.

When annotating a class, no plugin.xml is required.  As mentioned above, all the jar files inside the plugin distribution are automatically added to the plugin's classpath.

Annotations take precedence over the plugin.xml file.  In other words, if a plugin is configured via annotations, its plugin.xml file will not be relevant for the loading process, even if it is present in the distribution.

Lastly, due to annotations, a single jar file can have any number of plugins.  Previously, this number was limited to one, due to the limitations caused by the plugin.xml file.