Extending Libformula - Creating your own function

Purpose

The purpose of this document is to provide Java developers with a set of simple, cookbook-like instructions for how to add custom formulas to LibFormula for use in the Pentaho BI Platform, Report Designer, Metadata Editor, etc. This is not intended to be a guide on the internals of LibFormula.

Tools and prerequisites

To add your own function to LibFormula, you should be an experienced Java developer and have:

  • A Java IDE (Eclipse, IntelliJ, NetBeans, etc.)
  • The LibFormula .jar file in your classpath
  • The LibFormula source .jar file in your classpath

Before starting

Before you get started, you need to familiarize yourself with a few key classes and interfaces. After that, you need to examine the requirements of your new function and be prepared to answer some basic questions about what you're going to implement.

Key Classes and Interfaces

  • org.pentaho.reporting.libraries.formula.typing.DefaultTypeRegistry which provides the default implementation of org.pentaho.reporting.libraries.formula.typing.TypeRegistry
  • org.pentaho.reporting.libraries.formula.typing.Type which describes data types
  • org.pentaho.reporting.libraries.formula.lvalues.TypeValuePair
  • org.pentaho.reporting.libraries.formula.util which is a package of helper utilities that you'll use for various conversions
    • org.pentaho.reporting.libraries.formula.util.NumberUtil which contains important methods (especially getAsBigDecimal) for safely getting and converting BigDecimal numbers
    • org.pentaho.reporting.libraries.formula.util.DateUtil which contains important functions for safely getting and converting dates

Basic questions you need to ask yourself before coding

Input parameters

  • How many parameters does your function need? For example, a NOW function would need zero parameters, and a SIN function would need one parameter.
  • Are any of the parameters optional?
  • What is the expected data type of each input? For example, a LN function expects a numeric input parameter, and an UPPERCASE function expects a string input parameter.
  • Does your function operate on a single value, or a sequence of values? For example, a SQUAREROOT function works on a single numeric input, and an AVERAGE function works on a sequence of numbers.

Output parameters

  • What type of result does your function return? For example, a SUBSTRING function returns String data, a TAN function returns numeric data.
  • Does your function return a single value, or multiple values (array or sequence)? For example, an ACOS function returns a single numeric type, and INDEX returns an array of numeric values.
  • What is the category of your result? Categorization of functions allows user-interfaces to list your function in the right place. At the time of this writing, there were the following pre-defined categories of functions:
    • DateTimeFunctionCategory (e.g. YEAR)
    • FinancialFunctionCategory (e.g. IRR - Internal Rate of Return)
    • InformationFunctionCategory (e.g. ISBLANK)
    • LogicalFunctionCategory (e.g. XOR)
    • MathFunctionCategory (e.g. ATAN)
    • RoundingFunctionCategory (e.g. INT)
    • TextFunctionCategory (e.g. TRIM)
    • UserDefinedFunctionCategory (e.g. NULL)

The LibFormula Cookbook

Extending LibFormula is as simple as implementing two Java interfaces, and creating a couple of .properties files that tell LibFormula about your new function. Here are the two Java interfaces you'll be implementing:

org.pentaho.reporting.libraries.formula.function.Function
org.pentaho.reporting.libraries.formula.function.FunctionDescription

To see a simple example that uses these classes, open up the LibFormula source in your IDE and go to the org.pentaho.reporting.libraries.formula.function.math package. Then open up the classes AbsFunction and AbsFunctionDescription, and the file Abs-Function.properties.

File Descriptions:
AbsFunction - This class performs the work. It evaluates the incoming parameters, and returns the result.
AbsFunctionDescription - This class describes the function to the outside world. It is the mechanism that allows user interfaces to recognize, categorize, and display your function in the correct place.
Abs-Function.properties - This file provides the name and description for your function as well as all the arguments to your function.

Fast-path to implementing Function

  1. Make sure you have a zero-argument constructor
  2. Check your parameter-count first, then check the types of each of your parameters
  3. Throw an EvaluationException if you have problems interpreting the parameters

Fast-path to implementing FunctionDescription

  1. Subclass org.pentaho.reporting.libraries.formula.function.AbstractFunctionDescription and override only what's required.
  2. Create a zero-argument constructor, and call the super-class constructor with two parameters:
    • The string you return from your Function's getCanonicalName() function.
    • The path and name (using package notation, not file notation) of the .properties file that contains translatable strings for use in user interfaces (explained below) without the .properties extension. In this case, it's org.pentaho.reporting.libraries.formula.function.math.Abs-Function, which tells LibFormula to find the file Abs-Function.properties in the org.pentaho.reporting.libraries.formula.function.math package.

Create your function .properties file

In the same package as your Function and FunctionDescription implementations, create the properties file you named in your FunctionDescription zero-argument constructor. This properties file provides names and descriptions for your function, and for all of your parameters. For a zero-parameter function, you'll only need two properties in the file:

display-name
description

Make sure that display-name is the same as the string you return in your Function's getCanonicalName() function.

Create libformula.properties

The class LibFormulaBoot is coded to look for all libformula.properties files that aren't in any package. This file tells LibFormula about your new formula. Your new function requires two properties. The property name is broken into three parts separated by dots:

  1. The fully qualified category. For example - org.pentaho.reporting.libraries.formula.functions.datetime for date/time functions, or org.pentaho.reporting.libraries.formula.functions.math for math functions
  2. The root name of your class (without Function or FunctionDescription). For example, if you created a class called VarianceFunction and VarianceFunctionDescription, then the root name of the class is Variance.
  3. class for Function, or description for FunctionDescription. The value of the class property is the fully qualified package and class for the Function implementation. The value of the description property is the fully qualified package and class for the FunctionDescription implementation.

For example, let's assume the following:

  • You've created a new GCD (Greatest Common Denominator) function
  • You work for the company Acme
  • Your Function implementation is com.acme.libformula.GreatestCommonDenomFunction
  • Your FunctionDescription implementation is com.acme.libformula.GreatestCommonDenomFunctionDescription

Given the above, your libformula.properties file would have the following two lines in it:

org.pentaho.reporting.libraries.formula.functions.math.GreatestCommonDenom.class=com.acme.libformula.GreatestCommonDenomFunction
org.pentaho.reporting.libraries.formula.functions.math.GreatestCommonDenom.description=com.acme.libformula.GreatestCommonDenomFunctionDescription

Sample Code

Attached is PEOpen:a sample Java project that implements a simple LibFormula function called sleep (SLEEP). The SLEEP function takes a numeric parameter, and calls the Thread.sleep method which causes the current thread to cease execution for the specified number of milliseconds.

To build the source code, you'll need Apache Ant. You need to run the target resolve, followed by the target jar.

Summary

The Pentaho LibFormula library was designed from the very beginning to allow extensions in many different ways; adding your own formula is only one of many possible extension points. We hope that the extensibility of LibFormula encourages community contributions like financial libraries, algebraic expressions, and other useful operations.