Javascript and Java Object Access in Kettle

March 30, 2006
Submitted by Matt Castors, Chief of Data Integration, Pentaho

Just wanted to share a Kettle tip with you, this week on the subject of the Script Values step. Let's do ... 

Regular Expressions

Today I got a question from a user that likes the regular expressions support in JavaScript, but had troubles using it in the Script Values step.
The problem is that if you grab the String representation of a field using for example NAME.getString(), you can't use do a replace like in this script:

String str = NAME.getString();
String result = str.replace( /aaa/g, "a3" );

This fails because str is of type java.lang.String and is in fact NOT a JavaScript String.

A simple workaround to convert to a JavaScript String is by doing the following:

String str = "" + NAME.getString();
String result = str.replace( /aaa/g, "a3" );

This will work fine and it will make certain text replace scenarios a lot easier to write.

Access Java Objects

You too can have access to all the java objects that are in Kettle's classpath. There are a lot of utility libs and you can add your own if you like.
To activate this, you can simply place the following line of code on top of your javascript:

java

Then you can use all kind of code by specifying the complete name of a class, for example:

var systemDate = new java.util.Date();
dateField.setValue(systemDate);

Row object

If you want to access the complete row that is passing through and not just the individual fields, you can use the row object.
With this you can for example compare the current row with the previous for all kinds of scripting fun...

var prevRow;
var prevName;

if (prevRow != null )
{
  prevName = prevRow.getString("NAME", "");
}
else
{
  prevName = "";
}

prevRow = row.Clone();

Telephone Numbers

Someone asked recently: how can I format telephone numbers?
This is a complex question in itself, but more often then not, it can be solved by using a piece of javascript.
Here is what we could use in Belgium to format telephone numbers, but I guess it can be adapted for other countries as well:

Assuming you have a field called "tel" as input :

----tel.lpad(9, '0');

var newTel;

if (tel.getString().startsWith("02") || tel.getString().startsWith("03"))
{
  newTel=tel.getString().substr(0,2) + "/" + tel.getString().substr(2);
}
else
{
  newTel=tel.getString().substr(0,3) + "/" + tel.getString().substr(3);
}

For example when the input is 54250137, the output would be 054/250137.

I hope you found this useful. Please feel free to add your own samples below or suggest another topic for next week.

Kind regards,

Matt