October 11, 2006
|
This tech tip introduces a couple of key features in an action sequence that will help Pentaho platform users build more flexible, secure and scalable solutions.
The first feature we'll talk about is the {PREPARE:} option in the SQLLookupRule component. The {PREPARE:} option is a much more robust and secure replacement for using replaceable parameters in a SQL statement.
The second feature we'll cover is the IF Statement component, which allows you to conditionally execute one or more actions defined in an action sequence.
So, one final word of caution before we get started. This article requires that you retrieve the latest nightly build of both the Pentaho Design Studio and the PCI. The nightly builds can at times be problematic or just plain broken. They are built specifically to test what has been broken (or fixed) due to code added since the previous nightly build. That being said, we are in a release candidate cycle where we are very careful not to break the builds. If you are faint of heart or just commonly annoyed with temperamental software, you will be better off waiting until our RC3 release candidate is out, due sometime in the next week or so, before you try out this tech tip. If you are the adventurous sort, we encourage you to try this stuff, and are eager to get as many hands testing these new features as possible.
In many cases, when creating a Pentaho solution, you will want to be able to specify parts of an SQL query dynamically. For example, our samples have many places where we use a prompt (secure filter component) to allow the user to select a Region, then use the selected region value in the where clause of a SQLLookupRule component. The SQL for that example would look something like this:
select * from quadrant_actuals where region = '{REGION_VAR}' |
While this works fine on the "happy path", there are a couple of problems with this technique. The first and arguably the biggest problem is that it opens a security hole. If you do not use a secure filter component to be absolutely sure that the value for {REGION_VAR} is coming from a source that you have qualified, then your data can be compromized by a SQL injection attack. The second problem is the syntax used above favors strings, but does not account for date or numeric values.
This is where the prepare feature comes in. By using the PREPARE: prefix on your parameters, you instruct the platform to process this SQL query using a prepared statement. This is different from the original example because in the first example, the SQL string was pre-processed and handed to the SQL processor already built. Which is fine, if you have made sure to protect your parameters using a secure filter component. But sometimes that is not a viable option, as is the case where you want to report on three customers and the list of customers is over 200,000 records. In this instance, you use the PREPARE: option and your Customer parameter is protected. No rogue statements can be injected, and only where clause parameter values can be passed in to the SQL query. Here's what the prepare option looks like:
select * from quadrant_actuals where region = {PREPARE: REGION_VAR} |
Note that PREPARE must be all uppercase, and there should be no spaces on either side of the colon.
This also eliminates the data type syntax issue. Notice that there are no quotes around the PREPARE parameters. Again, the platform knows to build this SQL as a prepared statement, so the parameter values are passed to the SQL processor at runtime and the data types are determined at that time. No predetermined type limitations due to syntax restrictions.
Let's build a small action sequence and watch the prepare option in action.
The following software is what I used to build out this article. I highly recommend you set up these pieces in your environment before you get started. While it may seem like a bit of work, each project has a very simple install and setup.
Visit http://community.pentaho.orgfor instructions on how to get the specific (or the latest) nightly builds of any of the Pentaho projects.
You can also download the solution to this article to follow along if you like, although the article will step through creating each of these files.
Let's get started!
As I mentioned in the "Resources.." section, it's important at this point that you have the PCI, and the Pentaho Design Studio installed and ready to run somewhere on your computer. If you have any questions on how to set these things up, they all have installation and user guides located at http://www.pentaho.org/download/latest.html. From here forward I will assume you have the necessary tools for the exercise.
The first thing we need to do is set up the Design Studio so that we can work and debug directly from the sample solutions that come with the PCI. (If you want to use your own solution directories, that's fine too, just modify the following steps to accommodate your structure).
select * from quadrant_actuals where region = {PREPARE:REGION_VAR} |
That's all there is to it, not very different from the queries you have been using all along, but alot more secure, and flexible! Let's test what we have so far:
You see that the output is pretty much what was expected. The power behind the PREPARE is knowing that the parameters you have defined are safe, and that it's simple to use the same syntax for a number or a date/time field in the database.
I wanted to take this opportunity to demonstrate another very powerful feature, the IF statement. The IF statement is an intuitive conditional component. When the condition you specify evaluates to true, any actions associated with the IF statement are executed. The powerful part is in the evaluation of the conditional statement. A condition evaluates to true under the following circumstances:
Let's try it out in our action sequence.
The region selected is {REGION_VAR}. |
What we have coded here is "If the resultset has data, display the message "The selected region is Eastern"" . It is simply a matter of adding another If Statement component to display a different message if the resultset is empty. I encourage you the reader to extend the solution to display "The region {REGION_VAR} is invalid" if the resultset is empty. I will give you a hint - while much of our syntax is Java or JavaScript based in nature, the (!resultset) syntax will not work for this task. You must evaluate the resultset row count in order to get the right conditional, which looks like
(resultset.getRowCount() == 0). |
That's it for today, you can download the solution files for this article here.