Configuring for Oracle 10g

Important Overview

This document will walk through the steps needed to use Oracle 10g as the repository RDBMS for the Pentaho BI Server. Note that this is the database that the platform uses for the content, solution, and admin data storage. These instructions do not pertain to client Oracle databases upon which you might be reporting, analyzing, etc.

The steps outlined here are a guide for manually configuring the Pentaho BI Server to use Oracle as its RDBMS repository in a JBoss application server environment. Specifically, I will use the Pentaho BI Server PCI Demo as a sample server. You should be able to tailor these instructions for your individual use case. Please feel free to add any individual configurations to this Wiki page that deviate from this specific environment.

Caveat Regarding PCI Demo Sample Data

Also note that while this guide will use the Pentaho PCI demo as an example server, this does not include detailed setup instructions for using an Oracle database for the sample data that feeds the demo's sample application. This is an important distinction! We are interested in using Oracle as the backend repository to the server; the sample data and the samples that ship with the PCI demo are not required as part of that exercise. That being said, I do give some guidance on setting up the sample data on Oracle should you choose to, with the caveat that the sample action sequences are not tailored to Oracle, so while some actions run fine, others will fail with Oracle specific SQL issues.

Oracle OCI Versus Oracle Thin

The configuration files referenced here are set to use the Oracle OCI driver implementation. Oracle recommends that applications use the OCI driver. Should you choose to change the connection information to use the thin driver, be warned that we have not tested that configuration.

Prerequisites

To follow this guide step by step, you should have the following software/stuff on your machine. Download what you need to follow along.

  1. Pentaho Open BI Suite, Pre-configured Installation (PCI) - Hypersonic, version 1.6 RC2 or later.
  2. The latest Pentaho project source files, available from our SVN repository. It doesn't matter where you toss these files, we are just going to be copying a few config files out of the resource directories, so no need for the project to compile.
  3. Quartz Scheduler, full distribution, version 1.5.2
  4. Oracle's JDBC driver for Oracle 9i/10g, ojdbc14.jar
  5. Administrative access to an Oracle 10g database.

Configuration

1. Create the Oracle Databases

There are two mandatory databases for the Pentaho BI Server, the Hibernate database and the Quartz database.

The Hibernate database is used for storing content and runtime data for the Pentaho BI server. The Quartz database is used for job scheduling for solutions in the server.

NOTE: The Single Database solution

  1. The current SQL script for PCI sample data provides a single database solution. That is you do not need to create separate databases for quartz and hibernate and can work off of a single database. You would need to:

            a. Modify the sampledata-ora.sql to point to the correct database and execute the script.
            b. Modify the hibernate configuration (step 2 below) and the datasouce definition files (step 4 below) file to point to this single database instance.

The Hibernate Database

  1. Create a database in Oracle with the SID Hibernate.
  2. The default user in the configuration files for the server is hibuser, with a default password of password. Either create this user now with admin level privileges to the Hibernate database, or plan to change the user and password in the configuration files later (I'll tell you when (smile) ).

Once we have the server configured to use your Oracle databases, the Hibernate database will be automagically populated on server startup, so no load scripts are necessary for this database.

The Quartz Database

  1. Create a database in Oracle with the SID Quartz.
  2. The default user in the configuration files for the server is hibuser, with a default password of password. Either create this user now with admin level privileges to the Quartz database, or plan to change the user and password in the configuration files later.

To build and load the Quartz database, follow the JDBCJobStore instructions from the Quartz website for configuring Quartz to use an RDBMS.

Where does Pentaho hide the quartz.properties file for the server?

The setup instructions for Quartz requires that you specify the Oracle delegate in the quartz.properties file. You can find this file in the pentaho-solutions/system/quartz directory under the PCI demo root folder, or in your custom solutions folder.

Additional jar file needed for Quartz on Oracle

The Oracle delegate class for Quartz does not reside in the main Quartz jar! You need to copy the quartz-oracle-1.5.2.jar from the full Quartz distribution into the following directory in the PCI demo: jboss/server/default/deploy/pentaho.war/WEB-INF/lib . NOTE: Tomcat requires the jar to be placed in common/lib directory for example target-tomcatpreconfiguredinstall/common/lib and also in the WEB-INF/lib

What about Oracle 9i?

The Oracle 9i database itself is capable of being used as a hibernate repository for Pentaho. The Oracle 9i JDBC driver however is not up to the task. There are various problems with driver-imposed limits on the length of Blob fields and LongRaw fields that don't exist in the Oracle 10g driver. If you're using Oracle 9i for whatever reason, the only way we've been successful is by using the Oracle 10g JDBC thin driver.


Optional: The SampleData Database

If you are interested in setting up the PCI sample data in Oracle, you can create a Sampledata database, and use Pentaho Data Integration to migrate the Hypersonic sample data into Oracle. The default users for the samples are pentaho_user and pentaho_admin (admin privileges are necessary for this user), with default passwords of password.

This is NOT required for using Oracle as the repository, only an option if you wish to try to run the samples in the PCI demo.

2. Hibernate Configuration for Oracle in the Pentaho BI Server

The Pentaho BI Server uses Hibernate object persistence to talk to the repository. There are configuration files that need to be copied into the server in order for it to recognize your Oracle database.

  1. In the Pentaho source project (you should have retrieved this from the SVN repository already) navigate to pentaho/server/pentaho/res/hibernate/oracle10g.
  2. Open the hibernate.cfg.xml file, in the root of the oracle10g folder. This is where your connect information is specified.
  3. Check that the connection.url specifies the proper host and port (defaulted to localhost and 1521).
  4. If you changed the default username and password for the Hibernate database, specify the new values in this file.
  5. Save and close the file.
  6. select all files the oracle10g directory and copy them to the following directory in the PCI demo: jboss/server/default/deploy/pentaho.war/WEB-INF/classes. This will overlay the default files already in the server - choose yes to overwrite if you are prompted.

3. Add the Oracle JDBC Driver to the Pentaho BI Server

Pay close attention to what driver version you use

Check out the PentahoDoc:troubleshooting tip below to avoid a known issue with mismatching Oracle's driver versions!

You must add the JDBC driver for Oracle to the PCI Demo for it to have the classes to talk to Oracle. Drop the Oracle JDBC driver jar (ojdbc14.jar) into the following directory in the PCI demo: jboss/server/default/lib. For tomcat it is: target-tomcatpreconfiguredinstall/common/lib

4. Add the Datasource Definitions to the Pentaho BI Server

You now need to tell JBoss where to find your Oracle databases. These data source files are the JNDI mappings that the server uses to find your databases.

  1. In the Pentaho source project (you should have retrieved this from the SVN repository already) navigate to pentaho/server/pentaho/res/jboss/datasources/oracle10g.
  2. Notice that there are MANY datasource files in this directory. Most of them are used for the sample solutions, and are not necessary for this exercise. The two you DO need are PentahoHibernate-ds.xml and quartz-ds.xml.
  3. Before copying them into the PCI, open each file and edit the connection information to match your databases, as we did in the Hibernate Configuration step, checking the connect string, username and password.
  4. Once you are satisfied the connection information is correct, copy the two datasource files to the following directory in the PCI demo: jboss/server/default/deploy.

    What about the rest of the datasource files?

    If you are setting up the sample data in Oracle, you should go through the rest of the datasource files, verify the connection information and copy them over to the same location as specified above.

5. Change the quartz.properties file to use:

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.oracle.OracleDelegate

instead of

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate

6. Make Sure You Are Using the RDBMS Based Repository Server Configuration

The Pentaho BI Server allows two options for storage - either a file-based repository or an RDBMS-based repository. Obviously, if you are configuring Oracle as your repository, you want to make sure that the server is configured for RDBMS.

  1. Navigate to the pentaho-solutions/system directory in the PCI demo (or your custom solutions folder).
  2. Open the pentaho.xml file.
  3. Make sure the following line is commented out:
    <ISolutionRepository scope="session">org.pentaho.repository.filebased.solution.SolutionRepository</ISolutionRepository>
    
  4. Make sure the following line is uncommented:
    <ISolutionRepository scope="session">com.pentaho.repository.dbbased.solution.SolutionRepository</ISolutionRepository>
    

Additional Steps For Subscription Users

Database-based Audit Reporting

There are additional files that need to be added for audit reporting if you are modifying a subscription deployment. These files can be found on the subscription ftp site under the /VERSION/server/advanced folder, in the oracle_audit_patch archive. Unpack the oracle_audit_patch archive file to the jboss/server/default/deploy/pentaho.war/WEB-INF/classes directory of your deployment.

You will also need to modify the insertSQL found in the pentaho.xml file (in the root of your solution/system directory).

  1. Locate and open for edit your pentaho.xml file.
  2. Find the following XML:
    <auditConnection>
      <insertSQL>
      INSERT INTO PRO_AUDIT (JOB_ID, INST_ID, OBJ_ID, OBJ_TYPE, ACTOR, MESSAGE_TYPE, MESSAGE_NAME,
    MESSAGE_TEXT_VALUE, MESSAGE_NUM_VALUE, DURATION, AUDIT_TIME) values (?,?,?,?,?,?,?,?,?,?,?)
      </insertSQL>
      <JNDI>Hibernate</JNDI>
    </auditConnection>
    
  3. Replace the SQL query in that XML block with the following query:
    ...
    INSERT INTO PRO_AUDIT (AUDIT_ID, JOB_ID, INST_ID, OBJ_ID, OBJ_TYPE, ACTOR, MESSAGE_TYPE, MESSAGE_NAME,
    MESSAGE_TEXT_VALUE, MESSAGE_NUM_VALUE, DURATION, AUDIT_TIME) values (HIBERNATE_SEQUENCE.nextval,?,?,?,?,?,?,?,?,?,?,?)
    ...
    

Voila!

That should do it! Start the Pentaho BI Server, and you should see messages in the server console that indicate the Hibernate database is being built. If you experience any problems, please report them in our forums, and document the solutions here!

Troubleshooting

In setting this up, I ran into several Oracle related problems. Admittedly, some were my own Oracle-newbie fault. While in other cases, the blame could be put squarely on Oracle. I've documented some of my nightmares below:

Feel free to share your stories as well, if it will save other Pentaho brother some terror.

JVM Crash on Server Startup: Access Violation in OraClient10.Dll

This access violation occurs when the Oracle JDBC driver you are using in the server doesn't match the driver that your Oracle database is using. This match is necessary since we are using the Type2-OCI driver, where the JNI layer seems to be reliant on the driver classes matching.

Here are some details from Oracle's and Sun's forums:

I found this problem after upgrading my Oracle JDBC driver. It seems that I miraculously chose the correct version of the driver the first time around that EXACTLY matched the driver in my Oracle10g database installation. What are the chances, eh? Anyway, when I upgraded, the drivers no longer matched and my server went up in a binary plume of smoke.

So, the solution is to find the driver in your database installation; figure out the version; and drop that version in to your server's lib directory.

Error When Running a Simple Report or Xaction That Access the Oracle JNDI Connection

Executing a simple action sequence to retrieve data from an Oracle database yields the following error:
Error: ConnectFactory.ERROR_0001 - Database connection could not be established to: (the JNDI Connection Name goes here)
- org.jboss.util.NestedSQLException: Unexpected throwable while trying to create a connection: null;

Further Investigation of the server.log shows:

2007-09-26 15:51:53,133 WARN  [PentahoDoc:org.jboss.resource.connectionmanager.JBossManagedConnectionPool] Throwable while attempting to get a new connection: null

java.lang.NoSuchMethodError: oracle.i18n.util.GDKOracleMetaData.getDataPath()Ljava/lang/String;

            at oracle.i18n.text.converter.CharacterConverterOGS.<clinit>(CharacterConverterOGS.java:63)
            at java.lang.Class.forName0(Native Method)
            at java.lang.Class.forName(Unknown Source)
            at oracle.sql.CharacterSet.<clinit>(CharacterSet.java:351)
            at oracle.jdbc.driver.DBConversion.<init>(DBConversion.java:126)
            at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:733)
            at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:228)
            at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:348)
            at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:151)
            at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)
            at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:563)
            at org.jboss.resource.adapter.jdbc.local.LocalManagedConnectionFactory.createManagedConnection

Cause: The default ojdbc14.jar file that Pentaho provides with the Pentaho BI Platform was replaced with an older version of that file and conflicted with the orai18n.jar file that was already provided in the directory.

More information on ora18n.jar can be found here: http://oraclelon1.oracle.com/docs/cd/B14117_01/java.101/b10979/global.htm

Solution 1: Remove the ori18n.jar file and restart the application server, note that you will lose Globalization Support for character sets retrieved from or inserted to the database.

Solution 2: Replace the ori18n.jar file with the correct version that is compatible with the version of the ojdbc14.jar file you are using if the default one is replaced.

Solution 3: Don't replace the provided Oracle JDBC Driver that comes with the Pentaho BI Platform.

OCI Connect String For Oracle9i Differs From 10g

When connecting to an Oracle9i repository, we had issues with the connect string in the resource files from the Pentaho project in Subversion. The connect string that works with Oracle9i requires the name for the database in the tnsnames.ora file, rather then the host:port:SID specification.

Example Oracle 10g connect string: jdbc:oracle:oci:@localhost:1521:HIBERNAT
Example Oracle 9i connect string: jdbc:oracle:oci:@HIBERNAT_LOCALHOST