Classic-DataSources

Data Processing and Functions

Introduction to the Data-Layer

Pentaho Reporting Classic strictly separates the data processing from the layout processing.

The Data-Layer provides a unified interface to the various data-sources and encapsulates all implementation specific details. In Pentaho Reporting Classic, there are three general classes of data-sources: (1) Tables provide Pentaho Reporting Classic with mass-data. Table-data is usually read from a database or similar data storage. (2) Parameters allow to feed single data values from the outside. (3) Functions and Expressions as third data-source compute values using all other data-sources as input.

General Architecture

The datarow is the central interface for accessing values from the various data sources. The datarow is a stacked and strictly separated structure. A global view grants access to the local data and prevents the direct manipulation of the datastructures of the backend.

The backend itself is a collection of all datasources. In the default implementation there are four different datasource types available:

  • Parameters
  • Report Data (Tables queried from the DataFactories)
  • Functions
  • Imported Parameters

Datarows are layered. Each report (either a master or a subreport) opens a new data-row context. Data from lower layers can be accessed from higher layers through the imported parameters mechanism.

Parameters

Each master and subreport can take an unlimited number of parameters. When a report process is started, the parameters are used to perform the query. Parameters are always available and it is assumed, that they do not change during the report processing.

Parameter values for subreports are read from the dataset of the parent report. To declare parameters, you can either list all parameters you want to import from the parent to the sub-report or you declare a default mapping, which imports all fields of the parent into the sub-report. In case you define the parameters manually, you can also rename the parameters on the import. Although the default-mapping is an easy short-cut, it can severely impact the performance of the sub-report processing, as it increases the chances to falsely invalidate all caches.

A subreport that defines export parameters maps those parameters into the datarow of the master report. This way deep traversing functions can access these parameters for their own computations. Export parameters stay available after the sub-report has been fully processed. As sub-reports are always executed after the root-level band, to which they have been attached, is printed, exported parameters will not be availbe in the sub-reports parent-band.

Report-Data and DataFactories

The report processing always starts with a (possibly empty) set of parameters and a query name. The parameters correspond to the report properties set on the JFreeReport object. Each parameter automaticly appears as static column in the data row. Sub-Reports have no explicit parameter set, they receive all parameters from the current data row context. Every report and subreport has a query name. The name, along with the current values from the data row, is used by the DataFactory to query the underlying datasource and to return a valid ReportData object.

The format and contents of the 'query'-string is not defined and depends on the actual DataFactory implementation. It can be anything, from valid SQL-statement to an handle to an arbitary complex operation.

Classical reporting is driven by uniform mass-data given in two dimensional tables. As the reporting process advances, this data is merged into the report definition. Additional computations. aggregation and grouping can be applied to provide additional information derived from that original data.

The Report-Engine makes two assumptions when dealing with tabular data.

  1. The data is constant. Reading the same position multiple times must always return the same result. The number or order of the rows does never change.
  2. The data is randomly accessible. The current cursor position can change to any previously read row at any time.
  3. The data that is retained is sorted in a suitable way. The reporting engine will not sort the data; it is the responsibility of the report author to provide a reasonably sorted data-set that fits the declared groupings of the report.

All report data factories described in this document can be created using the API or can be instantiated from the XML parser.

SQL-Data-Factories

The SQL-DataFactories allow Pentaho Reporting Classic to query JDBC-DataSources. By default, the query-string is an alias for the real SQL-Query-string.

SQLDataFactories can be defined in XML and can be referenced directly from the report definition. The queries can be parametrized. The engine autmatically translates named parameters into positional parameters. The actual query is always done using PreparedStatements, JDBC-Drivers which do not support prepared statements cannot be used with this implementation.

To provide support for named parameters, the data factories use a special parametrisation syntax. Parameters are specified using '${parameter}'. You can escape the special characters used by the parameterization mechanism using the backslash, if any of these characters is also part of your query.

Example: SELECT * FROM table WHERE column = ${parameter}

This query string is translated into a prepared statement and the value read from the column 'parameter' from the report parameters is used as argument for the query.

Pentaho Reporting Classic comes with two implementations of the ReportDataFactory. The SimpleSQLReportDataFactory expects a valid SQL- Query (optionally with parameter specifications) and executes these queries directly. The SQLReportDataFactory provides a naming mechanism so that each query is addressed using asymtotic name instead of the raw SQL string.

The SQLReportDatafactory can be fully defined using an XML-document.

All SQL-ReportDataFactories need a Connection Provider to gain access to a valid JDBC Connection object. We provide two implementations for the ConnectionProvider-interface

  1. StaticConnectionProvider carries an user-provided connection object. The connection contained in the provider must be open and will be controlled by the data factory.
  2. DriverConnectionProvider: A JDBC-Driver implementation is used to create a connection to the database.

Code Examples:

Defining a SQL-DataSource in XML:

<?xml version="1.0"?>
<sql-datasource
        xmlns="http://jfreereport.sourceforge.net/namespaces/datasources/sql"
        xmlns:html="http://www.w3.org/1999/xhtml">
  <connection>
    <driver>org.hsqldb.jdbcDriver</driver>
    <url>jdbc:hsqldb:./sql/sampledata</url>
    <properties>
      <property name="user">sa</property>
      <property name="pass"></property>
    </properties>
  </connection>

  <!-- First query: get all regions .. -->
  <query name="default">
      SELECT DISTINCT
           QUADRANT_ACTUALS.REGION
      FROM
           QUADRANT_ACTUALS
      ORDER BY
          REGION
  </query>

  <query name="actuals-by-region">
      SELECT
           QUADRANT_ACTUALS.REGION,
           QUADRANT_ACTUALS.DEPARTMENT,
           QUADRANT_ACTUALS.POSITIONTITLE,
           QUADRANT_ACTUALS.ACTUAL,
           QUADRANT_ACTUALS.BUDGET,
           QUADRANT_ACTUALS.VARIANCE
      FROM
           QUADRANT_ACTUALS
      WHERE
          REGION = ${REGION}
      ORDER BY
          REGION, DEPARTMENT, POSITIONTITLE
  </query>
</sql-datasource>

The XML data-source definition can be referenced from an XML report definition.

<report
    xmlns="http://jfreereport.sourceforge.net/namespaces/reports/legacy/simple">
  ..
  <data-factory href="sql-subreport.sqlds"/>
  ..
</report>

Parsing the file can be done using the common LibLoader code:

    JFreeReport report; // created elsewhere
    Object sourceObject; // either a valid URL, File or String object

    ResourceManager manager = new ResourceManager();
    Resource resource = manager.createDirectly (sourceObject, DataFactory.class);
    DataFactory dataFactory = (DataFactory) resource.getResource();
    report.setDataFactory(dataFactory);

The SQLDataFactory can also be created using the API.

    DriverConnectionProvider provider = new DriverConnectionProvider();
    provider.setDriver("your.database.jdbc.Driver");
    provider.setProperty("user", "joe_user");
    provider.setProperty("pass", "secret");
    provider.setUrl("jdbc:yourdb://host/database");
    SQLReportDataFactory dataFactory = new SQLReportDataFactory(provider);
    dataFactory.setQuery("default",
        "SELECT DISTINCT REGION FROM QUADRANT_ACTUALS");
    dataFactory.setQuery("actuals-by-region",
        "SELECT * FROM QUADRANT_ACTUALS WHERE REGION=${REGION}");
    report.setDataFactory(dataFactory);

Static-DataFactory

A static datafactory uses the Java-Reflection-API to call a method of a Java-Bean or a static method on any Java-Class. The query can be parameterized. The StaticDataSource translates all parameters into parameters of the method or constructor that is called. For ease of use, you can omit the parameter-paranthesis if the parameter list for the method is empty.

The query can be in one of the following formats:

  • full-qualified-classname#methodName(Parameters)
  • full-qualified-classname(constructorparams)#methodName(Parameters)
  • full-qualified-classname(constructorparams)

Example:

package org.jfree.report.test;

public class StaticDataSourceFactoryTestSupport extends DefaultTableModel
{
  /**
   * Constructs a default <code>DefaultTableModel</code> which is a table of
   * zero columns and zero rows.
   */
  public StaticDataSourceFactoryTestSupport()
  {
  }

  /**
   * A parameterized constructor.
   */
  public StaticDataSourceFactoryTestSupport(String parameter, int parameter2)
  {
  }

  /**
   * A parameterized method.
   */
  public TableModel createParametrizedTableModel (int i1, String s1)
  {
    TestCase.assertEquals("Passing primitive parameters failed", 5, i1);
    TestCase.assertEquals("Passing object parameters failed", "test", s1);
    return new DefaultTableModel();
  }

  /**
   * A un-parameterized method.
   */
  public TableModel createSimpleTableModel ()
  {
    return new DefaultTableModel();
  }

  public static TableModel createStaticSimpleTableModel ()
  {
  }

  public static TableModel createStaticParametrizedTableModel (int i1, String s1)
  {
  }
}

The Static-Datafactory will accept the following query strings to access the bean above:

org.jfree.report.test.StaticDataSourceFactoryTestSupport
org.jfree.report.test.StaticDataSourceFactoryTestSupport()

Calls the non-parameter constructor. As the class itself is a TableModel, the constructed object is used as new data-table.

org.jfree.report.test.StaticDataSourceFactoryTestSupport#createSimpleTableModel
org.jfree.report.test.StaticDataSourceFactoryTestSupport()#createSimpleTableModel
org.jfree.report.test.StaticDataSourceFactoryTestSupport#createSimpleTableModel()
org.jfree.report.test.StaticDataSourceFactoryTestSupport()#createSimpleTableModel()

Calls the non-parameter constructor to create the bean and then calls the "createSimpleTableModel" method on the bean. The return value of this method is used as data-table.

org.jfree.report.test.StaticDataSourceFactoryTestSupport#createParametrizedTableModel(parameter2,parameter1)
org.jfree.report.test.StaticDataSourceFactoryTestSupport()#createParametrizedTableModel(parameter2,parameter1)

Parameterizes the method call.

org.jfree.report.test.StaticDataSourceFactoryTestSupport(parameter1,parameter2)
org.jfree.report.test.StaticDataSourceFactoryTestSupport(parameter1,parameter2)#createSimpleTableModel
org.jfree.report.test.StaticDataSourceFactoryTestSupport(parameter1,parameter2)#createSimpleTableModel()
org.jfree.report.test.StaticDataSourceFactoryTestSupport(parameter1,parameter2)#createParametrizedTableModel(parameter2,parameter1)

Same as above, but uses the parameterized constructor instead.

The static-datasource-factory can be defined in XML and using the API.

<static-datasource xmlns="http://jfreereport.sourceforge.net/namespaces/datasources/static">
  <query name="default">org.jfree.report.demo.features.datasource.StaticDataSourceDemo#createMainQuery</query>
  <query name="sub-report">org.jfree.report.demo.features.datasource.StaticDataSourceDemo#createSubReportQuery(parameter)</query>
</static-datasource>
    NamedStaticDataFactory nsdf = new NamedStaticDataFactory();
    nsdf.setQuery("default", "org.jfree.report.demo.features.datasource.StaticDataSourceDemo#createMainQuery");
    nsdf.setQuery("sub-report", "org.jfree.report.demo.features.datasource.StaticDataSourceDemo#createSubReportQuery(parameter)");
    report.setDataFactory(nsdf);

Table-Data-Factories

Older versions of the Pentaho-Reporting Classic engine relied on the user to perform the database queries and to supply a suitable data-table to the report engine. If your application already provides TableModels which meet the engines requirements and if you do not need parameterization, then you can use the TableModelDataFactory to feed these tablemodels into the report engine.