04. MQLQuery Architecture Documentation

The MQLQuery and pms.mql packages have been deprecated, this documentation is relevant to Metadata releases 3.0 and earlier.

Important Classes and Interfaces within the org.pentaho.pms.mql package:

MQLQuery

This is the public facing interface of the MQL SQL generation code. This interface defines the basic object model of mql queries: selections, constraints, and orderings. The getQuery() call returns the necessary sql.

MQLQueryFactory

This is the factory class used by other systems for determining the instance of MQLQuery to use when generating SQL from MQL.

MQLQueryImpl

This is the default implementation of MQLQuery. This class contains the necessary members and methods for serializing and de-serializing MQL. It delegates SQL generation to the SQLGenerator class.

Selection

This is a single Business Column selection within an MQL query

WhereCondition

This is an open formula expression which constrains the MQL query

OrderBy

This is a single business column and type of ordering used by the MQL query

SQLGenerator

This class generates SQL based on MQL. The SQL Generator determines the join path for the selected business columns, and also determines the necessary group by organization if aggregates are used.

PMSFormula

PMSFormula translates an open formula expression into SQL. This is used in many contexts, including constraints, exact business column definitions, and complex join definitions.

Path

This class is used as a data structure when determining the correct join path of selected business columns.

Query

This is an interface returned by MQLQuery that allows access to the SQL generated. It also provides an API for generating Extended Metadata.

MappedQuery

This is the implementation of the Query interface. the SQL generated by MQL uses COL0 to COLN for sql columns, and these columns are mapped back to Business Columns via the ExtendedMetaData.

ExtendedMetaData

This class enhances MemoryMetaData by providing more information on each column selected by the query.

MQL Query Model

Domain Type

the type of domain, either relational or olap. Olap is not supported at this time.

Domain

the Metadata domain selected

Model

the Metadata Business Model selected within the domain

Options

options allow for customization of the SQL generation
disableDistinct - disables the placement of DISTINCT within the SQL SELECT statement.

Selections

Selections are the set of business columns returned in the generated SQL query

Constraints

Constraints are open formula expressions which constrain the SQL query

Orders

Orders are used for sorting the SQL generated.

See the document Pentaho Metadata MQL Schema for more details on the XML configuration.

Important Algorithms

Deserialization of MQL XML

The MQLQueryImpl class provides the functionality to parse MQL XML. First an XML DOM is generated from a String, and then that DOM is traversed via the fromXML() methods. The dom is broken into five segments.

The first section includes the parsing of domain_type, domain_id, and model_id. These are all required.

The options element contains customized options that impact the generation of the SQL. The only option defined within MQLQueryImpl is disable_distinct.

Selection elements contain the business column selections for the query in the column child element. Selection objects are created for each xml element.

Constraint elements contain an open formula expression in the condition child element, and an operator definition as a child element. A new WhereCondition object is generated for each constraint.

Order elements contain direction, view_id, and column_id child elements. direction may equal "asc" or "desc", and the view_id and column_id elements must resolve to an existing business column. An OrderBy object is generated for each order element.

Serialization of MQL XML

The MQLQueryImpl class provides functionality to generate MQL XML based on it's object model of selections, constraints, orders, and options. The getXML() call first generates an XML DOM, and then prints that out to a string.

SQL Generation

The SQLGeneration class combined with the PMSFormula class contain the code necessary for generating SQL based on the MQL model. First, an optimal join path is determined that contains all selected and referenced business tables. From there, the SQL is generated in seven generation steps. These generation steps include Select, From, Join, Where, Group By, Having, and Order By.

Join Path Algorithm

The getShortestPathBetween() method determines the shortest path between the list of included tables within the MQL Query.

The algorithm first determines if there is an existing path between all selected tables. If not, the algorithm continues to add new tables to the list until a path is discovered. If more than one path is available with a certain number of tables, the algorithm uses the relative size values if specified to determine which path to traverse in the SQL Join.

Select

The generateSelect() method traverses the set of selections and renders those selections to the SQL string buffer. This method determines the SQL column aliases. It also calls getBusinessColumnSQL() which renders each individual business column in three different ways. Either as an MQL Formula, an aggregate function, or as a standard SQL column.

From

The generateFrom() method traverses the set of included business tables and renders those tables to the SQL string buffer.

Joins

The generateJoins() method traverses the set of included RelationMetas and renders them to the SQL string buffer via the getJoin() method

Where

The generateWhere() method renders all WhereCondition's that are not part of an aggregate column.

Group By

The generateGroupBy() method renders the non aggregate columns in this section, creating the correct SQL for group by statements.

Having

The generateHaving() method renders all the Where Conditions of aggregate columns.

Order By

The generateOrderBy() method renders all the Order By business columns

PMSFormula Class

The PMSFormula class is used in various contexts while generating SQL from MQL. These contexts include Business Column Exact Formula rendering, Relation Complex Join rendering, and Constraint rendering.

Business Column Exact Formula Rendering

Used within the SQLGeneration.getBusinessColumnSQL() method when a business column is configured as "exact", the formula specified as part of the business column is parsed as an open formula and then rendered as SQL.

Relation Complex Join Rendering

Used within the SQLGeneration.getJoin() method when a Relation is set to complex, the complex join string is parsed as open formula and rendered to SQL.

Constraint Rendering

Constraints are rendered using PMSFormula. Each WhereCondition class contains a PMSFormula object which manages the SQL rendering. Constraints are parsed and resolved before SQL generation, to determine which business tables are referenced. This is an important step so that the Join algorithm contains the correct set of tables.

 Extending MQL 

Specifying your own MQLQuery class

The default MQLQuery behavior may be extended or replaced.  To accomplish this:

Step 1

Extend MQLQueryImpl or implement the MQLQuery interface

Step 2

Make sure to have a constructor with the following signature:
            (String XML, DatabaseMeta databaseMeta, String locale, CwmSchemaFactoryInterface factory)

Step 3

Either pass in the fully qualified class name into MQLQueryFactory.getMQLQuery(),
            or set the system property "org.pentaho.pms.mql.MQLQueryClassName" to your
            fully qualified path.

Extending MQLQueryImpl

To easily add modifications or additional information to the MQL XML syntax, one must extend MQLQueryImpl, SQLGenerator and PMSFormula.  The MQL Query Model can be extended by overriding specific methods within these classes.  In the org.pentaho.pms.example package, an AdvancedMQLQuery class is available that demonstrates the extension of Selections XML parsing.