PA_CR_PA-3.7.0.0-752_impalad-1.3.0_simba_JDBC4-2.5.5.1007

Preface

This report covers the following products.

  • Pentaho Analysis 5.1.0.0-752 ( 3.7.0.0-752 )
  • Impala impalad version cdh5-1.3.0 RELEASE
  • Simba driver for Impala JDBC4 2.5.5.1007

Feature

Status

Notes

Degenerate Schemas

 

Star Schemas


 

Snowflake Schemas

Not all join structures are properly handled. The JDBC parser fails to parse certain types of schemas.

Filters

Complex compound filters produce wrong numbers.

Top Count


 

Aggregation Tables


 

Null Values & Keys


 

Inline Tables


 

Distinct Count


Not all forms of distinct counts are supported, although the minimum support it offers is sufficient for Mondrian.

Grouping Sets


Grouping sets are not supported.

Failures

Complex filtering clauses cause wrong numbers to be returned

Symptom

When a complex compound predicate is used within a SQL query, the driver makes errors while parsing and the returned numbers are wrong.

Failed tests

Test

Result

org.pentaho.mondrian.tck.NativeFilterTest.testCompoundPredicate

Using a compound predicate like the one below produces wrong numbers.

select sum(sales_fact_1997.unit_sales) as m0
  from store store
     , product product
     , sales_fact_1997 sales_fact_1997
 where sales_fact_1997.store_id  = store.store_id
   and sales_fact_1997.product_id = product.product_id
   and ((
           store.store_country     = 'USA'
       and store.first_opened_date = '1981-01-03'
       and store.last_remodel_date = TIMESTAMP '1991-03-13 00:00:00'
       )
    or (
         store.store_city          = 'San Diego'
     and store.store_state         = 'CA'
       )
    or (
         store.store_state         = 'WA'
     and store.store_sqft          > 50000
     and product.gross_weight      = 17.1
       )
    or (
         store.store_sqft is null
       )
    )


org.junit.ComparisonFailure: Row content doesn't match. expected:<[60,662]> but was:<[39,329]>
	at org.junit.Assert.assertEquals(Assert.java:115)
	at org.pentaho.mondrian.tck.SqlExpectation.validateRows(SqlExpectation.java:142)
	at org.pentaho.mondrian.tck.SqlExpectation.verify(SqlExpectation.java:64)
	at org.pentaho.mondrian.tck.SqlContext.verify(SqlContext.java:75)
	at org.pentaho.mondrian.tck.NativeFilterTest.testCompoundPredicate(NativeFilterTest.java:224)

org.pentaho.mondrian.tck.NativeFilterTest.testCompoundPredicateNoJoinsDateLiteralSyntax

Using a compound predicate like the one below produces wrong numbers.

select sum(store.store_sqft) as m0
  from store store
 where (
           store.store_country     = 'USA'
       and store.first_opened_date = '1981-01-03'
       and store.last_remodel_date = TIMESTAMP '1991-03-13 00:00:00'
       )
    or (
         store.store_city        = 'San Diego'
     and store.store_state       = 'CA'
       )
    or (
         store.store_state       = 'WA'
     and store.store_sqft        > 30000
       )
    or ( store.store_sqft is null)


org.junit.ComparisonFailure: Row content doesn't match. expected:<1[27,510]> but was:<1[03822]>
	at org.junit.Assert.assertEquals(Assert.java:115)
	at org.pentaho.mondrian.tck.SqlExpectation.validateRows(SqlExpectation.java:142)
	at org.pentaho.mondrian.tck.SqlExpectation.verify(SqlExpectation.java:64)
	at org.pentaho.mondrian.tck.SqlContext.verify(SqlContext.java:75)
	at org.pentaho.mondrian.tck.NativeFilterTest.testCompoundPredicateNoJoinsDateLiteralSyntax(NativeFilterTest.java:230)

Support for database joins

Symptom

Not all forms of joins are supported by the database. This will prevent some types of schemas from being supported on the DB evaluated. In this case, the driver fails on implicit crossjoins, as well as complex snowflake schemas. This will impair Mondrian's capability to use snowflake schemas with this driver.

Failed tests

Test

Result

org.pentaho.mondrian.tck.JoinTest.testThreeWayJoinPlusDoubleStarLeaf89
org.pentaho.mondrian.tck.JoinTest.testThreeWayJoinPlusDoubleStarLeaf92

The driver throws an exception when the following arrangement of table is used. The error happens with both SQL-89 and SQL-92 style joins.

D1'<--
      \
D1''<--D1<--      ----->D2
            \    /
             Fact
D3 <-------/



Caused by: java.sql.SQLException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:AnalysisException: Unqualified column reference 'description' is ambiguous), Query: SELECT `warehouse_class`.`description`, `wc2`.`description`, `warehouse`.`warehouse_id`, `inventory_fact_1997`.`store_id`, `time_by_day`.`time_id`, `product`.`product_id` FROM `default`.`warehouse`, `default`.`inventory_fact_1997`, `default`.`product`, `default`.`time_by_day`, `default`.`warehouse_class`, `default`.`warehouse_class` `wc2` WHERE ((`wc2`.`warehouse_class_id` = `warehouse`.`warehouse_class_id`) AND ((`warehouse_class`.`warehouse_class_id` = `warehouse`.`warehouse_class_id`) AND ((`inventory_fact_1997`.`product_id` = `product`.`product_id`) AND ((`inventory_fact_1997`.`warehouse_id` = `warehouse`.`warehouse_id`) AND (`inventory_fact_1997`.`time_id` = `time_by_day`.`time_id`))))) ORDER BY `warehouse`.`warehouse_id` ASC, `inventory_fact_1997`.`store_id` ASC, `time_by_day`.`time_id` ASC, `product`.`product_id` ASC.
	at com.cloudera.impala.hivecommon.api.HS2Client.executeStatementInternal(HS2Client.java:832)
	at com.cloudera.impala.hivecommon.api.HS2Client.executeStatement(HS2Client.java:287)
	at com.cloudera.impala.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeQuery(HiveJDBCNativeQueryExecutor.java:487)
	at com.cloudera.impala.hivecommon.dataengine.HiveJDBCDSIExtQueryExecutor.execute(HiveJDBCDSIExtQueryExecutor.java:165)
	at com.cloudera.impala.jdbc.common.SStatement.executeNoParams(SStatement.java:2695)
	at com.cloudera.impala.jdbc.common.SStatement.execute(SStatement.java:566)
	at org.pentaho.mondrian.tck.SqlExpectation$Builder$1.getData(SqlExpectation.java:269)
	at org.pentaho.mondrian.tck.SqlContext.verify(SqlContext.java:73)
	at org.pentaho.mondrian.tck.JoinTest.testThreeWayJoinPlusDoubleStarLeaf89(JoinTest.java:343)



Caused by: java.sql.SQLException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:AnalysisException: Unqualified column reference 'description' is ambiguous), Query: SELECT `warehouse_class`.`description`, `wc2`.`description`, `warehouse`.`warehouse_id`, `inventory_fact_1997`.`store_id`, `time_by_day`.`time_id`, `product`.`product_id` FROM `default`.`inventory_fact_1997` INNER JOIN `default`.`warehouse` ON (`inventory_fact_1997`.`warehouse_id` = `warehouse`.`warehouse_id`) INNER JOIN `default`.`product` ON (`inventory_fact_1997`.`product_id` = `product`.`product_id`) INNER JOIN `default`.`time_by_day` ON (`inventory_fact_1997`.`time_id` = `time_by_day`.`time_id`) INNER JOIN `default`.`warehouse_class` ON (`warehouse_class`.`warehouse_class_id` = `warehouse`.`warehouse_class_id`) INNER JOIN `default`.`warehouse_class` `wc2` ON (`wc2`.`warehouse_class_id` = `warehouse`.`warehouse_class_id`) ORDER BY `warehouse`.`warehouse_id` ASC, `inventory_fact_1997`.`store_id` ASC, `time_by_day`.`time_id` ASC, `product`.`product_id` ASC.
	at com.cloudera.impala.hivecommon.api.HS2Client.executeStatementInternal(HS2Client.java:832)
	at com.cloudera.impala.hivecommon.api.HS2Client.executeStatement(HS2Client.java:287)
	at com.cloudera.impala.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeQuery(HiveJDBCNativeQueryExecutor.java:487)
	at com.cloudera.impala.hivecommon.dataengine.HiveJDBCDSIExtQueryExecutor.execute(HiveJDBCDSIExtQueryExecutor.java:165)
	at com.cloudera.impala.jdbc.common.SStatement.executeNoParams(SStatement.java:2695)
	at com.cloudera.impala.jdbc.common.SStatement.execute(SStatement.java:566)
	at org.pentaho.mondrian.tck.SqlExpectation$Builder$1.getData(SqlExpectation.java:269)
	at org.pentaho.mondrian.tck.SqlContext.verify(SqlContext.java:73)
	at org.pentaho.mondrian.tck.JoinTest.testThreeWayJoinPlusDoubleStarLeaf92(JoinTest.java:381)

org.pentaho.mondrian.tck.JoinTest.testImplicitJoin

Implicit joins are not supported. If mondrian tries to evaluate a crossjoin of the members of two levels in a context allowing empty cells, the fact table is omitted from the SQL query and both tables are joined by what is called an 'implicit' join.

Caused by: java.sql.SQLException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:NotImplementedException: Join with 'default.warehouse_class' requires at least one conjunctive equality predicate. To perform a Cartesian product between two tables, use a CROSS JOIN.), Query: SELECT `warehouse`.`warehouse_id`, `warehouse_class`.`description` FROM `default`.`warehouse`, `default`.`warehouse_class`.
	at com.cloudera.impala.hivecommon.api.HS2Client.executeStatementInternal(HS2Client.java:832)
	at com.cloudera.impala.hivecommon.api.HS2Client.executeStatement(HS2Client.java:287)
	at com.cloudera.impala.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeQuery(HiveJDBCNativeQueryExecutor.java:487)
	at com.cloudera.impala.hivecommon.dataengine.HiveJDBCDSIExtQueryExecutor.execute(HiveJDBCDSIExtQueryExecutor.java:165)
	at com.cloudera.impala.jdbc.common.SStatement.executeNoParams(SStatement.java:2695)
	at com.cloudera.impala.jdbc.common.SStatement.execute(SStatement.java:566)
	at org.pentaho.mondrian.tck.SqlExpectation$Builder$1.getData(SqlExpectation.java:269)
	at org.pentaho.mondrian.tck.SqlContext.verify(SqlContext.java:73)
	at org.pentaho.mondrian.tck.JoinTest.testImplicitJoin(JoinTest.java:120)

Warnings

Distinct Count

Symptom

Not all forms of distinct count queries are supported. One form of distinct count for multiple columns is supported however, so mondrian can batch the queries as needed. The integration tests have also shown that the dialect is issuing the distinct count queries correctly.

Failed tests

Test

Result

org.pentaho.mondrian.tck.AggregationTest.testDistinctTwoCount

Cannot batch multiple distinct count columns with the following syntax

Caused by: java.sql.SQLException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:AnalysisException: all DISTINCT aggregate functions need to have the same set of parameters as count(DISTINCT sales_fact_1997.unit_sales); deviating function: count(DISTINCT sales_fact_1997.store_id)), Query: SELECT COUNT(DISTINCT `sales_fact_1997`.`unit_sales`) as `count_sales`, COUNT(DISTINCT `sales_fact_1997`.`store_id`) as `count_store_id` FROM `default`.`sales_fact_1997`.
	at com.cloudera.impala.hivecommon.api.HS2Client.executeStatementInternal(HS2Client.java:832)
	at com.cloudera.impala.hivecommon.api.HS2Client.executeStatement(HS2Client.java:287)
	at com.cloudera.impala.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeQuery(HiveJDBCNativeQueryExecutor.java:487)
	at com.cloudera.impala.hivecommon.dataengine.HiveJDBCDSIExtQueryExecutor.execute(HiveJDBCDSIExtQueryExecutor.java:165)
	at com.cloudera.impala.jdbc.common.SStatement.executeNoParams(SStatement.java:2695)
	at com.cloudera.impala.jdbc.common.SStatement.execute(SStatement.java:566)
	at org.pentaho.mondrian.tck.SqlExpectation$Builder$1.getData(SqlExpectation.java:269)
	at org.pentaho.mondrian.tck.SqlContext.verify(SqlContext.java:73)
	at org.pentaho.mondrian.tck.AggregationTest.testDistinctTwoCount(AggregationTest.java:200)

Grouping sets

Symptom

Queries which use grouping sets are not supported. This is a optimization feature supported by some more advanced databases. It allows to batch cell requests and improve the overall performance.

Failed tests

Test

Result

org.pentaho.mondrian.tck.GroupingSetTest.testEmptyEntry

Grouping set queries are not supported.

select
    customer.gender as gender, sum(sales_fact_1997.store_cost) as sum_cost
from
    time_by_day, sales_fact_1997, customer
where
    (sales_fact_1997.time_id = time_by_day.time_id and time_by_day.the_year = 1997
    and sales_fact_1997.customer_id = customer.customer_id)
group by grouping sets
    ((customer.gender),())

org.pentaho.mondrian.tck.GroupingSetTest.testPlainEntry

Grouping set queries are not supported.

select
    customer.gender as gender, sum(sales_fact_1997.store_cost) as sum_cost
from
    time_by_day, sales_fact_1997, customer
where
    (sales_fact_1997.time_id = time_by_day.time_id and time_by_day.the_year = 1997
    and sales_fact_1997.customer_id = customer.customer_id)
group by grouping sets
    ((customer.gender))

org.pentaho.mondrian.tck.GroupingSetTest.testComplexEntry

Grouping set queries are not supported.

select
    time_by_day.the_year as the_year, customer.gender as gender, sum(sales_fact_1997.store_cost) as sum_cost
from
    time_by_day, sales_fact_1997, customer
where
    (sales_fact_1997.time_id = time_by_day.time_id and time_by_day.the_year = 1997
    and sales_fact_1997.customer_id = customer.customer_id)
group by grouping sets
    ((time_by_day.the_year, customer.gender))

org.pentaho.mondrian.tck.GroupingSetTest.testMultipleEntries

Grouping set queries are not supported.

select
    time_by_day.the_year as the_year, customer.gender as gender, sum(sales_fact_1997.store_cost) as sum_cost
from
    time_by_day, sales_fact_1997, customer
where
    (sales_fact_1997.time_id = time_by_day.time_id and time_by_day.the_year = 1997
    and sales_fact_1997.customer_id = customer.customer_id)
group by grouping sets
    ((time_by_day.the_year, customer.gender), (time_by_day.the_year),())