PA_CR_PA-3.7.0.0-752_HDP-2.1_Hive-0.13

Preface

This report covers the following products.

  • Pentaho Analysis 5.1.0.0-752 ( 3.7.0.0-752 )
  • Hortonworks Sandbox 2.1 (Hive 0.13)
  • Pentaho Big Data Shim for HDP21 (5.1.0.0-752)

Feature

Status

Notes

Degenerate Schemas

 

Star Schemas


 

Snowflake Schemas


 

Filters

The columns of type TIMESTAMP can't be filtered on correctly.

Top Count


 

Aggregation Tables


The JDBC driver doesn't return the proper metadata when providing a list of the tables present in a database. 

Aggregation tables defined in the schema still work.

Null Values & Keys


 

Inline Tables


Inline tables are not supported

Distinct Count


 

Grouping Sets


Grouping sets are not supported.

Failures

Native Filters

Symptom

The presence of the keyword TIMESTAMP is not recognized by the driver. This needs to be fixed within mondrian's Hive dialect.

Failed tests

Test

Result

org.pentaho.mondrian.tck.NativeFilterTest.testCompoundPredicate



java.lang.Exception: Query failed to run successfully:
select sum(sales_fact_1997.unit_sales) as m0
  from store store
     , product product
     , sales_fact_1997 sales_fact_1997
 where sales_fact_1997.store_id  = store.store_id
   and sales_fact_1997.product_id = product.product_id
   and ((
           store.store_country     = 'USA'
       and store.first_opened_date = '1981-01-03'
       and store.last_remodel_date = TIMESTAMP '1991-03-13 00:00:00'
       )
    or (
         store.store_city          = 'San Diego'
     and store.store_state         = 'CA'
       )
    or (
         store.store_state         = 'WA'
     and store.store_sqft          > 50000
     and product.gross_weight      = 17.1
       )
    or (
         store.store_sqft is null
       )
    )

	at org.pentaho.mondrian.tck.SqlExpectation$Builder$1.getData(SqlExpectation.java:120)
	at org.pentaho.mondrian.tck.SqlContext.verify(SqlContext.java:125)
	at org.pentaho.mondrian.tck.NativeFilterTest.testCompoundPredicate(NativeFilterTest.java:224)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: java.sql.SQLException: Error while compiling statement: FAILED: ParseException line 10:47 cannot recognize input near 'TIMESTAMP' ''1991-03-13 00:00:00'' ')' in expression specification
	at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:167)
	at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:155)
	at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:210)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.pentaho.hadoop.shim.common.DriverProxyInvocationChain$CaptureResultSetInvocationHandler.invoke(DriverProxyInvocationChain.java:513)
	at com.sun.proxy.$Proxy8.execute(Unknown Source)
	at org.pentaho.mondrian.tck.SqlExpectation$Builder$1.getData(SqlExpectation.java:118)
	... 26 more

Automatic recognition of aggregation tables

Symptom

The JDBC driver does not return properly formatted data when Mondrian asks for a list of the tables available. This results in an inability to automatically discover the aggregation tables which might be present. This does not affect aggregation tables that are declared explicitly in schema.

Failed tests

Test

Result

org.pentaho.mondrian.tck.AggregationTablesRecognitionTest.testAggregationRecognition

The method to obtain a list of tables isn't implemented properly in the Pentaho shim. It returns only one columns with the table names, whereas the API says it must return at least 4 columns, the 4th being the name.

Caused by: java.sql.SQLException: Invalid columnIndex: 3
	at org.apache.hive.jdbc.HiveBaseResultSet.getColumnValue(HiveBaseResultSet.java:491)
	at org.apache.hive.jdbc.HiveBaseResultSet.getString(HiveBaseResultSet.java:629)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.pentaho.hadoop.shim.common.DriverProxyInvocationChain$ResultSetInvocationHandler.invoke(DriverProxyInvocationChain.java:682)
	at com.sun.proxy.$Proxy8.getString(Unknown Source)
	at mondrian.rolap.aggmatcher.JdbcSchema.addTable(JdbcSchema.java:1282)
	at mondrian.rolap.aggmatcher.JdbcSchema.loadTablesOfType(JdbcSchema.java:1265)
	at mondrian.rolap.aggmatcher.JdbcSchema.loadTables(JdbcSchema.java:1231)
	at mondrian.rolap.aggmatcher.JdbcSchema.load(JdbcSchema.java:1100)
	at mondrian.rolap.aggmatcher.AggTableManager.loadRolapStarAggregates(AggTableManager.java:178)
	at mondrian.rolap.aggmatcher.AggTableManager.initialize(AggTableManager.java:91)

org.pentaho.mondrian.tck.AggregationTablesRecognitionTest.testGetTablesJdbc

The method to obtain a list of tables isn't implemented properly in the Pentaho shim. It returns only one columns with the table names, whereas the API says it must return at least 4 columns, the 4th being the name.

java.lang.AssertionError: Column 'table_cat' doesn't exist in the columns result set '[name]'
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at org.pentaho.mondrian.tck.SqlExpectation.validateColumns(SqlExpectation.java:75)
	at org.pentaho.mondrian.tck.SqlExpectation.verify(SqlExpectation.java:62)
	at org.pentaho.mondrian.tck.SqlContext.verify(SqlContext.java:75)
	at org.pentaho.mondrian.tck.AggregationTablesRecognitionTest.testGetTablesJdbc(AggregationTablesRecognitionTest.java:53)

Inline tables

Symptom

Inline tables are not supported. The JDBC driver can't parse the queries correctly.

Failed tests

Test

Result

org.pentaho.mondrian.tck.InlineTablesTest.testInlineTable

Simple inline tables fail to run successfully.

java.lang.Exception: Query failed to run successfully:
select
    alt_promotion.promo_id promo_id,
    alt_promotion.promo_name promo_name
from
    (select 0 promo_id, 'Promo0' promo_name union all select 1 promo_id, 'Promo1' promo_name) alt_promotion
group by
    alt_promotion.promo_id,
    alt_promotion.promo_name

	at org.pentaho.mondrian.tck.SqlExpectation$Builder$1.getData(SqlExpectation.java:120)
	at org.pentaho.mondrian.tck.SqlContext.verify(SqlContext.java:125)
	at org.pentaho.mondrian.tck.InlineTablesTest.testInlineTable(InlineTablesTest.java:44)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: java.sql.SQLException: Error while compiling statement: FAILED: HiveAuthzPluginException Error getting object from metastore for Object [type=TABLE_OR_VIEW, name=_dummy_database._dummy_table]
	at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:167)
	at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:155)
	at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:210)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.pentaho.hadoop.shim.common.DriverProxyInvocationChain$CaptureResultSetInvocationHandler.invoke(DriverProxyInvocationChain.java:513)
	at com.sun.proxy.$Proxy8.execute(Unknown Source)
	at org.pentaho.mondrian.tck.SqlExpectation$Builder$1.getData(SqlExpectation.java:118)
	... 26 more

Warnings

Grouping sets

Symptom

Queries which use grouping sets are not supported. This is a optimization feature supported by some more advanced databases. It allows to batch cell requests and improve the overall performance.

Failed tests

Test

Result

org.pentaho.mondrian.tck.GroupingSetTest.testEmptyEntry

Grouping set queries are not supported.

select
    customer.gender as gender, sum(sales_fact_1997.store_cost) as sum_cost
from
    time_by_day, sales_fact_1997, customer
where
    (sales_fact_1997.time_id = time_by_day.time_id and time_by_day.the_year = 1997
    and sales_fact_1997.customer_id = customer.customer_id)
group by grouping sets
    ((customer.gender),())

org.pentaho.mondrian.tck.GroupingSetTest.testPlainEntry

Grouping set queries are not supported.

select
    customer.gender as gender, sum(sales_fact_1997.store_cost) as sum_cost
from
    time_by_day, sales_fact_1997, customer
where
    (sales_fact_1997.time_id = time_by_day.time_id and time_by_day.the_year = 1997
    and sales_fact_1997.customer_id = customer.customer_id)
group by grouping sets
    ((customer.gender))

org.pentaho.mondrian.tck.GroupingSetTest.testComplexEntry

Grouping set queries are not supported.

select
    time_by_day.the_year as the_year, customer.gender as gender, sum(sales_fact_1997.store_cost) as sum_cost
from
    time_by_day, sales_fact_1997, customer
where
    (sales_fact_1997.time_id = time_by_day.time_id and time_by_day.the_year = 1997
    and sales_fact_1997.customer_id = customer.customer_id)
group by grouping sets
    ((time_by_day.the_year, customer.gender))

org.pentaho.mondrian.tck.GroupingSetTest.testMultipleEntries

Grouping set queries are not supported.

select
    time_by_day.the_year as the_year, customer.gender as gender, sum(sales_fact_1997.store_cost) as sum_cost
from
    time_by_day, sales_fact_1997, customer
where
    (sales_fact_1997.time_id = time_by_day.time_id and time_by_day.the_year = 1997
    and sales_fact_1997.customer_id = customer.customer_id)
group by grouping sets
    ((time_by_day.the_year, customer.gender), (time_by_day.the_year),())