Include Page | ||||
---|---|---|---|---|
|
Pentaho Big Data Plugin
Div | ||
---|---|---|
| ||
The Pentaho Big Data Plugin Project provides support for an ever-expanding Big Data community within the Pentaho ecosystem. It is a plugin for the Pentaho Kettle engine which can be used within Pentaho Data Integration (Kettle), Pentaho Reporting, and the Pentaho BI Platform.
...
This project contains the implementations for connecting to or preforming the following:
- Pentaho MapReduce: visually design MapReduce jobs as Kettle transformations
- HDFS File Operations: Read/write directly from any Kettle step. All made possible by the ubiquitous use of Apache VFS throughout Kettle
- Data Sources
- JDBC connectivity
- Apache Hive
- Native RPC connectivity for reading/writing
- Apache HBase
- Cassandra
- MongoDB
- CouchDB
- JDBC connectivity
Key Links
- Git Repository: https://github.com/pentaho/big-data-plugin
- CI: pentaho-big-data-plugin
- Download the latest development build: pentaho-big-data-plugin-TRUNK-SNAPSHOT.tar.gz
Community and where to find help
...
Here's a short list of resources to help you learn and master Git:
Documentation
Kettle Plugin Development
Getting started with the Pentaho Data Integration Java API
Step Documentation
Job Entry Documentation
Hadoop Configuration
Community Plugins
Here's a list of known community plugins that fall into the "big data" category:
...