Sqoop Export

Sqoop Export

The Sqoop Export job allows you to export data from Hadoop into an RDBMS using Apache Sqoop. This job has two setup modes:

  • Quick Mode provides the minimum options necessary to perform a successful Sqoop export.
  • Advanced Mode's default view provides options for to better control your Sqoop export. Advance Mode also has a command line view which allows you to reuse an existing Sqoop command from the command line.

For additional information about Apache Sqoop, visit http://sqoop.apache.org/.

Quick Setup

Option

Definition

Name

The name of this job as it appears in the transformation workspace.

Hadoop Cluster

Allows you to create, edit, and select a Hadoop cluster configuration for use.  Hadoop cluster configurations settings can be reused in transformation steps and job entries that support this feature.  In a Hadoop cluster configuration, you can specify information like host names and ports for HDFS, Job Tracker, and other big data cluster components.  The Edit button allows you to edit Hadoop cluster configuration information.  The New button allows you to add a new Hadoop cluster configuration.  Information on Hadoop Clusters can be found in Pentaho Help

Export Directory

Path of the directory within HDFS to export from.

Database Connection

Select the database connection to export to. Clicking Edit... allows you to edit an existing connection or you can create a new connection from this dialog by clicking New....

Table

Destination table to export into. If the source database requires it a schema may be supplied in the format: SCHEMA.TABLE_NAME. This table must exist and its structure must match the input data’s format.

Hadoop Cluster

Error rendering macro 'excerpt-include' : User 'null' does not have permission to view the page 'Pentaho Map Reduce (Draft)'.

Open File

Option

Definition

Open from Folder

Indicates the path and name of the MapRFS or HDFS directory you want to browse.  This directory becomes the active directory.

Up One Level

Displays the parent directory of the active directory shown in the Open from Folder field.

Delete

Deletes a folder from the active directory.

Create Folder

Creates a new folder in the active directory.

Active Directory Contents (no label)

Displays the active directory, which is the one that is listed in the Open from Folder field.

Filter

Applies a filter to the results displayed in the active directory contents.

Advanced Setup

Option

Definition

Default/List view

List of property and value pair settings which can be modified to suit your needs including options to configure an export from Hive or HBase.

Command line view

Field which accepts command line arguments, typically used to allow you to paste an existing Sqoop command line argument.