Pentaho Data Integration Job Entries

Introduction

 This page contains the index for the documentation on all the standard job entries in Pentaho Data Integration.
We invite everyone to add more details, tips and samples to this job entries pages.

NOTE

You may not be viewing the most up-to-date documentation for these steps. View the most recent Pentaho documentation here.

Name

Category

ID

Description

Main class

Abort job

Utility

ABORT

Abort the job

opdje.abort.JobEntryAbort

Add filenames to result

File management

ADD_RESULT_FILENAMES

Add filenames to result

opdje.addresultfilenames.JobEntryAddResultFilenames

Amazon EMR Job Executor

Big Data

EMRJobExecutorPlugin

Execute MapReduce jobs in Amazon EMR

org.pentaho.amazon.emr.job.AmazonElasticMapReduceJobExecutor

Amazon Hive Job Executor

Big Data

HiveJobExecutorPlugin

Execute Hive jobs in Amazon EMR

org.pentaho.amazon.hive.job. AmazonHiveJobExecutor

BulkLoad from Mysql into file

Bulk loading

MYSQL_BULK_FILE

Load from a mysql table into a file

opdje.mysqlbulkfile.JobEntryMysqlBulkFile

BulkLoad into MSSQL

Bulk loading

MSSQL_BULK_LOAD

Load data from a file into a MSSQL table

opdje.mssqlbulkload.JobEntryMssqlBulkLoad

BulkLoad into Mysql

Bulk loading

MYSQL_BULK_LOAD

Load data from a file into a Mysql table

opdje.mysqlbulkload.JobEntryMysqlBulkLoad

Check Db connections

Conditions

CHECK_DB_CONNECTIONS

Check if we can connect to one or several databases.

opdje.checkdbconnection.JobEntryCheckDbConnections

Check files locked

Conditions

CHECK_FILES_LOCKED

Check if one or several files are locked by another process

opdje.checkfilelocked.JobEntryCheckFilesLocked

Check if a folder is empty

Conditions

FOLDER_IS_EMPTY

Check if a folder is empty

opdje.folderisempty.JobEntryFolderIsEmpty

Check if connected to repository

Repository

CONNECTED_TO_REPOSITORY

Return true if we are connected to a repository

opdje.connectedtorepository.JobEntryConnectedToRepository

Check if XML file is well formed

XML

XML_WELL_FORMED

Check if one or several XML files is/are well formed

opdje.xmlwellformed.JobEntryXMLWellFormed

Check webservice availability

Conditions

WEBSERVICE_AVAILABLE

Check if a webservice is available

opdje.webserviceavailable.JobEntryWebServiceAvailable

Checks if files exist

Conditions

FILES_EXIST

Checks if files exists

opdje.filesexist.JobEntryFilesExist

Columns exist in a table

Conditions

COLUMNS_EXIST

Check if one or several columns exist in a table on a specified connection

opdje.columnsexist.JobEntryColumnsExist

Compare folders

File management

FOLDERS_COMPARE

compare two folders (or two files)

opdje.folderscompare.JobEntryFoldersCompare

Convert file between Windows and Unix

File management

DOS_UNIX_CONVERTER

Convert file content between Windows and Unix. Converting to Unix will replace CRLF (Carriage Return and line Feed) by LF (Line Feed)

opdje.dostounix.JobEntryDosToUnix

Copy Files

File management

COPY_FILES

Copy Files

opdje.copyfiles.JobEntryCopyFiles

Copy or Move result filenames

File management

COPY_MOVE_RESULT_FILENAMES

Copy or Move result filenames (since version 5.0, this job entry has been renamed to Process result filenames and handles Delete as well)

opdje.copymoveresultfilenames.JobEntryCopyMoveResultFilenames

Create a folder

File management

CREATE_FOLDER

Create a folder

opdje.createfolder.JobEntryCreateFolder

Create file

File management

CREATE_FILE

Create (an empty) file

opdje.createfile.JobEntryCreateFile

Decrypt files with PGP

File encryption

PGP_DECRYPT_FILES

Decrypt files encrypted with PGP (Pretty Good Privacy). This job entry need GnuPG to work properly.

opdje.pgpdecryptfiles.JobEntryPGPDecryptFiles

Delete file

File management

DELETE_FILE

Delete a file

opdje.deletefile.JobEntryDeleteFile

Delete filenames from result

File management

DELETE_RESULT_FILENAMES

Delete filenames from result

opdje.deleteresultfilenames.JobEntryDeleteResultFilenames

Delete files

File management

DELETE_FILES

Delete files

opdje.deletefiles.JobEntryDeleteFiles

Delete folders

File management

DELETE_FOLDERS

Delete specified folders. Attention : if a the folder contains files, PDI will delete them all!

opdje.deletefolders.JobEntryDeleteFolders

Display Msgbox Info

Utility

MSGBOX_INFO

Display a simple Message box Information

opdje.msgboxinfo.JobEntryMsgBoxInfo

DTD Validator

XML

DTD_VALIDATOR

DTD Validator

opdje.dtdvalidator.JobEntryDTDValidator

Dummy

General

DUMMY (internal: SPECIAL)

Use the Dummy job entry to do nothing in a job.

opdje.special.JobEntrySpecial

Encrypt files with PGP

File encryption

PGP_ENCRYPT_FILES

Encrypt files with PGP (Pretty Good Privacy). This job entry need GnuPG to work properly.

opdje.pgpencryptfiles.JobEntryPGPEncryptFiles

Evaluate files metrics

Conditions

EVAL_FILES_METRICS

Evaluate files size or files count

opdje.evalfilesmetrics.JobEntryEvalFilesMetrics

Evaluate rows number in a table

Conditions

EVAL_TABLE_CONTENT

Evaluate the content of a table. You can also specify a SQL query.

opdje.evaluatetablecontent.JobEntryEvalTableContent

Example plugin

General

DummyJob

This is an example test job entry for a plugin

pdi.jobentry.dummy.JobEntryDummy

Export repository to XML file

Repository

EXPORT_REPOSITORY

Export repository to XML file

opdje.exportrepository.JobEntryExportRepository

File Compare

File management

FILE_COMPARE

Compare 2 files

opdje.filecompare.JobEntryFileCompare

File Exists

Conditions

FILE_EXISTS

Checks if a file exists

opdje.fileexists.JobEntryFileExists

FTP Delete

File transfer

FTP_DELETE

Delete files in a remote host

opdje.ftpdelete.JobEntryFTPDelete

Get a file with FTP

File transfer

FTP

Get files using FTP (File Transfer Protocol)

opdje.ftp.JobEntryFTP

Get a file with FTPS

File transfer

FTPS_GET

Get a file with FTP secure

opdje.ftpsget.JobEntryFTPSGet

Get a file with SFTP

File transfer

SFTP

Get files using Secure FTP (Secure File Transfer Protocol)

opdje.sftp.JobEntrySFTP

Get mails (POP3/IMAP)

Mail

GET_POP

Get mails (POP3/IMAP) server and save into a local folder

opdje.getpop.JobEntryGetPOP

Hadoop Copy Files

Big Data




Hadoop job executor

Big Data

HadoopJobExecutorPlugin

Execute a map/reduce job contained in a jar file

opdje.hadoopjobexecutor.JobEntryHadoopJobExecutor

HL7 MLLP Acknowledge

Utility




HL7 MLLP Input

Utility




HTTP

File management

HTTP

Gets or uploads a file using HTTP (HyperText Transfer Protocol)

opdje.http.JobEntryHTTP

JavaScript

Scripting

EVAL

Evaluates the result of the execution of a previous job entry

opdje.eval.JobEntryEval

Job

General

JOB

Executes a job

opdje.job.JobEntryJob

Mail

Mail

MAIL

Sends an e-Mail

opdje.mail.JobEntryMail

Mail validator

Mail

JobCategory.Category.Mail_VALIDATOR

Check the validity of an email address ( SNMP trap to a target host

opdje.mailvalidator.JobEntryMailValidator

Move Files

File management

MOVE_FILES

Move Files

opdje.movefiles.JobEntryMoveFiles

MS Access Bulk Load

Deprecated

MS_ACCESS_BULK_LOAD

Load data into a Microsoft Access table from CSV file format. ATTENTION, at the moment only the insertion is available! If target table exists, a new one will be created and data inserted.

opdje.msaccessbulkload.JobEntryMSAccessBulkLoad

Oozie Job Executor

Big Data




Palo Cube Create (Deprecated)

Deprecated

PALO_CUBE_CREATE

Creates a cube on a Palo server

opdje.palo.JobEntryCubeCreate

Palo Cube Delete (Deprecated)

Deprecated

PALO_CUBE_DELETE

Deletes a cube on a Palo server

opdje.palo.JobEntryCubeDelete

Pentaho MapReduce

Big Data

HadoopTransJobExecutorPlugin

Execute Transformation Based MapReduce Jobs in Hadoop

opdjw.hadooptransjobexecutor.JobEntryHadoopTransJobExecutor

Pig Script Executor

Big Data

HadoopPigScriptExecutorPlugin

Execute a Pig script on a Hadoop cluster

opdje.pig.JobEntryPigScriptExecutor

Ping a host

Utility

PING

Ping a host

opdje.ping.JobEntryPing

Put a file with FTP

File transfer

FTP_PUT

Put a file with FTP

opdje.ftpput.JobEntryFTPPUT

Process result filenames

File management

COPY_MOVE_RESULT_FILENAMES

Copy, Move or Delete result filenames

opdje.copymoveresultfilenames.JobEntryCopyMoveResultFilenames

Put a file with SFTP

File transfer

SFTPPUT

Put files using SFTP (Secure File Transfer Protocol)

opdje.sftpput.JobEntrySFTPPUT

Send information using Syslog

Utility

SYSLOG

Sends information to another server using the Syslog protocol.

opdje.syslog.JobEntrySyslog

Send Nagios passive check

Utility

SEND_NAGIOS_PASSIVE_CHECK

Send Nagios passive checks

opdje.sendnagiospassivecheck.JobEntrySendNagiosPassiveCheck

Send SNMP trap

Utility

SNMP_TRAP

Send SNMP trap to a target host

opdje.snmptrap.JobEntrySNMPTrap

Set variables

General

SET_VARIABLES

Set one or several variables.

opdje.setvariables.JobEntrySetVariables

Shell

Scripting

SHELL

Executes a shell script

opdje.shell.JobEntryShell

Simple evaluation

Conditions

SIMPLE_EVAL

Evaluate one field or variable

opdje.simpleeval.JobEntrySimpleEval

Spark Submit

Big Data


Submit Spark jobs


SQL

Scripting

SQL

Executes SQL on a certain database connection

opdje.sql.JobEntrySQL

Sqoop Export

Big Data

SqoopExport

Export data from the Hadoop Distributed File System (HDFS) into a relational database (RDBMS) using Apache Sqoop

opdje.sqoop.SqoopExportJobEntry

Sqoop Import

Big Data

SqoopImport

Import data from a relational database (RDBMS) into the Hadoop Distributed File System (HDFS) using Apache Sqoop

opdje.sqoop.SqoopImportJobEntry

SSH2 Get

Deprecated

SSH2_GET

Get files using SSH2 (Deprecated in 5.0 in favor of the SFTP job entry)

opdje.ssh2get.JobEntrySSH2GET

SSH2 Put

Deprecated

SSH2_PUT

Put files in a remote host using SSH2 (Deprecated in 5.0 in favor of the SFTP job entry)

opdje.ssh2put.JobEntrySSH2PUT

Start

General

START (internal: SPECIAL)

Start is where the job starts to execute and is required before the job can be executed.

opdje.special.JobEntrySpecial

Start a PDI Cluster on YARN

Big Data

StartYarnKettleCluster

Starts a YARN Kettle Cluster

org.pentaho.di.job.entries.startyarncluster.StartYarnCluster

Stop a PDI Cluster on YARN

Big Data

StopYarnKettleCluster

Stops a YARN Kettle Cluster

org.pentaho.di.job.entries.stopyarncluster.StopYarnCluster

Success

General

SUCCESS

Success

opdje.success.JobEntrySuccess

Table exists

Conditions

TABLE_EXISTS

Checks if a table exists on a database connection

opdje.tableexists.JobEntryTableExists

Talend Job Execution

Deprecated

TALEND_JOB_EXEC

This job entry executes an exported Talend Job

opdje.talendjobexec.JobEntryTalendJobExec

Telnet a file

File transfer

TELNET

This job entry transfers a file using Telnet

opdje.telnet.JobEntryTelnet

Transformation

General

TRANS

Executes a transformation

opdje.trans.JobEntryTrans

Truncate tables

Utility

TRUNCATE_TABLES

Truncate one or several tables.

opdje.truncatetables.JobEntryTruncateTables

Unzip file

File management

UNZIP

Unzip file in a targer folder

opdje.unzip.JobEntryUnZip

Upload files to FTPS

File transfer

FTPS_PUT

Upload files to a FTP secure

opdje.ftpsput.JobEntryFTPSPUT

Verify file signature with PGP

File encryption

PGP_VERIFY_FILES

Verify file signature with PGP (Pretty Good Privacy). This job entry need GnuPG to work properly.

opdje.pgpverify.JobEntryPGPVerify

Wait for

Conditions

DELAY

Wait for a delay

opdje.delay.JobEntryDelay

Wait for file

File management

WAIT_FOR_FILE

Wait for a file

opdje.waitforfile.JobEntryWaitForFile

Wait for SQL

Utility

WAIT_FOR_SQL

Scan a database and success when a specified condition on returned rows is true.

opdje.waitforsql.JobEntryWaitForSQL

Write to file

File management

WRITE_TO_FILE

Write text content to file.

opdje.writetofile.JobEntryWriteToFile

Write To Log

Utility

WRITE_TO_LOG

Write message to log

opdje.writetolog.JobEntryWriteToLog

XSD Validator

XML

XSD_VALIDATOR

XSD Validator

opdje.xsdvalidator.JobEntryXSDValidator

XSL Transformation

XML

XSLT

Make an XSL Transformation

opdje.xslt.JobEntryXSLT

Zip file

File management

ZIP_FILE

Zip files from a directory and process files

opdje.zipfile.JobEntryZipFile