Page Comparison

...

Name	Category	ID	Description	Metadata Java class opdts = org.pentaho.di.trans.steps
Abort	Flow	Abort	Abort a transformation	opdts.abort.AbortMeta
Add a checksum	Transform	CheckSum	Add a checksum column for each input row	opdts.checksum.CheckSumMeta
Add constants	Transform	Constant	Add one or more constants to the input rows	opdts.constant.ConstantMeta
Add sequence	Transform	Sequence	Get the next value from an sequence	opdts.addsequence.AddSequenceMeta
Add value fields changing sequence	Transform	FieldsChangeSequence	Add sequence depending of fields value change. Each time value of at least one field change, PDI will reset sequence.	opdts.fieldschangesequence.FieldsChangeSequenceMeta
Add XML	Transform	AddXML	Encode several fields into an XML fragment	opdts.addxml.AddXMLMeta
Aggregate Rows	Deprecated
Analytic Query	Statistics	AnalyticQuery	Execute analytic queries over a sorted dataset (LEAD/LAG/FIRST/LAST)	opdts.analyticquery.AnalyticQueryMeta
Append streams	Flow	Append	Append 2 streams in an ordered way	opdts.append.AppendMeta
ARFF Output	Data Mining	Arff Output	Writes data in ARFF format to a file	opdts.append.arff.ArffOutputMeta
Automatic Documentation Output	Output	AutoDoc	This step automatically generates documentation based on input in the form of a list of transformations and jobs	opdts.autodoc.AutoDocMeta
Avro Input (Deprecated)	Deprecated (pre- v.8.0) Input (v.8.0 and after)	AvroInput	Decode binary or Json Avro data from a file or a field	opdts.avroinput.AvroInputMeta
Avro Output	Output	AvroOutput	Encode binary or Json Avro data to a file	opdts.avrooutput.AvroOutputMeta
Block this step until steps finish	Flow	BlockUntilStepsFinish	Block this step until selected steps finish.	opdts.blockuntilstepsfinish.BlockUntilStepsFinishMeta
Blocking Step	Flow	BlockingStep	This step blocks until all incoming rows have been processed. Subsequent steps only recieve the last input row to this step.	opdts.blockingstep.BlockingStepMeta
Calculator	Transform	Calculator	Create new fields by performing simple calculations	opdts.calculator.CalculatorMeta
Call DB Procedure	Lookup	DBProc	Get back information by calling a database procedure.	opdts.dbproc.DBProcMeta
Call Endpoint	BA Server	CallEndpointStep	Calls API endpoints from the BA server within a PDI transformation.	org.pentaho.di.baserver.utils.CallEndpointMeta
Change file encoding	Utility	ChangeFileEncoding	Change file encoding and create a new file	opdts.changefileencoding.ChangeFileEncodingMeta
Cassandra input	Big Data	CassandraInput	Read from a Cassandra column family	opdts.cassandrainput.CassandraInputMeta
Cassandra output	Big Data	CassandraOutput	Write to a Cassandra column family	opdts.cassandraoutput.CassandraOutputMeta
Check if a column exists	Lookup	ColumnExists	Check if a column exists in a table on a specified connection.	opdts.columnexists.ColumnExistsMeta
Check if file is locked	Lookup	FileLocked	Check if a file is locked by another process	opdts.filelocked.FileLockedMeta
Check if webservice is available	Lookup	WebServiceAvailable	Check if a webservice is available	opdts.webserviceavailable.WebServiceAvailableMeta
Clone row	Utility	CloneRow	Clone a row as many times as needed	opdts.clonerow.CloneRowMeta
Closure Generator	Transform	ClosureGenerator	This step allows you to generates a closure table using parent-child relationships.	opdts.closure.ClosureGeneratorMeta
Combination lookup/update	Data Warehouse	CombinationLookup	Update a junk dimension in a data warehouse. Alternatively, look up information in this dimension. The primary key of a junk dimension are all the fields.	opdts.combinationlookup.CombinationLookupMeta
Concat Fields	Transform	ConcatFields	The Concat Fields step is used to concatenate multiple fields into one target field. The fields can be separated by a separator and the enclosure logic is completely compatible with the Text File Output step.	opdts.concatfields.ConcatFieldsMeta
Copy rows to result	Job	RowsToResult	Use this step to write rows to the executing job. The information will then be passed to the next entry in this job.	opdts.rowstoresult.RowsToResultMeta
CouchDB Input	Big Data	CouchDbInput	Retrieves all documents from a given view in a given design document from a given database	opdts.couchdbinput.CouchDbInputMeta
Credit card validator	Validation	CreditCardValidator	The Credit card validator step will help you tell: (1) if a credit card number is valid (uses LUHN10 (MOD-10) algorithm) (2) which credit card vendor handles that number (VISA, MasterCard, Diners Club, EnRoute, American Express (AMEX),...)	opdts.creditcardvalidator.CreditCardValidatorMeta
CSV file input	Input	CsvInput	Simple CSV file input	opdts.csvinput.CsvInputMeta
Data Grid	Input	DataGrid	Enter rows of static data in a grid, usually for testing, reference or demo purpose	opdts.datagrid.DataGridMeta
Data Validator	Validation	Validator	Validates passing data based on a set of rules	opdts.validator.ValidatorMeta
Database join	Lookup	DBJoin	Execute a database query using stream values as parameters	opdts.databasejoin.DatabaseJoinMeta
Database lookup	Lookup	DBLookup	Look up values in a database using field values	opdts.databaselookup.DatabaseLookupMeta
De-serialize from file	Input	CubeInput	Read rows of data from a data cube.	opdts.cubeinput.CubeInputMeta
Delay row	Utility	Delay	Output each input row after a delay	opdts.delay.DelayMeta
Delete	Output	Delete	Delete data in a database table based upon keys	opdts.delete.DeleteMeta
Detect empty stream	Flow	DetectEmptyStream	This step will output one empty row if input stream is empty (ie when input stream does not contain any row)	opdts.detectemptystream.DetectEmptyStreamMeta
Dimension lookup/update	Data Warehouse	DimensionLookup	Update a slowly changing dimension in a data warehouse. Alternatively, look up information in this dimension.	opdts.dimensionlookup.DimensionLookupMeta
Dummy (do nothing)	Flow	Dummy	This step type doesn't do anything. It's useful however when testing things or in certain situations where you want to split streams.	opdts.dummytrans.DummyTransMeta
Dynamic SQL row	Lookup	DynamicSQLRow	Execute dynamic SQL statement build in a previous field	opdts.dynamicsqlrow.DynamicSQLRowMeta
Edi to XML	Utility	TypeExitEdi2XmlStep	Converts an Edifact message to XML to simplify data extraction (Available in PDI 4.4, already present in CI trunk builds)	opdts.edi2xml.Edi2XmlMeta
ElasticSearch Bulk Insert	Bulk loading	ElasticSearchBulk	Performs bulk inserts into ElasticSearch	opdts.elasticsearchbulk.ElasticSearchBulkMeta
Email messages input	Input	MailInput	Read POP3/IMAP server and retrieve messages	opdts.mailinput.MailInputMeta
ESRI Shapefile Reader	Input	ShapeFileReader	Reads shape file data from an ESRI shape file and linked DBF file	org.pentaho.di.shapefilereader.ShapeFileReaderMeta
ETL Metadata Injection	Flow	MetaInject	This step allows you to inject metadata into an existing transformation prior to execution. This allows for the creation of dynamic and highly flexible data integration solutions.	opdts.metainject.MetaInjectMeta
Example Step (Deprecated)	Deprecated
Execute a process	Utility	ExecProcess	Execute a process and return the result	opdts.execprocess.ExecProcessMeta
Execute row SQL script	Scripting	ExecSQLRow	Execute SQL script extracted from a field created in a previous step.	opdts.execsqlrow.ExecSQLRowMeta
Execute SQL script	Scripting	ExecSQL	Execute an SQL script, optionally parameterized using input rows	opdts.sql.ExecSQLMeta
File exists	Lookup	FileExists	Check if a file exists	opdts.fileexists.FileExistsMeta
Filter Rows	Flow	FilterRows	Filter rows using simple equations	opdts.filterrows.FilterRowsMeta
Fixed file input	Input	FixedInput	Fixed file input	opdts.fixedinput.FixedInputMeta
Formula	Scripting	Formula	Calculate a formula using Pentaho's libformula	opdts.formula.FormulaMeta
Fuzzy match	Lookup	FuzzyMatch	Finding approximate matches to a string using matching algorithms. Read a field from a main stream and output approximative value from lookup stream.	opdts.fuzzymatch.FuzzyMatchMeta
Generate random credit card numbers	Input	RandomCCNumberGenerator	Generate random valide (luhn check) credit card numbers	opdts.randomccnumber.RandomCCNumberGeneratorMeta
Generate random value	Input	RandomValue	Generate random value	opdts.randomvalue.RandomValueMeta
Generate Rows	Input	RowGenerator	Generate a number of empty or equal rows.	opdts.rowgenerator.RowGeneratorMeta
Get data from XML	Input	getXMLData	Get data from XML file by using XPath. This step also allows you to parse XML defined in a previous field.	opdts.getxmldata.GetXMLDataMeta
Get File Names	Input	GetFileNames	Get file names from the operating system and send them to the next step.	opdts.getfilenames.GetFileNamesMeta
Get files from result	Job	FilesFromResult	This step allows you to read filenames used or generated in a previous entry in a job.	opdts.filesfromresult.FilesFromResultMeta
Get Files Rows Count	Input	GetFilesRowsCount	Get Files Rows Count	opdts.getfilesrowscount.GetFilesRowsCountMeta
Get ID from slave server	Transform	GetSlaveSequence	Retrieves unique IDs in blocks from a slave server. The referenced sequence needs to be configured on the slave server in the XML configuration file.	opdts.getslavesequence.GetSlaveSequenceMeta
Get previous row fields	Deprecated
Get repository names	Input	GetRepositoryNames	Lists detailed information about transformations and/or jobs in a repository	opdts.getrepositorynames.GetRepositoryNamesMeta
Get rows from result	Job	RowsFromResult	This allows you to read rows from a previous entry in a job	opdts.rowsfromresult.RowsFromResultMeta
Get Session Variables	BA Server	GetSessionVariableStep	Retrieves the value of a session variable	org.pentaho.di.baserver.utils.GetSessionVariableMeta
Get SubFolder names	Input	GetSubFolders	Read a parent folder and return all subfolders	opdts.getsubfolders.GetSubFoldersMeta
Get System Info	Input	SystemInfo	Get information from the system like system date, arguments, etc.	opdts.systemdata.SystemDataMeta
Get table names	Input	GetTableNames	Get table names from database connection and send them to the next step	opdts.gettablenames.GetTableNamesMeta
Get Variables	Job	GetVariable	Determine the values of certain (environment or Kettle) variables and put them in field values.	opdts.getvariable.GetVariableMeta
Google Analytics	Input	TypeExitGoogleAnalyticsInputStep	Fetches data from google analytics account	opdts.googleanalytics.GaInputStepMeta
Google Docs Input	Input
Greenplum Bulk Loader (Deprecated)	Deprecated	GPBulkLoader	Greenplum Bulk Loader	opdts.gpbulkloader.GPBulkLoaderMeta
Greenplum Load	Bulk loading	GPLoad	Greenplum Load
Group by	Statistics	GroupBy	Builds aggregates in a group by fashion. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly.	opdts.groupby.GroupByMeta
GZIP CSV Input	Input	ParallelGzipCsvInput	Parallel GZIP CSV file input reader	opdts.parallelgzipcsv.ParGzipCsvInputMeta
Hadoop File Input	Big Data	HadoopFileInputPlugin	Read data from a variety of different text-file types stored on a Hadoop cluster	opdts.hadoopfileinput.HadoopFileInputMeta
Hadoop File Output	Big Data	HadoopFileOutputPlugin	Write data to a variety of different text-file types stored on a Hadoop cluster	opdts.hadoopfileoutput.HadoopFileOutputMeta
HBase input	Big Data	HbaseInput	Read from an HBase column family	opdts.hbaseinput.HBaseInputMeta
HBase output	Big Data	HbaseOutput	Write to an HBase column family	opdts.hbaseoutput.HBaseOutputMeta
HBase Row Decoder	Big Data	HBaseRowDecoder	Decodes an incoming key and HBase result object according to a mapping	opdts.hbaserowdecoder.HBaseRowDecoderMeta
HL7 Input	Input	HL7Input	Read data from HL7 data streams.	opdt.hl7.plugins.hl7input
HTTP client	Lookup	HTTP	Call a web service over HTTP by supplying a base URL by allowing parameters to be set dynamically	opdts.http.HTTPMeta
HTTP Post	Lookup	HTTPPOST	Call a web service request over HTTP by supplying a base URL by allowing parameters to be set dynamically	opdts.httppost.HTTPPOSTMeta
IBM Websphere MQ Consumer (Deprecated)	Deprecated	MQInput	Receive messages from any IBM Websphere MQ Server
IBM Websphere MQ Producer (Deprecated)	Deprecated	MQOutput	Send messages to any IBM Websphere MQ Server
Identify last row in a stream	Flow	DetectLastRow	Last row will be marked	opdts.detectlastrow.DetectLastRowMeta
If field value is null	Utility	IfNull	Sets a field value to a constant if it is null.	opdts.ifnull.IfNullMeta
Infobright Loader	Bulk loading	InfobrightOutput	Load data to an Infobright database table	opdts.infobrightoutput.InfobrightLoaderMeta
Ingres VectorWise Bulk Loader	Bulk loading	VectorWiseBulkLoader	This step interfaces with the Ingres VectorWise Bulk Loader "COPY TABLE" command.	opdts.ivwloader.IngresVectorwiseLoaderMeta
Injector	Inline	Injector	Injector step to allow to inject rows into the transformation through the java API	opdts.injector.InjectorMeta
Insert / Update	Output	InsertUpdate	Update or insert rows in a database based upon keys.	opdts.insertupdate.InsertUpdateMeta
Java Filter	Flow	JavaFilter	Filter rows using java code	opdts.javafilter.JavaFilterMeta
JMS Consumer (Deprecated)	Deprecated (pre- v.8.1) Input (v.8.1 and after)	JmsInput	Receive messages from a JMS server
JMS Producer (Deprecated)	Deprecated (pre- v.8.1) Output (v.8.1 and after)	JmsOutput	Send messages to a JMS server
Job Executor	Flow	JobExecutor	This step executes a Pentaho Data Integration Job, passes parameters and rows.	opdts.jobexecutor.JobExecutorMeta
Join Rows (cartesian product)	Joins	JoinRows	The output of this step is the cartesian product of the input streams. The number of rows is the multiplication of the number of rows in the input streams.	opdts.joinrows.JoinRowsMeta
JSON Input	Input	JsonInput	Extract relevant portions out of JSON structures (file or incoming field) and output rows	opdts.jsoninput.JsonInputMeta
JSON output	Output	JsonOutput	Create Json bloc and output it in a field ou a file.	opdts.jsonoutput.JsonOutputMeta
Knowledge Flow	Data Mining	KF	Executes a Knowledge Flow data mining process	org.pentaho.di.kf.KFMeta
LDAP Input	Input	LDAPInput	Read data from LDAP host	opdts.ldapinput.LDAPInputMeta
LDAP Output	Output	LDAPOutput	Perform Insert, upsert, update, add or delete operations on records based on their DN (Distinguished Name).	opdts.ldapoutput.LDAPOutputMeta
LDIF Input	Input	LDIFInput	Read data from LDIF files	opdts.ldifinput.LDIFInputMeta
Load file content in memory	Input	LoadFileInput	Load file content in memory	opdts.loadfileinput.LoadFileInputMeta
LucidDB Bulk Loader	Deprecated
LucidDB Streaming Loader (Deprecated)	Deprecated	LucidDBStreamingLoader	Load data into LucidDB by using Remote Rows UDX.	opdts.luciddbstreamingloader.LucidDBStreamingLoaderMeta
Mail	Utility	Mail	Send eMail.	opdts.mail.MailMeta
Mail Validator	Validation	MailValidator	Check if an email address is valid.	opdts.mailvalidator.MailValidatorMeta
Mapping (sub-transformation)	Mapping	Mapping	Run a mapping (sub-transformation), use MappingInput and MappingOutput to specify the fields interface	opdts.mapping.MappingMeta
Mapping input specification	Mapping	MappingInput	Specify the input interface of a mapping	opdts.mappinginput.MappingInputMeta
Mapping output specification	Mapping	MappingOutput	Specify the output interface of a mapping	opdts.mappingoutput.MappingOutputMeta
MapReduce Input	Big Data	HadoopEnterPlugin	Key Value pairs enter here from Hadoop MapReduce	opdts.hadoopenter.HadoopEnterMeta
MapReduce Output	Big Data	HadoopExitPlugin	Key Value pairs exit here and are pushed into Hadoop MapReduce	opdts.hadoopexit.HadoopExitMeta
MaxMind GeoIP Lookup	Lookup	MaxMindGeoIPLookup	Lookup an IPv4 address in a MaxMind database and add fields such as geography, ISP, or organization.	com.maxmind.geoip.MaxMindGeoIPLookupMeta
Memory Group by	Statistics	MemoryGroupBy	Builds aggregates in a group by fashion. This step doesn't require sorted input.	opdts.memgroupby.MemoryGroupByMeta
Merge Join	Joins	MergeJoin	Joins two streams on a given key and outputs a joined set. The input streams must be sorted on the join key	opdts.mergejoin.MergeJoinMeta
Merge Rows (diff)	Joins	MergeRows	Merge two streams of rows, sorted on a certain key. The two streams are compared and the equals, changed, deleted and new rows are flagged.	opdts.mergerows.MergeRowsMeta
Metadata structure of stream	Utility	StepMetastructure	This is a step to read the metadata of the incoming stream.	opdts.stepmeta.StepMetastructureMeta
Microsoft Access Input	Input	AccessInput	Read data from a Microsoft Access file	opdts.accessinput.AccessInputMeta
Microsoft Access Output	Output	AccessOutput	Stores records into an MS-Access database table.	opdts.accessoutput.AccessOutputMeta
Microsoft Excel Input	Input	ExcelInput	Read data from Excel and OpenOffice Workbooks (XLS, XLSX, ODS).	opdts.excelinput.ExcelInputMeta
Microsoft Excel Output	Output	ExcelOutput	Stores records into an Excel (XLS) document with formatting information.	opdts.exceloutput.ExcelOutputMeta
Microsoft Excel Writer	Output	TypeExitExcelWriterStep	Writes or appends data to an Excel file	opdts.excelwriter.ExcelWriterStepMeta
Modified Java Script Value	Scripting	ScriptValueMod	This steps allows the execution of JavaScript programs (and much more)	opdts.scriptvalues_mod.ScriptValuesMetaMod
Mondrian Input	Input	MondrianInput	Execute and retrieve data using an MDX query against a Pentaho Analyses OLAP server (Mondrian)	opdts.mondrianinput.MondrianInputMeta
MonetDB Agile Mart	Agile
MonetDB Bulk Loader	Bulk loading	MonetDBBulkLoader	Load data into MonetDB by using their bulk load command in streaming mode.	opdts.monetdbbulkloader.MonetDBBulkLoaderMeta
MongoDB Input	Big Data	MongoDbInput	Reads all entries from a MongoDB collection in the specified database.	opdts.mongodbinput.MongoDbInputMeta
MongoDB Output	Big Data	MongoDbOutput	Write to a MongoDB collection.	opdts.mongodboutput.MongoDbOutputMeta
Multiway Merge Join	Joins	MultiwayMergeJoin	Multiway Merge Join	opdts.multimerge.MultiMergeJoinMeta
MySQL Bulk Loader	Bulk loading	MySQLBulkLoader	MySQL bulk loader step, loading data over a named pipe (not available on MS Windows)	opdts.mysqlbulkloader.MySQLBulkLoaderMeta
Null if...	Utility	NullIf	Sets a field value to null if it is equal to a constant value	opdts.nullif.NullIfMeta
Number range	Transform	NumberRange	Create ranges based on numeric field	opdts.numberrange.NumberRangeMeta
OLAP Input	Input	OlapInput	Execute and retrieve data using an MDX query against any XML/A OLAP datasource using olap4j	opdts.olapinput.OlapInputMeta
OpenERP Object Delete (Deprecated)	Deprecated	OpenERPObjectDelete	Deletes data from the OpenERP server using the XMLRPC interface with the 'unlink' function.	opdts.openerp.objectdelete.OpenERPObjectDeleteMeta
OpenERP Object Input (Deprecated)	Deprecated	OpenERPObjectInput	Retrieves data from the OpenERP server using the XMLRPC interface with the 'read' function.	opdts.openerp.objectinput.OpenERPObjectInputMeta
OpenERP Object Output (Deprecated)	Deprecated	OpenERPObjectOutputImport	Updates data on the OpenERP server using the XMLRPC interface and the 'import' function	opdts.openerp.objectoutput.OpenERPObjectOutputMeta
Oracle Bulk Loader	Bulk loading	OraBulkLoader	Use Oracle Bulk Loader to load data	opdts.orabulkloader.OraBulkLoaderMeta
Output steps metrics	Statistics	StepsMetrics	Return metrics for one or several steps	opdts.stepsmetrics.StepsMetricsMeta
Palo Cell Input (Deprecated)	Deprecated	PaloCellInput	Retrieves all cell data from a Palo cube	opdts.palo.cellinput
Palo Cell Output (Deprecated)	Deprecated	PaloCellOutput	Updates cell data in a Palo cube	opdts.palo.celloutput
Palo Dimension Input (Deprecated)	Deprecated	PaloDimInput	Returns elements from a dimension in a Palo database	opdts.palo.diminput
Palo Dimension Output (Deprecated)	Deprecated	PaloDimOutput	Creates/updates dimension elements and element consolidations in a Palo database	opdts.palo.dimoutput
Pentaho Reporting Output	Output	PentahoReportingOutput	Executes an existing report (PRPT)	opdts.pentahoreporting.PentahoReportingOutputMeta
PostgreSQL Bulk Loader	Bulk loading	PGBulkLoader	PostgreSQL Bulk Loader	opdts.pgbulkloader.PGBulkLoaderMeta
Prioritize streams	Flow	PrioritizeStreams	Prioritize streams in an order way.	opdts.prioritizestreams.PrioritizeStreamsMeta
Process files	Utility	ProcessFiles	Process one file per row (copy or move or delete). This step only accept filename in input.	opdts.processfiles.ProcessFilesMeta
Properties Output	Output	PropertyOutput	Write data to properties file	opdts.propertyoutput.PropertyOutputMeta
Property Input	Input	PropertyInput	Read data (key, value) from properties files.	opdts.propertyinput.PropertyInputMeta
R script executor	Statistics	RScriptExecutor	Executes an R script within a PDI transformation
Regex Evaluation	Scripting	RegexEval	Regular expression Evaluation. This step uses a regular expression to evaluate a field. It can also extract new fields out of an existing field with capturing groups.	opdts.regexeval.RegexEvalMeta
Replace in string	Transform	ReplaceString	Replace all occurences a word in a string with another word.	opdts.replacestring.ReplaceStringMeta
Reservoir Sampling	Statistics	ReservoirSampling	Transform Samples a fixed number of rows from the incoming stream	opdts.reservoirsampling.ReservoirSamplingMeta
REST Client	Lookup	Rest	Consume RESTfull services. REpresentational State Transfer (REST) is a key design idiom that embraces a stateless client-server architecture in which the web services are viewed as resources and can be identified by their URLs	opdts.rest.RestMeta
Row denormaliser	Transform	Denormaliser	Denormalises rows by looking up key-value pairs and by assigning them to new fields in the output rows. This method aggregates and needs the input rows to be sorted on the grouping fields	opdts.denormaliser.DenormaliserMeta
Row flattener	Transform	Flattener	Flattens consecutive rows based on the order in which they appear in the input stream	opdts.flattener.FlattenerMeta
Row Normaliser	Transform	Normaliser	De-normalised information can be normalised using this step type.	opdts.normaliser.NormaliserMeta
RSS Input	Input	RssInput	Read RSS feeds	opdts.rssinput.RssInputMeta
RSS Output	Output	RssOutput	Read RSS stream.	opdts.rssoutput.RssOutputMeta
Rule Executor	Scripting	RuleExecutor	Execute a rule against each row (using Drools)	opdts.rules.RulesExecutorMeta
Rule Accumulator	Scripting	RuleAccumulator	Execute a rule against a set of rows (using Drools)	opdts.rules.RulesAccumulatorMeta
Run SSH commands	Utility	SSH	Run SSH commands and returns result.	opdts.ssh.SSHMeta
S3 CSV Input	Input	S3CSVINPUT	S3 CSV Input	opdts.s3csvinput.S3CsvInputMeta
S3 File Output	Output	S3FileOutputPlugin	Exports data to a text file on an Amazon Simple Storage Service (S3)	com.pentaho.amazon.s3.S3FileOutputMeta
SAP HANA Bulk Loader	Bulk loading	HanaBulkLoader	Bulk load data into SAP HANA	org.pentaho.di.trans.steps.hanabulkloader.HanaBulkLoaderMeta
Salesforce Delete	Output	SalesforceDelete	Delete records in Salesforce module.	opdts.salesforcedelete.SalesforceDeleteMeta
Salesforce Input	Input	SalesforceInput	Reads information from SalesForce	opdts.salesforceinput.SalesforceInputMeta
Salesforce Insert	Output	SalesforceInsert	Insert records in Salesforce module.	opdts.salesforceinsert.SalesforceInsertMeta
Salesforce Update	Output	SalesforceUpdate	Update records in Salesforce module.	opdts.salesforceupdate.SalesforceUpdateMeta
Salesforce Upsert	Output	SalesforceUpsert	Insert or update records in Salesforce module.	opdts.salesforceupsert.SalesforceUpsertMeta
Sample rows	Statistics	SampleRows	Filter rows based on the line number.	opdts.samplerows.SampleRowsMeta
SAP Input (Deprecated)	Deprecated	SapInput	Read data from SAP ERP, optionally with parameters	opdts.sapinput.SapInputMeta
SAS Input	Input	SASInput	This step reads files in sas7bdat (SAS) native format	opdts.sasinput.SasInputMeta
Script	Experimental
Secret key generator	Cryptography	SecretKeyGenerator	Generate secrete key for algorithms such as DES, AES, TripleDES.	opdts.symmetriccrypto.secretkeygenerator.SecretKeyGeneratorMeta
Select values	Transform	SelectValues	Select or remove fields in a row. Optionally, set the field meta-data: type, length and precision.	opdts.selectvalues.SelectValuesMeta
Send message to Syslog	Utility	SyslogMessage	Send message to Syslog server	opdts.syslog.SyslogMessageMeta
Serialize to file	Output	CubeOutput	Write rows of data to a data cube	opdts.cubeoutput.CubeOutputMeta
Set field value	Transform	SetValueField	Replace value of a field with another value field	opdts.setvaluefield.SetValueFieldMeta
Set field value to a constant	Transform	SetValueConstant	Replace value of a field to a constant	opdts.setvalueconstant.SetValueConstantMeta
Set files in result	Job	FilesToResult	This step allows you to set filenames in the result of this transformation. Subsequent job entries can then use this information.	opdts.filestoresult.FilesToResultMeta
Set Session Variables	BA Server	SetSessionVariableStep	Allows you to set the value of session variable	org.pentaho.di.baserver.utils.SetSessionVariableMeta
Set Variables	Job	SetVariable	Set environment variables based on a single input row.	opdts.setvariable.SetVariableMeta
SFTP Put	Experimental
Simple Mapping	Mapping	SimpleMapping	Turn a repetitive, re-usable part of a transformation (a sequence of steps) into a mapping (sub-transformation).	opdts.simplemapping.SimpleMapping
Single Threader	Flow	SingleThreader	Executes a transformation snippet in a single thread. You need a standard mapping or a transformation with an Injector step where data from the parent transformation will arive in blocks.	opdts.singlethreader.SingleThreaderMeta
Socket reader	Inline	SocketReader	Socket reader. A socket client that connects to a server (Socket Writer step).	opdts.socketreader.SocketReaderMeta
Socket writer	Inline	SocketWriter	Socket writer. A socket server that can send rows of data to a socket reader.	opdts.socketwriter.SocketWriterMeta
Sort rows	Transform	SortRows	Sort rows based upon field values (ascending or descending)	opdts.sort.SortRowsMeta
Sorted Merge	Joins	SortedMerge	Sorted Merge	opdts.sortedmerge.SortedMergeMeta
Split field to rows	Transform	SplitFieldToRows3	Splits a single string field by delimiter and creates a new row for each split term	opdts.splitfieldtorows.SplitFieldToRowsMeta
Split Fields	Transform	FieldSplitter	When you want to split a single field into more then one, use this step type.	opdts.fieldsplitter.FieldSplitterMeta
Splunk Input	Transform	SplunkInput	Reads data from Splunk.	opdts.splunk.SplunkInputMeta
Splunk Output	Transform	SplunkOutput	Writes data to Splunk.	opdts.splunk.SplunkOutputMeta
SQL File Output	Output	SQLFileOutput	Output SQL INSERT statements to file	opdts.sqlfileoutput.SQLFileOutputMeta
Stream lookup	Lookup	StreamLookup	Look up values coming from another stream in the transformation.	opdts.streamlookup.StreamLookupMeta
SSTable Output	Big Data	SSTableOutput	writes to a filesystem directory as a Cassandra SSTable	opdts.cassandrasstableoutput.SSTableOutputMeta
Streaming XML Input	Deprecated
String operations	Transform	StringOperations	Apply certain operations like trimming, padding and others to string value.	opdts.stringoperations.StringOperationsMeta
Strings cut	Transform	StringCut	Strings cut (substring).	opdts.stringcut.StringCutMeta
Switch / Case	Flow	SwitchCase	Switch a row to a certain target step based on the case value in a field.	opdts.switchcase.SwitchCaseMeta
Symmetric Cryptography	Cryptography	SymmetricCryptoTrans	Encrypt or decrypt a string using symmetric encryption. Available algorithms are DES, AEC, TripleDES.	opdts.symmetriccrypto.symmetriccryptotrans.SymmetricCryptoTransMeta
Synchronize after merge	Output	SynchronizeAfterMerge	This step perform insert/update/delete in one go based on the value of a field.	opdts.synchronizeaftermerge.SynchronizeAfterMergeMeta
Table Agile Mart	Agile
Table Compare	Utility	TableCompare	This step compares the data from two tables (provided they have the same lay-out). It'll find differences between the data in the two tables and log it.	opdts.tablecompare.TableCompareMeta
Table exists	Lookup	TableExists	Check if a table exists on a specified connection	opdts.tableexists.TableExistsMeta
Table input	Input	TableInput	Read information from a database table.	opdts.tableinput.TableInputMeta
Table output	Output	TableOutput	Write information to a database table	opdts.tableoutput.TableOutputMeta
Teradata Fastload Bulk Loader	Bulk loading	TeraFast	The Teradata Fastload Bulk loader	opdts.terafast.TeraFastMeta
Teradata TPT Insert Upsert Bulk Loader	Bulk loading	TeraDataBulkLoader	Bulk loading via TPT using the tbuild command.
Text file input	Input	TextFileInput	Read data from a text file in several formats. This data can then be passed on to the next step(s)...	opdts.textfileinput.TextFileInputMeta
Text file output (Deprecated)	Deprecated	TextFileOutput	Write rows to a text file.	opdts.textfileoutput.TextFileOutputMeta
Transformation Executor	Flow		This step executes a Pentaho Data Integration transformation, sets parameters, and passes rows.
Unique rows	Transform	Unique	Remove double rows and leave only unique occurrences. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly.	opdts.uniquerows.UniqueRowsMeta
Unique rows (HashSet)	Transform	UniqueRowsByHashSet	Remove double rows and leave only unique occurrences by using a HashSet.	opdts.uniquerowsbyhashset.UniqueRowsByHashSetMeta
Univariate Statistics	Statistics	UnivariateStats	This step computes some simple stats based on a single input field	opdts.univariatestats.UnivariateStatsMeta
Update	Output	Update	Update data in a database table based upon keys	opdts.update.UpdateMeta
User Defined Java Class	Scripting	UserDefinedJavaClass	This step allows you to program a step using Java code	opdts.userdefinedjavaclass.UserDefinedJavaClassMeta
User Defined Java Expression	Scripting	Janino	Calculate the result of a Java Expression using Janino	opdts.janino.JaninoMeta
Value Mapper	Transform	ValueMapper	Maps values of a certain field from one value to another	opdts.valuemapper.ValueMapperMeta
Vertica Bulk Loader	Bulk loading	VerticaBulkLoader	Bulk loads data into a Vertica table using their high performance COPY feature	opdts.verticabulkload.VerticaBulkLoaderMeta
Web services lookup	Lookup	WebServiceLookup	Look up information using web services (WSDL)	opdts.webservices.WebServiceMeta
Knowledge Flow	Data Mining	KF	Executes a Knowledge Flow data mining process	org.pentaho.di.kf.KFMeta
Write to log	Utility	WriteToLog	Write data to log	opdts.writetolog.WriteToLogMeta
XBase input	Input	XBaseInput	Reads records from an XBase type of database file (DBF)	opdts.xbaseinput.XBaseInputMeta
XML Input Stream (StAX)	Input	XMLInputStream	This step is capable of processing very large and complex XML files very fast.	opdts.xmlinputstream.XMLInputStreamMeta
XML Input	Deprecated
XML Join	Joins	XMLJoin	Joins a stream of XML-Tags into a target XML string	opdts.xmljoin.XMLJoinMeta
XML Output	Output	XMLOutput	Write data to an XML file	opdts.xmloutput.XMLOutputMeta
XSD Validator	Validation	XSDValidator	Validate XML source (files or streams) against XML Schema Definition.	opdts.xsdvalidator.XsdValidatorMeta
XSL Transformation	Transform	XSLT	Transform XML stream using XSL (eXtensible Stylesheet Language).	opdts.xslt.XsltMeta
Yaml Input	Input	YamlInput	Read YAML source (file or stream) parse them and convert them to rows and writes these to one or more output.	opdts.yamlinput.YamlInputMeta
Zip File	Utility	ZipFile	Creates a standard ZIP archive from the data stream fields	opdts.zipfile.ZipFileMeta

Versions Compared

Old Version 264

New Version 265

Key