...
Name | Category | ID | Description | Metadata Java class | ||
---|---|---|---|---|---|---|
Flow | Abort | Abort a transformation | opdts.abort.AbortMeta | |||
Transform | CheckSum | Add a checksum column for each input row | opdts.checksum.CheckSumMeta | |||
Transform | Constant | Add one or more constants to the input rows | opdts.constant.ConstantMeta | |||
Transform | Sequence | Get the next value from an sequence | opdts.addsequence.AddSequenceMeta | |||
Transform | FieldsChangeSequence | Add sequence depending of fields value change. Each time value of at least one field change, PDI will reset sequence. | opdts.fieldschangesequence.FieldsChangeSequenceMeta | |||
Transform | AddXML | Encode several fields into an XML fragment | opdts.addxml.AddXMLMeta | |||
Deprecated | ||||||
Statistics | AnalyticQuery | Execute analytic queries over a sorted dataset (LEAD/LAG/FIRST/LAST) | opdts.analyticquery.AnalyticQueryMeta | |||
Flow | Append | Append 2 streams in an ordered way | opdts.append.AppendMeta | |||
Data Mining | Arff Output | Writes data in ARFF format to a file | opdts.append.arff.ArffOutputMeta | |||
Output | AutoDoc | This step automatically generates documentation based on input in the form of a list of transformations and jobs | opdts.autodoc.AutoDocMeta | |||
Deprecated (pre- v.8.0) Input (v.8.0 and after) | AvroInput | Decode binary or Json Avro data from a file or a field | opdts.avroinput.AvroInputMeta | |||
Output | AvroOutput | Encode binary or Json Avro data to a file | opdts.avrooutput.AvroOutputMeta | |||
Flow | BlockUntilStepsFinish | Block this step until selected steps finish. | opdts.blockuntilstepsfinish.BlockUntilStepsFinishMeta | |||
Flow | BlockingStep | This step blocks until all incoming rows have been processed. Subsequent steps only recieve the last input row to this step. | opdts.blockingstep.BlockingStepMeta | |||
Transform | Calculator | Create new fields by performing simple calculations | opdts.calculator.CalculatorMeta | |||
Lookup | DBProc | Get back information by calling a database procedure. | opdts.dbproc.DBProcMeta | |||
BA Server | CallEndpointStep | Calls API endpoints from the BA server within a PDI transformation. | org.pentaho.di.baserver.utils.CallEndpointMeta | |||
Utility | ChangeFileEncoding | Change file encoding and create a new file | opdts.changefileencoding.ChangeFileEncodingMeta | |||
Big Data | CassandraInput | Read from a Cassandra column family | opdts.cassandrainput.CassandraInputMeta | |||
Big Data | CassandraOutput | Write to a Cassandra column family | opdts.cassandraoutput.CassandraOutputMeta | |||
Lookup | ColumnExists | Check if a column exists in a table on a specified connection. | opdts.columnexists.ColumnExistsMeta | |||
Lookup | FileLocked | Check if a file is locked by another process | opdts.filelocked.FileLockedMeta | |||
Lookup | WebServiceAvailable | Check if a webservice is available | opdts.webserviceavailable.WebServiceAvailableMeta | |||
Utility | CloneRow | Clone a row as many times as needed | opdts.clonerow.CloneRowMeta | |||
Transform | ClosureGenerator | This step allows you to generates a closure table using parent-child relationships. | opdts.closure.ClosureGeneratorMeta | |||
Data Warehouse | CombinationLookup | Update a junk dimension in a data warehouse. Alternatively, look up information in this dimension. The primary key of a junk dimension are all the fields. | opdts.combinationlookup.CombinationLookupMeta | |||
Transform | ConcatFields | The Concat Fields step is used to concatenate multiple fields into one target field. The fields can be separated by a separator and the enclosure logic is completely compatible with the Text File Output step. | opdts.concatfields.ConcatFieldsMeta | |||
Job | RowsToResult | Use this step to write rows to the executing job. The information will then be passed to the next entry in this job. | opdts.rowstoresult.RowsToResultMeta | |||
Big Data | CouchDbInput | Retrieves all documents from a given view in a given design document from a given database | opdts.couchdbinput.CouchDbInputMeta | |||
Validation | CreditCardValidator | The Credit card validator step will help you tell: (1) if a credit card number is valid (uses LUHN10 (MOD-10) algorithm) (2) which credit card vendor handles that number (VISA, MasterCard, Diners Club, EnRoute, American Express (AMEX),...) | opdts.creditcardvalidator.CreditCardValidatorMeta | |||
Input | CsvInput | Simple CSV file input | opdts.csvinput.CsvInputMeta | |||
Input | DataGrid | Enter rows of static data in a grid, usually for testing, reference or demo purpose | opdts.datagrid.DataGridMeta | |||
Validation | Validator | Validates passing data based on a set of rules | opdts.validator.ValidatorMeta | |||
Lookup | DBJoin | Execute a database query using stream values as parameters | opdts.databasejoin.DatabaseJoinMeta | |||
Lookup | DBLookup | Look up values in a database using field values | opdts.databaselookup.DatabaseLookupMeta | |||
Input | CubeInput | Read rows of data from a data cube. | opdts.cubeinput.CubeInputMeta | |||
Utility | Delay | Output each input row after a delay | opdts.delay.DelayMeta | |||
Output | Delete | Delete data in a database table based upon keys | opdts.delete.DeleteMeta | |||
Flow | DetectEmptyStream | This step will output one empty row if input stream is empty (ie when input stream does not contain any row) | opdts.detectemptystream.DetectEmptyStreamMeta | |||
Data Warehouse | DimensionLookup | Update a slowly changing dimension in a data warehouse. Alternatively, look up information in this dimension. | opdts.dimensionlookup.DimensionLookupMeta | |||
Flow | Dummy | This step type doesn't do anything. It's useful however when testing things or in certain situations where you want to split streams. | opdts.dummytrans.DummyTransMeta | |||
Lookup | DynamicSQLRow | Execute dynamic SQL statement build in a previous field | opdts.dynamicsqlrow.DynamicSQLRowMeta | |||
Utility | TypeExitEdi2XmlStep | Converts an Edifact message to XML to simplify data extraction (Available in PDI 4.4, already present in CI trunk builds) | opdts.edi2xml.Edi2XmlMeta | |||
Bulk loading | ElasticSearchBulk | Performs bulk inserts into ElasticSearch | opdts.elasticsearchbulk.ElasticSearchBulkMeta | |||
Input | MailInput | Read POP3/IMAP server and retrieve messages | opdts.mailinput.MailInputMeta | |||
Input | ShapeFileReader | Reads shape file data from an ESRI shape file and linked DBF file | org.pentaho.di.shapefilereader.ShapeFileReaderMeta | |||
Flow | MetaInject | This step allows you to inject metadata into an existing transformation prior to execution. This allows for the creation of dynamic and highly flexible data integration solutions. | opdts.metainject.MetaInjectMeta | |||
Deprecated | ||||||
Utility | ExecProcess | Execute a process and return the result | opdts.execprocess.ExecProcessMeta | |||
Scripting | ExecSQLRow | Execute SQL script extracted from a field created in a previous step. | opdts.execsqlrow.ExecSQLRowMeta | |||
Scripting | ExecSQL | Execute an SQL script, optionally parameterized using input rows | opdts.sql.ExecSQLMeta | |||
Lookup | FileExists | Check if a file exists | opdts.fileexists.FileExistsMeta | |||
Flow | FilterRows | Filter rows using simple equations | opdts.filterrows.FilterRowsMeta | |||
Input | FixedInput | Fixed file input | opdts.fixedinput.FixedInputMeta | |||
Scripting | Formula | Calculate a formula using Pentaho's libformula | opdts.formula.FormulaMeta | |||
Lookup | FuzzyMatch | Finding approximate matches to a string using matching algorithms. Read a field from a main stream and output approximative value from lookup stream. | opdts.fuzzymatch.FuzzyMatchMeta | |||
Input | RandomCCNumberGenerator | Generate random valide (luhn check) credit card numbers | opdts.randomccnumber.RandomCCNumberGeneratorMeta | |||
Input | RandomValue | Generate random value | opdts.randomvalue.RandomValueMeta | |||
Input | RowGenerator | Generate a number of empty or equal rows. | opdts.rowgenerator.RowGeneratorMeta | |||
Input | getXMLData | Get data from XML file by using XPath. This step also allows you to parse XML defined in a previous field. | opdts.getxmldata.GetXMLDataMeta | |||
Input | GetFileNames | Get file names from the operating system and send them to the next step. | opdts.getfilenames.GetFileNamesMeta | |||
Job | FilesFromResult | This step allows you to read filenames used or generated in a previous entry in a job. | opdts.filesfromresult.FilesFromResultMeta | |||
Input | GetFilesRowsCount | Get Files Rows Count | opdts.getfilesrowscount.GetFilesRowsCountMeta | |||
Transform | GetSlaveSequence | Retrieves unique IDs in blocks from a slave server. The referenced sequence needs to be configured on the slave server in the XML configuration file. | opdts.getslavesequence.GetSlaveSequenceMeta | |||
Deprecated | ||||||
Input | GetRepositoryNames | Lists detailed information about transformations and/or jobs in a repository | opdts.getrepositorynames.GetRepositoryNamesMeta | |||
Job | RowsFromResult | This allows you to read rows from a previous entry in a job | opdts.rowsfromresult.RowsFromResultMeta | |||
BA Server | GetSessionVariableStep | Retrieves the value of a session variable | org.pentaho.di.baserver.utils.GetSessionVariableMeta | |||
Input | GetSubFolders | Read a parent folder and return all subfolders | opdts.getsubfolders.GetSubFoldersMeta | |||
Input | SystemInfo | Get information from the system like system date, arguments, etc. | opdts.systemdata.SystemDataMeta | |||
Input | GetTableNames | Get table names from database connection and send them to the next step | opdts.gettablenames.GetTableNamesMeta | |||
Job | GetVariable | Determine the values of certain (environment or Kettle) variables and put them in field values. | opdts.getvariable.GetVariableMeta | |||
Input | TypeExitGoogleAnalyticsInputStep | Fetches data from google analytics account | opdts.googleanalytics.GaInputStepMeta | |||
Input | ||||||
Deprecated | GPBulkLoader | Greenplum Bulk Loader | opdts.gpbulkloader.GPBulkLoaderMeta | |||
Bulk loading | GPLoad | Greenplum Load | ||||
Statistics | GroupBy | Builds aggregates in a group by fashion. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly. | opdts.groupby.GroupByMeta | |||
Input | ParallelGzipCsvInput | Parallel GZIP CSV file input reader | opdts.parallelgzipcsv.ParGzipCsvInputMeta | |||
Big Data | HadoopFileInputPlugin | Read data from a variety of different text-file types stored on a Hadoop cluster | opdts.hadoopfileinput.HadoopFileInputMeta | |||
Big Data | HadoopFileOutputPlugin | Write data to a variety of different text-file types stored on a Hadoop cluster | opdts.hadoopfileoutput.HadoopFileOutputMeta | |||
Big Data | HbaseInput | Read from an HBase column family | opdts.hbaseinput.HBaseInputMeta | |||
Big Data | HbaseOutput | Write to an HBase column family | opdts.hbaseoutput.HBaseOutputMeta | |||
Big Data | HBaseRowDecoder | Decodes an incoming key and HBase result object according to a mapping | opdts.hbaserowdecoder.HBaseRowDecoderMeta | |||
Input | HL7Input | Read data from HL7 data streams. | opdt.hl7.plugins.hl7input | |||
Lookup | HTTP | Call a web service over HTTP by supplying a base URL by allowing parameters to be set dynamically | opdts.http.HTTPMeta | |||
Lookup | HTTPPOST | Call a web service request over HTTP by supplying a base URL by allowing parameters to be set dynamically | opdts.httppost.HTTPPOSTMeta | |||
Deprecated | MQInput | Receive messages from any IBM Websphere MQ Server | ||||
Deprecated | MQOutput | Send messages to any IBM Websphere MQ Server | ||||
Flow | DetectLastRow | Last row will be marked | opdts.detectlastrow.DetectLastRowMeta | |||
Utility | IfNull | Sets a field value to a constant if it is null. | opdts.ifnull.IfNullMeta | |||
Bulk loading | InfobrightOutput | Load data to an Infobright database table | opdts.infobrightoutput.InfobrightLoaderMeta | |||
Bulk loading | VectorWiseBulkLoader | This step interfaces with the Ingres VectorWise Bulk Loader "COPY TABLE" command. | opdts.ivwloader.IngresVectorwiseLoaderMeta | |||
Inline | Injector | Injector step to allow to inject rows into the transformation through the java API | opdts.injector.InjectorMeta | |||
Output | InsertUpdate | Update or insert rows in a database based upon keys. | opdts.insertupdate.InsertUpdateMeta | |||
Flow | JavaFilter | Filter rows using java code | opdts.javafilter.JavaFilterMeta | |||
Deprecated (pre- v.8.1) Input (v.8.1 and after) | JmsInput | Receive messages from a JMS server | ||||
Deprecated (pre- v.8.1) Output (v.8.1 and after) | JmsOutput | Send messages to a JMS server | ||||
Flow | JobExecutor | This step executes a Pentaho Data Integration Job, passes parameters and rows. | opdts.jobexecutor.JobExecutorMeta | |||
Joins | JoinRows | The output of this step is the cartesian product of the input streams. The number of rows is the multiplication of the number of rows in the input streams. | opdts.joinrows.JoinRowsMeta | |||
Input | JsonInput | Extract relevant portions out of JSON structures (file or incoming field) and output rows | opdts.jsoninput.JsonInputMeta | |||
Output | JsonOutput | Create Json bloc and output it in a field ou a file. | opdts.jsonoutput.JsonOutputMeta | |||
Data Mining | KF | Executes a Knowledge Flow data mining process | org.pentaho.di.kf.KFMeta | |||
Input | LDAPInput | Read data from LDAP host | opdts.ldapinput.LDAPInputMeta | |||
Output | LDAPOutput | Perform Insert, upsert, update, add or delete operations on records based on their DN (Distinguished Name). | opdts.ldapoutput.LDAPOutputMeta | |||
Input | LDIFInput | Read data from LDIF files | opdts.ldifinput.LDIFInputMeta | |||
Input | LoadFileInput | Load file content in memory | opdts.loadfileinput.LoadFileInputMeta | |||
Deprecated | ||||||
Deprecated | LucidDBStreamingLoader | Load data into LucidDB by using Remote Rows UDX. | opdts.luciddbstreamingloader.LucidDBStreamingLoaderMeta | |||
Utility | Send eMail. | opdts.mail.MailMeta | ||||
Validation | MailValidator | Check if an email address is valid. | opdts.mailvalidator.MailValidatorMeta | |||
Mapping | Mapping | Run a mapping (sub-transformation), use MappingInput and MappingOutput to specify the fields interface | opdts.mapping.MappingMeta | |||
Mapping | MappingInput | Specify the input interface of a mapping | opdts.mappinginput.MappingInputMeta | |||
Mapping | MappingOutput | Specify the output interface of a mapping | opdts.mappingoutput.MappingOutputMeta | |||
Big Data | HadoopEnterPlugin | Key Value pairs enter here from Hadoop MapReduce | opdts.hadoopenter.HadoopEnterMeta | |||
Big Data | HadoopExitPlugin | Key Value pairs exit here and are pushed into Hadoop MapReduce | opdts.hadoopexit.HadoopExitMeta | |||
Lookup | MaxMindGeoIPLookup | Lookup an IPv4 address in a MaxMind database and add fields such as geography, ISP, or organization. | com.maxmind.geoip.MaxMindGeoIPLookupMeta | |||
Statistics | MemoryGroupBy | Builds aggregates in a group by fashion. This step doesn't require sorted input. | opdts.memgroupby.MemoryGroupByMeta | |||
Joins | MergeJoin | Joins two streams on a given key and outputs a joined set. The input streams must be sorted on the join key | opdts.mergejoin.MergeJoinMeta | |||
Joins | MergeRows | Merge two streams of rows, sorted on a certain key. The two streams are compared and the equals, changed, deleted and new rows are flagged. | opdts.mergerows.MergeRowsMeta | |||
Utility | StepMetastructure | This is a step to read the metadata of the incoming stream. | opdts.stepmeta.StepMetastructureMeta | |||
Input | AccessInput | Read data from a Microsoft Access file | opdts.accessinput.AccessInputMeta | |||
Output | AccessOutput | Stores records into an MS-Access database table. | opdts.accessoutput.AccessOutputMeta | |||
Input | ExcelInput | Read data from Excel and OpenOffice Workbooks (XLS, XLSX, ODS). | opdts.excelinput.ExcelInputMeta | |||
Output | ExcelOutput | Stores records into an Excel (XLS) document with formatting information. | opdts.exceloutput.ExcelOutputMeta | |||
Output | TypeExitExcelWriterStep | Writes or appends data to an Excel file | opdts.excelwriter.ExcelWriterStepMeta | |||
Scripting | ScriptValueMod | This steps allows the execution of JavaScript programs (and much more) | opdts.scriptvalues_mod.ScriptValuesMetaMod | |||
Input | MondrianInput | Execute and retrieve data using an MDX query against a Pentaho Analyses OLAP server (Mondrian) | opdts.mondrianinput.MondrianInputMeta | |||
Agile | ||||||
Bulk loading | MonetDBBulkLoader | Load data into MonetDB by using their bulk load command in streaming mode. | opdts.monetdbbulkloader.MonetDBBulkLoaderMeta | |||
Big Data | MongoDbInput | Reads all entries from a MongoDB collection in the specified database. | opdts.mongodbinput.MongoDbInputMeta | |||
Big Data | MongoDbOutput | Write to a MongoDB collection. | opdts.mongodboutput.MongoDbOutputMeta | |||
Joins | MultiwayMergeJoin | Multiway Merge Join | opdts.multimerge.MultiMergeJoinMeta | |||
Bulk loading | MySQLBulkLoader | MySQL bulk loader step, loading data over a named pipe (not available on MS Windows) | opdts.mysqlbulkloader.MySQLBulkLoaderMeta | |||
Utility | NullIf | Sets a field value to null if it is equal to a constant value | opdts.nullif.NullIfMeta | |||
Transform | NumberRange | Create ranges based on numeric field | opdts.numberrange.NumberRangeMeta | |||
Input | OlapInput | Execute and retrieve data using an MDX query against any XML/A OLAP datasource using olap4j | opdts.olapinput.OlapInputMeta | |||
Deprecated | OpenERPObjectDelete | Deletes data from the OpenERP server using the XMLRPC interface with the 'unlink' function. | opdts.openerp.objectdelete.OpenERPObjectDeleteMeta | |||
Deprecated | OpenERPObjectInput | Retrieves data from the OpenERP server using the XMLRPC interface with the 'read' function. | opdts.openerp.objectinput.OpenERPObjectInputMeta | |||
Deprecated | OpenERPObjectOutputImport | Updates data on the OpenERP server using the XMLRPC interface and the 'import' function | opdts.openerp.objectoutput.OpenERPObjectOutputMeta | |||
Bulk loading | OraBulkLoader | Use Oracle Bulk Loader to load data | opdts.orabulkloader.OraBulkLoaderMeta | |||
Statistics | StepsMetrics | Return metrics for one or several steps | opdts.stepsmetrics.StepsMetricsMeta | |||
Deprecated | PaloCellInput | Retrieves all cell data from a Palo cube | opdts.palo.cellinput | |||
Deprecated | PaloCellOutput | Updates cell data in a Palo cube | opdts.palo.celloutput | |||
Deprecated | PaloDimInput | Returns elements from a dimension in a Palo database | opdts.palo.diminput | |||
Deprecated | PaloDimOutput | Creates/updates dimension elements and element consolidations in a Palo database | opdts.palo.dimoutput | |||
Output | PentahoReportingOutput | Executes an existing report (PRPT) | opdts.pentahoreporting.PentahoReportingOutputMeta | |||
Bulk loading | PGBulkLoader | PostgreSQL Bulk Loader | opdts.pgbulkloader.PGBulkLoaderMeta | |||
Flow | PrioritizeStreams | Prioritize streams in an order way. | opdts.prioritizestreams.PrioritizeStreamsMeta | |||
Utility | ProcessFiles | Process one file per row (copy or move or delete). This step only accept filename in input. | opdts.processfiles.ProcessFilesMeta | |||
Output | PropertyOutput | Write data to properties file | opdts.propertyoutput.PropertyOutputMeta | |||
Input | PropertyInput | Read data (key, value) from properties files. | opdts.propertyinput.PropertyInputMeta | |||
Statistics | RScriptExecutor | Executes an R script within a PDI transformation | ||||
Scripting | RegexEval | Regular expression Evaluation. This step uses a regular expression to evaluate a field. It can also extract new fields out of an existing field with capturing groups. | opdts.regexeval.RegexEvalMeta | |||
Transform | ReplaceString | Replace all occurences a word in a string with another word. | opdts.replacestring.ReplaceStringMeta | |||
Statistics | ReservoirSampling | Transform Samples a fixed number of rows from the incoming stream | opdts.reservoirsampling.ReservoirSamplingMeta | |||
Lookup | Rest | Consume RESTfull services. REpresentational State Transfer (REST) is a key design idiom that embraces a stateless client-server architecture in which the web services are viewed as resources and can be identified by their URLs | opdts.rest.RestMeta | |||
Transform | Denormaliser | Denormalises rows by looking up key-value pairs and by assigning them to new fields in the output rows. This method aggregates and needs the input rows to be sorted on the grouping fields | opdts.denormaliser.DenormaliserMeta | |||
Transform | Flattener | Flattens consecutive rows based on the order in which they appear in the input stream | opdts.flattener.FlattenerMeta | |||
Transform | Normaliser | De-normalised information can be normalised using this step type. | opdts.normaliser.NormaliserMeta | |||
Input | RssInput | Read RSS feeds | opdts.rssinput.RssInputMeta | |||
Output | RssOutput | Read RSS stream. | opdts.rssoutput.RssOutputMeta | |||
Scripting | RuleExecutor | Execute a rule against each row (using Drools) | opdts.rules.RulesExecutorMeta | |||
Scripting | RuleAccumulator | Execute a rule against a set of rows (using Drools) | opdts.rules.RulesAccumulatorMeta | |||
Utility | SSH | Run SSH commands and returns result. | opdts.ssh.SSHMeta | |||
Input | S3CSVINPUT | S3 CSV Input | opdts.s3csvinput.S3CsvInputMeta | |||
Output | S3FileOutputPlugin | Exports data to a text file on an Amazon Simple Storage Service (S3) | com.pentaho.amazon.s3.S3FileOutputMeta | |||
Bulk loading | HanaBulkLoader | Bulk load data into SAP HANA | org.pentaho.di.trans.steps.hanabulkloader.HanaBulkLoaderMeta | |||
Output | SalesforceDelete | Delete records in Salesforce module. | opdts.salesforcedelete.SalesforceDeleteMeta | |||
Input | SalesforceInput | Reads information from SalesForce | opdts.salesforceinput.SalesforceInputMeta | |||
Output | SalesforceInsert | Insert records in Salesforce module. | opdts.salesforceinsert.SalesforceInsertMeta | |||
Output | SalesforceUpdate | Update records in Salesforce module. | opdts.salesforceupdate.SalesforceUpdateMeta | |||
Output | SalesforceUpsert | Insert or update records in Salesforce module. | opdts.salesforceupsert.SalesforceUpsertMeta | |||
Statistics | SampleRows | Filter rows based on the line number. | opdts.samplerows.SampleRowsMeta | |||
Deprecated | SapInput | Read data from SAP ERP, optionally with parameters | opdts.sapinput.SapInputMeta | |||
Input | SASInput | This step reads files in sas7bdat (SAS) native format | opdts.sasinput.SasInputMeta | |||
Experimental | ||||||
Cryptography | SecretKeyGenerator | Generate secrete key for algorithms such as DES, AES, TripleDES. | opdts.symmetriccrypto.secretkeygenerator.SecretKeyGeneratorMeta | |||
Transform | SelectValues | Select or remove fields in a row. Optionally, set the field meta-data: type, length and precision. | opdts.selectvalues.SelectValuesMeta | |||
Utility | SyslogMessage | Send message to Syslog server | opdts.syslog.SyslogMessageMeta | |||
Output | CubeOutput | Write rows of data to a data cube | opdts.cubeoutput.CubeOutputMeta | |||
Transform | SetValueField | Replace value of a field with another value field | opdts.setvaluefield.SetValueFieldMeta | |||
Transform | SetValueConstant | Replace value of a field to a constant | opdts.setvalueconstant.SetValueConstantMeta | |||
Job | FilesToResult | This step allows you to set filenames in the result of this transformation. Subsequent job entries can then use this information. | opdts.filestoresult.FilesToResultMeta | |||
BA Server | SetSessionVariableStep | Allows you to set the value of session variable | org.pentaho.di.baserver.utils.SetSessionVariableMeta | |||
Job | SetVariable | Set environment variables based on a single input row. | opdts.setvariable.SetVariableMeta | |||
Experimental | ||||||
Mapping | SimpleMapping | Turn a repetitive, re-usable part of a transformation (a sequence of steps) into a mapping (sub-transformation). | opdts.simplemapping.SimpleMapping | |||
Flow | SingleThreader | Executes a transformation snippet in a single thread. You need a standard mapping or a transformation with an Injector step where data from the parent transformation will arive in blocks. | opdts.singlethreader.SingleThreaderMeta | |||
Inline | SocketReader | Socket reader. A socket client that connects to a server (Socket Writer step). | opdts.socketreader.SocketReaderMeta | |||
Inline | SocketWriter | Socket writer. A socket server that can send rows of data to a socket reader. | opdts.socketwriter.SocketWriterMeta | |||
Transform | SortRows | Sort rows based upon field values (ascending or descending) | opdts.sort.SortRowsMeta | |||
Joins | SortedMerge | Sorted Merge | opdts.sortedmerge.SortedMergeMeta | |||
Transform | SplitFieldToRows3 | Splits a single string field by delimiter and creates a new row for each split term | opdts.splitfieldtorows.SplitFieldToRowsMeta | |||
Transform | FieldSplitter | When you want to split a single field into more then one, use this step type. | opdts.fieldsplitter.FieldSplitterMeta | |||
Transform | SplunkInput | Reads data from Splunk. | opdts.splunk.SplunkInputMeta | |||
Transform | SplunkOutput | Writes data to Splunk. | opdts.splunk.SplunkOutputMeta | |||
Output | SQLFileOutput | Output SQL INSERT statements to file | opdts.sqlfileoutput.SQLFileOutputMeta | |||
Lookup | StreamLookup | Look up values coming from another stream in the transformation. | opdts.streamlookup.StreamLookupMeta | |||
Big Data | SSTableOutput | writes to a filesystem directory as a Cassandra SSTable | opdts.cassandrasstableoutput.SSTableOutputMeta | |||
Deprecated | ||||||
Transform | StringOperations | Apply certain operations like trimming, padding and others to string value. | opdts.stringoperations.StringOperationsMeta | |||
Transform | StringCut | Strings cut (substring). | opdts.stringcut.StringCutMeta | |||
Flow | SwitchCase | Switch a row to a certain target step based on the case value in a field. | opdts.switchcase.SwitchCaseMeta | |||
Cryptography | SymmetricCryptoTrans | Encrypt or decrypt a string using symmetric encryption. Available algorithms are DES, AEC, TripleDES. | opdts.symmetriccrypto.symmetriccryptotrans.SymmetricCryptoTransMeta | |||
Output | SynchronizeAfterMerge | This step perform insert/update/delete in one go based on the value of a field. | opdts.synchronizeaftermerge.SynchronizeAfterMergeMeta | |||
Agile | ||||||
Utility | TableCompare | This step compares the data from two tables (provided they have the same lay-out). It'll find differences between the data in the two tables and log it. | opdts.tablecompare.TableCompareMeta | |||
Lookup | TableExists | Check if a table exists on a specified connection | opdts.tableexists.TableExistsMeta | |||
Input | TableInput | Read information from a database table. | opdts.tableinput.TableInputMeta | |||
Output | TableOutput | Write information to a database table | opdts.tableoutput.TableOutputMeta | |||
Bulk loading | TeraFast | The Teradata Fastload Bulk loader | opdts.terafast.TeraFastMeta | |||
Bulk loading | TeraDataBulkLoader | Bulk loading via TPT using the tbuild command. | ||||
Input | TextFileInput | Read data from a text file in several formats. This data can then be passed on to the next step(s)... | opdts.textfileinput.TextFileInputMeta | |||
Deprecated | TextFileOutput | Write rows to a text file. | opdts.textfileoutput.TextFileOutputMeta | |||
Flow | This step executes a Pentaho Data Integration transformation, sets parameters, and passes rows. | |||||
Transform | Unique | Remove double rows and leave only unique occurrences. This works only on a sorted input. If the input is not sorted, only double consecutive rows are handled correctly. | opdts.uniquerows.UniqueRowsMeta | |||
Transform | UniqueRowsByHashSet | Remove double rows and leave only unique occurrences by using a HashSet. | opdts.uniquerowsbyhashset.UniqueRowsByHashSetMeta | |||
Statistics | UnivariateStats | This step computes some simple stats based on a single input field | opdts.univariatestats.UnivariateStatsMeta | |||
Output | Update | Update data in a database table based upon keys | opdts.update.UpdateMeta | |||
Scripting | UserDefinedJavaClass | This step allows you to program a step using Java code | opdts.userdefinedjavaclass.UserDefinedJavaClassMeta | |||
Scripting | Janino | Calculate the result of a Java Expression using Janino | opdts.janino.JaninoMeta | |||
Transform | ValueMapper | Maps values of a certain field from one value to another | opdts.valuemapper.ValueMapperMeta | |||
Bulk loading | VerticaBulkLoader | Bulk loads data into a Vertica table using their high performance COPY feature | opdts.verticabulkload.VerticaBulkLoaderMeta | |||
Lookup | WebServiceLookup | Look up information using web services (WSDL) | opdts.webservices.WebServiceMeta | |||
Data Mining | KF | Executes a Knowledge Flow data mining process | org.pentaho.di.kf.KFMeta | |||
Utility | WriteToLog | Write data to log | opdts.writetolog.WriteToLogMeta | |||
Input | XBaseInput | Reads records from an XBase type of database file (DBF) | opdts.xbaseinput.XBaseInputMeta | |||
Input | XMLInputStream | This step is capable of processing very large and complex XML files very fast. | opdts.xmlinputstream.XMLInputStreamMeta | |||
Deprecated | ||||||
Joins | XMLJoin | Joins a stream of XML-Tags into a target XML string | opdts.xmljoin.XMLJoinMeta | |||
Output | XMLOutput | Write data to an XML file | opdts.xmloutput.XMLOutputMeta | |||
Validation | XSDValidator | Validate XML source (files or streams) against XML Schema Definition. | opdts.xsdvalidator.XsdValidatorMeta | |||
Transform | XSLT | Transform XML stream using XSL (eXtensible Stylesheet Language). | opdts.xslt.XsltMeta | |||
Input | YamlInput | Read YAML source (file or stream) parse them and convert them to rows and writes these to one or more output. | opdts.yamlinput.YamlInputMeta | |||
Utility | ZipFile | Creates a standard ZIP archive from the data stream fields | opdts.zipfile.ZipFileMeta |