Kettle Telemetry and Usage Statistics

Introduction

There are many reasons to collect usage statistics, for example:

It can help in improving the product in the main used areas and features (steps, job entries, database types etc.)
It can help the user to determine if some features are effected by a planned upgrade (the upgrade notes on each release cover affected steps, job entries etc.)
When it gets combined with usage statistics in development/test/production you can also determine if some jobs/transformation are never used

Solutions

Analyze the used steps, job entries and database types

Download the solution _analyze_trans_job
Within PDI/Kettle, please open the job _analyze_trans_job/transformations_jobs/0_analyze_trans_job.kjb
Look at the comment within the job, it gives you all the usage information. For example it is possible to anonymize file names, transformation and step names: please see the option anonymize_names within the parameters.txt file.
The sending of the results (../data/analyze_result_*.zip) to the given e-mail address is absolutely anonymous and you can opt-in with your personal data if you like.

An option for uploading your file (instead of sending per mail) can be found on this blog post.

If you want to contribute to this solution, the jobs/transformations are hosted on GitHub.

Note: This is limited actually to the file system and does not support a repository or repository exported file (todo).

Pentaho Operations Mart

Within the PDI Enterprise Edition, the Pentaho Operations Mart collects a lot of information and also usage statistics. These can be combined to see what jobs/transformations are used, how often, from what user etc.