Pentaho Map Reduce Vizor

NOTE: VIZOR IS AN EXPERIMENTAL TOOL. IT IS NOT RECOMMENDED FOR PRODUCTION USE.

Vizor monitors running Pentaho MapReduce (PMR) jobs that are distributed on Hadoop or Carte clusters. Vizor helps you debug PMR jobs by displaying errors, metrics, Hadoop output, Carte information, and other job execution information in a Spoon perspective. Because Vizor is experimental, it does not come bundled with PDI. You have to install Vizor before it is ready to be used.

Installation

To install Vizor, download it from the Support Portal, unzip it, and place it in the right directory.

  1. Download the pentaho-pmr-vizor-plugin-package<version>.dist.zip file from the Pentaho Support portal.
  2. Unzip the file, then follow the onscreen instructions to extract the vizor jar from the package.
  3. Move vizor jar file here: pentaho/design-tools/data-integration/plugins/Vizor. If the Vizor directory doesn't exist, create it.
  4. Set the VIZOR_SPOON_CARTE_HOSTNAME kettle variable to the local IP address for Carte.  See Set Kettle Variables to learn how to do this.
  5. Open the plugin.properties file in the pentaho/design-tools/data-integration/plugins/pentaho-big-data-plugin folder and add this line:  pmr.kettle.additional.plugins=Vizor
  6. Stop and restart Spoon if it is running.

Using Vizor

Once Vizor is installed, it is available any time you run a job. To use Vizor, complete these steps.

  1. In Spoon, open a job that has a Pentaho MapReduce entry in it.  
  2. Run the Pentaho MapReduce job.
  3. A message appears asking whether you want to use Vizor. Click Yes.
  • As the step runs, notes about the job execution progress and step statuses appear in the logging tab near the bottom of the window.
  • The Vizor perspective (which you can see when you click the button in the upper right corner of the Spoon main window) reveals more details about the jobs. It shows each transformation in the job, as well as transformation statistics. 
  • In the Vizor perspective, statistics displayed are common PDI statistics, such as: number of read and written rows, input and output, number of errors and execution status. Column types mirror those in other PDI steps.
  • Two tabs in the bottom of the screen are context-dependent and show logging messages for the selected node and steps output data respectively.

NOTE: The big-data plugin storage in hadoop might need to be cleared to make vizor work. To do it, use OpenURL to connect to your hadoop server, then remove /opt/pentaho/mapreduce contents.

Demo Video

You can download the video here: https://pentaho.box.com/PMRVizorUpdate1.  

key fixVersion summary status assignee updated

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Unknown macro: {scrollbar}