Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

Package Manager

Weka 3.7.2 moves away from a single monolithic executable jar file to a modular package-based system. Although Weka's single jar file is only ~6Mb in size, it had become bloated in terms of the number of algorithms and options available. The plan was to have a stripped down "core" jar file that contains all the infrastructure plus a handful of the most well known algorithms from each of the main learning categories. All other algorithms will be available to the user as downloads via a package management system.

The main benefits of this approach are twofold. From the users perspective, Weka is less overwhelming (in terms of what is available initially) and easier to get started with. From the Weka maintainer's perspective, maintenance becomes less of a burden as it is made explicit which packages are external contributions and which come from the Weka team. Community members seeking help with an algorithm can either ask on the Weka forums (Pentaho or the Weka mailing list), or contact the author of the package in question.

Packages in 3.7.2 are hosted by either the Weka team (for internal code) or the author (for contributed code). The Weka team maintains a repository of meta data on all the available packages (not unlike the CRAN system used for the R statsitical software). Both command line and graphical package management clients are available. The package management system subsumes the existing plugin mechanisms in Weka (visualization plugins in the Explorer and the Knowledge Flow's plugin system). To alleviate library duplication, packages are able to depend on other packages (as well as a given version of the core system). The package management software takes care of resolving dependencies and detecting conflicts. This approach makes it possible for contributers to Weka to easily make use of external libraries. In the past we have avoided the use of external libraries due to the added complication they introduce to maintenance, installation and use of Weka. Under the new system, it is the responsibility of the contributer to make sure that their package(s) stay compatible with changes to external libraries (if used).

More information on using the package management system, the structure of packages, and how to contribute a package are available from the Weka Wiki on Wikispaces:

http://weka.wikispaces.com/How+do+I+use+the+package+manager%3F

...