GridSearch

Package

weka.classifiers.meta

Synopsis

Performs a grid search of parameter pairs for the a classifier (Y-axis, default is LinearRegression with the "Ridge" parameter) and the PLSFilter (X-axis, "# of Components") and chooses the best pair found for the actual predicting.

The initial grid is worked on with 2-fold CV to determine the values of the parameter pairs for the selected type of evaluation (e.g., accuracy). The best point in the grid is then taken and a 10-fold CV is performed with the adjacent parameter pairs. If a better pair is found, then this will act as new center and another 10-fold CV will be performed (kind of hill-climbing). This process is repeated until no better pair is found or the best pair is on the border of the grid.
In case the best pair is on the border, one can let GridSearch automatically extend the grid and continue the search. Check out the properties 'gridIsExtendable' (option '-extend-grid') and 'maxGridExtensions' (option '-max-grid-extensions <num>').

GridSearch can handle doubles, integers (values are just cast to int) and booleans (0 is false, otherwise true). float, char and long are supported as well.

The best filter/classifier setup can be accessed after the buildClassifier call via the getBestFilter/getBestClassifier methods.
Note on the implementation: after the data has been passed through the filter, a default NumericCleaner filter is applied to the data in order to avoid numbers that are getting too small and might produce NaNs in other schemes.

Available in Weka 3.6.x - 3.7.1. Available via the package management system for Weka >= 3.7.2 (gridSearch).

Options

The table below describes the options available for GridSearch.

Option

Description

XBase

The base of X.

XExpression

The expression for the X value (parameters: BASE, FROM, TO, STEP, I).

XMax

The maximum of X.

XMin

The minimum of X.

XProperty

The X property to test (normally the filter).

XStep

The step size of X.

YBase

The base of Y.

YExpression

The expression for the Y value (parameters: BASE, FROM, TO, STEP, I).

YMax

The maximum of Y.

YMin

The minimum of Y (normally the classifier).

YProperty

The Y property to test (normally the classifier).

YStep

The step size of Y.

classifier

The base classifier to be used.

debug

If set to true, classifier may output additional info to the console.

evaluation

Sets the criterion for evaluating the classifier performance and choosing the best one.

filter

The filter to be used (only used for setup).

gridIsExtendable

Whether the grid can be extended.

logFile

The log file to log the messages to.

maxGridExtensions

The maximum number of grid extensions, -1 for unlimited.

sampleSizePercent

The sample size (in percent) to use in the initial grid search.

seed

The random number seed to be used.

traversal

Sets type of traversal of the grid, either by rows or columns.

Capabilities

The table below describes the capabilites of GridSearch.

Capability

Supported

Class

Date class, Numeric class

Attributes

Missing values, Numeric attributes, Date attributes

Min # of instances

1