ThresholdSelector

Package

weka.classifiers.meta

Synopsis

A metaclassifier that selecting a mid-point threshold on the probability output by a Classifier. The midpoint threshold is set so that a given performance measure is optimized. Currently this is the F-measure. Performance is measured either on the training data, a hold-out set or using cross-validation. In addition, the probabilities returned by the base learner can have their range expanded so that the output probabilities will reside between 0 and 1 (this is useful if the scheme normally produces probabilities in a very narrow range).

Options

The table below describes the options available for ThresholdSelector.

Option

Description

classifier

The base classifier to be used.

debug

If set to true, classifier may output additional info to the console.

designatedClass

Sets the class value for which the optimization is performed. The options are: pick the first class value; pick the second class value; pick whichever class is least frequent; pick whichever class value is most frequent; pick the first class named any of "yes","pos(itive)", "1", or the least frequent if no matches).

evaluationMode

Sets the method used to determine the threshold/performance curve. The options are: perform optimization based on the entire training set (may result in overfitting); perform an n-fold cross-validation (may be time consuming); perform one fold of an n-fold cross-validation (faster but likely less accurate).

manualThresholdValue

Sets a manual threshold value to use. If this is set (non-negative value between 0 and 1), then all options pertaining to automatic threshold selection are ignored.

measure

Sets the measure for determining the threshold.

numXValFolds

Sets the number of folds used during full cross-validation and tuned fold evaluation. This number will be automatically reduced if there are insufficient positive examples.

rangeCorrection

Sets the type of prediction range correction performed. The options are: do not do any range correction; expand predicted probabilities so that the minimum probability observed during the optimization maps to 0, and the maximum maps to 1 (values outside this range are clipped to 0 and 1).

seed

The random number seed to be used.

Capabilities

The table below describes the capabilites of ThresholdSelector.

Capability

Supported

Class

Missing class values, Binary class

Attributes

Binary attributes, Date attributes, Nominal attributes, Numeric attributes, Empty nominal attributes, Unary attributes, Missing values

Min # of instances

1