Package
weka.filters.unsupervised.attribute
Synopsis
A filter for detecting outliers and extreme values based on interquartile ranges. The filter skips the class attribute.
Outliers:
Q3 + OF*IQR < x <= Q3 + EVF*IQR
or
Q1 - EVF*IQR <= x < Q1 - OF*IQR
Extreme values:
x > Q3 + EVF*IQR
or
x < Q1 - EVF*IQR
Key:
Q1 = 25% quartile
Q3 = 75% quartile
IQR = Interquartile Range, difference between Q1 and Q3
OF = Outlier Factor
EVF = Extreme Value Factor
Options
The table below describes the options available for InterquartileRange.
Option | Description |
---|---|
attributeIndices | Specify range of attributes to act on; this is a comma separated list of attribute indices, with "first" and "last" valid values; specify an inclusive range with "-", eg: "first-3,5,6-10,last". |
debug | Turns on output of debugging information. |
detectionPerAttribute | Generates Outlier/ExtremeValue attribute pair for each numeric attribute, not just a single pair for all numeric attributes together. |
extremeValuesAsOutliers | Whether to tag extreme values also as outliers. |
extremeValuesFactor | The factor for determining the thresholds for extreme values. |
outlierFactor | The factor for determining the thresholds for outliers. |
outputOffsetMultiplier | Generates an additional attribute 'Offset' that contains the multiplier the value is off the median: value = median + 'multiplier' * IQR |
Capabilities
The table below describes the capabilites of InterquartileRange.
Capability | Supported |
---|---|
Class | Unary class, Relational class, Date class, Missing class values, Numeric class, No class, String class, Empty nominal class, Binary class, Nominal class |
Attributes | Missing values, Nominal attributes, String attributes, Empty nominal attributes, Relational attributes, Binary attributes, Date attributes, Unary attributes, Numeric attributes |
Min # of instances | 0 |