MulticlassMetrics

class pyspark.mllib.evaluation.MulticlassMetrics(predictionAndLabels: pyspark.rdd.RDD[Tuple[float, float]])

Evaluator for multiclass classification.

Parameters
predictionAndLabelspyspark.RDD

an RDD of prediction, label, optional weight and optional probability.

Examples

>>> predictionAndLabels = sc.parallelize([(0.0, 0.0), (0.0, 1.0), (0.0, 0.0),
...     (1.0, 0.0), (1.0, 1.0), (1.0, 1.0), (1.0, 1.0), (2.0, 2.0), (2.0, 0.0)])
>>> metrics = MulticlassMetrics(predictionAndLabels)
>>> metrics.confusionMatrix().toArray()
array([[ 2.,  1.,  1.],
       [ 1.,  3.,  0.],
       [ 0.,  0.,  1.]])
>>> metrics.falsePositiveRate(0.0)
0.2...
>>> metrics.precision(1.0)
0.75...
>>> metrics.recall(2.0)
1.0...
>>> metrics.fMeasure(0.0, 2.0)
0.52...
>>> metrics.accuracy
0.66...
>>> metrics.weightedFalsePositiveRate
0.19...
>>> metrics.weightedPrecision
0.68...
>>> metrics.weightedRecall
0.66...
>>> metrics.weightedFMeasure()
0.66...
>>> metrics.weightedFMeasure(2.0)
0.65...
>>> predAndLabelsWithOptWeight = sc.parallelize([(0.0, 0.0, 1.0), (0.0, 1.0, 1.0),
...      (0.0, 0.0, 1.0), (1.0, 0.0, 1.0), (1.0, 1.0, 1.0), (1.0, 1.0, 1.0), (1.0, 1.0, 1.0),
...      (2.0, 2.0, 1.0), (2.0, 0.0, 1.0)])
>>> metrics = MulticlassMetrics(predAndLabelsWithOptWeight)
>>> metrics.confusionMatrix().toArray()
array([[ 2.,  1.,  1.],
       [ 1.,  3.,  0.],
       [ 0.,  0.,  1.]])
>>> metrics.falsePositiveRate(0.0)
0.2...
>>> metrics.precision(1.0)
0.75...
>>> metrics.recall(2.0)
1.0...
>>> metrics.fMeasure(0.0, 2.0)
0.52...
>>> metrics.accuracy
0.66...
>>> metrics.weightedFalsePositiveRate
0.19...
>>> metrics.weightedPrecision
0.68...
>>> metrics.weightedRecall
0.66...
>>> metrics.weightedFMeasure()
0.66...
>>> metrics.weightedFMeasure(2.0)
0.65...
>>> predictionAndLabelsWithProbabilities = sc.parallelize([
...      (1.0, 1.0, 1.0, [0.1, 0.8, 0.1]), (0.0, 2.0, 1.0, [0.9, 0.05, 0.05]),
...      (0.0, 0.0, 1.0, [0.8, 0.2, 0.0]), (1.0, 1.0, 1.0, [0.3, 0.65, 0.05])])
>>> metrics = MulticlassMetrics(predictionAndLabelsWithProbabilities)
>>> metrics.logLoss()
0.9682...

Methods

call(name, *a)

Call method of java_model

confusionMatrix()

Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in “labels”.

fMeasure(label[, beta])

Returns f-measure.

falsePositiveRate(label)

Returns false positive rate for a given label (category).

logLoss([eps])

Returns weighted logLoss.

precision(label)

Returns precision.

recall(label)

Returns recall.

truePositiveRate(label)

Returns true positive rate for a given label (category).

weightedFMeasure([beta])

Returns weighted averaged f-measure.

Attributes

accuracy

Returns accuracy (equals to the total number of correctly classified instances out of the total number of instances).

weightedFalsePositiveRate

Returns weighted false positive rate.

weightedPrecision

Returns weighted averaged precision.

weightedRecall

Returns weighted averaged recall.

weightedTruePositiveRate

Returns weighted true positive rate.

Methods Documentation

call(name: str, *a: Any) → Any

Call method of java_model

confusionMatrix()pyspark.mllib.linalg.Matrix

Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in “labels”.

fMeasure(label: float, beta: Optional[float] = None) → float

Returns f-measure.

falsePositiveRate(label: float) → float

Returns false positive rate for a given label (category).

logLoss(eps: float = 1e-15) → float

Returns weighted logLoss.

precision(label: float) → float

Returns precision.

recall(label: float) → float

Returns recall.

truePositiveRate(label: float) → float

Returns true positive rate for a given label (category).

weightedFMeasure(beta: Optional[float] = None) → float

Returns weighted averaged f-measure.

Attributes Documentation

accuracy

Returns accuracy (equals to the total number of correctly classified instances out of the total number of instances).

weightedFalsePositiveRate

Returns weighted false positive rate.

weightedPrecision

Returns weighted averaged precision.

weightedRecall

Returns weighted averaged recall. (equals to precision, recall and f-measure)

weightedTruePositiveRate

Returns weighted true positive rate. (equals to precision, recall and f-measure)