KMeansSummary¶

class pyspark.ml.clustering.KMeansSummary(java_obj: Optional[JavaObject] = None)¶

Summary of KMeans.

Attributes

`cluster`	DataFrame of predicted cluster centers for each training data point.
`clusterSizes`	Size of (number of data points in) each cluster.
`featuresCol`	Name for column of features in predictions.
`k`	The number of clusters the model was trained with.
`numIter`	Number of iterations.
`predictionCol`	Name for column of predicted clusters in predictions.
`predictions`	DataFrame produced by the model’s transform method.
`trainingCost`	K-means cost (sum of squared distances to the nearest centroid for all points in the training dataset).

Attributes Documentation

cluster¶: DataFrame of predicted cluster centers for each training data point.

clusterSizes¶: Size of (number of data points in) each cluster.

featuresCol¶: Name for column of features in predictions.

k¶: The number of clusters the model was trained with.

numIter¶: Number of iterations.

predictionCol¶: Name for column of predicted clusters in predictions.

predictions¶: DataFrame produced by the model’s transform method.

trainingCost¶: K-means cost (sum of squared distances to the nearest centroid for all points in the training dataset). This is equivalent to sklearn’s inertia.

previous

KMeansModel

next

GaussianMixture