GaussianMixture¶
- 
class 
pyspark.mllib.clustering.GaussianMixture¶ Learning algorithm for Gaussian Mixtures using the expectation-maximization algorithm.
Methods
train(rdd, k[, convergenceTol, …])Train a Gaussian Mixture clustering model.
Methods Documentation
- 
classmethod 
train(rdd: pyspark.rdd.RDD[VectorLike], k: int, convergenceTol: float = 0.001, maxIterations: int = 100, seed: Optional[int] = None, initialModel: Optional[pyspark.mllib.clustering.GaussianMixtureModel] = None) → pyspark.mllib.clustering.GaussianMixtureModel¶ Train a Gaussian Mixture clustering model.
- Parameters
 - rdd:
pyspark.RDD Training points as an RDD of
pyspark.mllib.linalg.Vectoror convertible sequence types.- kint
 Number of independent Gaussians in the mixture model.
- convergenceTolfloat, optional
 Maximum change in log-likelihood at which convergence is considered to have occurred. (default: 1e-3)
- maxIterationsint, optional
 Maximum number of iterations allowed. (default: 100)
- seedint, optional
 Random seed for initial Gaussian distribution. Set as None to generate seed based on system time. (default: None)
- initialModelGaussianMixtureModel, optional
 Initial GMM starting point, bypassing the random initialization. (default: None)
- rdd:
 
- 
classmethod