LassoWithSGD¶
-
class
pyspark.mllib.regression.
LassoWithSGD
¶ Train a regression model with L1-regularization using Stochastic Gradient Descent.
Use
pyspark.ml.regression.LinearRegression
with elasticNetParam = 1.0. Note the default regParam is 0.01 for LassoWithSGD, but is 0.0 for LinearRegression.Methods
train
(data[, iterations, step, regParam, …])Train a regression model with L1-regularization using Stochastic Gradient Descent.
Methods Documentation
-
classmethod
train
(data: pyspark.rdd.RDD[pyspark.mllib.regression.LabeledPoint], iterations: int = 100, step: float = 1.0, regParam: float = 0.01, miniBatchFraction: float = 1.0, initialWeights: Optional[VectorLike] = None, intercept: bool = False, validateData: bool = True, convergenceTol: float = 0.001) → pyspark.mllib.regression.LassoModel¶ Train a regression model with L1-regularization using Stochastic Gradient Descent. This solves the l1-regularized least squares regression formulation
f(weights) = 1/(2n) ||A weights - y||^2 + regParam ||weights||_1
Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with its corresponding right hand side label y. See also the documentation for the precise formulation.
- Parameters
- data
pyspark.RDD
The training data, an RDD of LabeledPoint.
- iterationsint, optional
The number of iterations. (default: 100)
- stepfloat, optional
The step parameter used in SGD. (default: 1.0)
- regParamfloat, optional
The regularizer parameter. (default: 0.01)
- miniBatchFractionfloat, optional
Fraction of data to be used for each SGD iteration. (default: 1.0)
- initialWeights
pyspark.mllib.linalg.Vector
or convertible, optional The initial weights. (default: None)
- interceptbool, optional
Boolean parameter which indicates the use or not of the augmented representation for training data (i.e. whether bias features are activated or not). (default: False)
- validateDatabool, optional
Boolean parameter which indicates if the algorithm should validate data before training. (default: True)
- convergenceTolfloat, optional
A condition which decides iteration termination. (default: 0.001)
- data
-
classmethod