LogisticRegressionWithSGD¶
-
class
pyspark.mllib.classification.
LogisticRegressionWithSGD
¶ Train a classification model for Binary Logistic Regression using Stochastic Gradient Descent.
Use ml.classification.LogisticRegression or LogisticRegressionWithLBFGS.
Methods
train
(data[, iterations, step, …])Train a logistic regression model on the given data.
Methods Documentation
-
classmethod
train
(data: pyspark.rdd.RDD[pyspark.mllib.regression.LabeledPoint], iterations: int = 100, step: float = 1.0, miniBatchFraction: float = 1.0, initialWeights: Optional[VectorLike] = None, regParam: float = 0.01, regType: str = 'l2', intercept: bool = False, validateData: bool = True, convergenceTol: float = 0.001) → pyspark.mllib.classification.LogisticRegressionModel¶ Train a logistic regression model on the given data.
- Parameters
- data
pyspark.RDD
The training data, an RDD of
pyspark.mllib.regression.LabeledPoint
.- iterationsint, optional
The number of iterations. (default: 100)
- stepfloat, optional
The step parameter used in SGD. (default: 1.0)
- miniBatchFractionfloat, optional
Fraction of data to be used for each SGD iteration. (default: 1.0)
- initialWeights
pyspark.mllib.linalg.Vector
or convertible, optional The initial weights. (default: None)
- regParamfloat, optional
The regularizer parameter. (default: 0.01)
- regTypestr, optional
The type of regularizer used for training our model. Supported values:
“l1” for using L1 regularization
“l2” for using L2 regularization (default)
None for no regularization
- interceptbool, optional
Boolean parameter which indicates the use or not of the augmented representation for training data (i.e., whether bias features are activated or not). (default: False)
- validateDatabool, optional
Boolean parameter which indicates if the algorithm should validate data before training. (default: True)
- convergenceTolfloat, optional
A condition which decides iteration termination. (default: 0.001)
- data
-
classmethod