object LinearDataGenerator
Generate sample data used for Linear Data. This class generates
uniformly random values for every feature and adds Gaussian noise with mean eps
to the
response variable Y
.
 Annotations
 @Since( "0.8.0" )
 Alphabetic
 By Inheritance
 LinearDataGenerator
 AnyRef
 Any
 Hide All
 Show All
 Public
 All
Value Members

final
def
!=(arg0: Any): Boolean
 Definition Classes
 AnyRef → Any

final
def
##(): Int
 Definition Classes
 AnyRef → Any

final
def
==(arg0: Any): Boolean
 Definition Classes
 AnyRef → Any

final
def
asInstanceOf[T0]: T0
 Definition Classes
 Any

def
clone(): AnyRef
 Attributes
 protected[lang]
 Definition Classes
 AnyRef
 Annotations
 @throws( ... ) @native()

final
def
eq(arg0: AnyRef): Boolean
 Definition Classes
 AnyRef

def
equals(arg0: Any): Boolean
 Definition Classes
 AnyRef → Any

def
finalize(): Unit
 Attributes
 protected[lang]
 Definition Classes
 AnyRef
 Annotations
 @throws( classOf[java.lang.Throwable] )

def
generateLinearInput(intercept: Double, weights: Array[Double], xMean: Array[Double], xVariance: Array[Double], nPoints: Int, seed: Int, eps: Double, sparsity: Double): Seq[LabeledPoint]
 intercept
Data intercept
 weights
Weights to be applied.
 xMean
the mean of the generated features. Lots of time, if the features are not properly standardized, the algorithm with poor implementation will have difficulty to converge.
 xVariance
the variance of the generated features.
 nPoints
Number of points in sample.
 seed
Random seed
 eps
Epsilon scaling factor.
 sparsity
The ratio of zero elements. If it is 0.0, LabeledPoints with DenseVector is returned.
 returns
Seq of input.
 Annotations
 @Since( "1.6.0" )

def
generateLinearInput(intercept: Double, weights: Array[Double], xMean: Array[Double], xVariance: Array[Double], nPoints: Int, seed: Int, eps: Double): Seq[LabeledPoint]
 intercept
Data intercept
 weights
Weights to be applied.
 xMean
the mean of the generated features. Lots of time, if the features are not properly standardized, the algorithm with poor implementation will have difficulty to converge.
 xVariance
the variance of the generated features.
 nPoints
Number of points in sample.
 seed
Random seed
 eps
Epsilon scaling factor.
 returns
Seq of input.
 Annotations
 @Since( "0.8.0" )

def
generateLinearInput(intercept: Double, weights: Array[Double], nPoints: Int, seed: Int, eps: Double = 0.1): Seq[LabeledPoint]
For compatibility, the generated data without specifying the mean and variance will have zero mean and variance of (1.0/3.0) since the original output range is [1, 1] with uniform distribution, and the variance of uniform distribution is (b  a)^{2} / 12 which will be (1.0/3.0)
For compatibility, the generated data without specifying the mean and variance will have zero mean and variance of (1.0/3.0) since the original output range is [1, 1] with uniform distribution, and the variance of uniform distribution is (b  a)^{2} / 12 which will be (1.0/3.0)
 intercept
Data intercept
 weights
Weights to be applied.
 nPoints
Number of points in sample.
 seed
Random seed
 eps
Epsilon scaling factor.
 returns
Seq of input.
 Annotations
 @Since( "0.8.0" )

def
generateLinearInputAsList(intercept: Double, weights: Array[Double], nPoints: Int, seed: Int, eps: Double): List[LabeledPoint]
Return a Java List of synthetic data randomly generated according to a multi collinear model.
Return a Java List of synthetic data randomly generated according to a multi collinear model.
 intercept
Data intercept
 weights
Weights to be applied.
 nPoints
Number of points in sample.
 seed
Random seed
 returns
Java List of input.
 Annotations
 @Since( "0.8.0" )

def
generateLinearRDD(sc: SparkContext, nexamples: Int, nfeatures: Int, eps: Double, nparts: Int = 2, intercept: Double = 0.0): RDD[LabeledPoint]
Generate an RDD containing sample data for Linear Regression models  including Ridge, Lasso, and unregularized variants.
Generate an RDD containing sample data for Linear Regression models  including Ridge, Lasso, and unregularized variants.
 sc
SparkContext to be used for generating the RDD.
 nexamples
Number of examples that will be contained in the RDD.
 nfeatures
Number of features to generate for each example.
 eps
Epsilon factor by which examples are scaled.
 nparts
Number of partitions in the RDD. Default value is 2.
 returns
RDD of LabeledPoint containing sample data.
 Annotations
 @Since( "0.8.0" )

final
def
getClass(): Class[_]
 Definition Classes
 AnyRef → Any
 Annotations
 @native()

def
hashCode(): Int
 Definition Classes
 AnyRef → Any
 Annotations
 @native()

final
def
isInstanceOf[T0]: Boolean
 Definition Classes
 Any

def
main(args: Array[String]): Unit
 Annotations
 @Since( "0.8.0" )

final
def
ne(arg0: AnyRef): Boolean
 Definition Classes
 AnyRef

final
def
notify(): Unit
 Definition Classes
 AnyRef
 Annotations
 @native()

final
def
notifyAll(): Unit
 Definition Classes
 AnyRef
 Annotations
 @native()

final
def
synchronized[T0](arg0: ⇒ T0): T0
 Definition Classes
 AnyRef

def
toString(): String
 Definition Classes
 AnyRef → Any

final
def
wait(): Unit
 Definition Classes
 AnyRef
 Annotations
 @throws( ... )

final
def
wait(arg0: Long, arg1: Int): Unit
 Definition Classes
 AnyRef
 Annotations
 @throws( ... )

final
def
wait(arg0: Long): Unit
 Definition Classes
 AnyRef
 Annotations
 @throws( ... ) @native()