Databricks Scala Spark API - org.apache.spark.mllib.tree.RandomForest

final def !=(arg0: Any): Boolean

Definition Classes: AnyRef → Any

final def ##(): Int

Definition Classes: AnyRef → Any

final def ==(arg0: Any): Boolean

Definition Classes: AnyRef → Any

final def asInstanceOf[T0]: T0

Definition Classes: Any

def clone(): AnyRef

Attributes: protected[lang]
Definition Classes: AnyRef
Annotations: @throws( ... ) @native()

final def eq(arg0: AnyRef): Boolean

Definition Classes: AnyRef

def equals(arg0: Any): Boolean

Definition Classes: AnyRef → Any

def finalize(): Unit

Attributes: protected[lang]
Definition Classes: AnyRef
Annotations: @throws( classOf[java.lang.Throwable] )

final def getClass(): Class[_]

Definition Classes: AnyRef → Any
Annotations: @native()

def hashCode(): Int

Definition Classes: AnyRef → Any
Annotations: @native()

def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

Attributes: protected
Definition Classes: Logging

def initializeLogIfNecessary(isInterpreter: Boolean): Unit

Attributes: protected
Definition Classes: Logging

final def isInstanceOf[T0]: Boolean

Definition Classes: Any

def isTraceEnabled(): Boolean

Attributes: protected
Definition Classes: Logging

def log: Logger

Attributes: protected
Definition Classes: Logging

def logDebug(msg: ⇒ String, throwable: Throwable): Unit

Attributes: protected
Definition Classes: Logging

def logDebug(msg: ⇒ String): Unit

Attributes: protected
Definition Classes: Logging

def logError(msg: ⇒ String, throwable: Throwable): Unit

Attributes: protected
Definition Classes: Logging

def logError(msg: ⇒ String): Unit

Attributes: protected
Definition Classes: Logging

def logInfo(msg: ⇒ String, throwable: Throwable): Unit

Attributes: protected
Definition Classes: Logging

def logInfo(msg: ⇒ String): Unit

Attributes: protected
Definition Classes: Logging

def logName: String

Attributes: protected
Definition Classes: Logging

def logTrace(msg: ⇒ String, throwable: Throwable): Unit

Attributes: protected
Definition Classes: Logging

def logTrace(msg: ⇒ String): Unit

Attributes: protected
Definition Classes: Logging

def logWarning(msg: ⇒ String, throwable: Throwable): Unit

Attributes: protected
Definition Classes: Logging

def logWarning(msg: ⇒ String): Unit

Attributes: protected
Definition Classes: Logging

final def ne(arg0: AnyRef): Boolean

Definition Classes: AnyRef

final def notify(): Unit

Definition Classes: AnyRef
Annotations: @native()

final def notifyAll(): Unit

Definition Classes: AnyRef
Annotations: @native()

val supportedFeatureSubsetStrategies: Array[String]

List of supported feature subset sampling strategies.

Annotations: @Since( "1.2.0" )

final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes: AnyRef

def toString(): String

Definition Classes: AnyRef → Any

def trainClassifier(input: JavaRDD[LabeledPoint], numClasses: Int, categoricalFeaturesInfo: Map[Integer, Integer], numTrees: Int, featureSubsetStrategy: String, impurity: String, maxDepth: Int, maxBins: Int, seed: Int): RandomForestModel

Java-friendly API for org.apache.spark.mllib.tree.RandomForest.trainClassifier

Annotations: @Since( "1.2.0" )

def trainClassifier(input: RDD[LabeledPoint], numClasses: Int, categoricalFeaturesInfo: Map[Int, Int], numTrees: Int, featureSubsetStrategy: String, impurity: String, maxDepth: Int, maxBins: Int, seed: Int = Utils.random.nextInt()): RandomForestModel

Method to train a decision tree model for binary or multiclass classification.

input: Training dataset: RDD of org.apache.spark.mllib.regression.LabeledPoint. Labels should take values {0, 1, ..., numClasses-1}.
numClasses: Number of classes for classification.
categoricalFeaturesInfo: Map storing arity of categorical features. An entry (n to k) indicates that feature n is categorical with k categories indexed from 0: {0, 1, ..., k-1}.
numTrees: Number of trees in the random forest.
featureSubsetStrategy: Number of features to consider for splits at each node. Supported values: "auto", "all", "sqrt", "log2", "onethird". If "auto" is set, this parameter is set based on numTrees: if numTrees == 1, set to "all"; if numTrees is greater than 1 (forest) set to "sqrt".
impurity: Criterion used for information gain calculation. Supported values: "gini" (recommended) or "entropy".
maxDepth: Maximum depth of the tree (e.g. depth 0 means 1 leaf node, depth 1 means 1 internal node + 2 leaf nodes). (suggested value: 4)
maxBins: Maximum number of bins used for splitting features (suggested value: 100)
seed: Random seed for bootstrapping and choosing feature subsets.
returns: RandomForestModel that can be used for prediction.

Annotations: @Since( "1.2.0" )

def trainClassifier(input: RDD[LabeledPoint], strategy: Strategy, numTrees: Int, featureSubsetStrategy: String, seed: Int): RandomForestModel

Method to train a decision tree model for binary or multiclass classification.

input: Training dataset: RDD of org.apache.spark.mllib.regression.LabeledPoint. Labels should take values {0, 1, ..., numClasses-1}.
strategy: Parameters for training each tree in the forest.
numTrees: Number of trees in the random forest.
featureSubsetStrategy: Number of features to consider for splits at each node. Supported values: "auto", "all", "sqrt", "log2", "onethird". If "auto" is set, this parameter is set based on numTrees: if numTrees == 1, set to "all"; if numTrees is greater than 1 (forest) set to "sqrt".
seed: Random seed for bootstrapping and choosing feature subsets.
returns: RandomForestModel that can be used for prediction.

Annotations: @Since( "1.2.0" )

def trainRegressor(input: JavaRDD[LabeledPoint], categoricalFeaturesInfo: Map[Integer, Integer], numTrees: Int, featureSubsetStrategy: String, impurity: String, maxDepth: Int, maxBins: Int, seed: Int): RandomForestModel

Java-friendly API for org.apache.spark.mllib.tree.RandomForest.trainRegressor

Annotations: @Since( "1.2.0" )

def trainRegressor(input: RDD[LabeledPoint], categoricalFeaturesInfo: Map[Int, Int], numTrees: Int, featureSubsetStrategy: String, impurity: String, maxDepth: Int, maxBins: Int, seed: Int = Utils.random.nextInt()): RandomForestModel

Method to train a decision tree model for regression.

input: Training dataset: RDD of org.apache.spark.mllib.regression.LabeledPoint. Labels are real numbers.
categoricalFeaturesInfo: Map storing arity of categorical features. An entry (n to k) indicates that feature n is categorical with k categories indexed from 0: {0, 1, ..., k-1}.
numTrees: Number of trees in the random forest.
featureSubsetStrategy: Number of features to consider for splits at each node. Supported values: "auto", "all", "sqrt", "log2", "onethird". If "auto" is set, this parameter is set based on numTrees: if numTrees == 1, set to "all"; if numTrees is greater than 1 (forest) set to "onethird".
impurity: Criterion used for information gain calculation. The only supported value for regression is "variance".
maxDepth: Maximum depth of the tree. (e.g., depth 0 means 1 leaf node, depth 1 means 1 internal node + 2 leaf nodes). (suggested value: 4)
maxBins: Maximum number of bins used for splitting features. (suggested value: 100)
seed: Random seed for bootstrapping and choosing feature subsets.
returns: RandomForestModel that can be used for prediction.

Annotations: @Since( "1.2.0" )

def trainRegressor(input: RDD[LabeledPoint], strategy: Strategy, numTrees: Int, featureSubsetStrategy: String, seed: Int): RandomForestModel

Method to train a decision tree model for regression.

input: Training dataset: RDD of org.apache.spark.mllib.regression.LabeledPoint. Labels are real numbers.
strategy: Parameters for training each tree in the forest.
numTrees: Number of trees in the random forest.
featureSubsetStrategy: Number of features to consider for splits at each node. Supported values: "auto", "all", "sqrt", "log2", "onethird". If "auto" is set, this parameter is set based on numTrees: if numTrees == 1, set to "all"; if numTrees is greater than 1 (forest) set to "onethird".
seed: Random seed for bootstrapping and choosing feature subsets.
returns: RandomForestModel that can be used for prediction.

Annotations: @Since( "1.2.0" )

final def wait(): Unit

Definition Classes: AnyRef
Annotations: @throws( ... )

final def wait(arg0: Long, arg1: Int): Unit

Definition Classes: AnyRef
Annotations: @throws( ... )

final def wait(arg0: Long): Unit

Definition Classes: AnyRef
Annotations: @throws( ... ) @native()

Packages

RandomForest

object RandomForest extends Serializable with Logging

Value Members

Inherited from Logging

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

RandomForest 

object RandomForest extends Serializable with Logging

Value Members

Inherited from Logging

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped

RandomForest