SparkSession.builder.appName (name)
|
Sets a name for the application, which will be shown in the Spark web UI. |
SparkSession.builder.config ([key, value, conf])
|
Sets a config option. |
SparkSession.builder.enableHiveSupport ()
|
Enables Hive support, including connectivity to a persistent Hive metastore, support for Hive SerDes, and Hive user-defined functions. |
SparkSession.builder.getOrCreate ()
|
Gets an existing SparkSession or, if there is no existing one, creates a new one based on the options set in this builder. |
SparkSession.builder.master (master)
|
Sets the Spark master URL to connect to, such as “local” to run locally, “local[4]” to run locally with 4 cores, or “spark://master:7077” to run on a Spark standalone cluster. |
SparkSession.catalog
|
Interface through which the user may create, drop, alter or query underlying databases, tables, functions, etc. |
SparkSession.conf
|
Runtime configuration interface for Spark. |
SparkSession.createDataFrame (data[, schema, …])
|
Creates a DataFrame from an RDD , a list, a pandas.DataFrame or a numpy.ndarray . |
SparkSession.getActiveSession ()
|
Returns the active SparkSession for the current thread, returned by the builder |
SparkSession.newSession ()
|
Returns a new SparkSession as new session, that has separate SQLConf, registered temporary views and UDFs, but shared SparkContext and table cache. |
SparkSession.range (start[, end, step, …])
|
Create a DataFrame with single pyspark.sql.types.LongType column named id , containing elements in a range from start to end (exclusive) with step value step . |
SparkSession.read
|
Returns a DataFrameReader that can be used to read data in as a DataFrame . |
SparkSession.readStream
|
Returns a DataStreamReader that can be used to read data streams as a streaming DataFrame . |
SparkSession.sparkContext
|
Returns the underlying SparkContext . |
SparkSession.sql (sqlQuery, args, **kwargs)
|
Returns a DataFrame representing the result of the given query. |
SparkSession.stop ()
|
Stop the underlying SparkContext . |
SparkSession.streams
|
Returns a StreamingQueryManager that allows managing all the StreamingQuery instances active on this context. |
SparkSession.table (tableName)
|
Returns the specified table as a DataFrame . |
SparkSession.udf
|
Returns a UDFRegistration for UDF registration. |
SparkSession.version
|
The version of Spark on which this application is running. |