Input/Output

DataFrameReader.csv(path[, schema, sep, …])

Loads a CSV file and returns the result as a DataFrame.

DataFrameReader.format(source)

Specifies the input data source format.

DataFrameReader.jdbc(url, table[, column, …])

Construct a DataFrame representing the database table named table accessible via JDBC URL url and connection properties.

DataFrameReader.json(path[, schema, …])

Loads JSON files and returns the results as a DataFrame.

DataFrameReader.load([path, format, schema])

Loads data from a data source and returns it as a DataFrame.

DataFrameReader.option(key, value)

Adds an input option for the underlying data source.

DataFrameReader.options(**options)

Adds input options for the underlying data source.

DataFrameReader.orc(path[, mergeSchema, …])

Loads ORC files, returning the result as a DataFrame.

DataFrameReader.parquet(*paths, **options)

Loads Parquet files, returning the result as a DataFrame.

DataFrameReader.schema(schema)

Specifies the input schema.

DataFrameReader.table(tableName)

Returns the specified table as a DataFrame.

DataFrameReader.text(paths[, wholetext, …])

Loads text files and returns a DataFrame whose schema starts with a string column named “value”, and followed by partitioned columns if there are any.

DataFrameWriter.bucketBy(numBuckets, col, *cols)

Buckets the output by the given columns.

DataFrameWriter.csv(path[, mode, …])

Saves the content of the DataFrame in CSV format at the specified path.

DataFrameWriter.format(source)

Specifies the underlying output data source.

DataFrameWriter.insertInto(tableName[, …])

Inserts the content of the DataFrame to the specified table.

DataFrameWriter.jdbc(url, table[, mode, …])

Saves the content of the DataFrame to an external database table via JDBC.

DataFrameWriter.json(path[, mode, …])

Saves the content of the DataFrame in JSON format (JSON Lines text format or newline-delimited JSON) at the specified path.

DataFrameWriter.mode(saveMode)

Specifies the behavior when data or table already exists.

DataFrameWriter.option(key, value)

Adds an output option for the underlying data source.

DataFrameWriter.options(**options)

Adds output options for the underlying data source.

DataFrameWriter.orc(path[, mode, …])

Saves the content of the DataFrame in ORC format at the specified path.

DataFrameWriter.parquet(path[, mode, …])

Saves the content of the DataFrame in Parquet format at the specified path.

DataFrameWriter.partitionBy(*cols)

Partitions the output by the given columns on the file system.

DataFrameWriter.save([path, format, mode, …])

Saves the contents of the DataFrame to a data source.

DataFrameWriter.saveAsTable(name[, format, …])

Saves the content of the DataFrame as the specified table.

DataFrameWriter.sortBy(col, *cols)

Sorts the output in each bucket by the given columns on the file system.

DataFrameWriter.text(path[, compression, …])

Saves the content of the DataFrame in a text file at the specified path.