pyspark.sql.DataFrameReader.schema¶

DataFrameReader.schema(schema: Union[pyspark.sql.types.StructType, str]) → pyspark.sql.readwriter.DataFrameReader¶

Specifies the input schema.

Some data sources (e.g. JSON) can infer the input schema automatically from data. By specifying the schema here, the underlying data source can skip the schema inference step, and thus speed up data loading.