pyspark.sql.DataFrame.pandas_api¶
-
DataFrame.
pandas_api
(index_col: Union[str, List[str], None] = None) → PandasOnSparkDataFrame¶ Converts the existing DataFrame into a pandas-on-Spark DataFrame.
If a pandas-on-Spark DataFrame is converted to a Spark DataFrame and then back to pandas-on-Spark, it will lose the index information and the original index will be turned into a normal column.
This is only available if Pandas is installed and available.
- Parameters
- index_col: str or list of str, optional, default: None
Index column of table in Spark.
See also
pyspark.pandas.frame.DataFrame.to_spark
Examples
>>> df.show() +----+----+ |Col1|Col2| +----+----+ | a| 1| | b| 2| | c| 3| +----+----+
>>> df.pandas_api() Col1 Col2 0 a 1 1 b 2 2 c 3
We can specify the index columns.
>>> df.pandas_api(index_col="Col1"): Col2 Col1 a 1 b 2 c 3