pyspark.sql.DataFrame.toLocalIterator¶

DataFrame.toLocalIterator(prefetchPartitions: bool = False) → Iterator[pyspark.sql.types.Row]¶

Returns an iterator that contains all of the rows in this DataFrame. The iterator will consume as much memory as the largest partition in this DataFrame. With prefetch it may consume up to the memory of the 2 largest partitions.

Parameters

prefetchPartitionsbool, optional: If Spark should pre-fetch the next partition before it is needed.

Examples

>>> list(df.toLocalIterator())
[Row(age=2, name='Alice'), Row(age=5, name='Bob')]

pyspark.sql.DataFrame.toJSON

pyspark.sql.DataFrame.toPandas