pyspark.RDD.cleanShuffleDependencies¶
-
RDD.
cleanShuffleDependencies
(blocking: bool = False) → None¶ Removes an RDD’s shuffles and it’s non-persisted ancestors.
When running without a shuffle service, cleaning up shuffle files enables downscaling. If you use the RDD after this call, you should checkpoint and materialize it first.
- Parameters
- blockingbool, optional
block on shuffle cleanup tasks. Disabled by default.
Notes
This API is a developer API.