pyspark.RDDBarrier

class pyspark.RDDBarrier(rdd: pyspark.rdd.RDD[T])

Wraps an RDD in a barrier stage, which forces Spark to launch tasks of this stage together. RDDBarrier instances are created by RDD.barrier().

Notes

This API is experimental

Methods

mapPartitions(f[, preservesPartitioning])

Returns a new RDD by applying a function to each partition of the wrapped RDD, where tasks are launched together in a barrier stage.

mapPartitionsWithIndex(f[, …])

Returns a new RDD by applying a function to each partition of the wrapped RDD, while tracking the index of the original partition.