pyspark.RDD.barrier

RDD.barrier() → pyspark.rdd.RDDBarrier[T]

Marks the current stage as a barrier stage, where Spark must launch all tasks together. In case of a task failure, instead of only restarting the failed task, Spark will abort the entire stage and relaunch all tasks for this stage. The barrier execution mode feature is experimental and it only handles limited scenarios. Please read the linked SPIP and design docs to understand the limitations and future plans.

Returns
RDDBarrier

instance that provides actions within a barrier stage.

Notes

For additional information see

This API is experimental