pyspark.Broadcast¶
-
class
pyspark.
Broadcast
(sc: Optional[SparkContext] = None, value: Optional[T] = None, pickle_registry: Optional[BroadcastPickleRegistry] = None, path: Optional[str] = None, sock_file: Optional[BinaryIO] = None)¶ A broadcast variable created with
SparkContext.broadcast()
. Access its value throughvalue
.Examples
>>> from pyspark.context import SparkContext >>> sc = SparkContext('local', 'test') >>> b = sc.broadcast([1, 2, 3, 4, 5]) >>> b.value [1, 2, 3, 4, 5] >>> sc.parallelize([0, 0]).flatMap(lambda x: b.value).collect() [1, 2, 3, 4, 5, 1, 2, 3, 4, 5] >>> b.unpersist()
>>> large_broadcast = sc.broadcast(range(10000))
Methods
destroy
([blocking])Destroy all data and metadata related to this broadcast variable.
dump
(value, f)init_with_process_isolation
(sc, value, …)Initializes the broadcast variable through trusted file path.
load
(file)load_from_path
(path)unpersist
([blocking])Delete cached copies of this broadcast on the executors.
Attributes
Return the broadcasted value