pyspark.SparkContext.pickleFile¶
-
SparkContext.
pickleFile
(name: str, minPartitions: Optional[int] = None) → pyspark.rdd.RDD[Any]¶ Load an RDD previously saved using
RDD.saveAsPickleFile()
method.Examples
>>> tmpFile = NamedTemporaryFile(delete=True) >>> tmpFile.close() >>> sc.parallelize(range(10)).saveAsPickleFile(tmpFile.name, 5) >>> sorted(sc.pickleFile(tmpFile.name, 3).collect()) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]