pyspark.RDD.persist¶
-
RDD.
persist
(storageLevel: pyspark.storagelevel.StorageLevel = StorageLevel(False, True, False, False, 1)) → pyspark.rdd.RDD[T]¶ Set this RDD’s storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet. If no storage level is specified defaults to (MEMORY_ONLY).
Examples
>>> rdd = sc.parallelize(["b", "a", "c"]) >>> rdd.persist().is_cached True