pyspark.streaming.StreamingContext.checkpoint

StreamingContext.checkpoint(directory: str) → None

Sets the context to periodically checkpoint the DStream operations for master fault-tolerance. The graph will be checkpointed every batch interval.

Parameters
directorystr

HDFS-compatible directory where the checkpoint data will be reliably stored