pyspark.streaming.StreamingContext.remember

StreamingContext.remember(duration: int) → None

Set each DStreams in this context to remember RDDs it generated in the last given duration. DStreams remember RDDs only for a limited duration of time and releases them for garbage collection. This method allows the developer to specify how long to remember the RDDs (if the developer wishes to query old data outside the DStream computation).

Parameters
durationint

Minimum duration (in seconds) that each DStream should remember its RDDs