pyspark.streaming.DStream.updateStateByKey¶
-
DStream.
updateStateByKey
(updateFunc: Callable[[Iterable[V], Optional[S]], S], numPartitions: Optional[int] = None, initialRDD: Union[pyspark.rdd.RDD[Tuple[K, S]], Iterable[Tuple[K, S]], None] = None) → pyspark.streaming.dstream.DStream[Tuple[K, S]]¶ Return a new “state” DStream where the state for each key is updated by applying the given function on the previous state of the key and the new values of the key.
- Parameters
- updateFuncfunction
State update function. If this function returns None, then corresponding state key-value pair will be eliminated.