pyspark.streaming.DStream.transformWith

DStream.transformWith(func: Union[Callable[[pyspark.rdd.RDD[T], pyspark.rdd.RDD[U]], pyspark.rdd.RDD[V]], Callable[[datetime.datetime, pyspark.rdd.RDD[T], pyspark.rdd.RDD[U]], pyspark.rdd.RDD[V]]], other: pyspark.streaming.dstream.DStream[U], keepSerializer: bool = False) → pyspark.streaming.dstream.DStream[V]

Return a new DStream in which each RDD is generated by applying a function on each RDD of this DStream and ‘other’ DStream.

func can have two arguments of (rdd_a, rdd_b) or have three arguments of (time, rdd_a, rdd_b)