pyspark.RDD.cartesian¶
-
RDD.cartesian(other: pyspark.rdd.RDD[U]) → pyspark.rdd.RDD[Tuple[T, U]]¶ Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements
(a, b)whereais in self andbis in other.Examples
>>> rdd = sc.parallelize([1, 2]) >>> sorted(rdd.cartesian(rdd).collect()) [(1, 1), (1, 2), (2, 1), (2, 2)]