pyspark.RDD.cartesian¶
-
RDD.
cartesian
(other: pyspark.rdd.RDD[U]) → pyspark.rdd.RDD[Tuple[T, U]]¶ Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements
(a, b)
wherea
is in self andb
is in other.Examples
>>> rdd = sc.parallelize([1, 2]) >>> sorted(rdd.cartesian(rdd).collect()) [(1, 1), (1, 2), (2, 1), (2, 2)]