pyspark.sql.functions.zip_with

pyspark.sql.functions.zip_with(left: ColumnOrName, right: ColumnOrName, f: Callable[[pyspark.sql.column.Column, pyspark.sql.column.Column], pyspark.sql.column.Column]) → pyspark.sql.column.Column

Merge two given arrays, element-wise, into a single array using a function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying the function.

Parameters
leftColumn or str

name of the first column or expression

rightColumn or str

name of the second column or expression

ffunction

a binary function (x1: Column, x2: Column) -> Column... Can use methods of Column, functions defined in pyspark.sql.functions and Scala UserDefinedFunctions. Python UserDefinedFunctions are not supported (SPARK-27052).

Returns
Column

Examples

>>> df = spark.createDataFrame([(1, [1, 3, 5, 8], [0, 2, 4, 6])], ("id", "xs", "ys"))
>>> df.select(zip_with("xs", "ys", lambda x, y: x ** y).alias("powers")).show(truncate=False)
+---------------------------+
|powers                     |
+---------------------------+
|[1.0, 9.0, 625.0, 262144.0]|
+---------------------------+
>>> df = spark.createDataFrame([(1, ["foo", "bar"], [1, 2, 3])], ("id", "xs", "ys"))
>>> df.select(zip_with("xs", "ys", lambda x, y: concat_ws("_", x, y)).alias("xs_ys")).show()
+-----------------+
|            xs_ys|
+-----------------+
|[foo_1, bar_2, 3]|
+-----------------+