pyspark.sql.functions.zip_with¶
-
pyspark.sql.functions.
zip_with
(left: ColumnOrName, right: ColumnOrName, f: Callable[[pyspark.sql.column.Column, pyspark.sql.column.Column], pyspark.sql.column.Column]) → pyspark.sql.column.Column¶ Merge two given arrays, element-wise, into a single array using a function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying the function.
- Parameters
- left
Column
or str name of the first column or expression
- right
Column
or str name of the second column or expression
- ffunction
a binary function
(x1: Column, x2: Column) -> Column...
Can use methods ofColumn
, functions defined inpyspark.sql.functions
and ScalaUserDefinedFunctions
. PythonUserDefinedFunctions
are not supported (SPARK-27052).
- left
- Returns
Examples
>>> df = spark.createDataFrame([(1, [1, 3, 5, 8], [0, 2, 4, 6])], ("id", "xs", "ys")) >>> df.select(zip_with("xs", "ys", lambda x, y: x ** y).alias("powers")).show(truncate=False) +---------------------------+ |powers | +---------------------------+ |[1.0, 9.0, 625.0, 262144.0]| +---------------------------+
>>> df = spark.createDataFrame([(1, ["foo", "bar"], [1, 2, 3])], ("id", "xs", "ys")) >>> df.select(zip_with("xs", "ys", lambda x, y: concat_ws("_", x, y)).alias("xs_ys")).show() +-----------------+ | xs_ys| +-----------------+ |[foo_1, bar_2, 3]| +-----------------+