pyspark.sql.functions.array_except

pyspark.sql.functions.array_except(col1: ColumnOrName, col2: ColumnOrName) → pyspark.sql.column.Column

Collection function: returns an array of the elements in col1 but not in col2, without duplicates.

Parameters
col1Column or str

name of column containing array

col2Column or str

name of column containing array

Examples

>>> from pyspark.sql import Row
>>> df = spark.createDataFrame([Row(c1=["b", "a", "c"], c2=["c", "d", "a", "f"])])
>>> df.select(array_except(df.c1, df.c2)).collect()
[Row(array_except(c1, c2)=['b'])]