pyspark.pandas.DataFrame.isin

DataFrame.isin(values: Union[List, Dict]) → pyspark.pandas.frame.DataFrame

Whether each element in the DataFrame is contained in values.

Parameters
valuesiterable or dict

The sequence of values to test. If values is a dict, the keys must be the column names, which must match. Series and DataFrame are not supported.

Returns
DataFrame

DataFrame of booleans showing whether each element in the DataFrame is contained in values.

Examples

>>> df = ps.DataFrame({'num_legs': [2, 4], 'num_wings': [2, 0]},
...                   index=['falcon', 'dog'],
...                   columns=['num_legs', 'num_wings'])
>>> df
        num_legs  num_wings
falcon         2          2
dog            4          0

When values is a list check whether every value in the DataFrame is present in the list (which animals have 0 or 2 legs or wings)

>>> df.isin([0, 2])
        num_legs  num_wings
falcon      True       True
dog        False       True

When values is a dict, we can pass values to check for each column separately:

>>> df.isin({'num_wings': [0, 3]})
        num_legs  num_wings
falcon     False      False
dog        False       True