pyspark.pandas.Index.symmetric_difference¶

Index.symmetric_difference(other: pyspark.pandas.indexes.base.Index, result_name: Union[Any, Tuple[Any, …], None] = None, sort: Optional[bool] = None) → pyspark.pandas.indexes.base.Index¶

Compute the symmetric difference of two Index objects.

Parameters

otherIndex or array-like
result_namestr
sortTrue or None, default None: Whether to sort the resulting index. * True : Attempt to sort the result. * None : Do not sort the result.

Returns

symmetric_differenceIndex

Notes

symmetric_difference contains elements that appear in either idx1 or idx2 but not both. Equivalent to the Index created by idx1.difference(idx2) | idx2.difference(idx1) with duplicates dropped.

Examples

>>> s1 = ps.Series([1, 2, 3, 4], index=[1, 2, 3, 4])
>>> s2 = ps.Series([1, 2, 3, 4], index=[2, 3, 4, 5])

>>> s1.index.symmetric_difference(s2.index)  
Int64Index([5, 1], dtype='int64')

You can set name of result Index.

>>> s1.index.symmetric_difference(s2.index, result_name='pandas-on-Spark')  
Int64Index([5, 1], dtype='int64', name='pandas-on-Spark')

You can set sort to True, if you want to sort the resulting index.

>>> s1.index.symmetric_difference(s2.index, sort=True)
Int64Index([1, 5], dtype='int64')

You can also use the ^ operator:

>>> s1.index ^ s2.index  
Int64Index([5, 1], dtype='int64')

pyspark.pandas.Index.difference

pyspark.pandas.Index.asof