pyspark.pandas.DataFrame.replace

DataFrame.replace(to_replace: Union[Any, List, Tuple, Dict, None] = None, value: Optional[Any] = None, inplace: bool = False, limit: Optional[int] = None, regex: bool = False, method: str = 'pad') → Optional[pyspark.pandas.frame.DataFrame]

Returns a new DataFrame replacing a value with another value.

Parameters
to_replaceint, float, string, list, tuple or dict

Value to be replaced.

valueint, float, string, list or tuple

Value to use to replace holes. The replacement value must be an int, float, or string. If value is a list or tuple, value should be of the same length with to_replace.

inplaceboolean, default False

Fill in place (do not create a new object)

Returns
DataFrame

Object after replacement.

Examples

>>> df = ps.DataFrame({"name": ['Ironman', 'Captain America', 'Thor', 'Hulk'],
...                    "weapon": ['Mark-45', 'Shield', 'Mjolnir', 'Smash']},
...                   columns=['name', 'weapon'])
>>> df
              name   weapon
0          Ironman  Mark-45
1  Captain America   Shield
2             Thor  Mjolnir
3             Hulk    Smash

Scalar to_replace and value

>>> df.replace('Ironman', 'War-Machine')
              name   weapon
0      War-Machine  Mark-45
1  Captain America   Shield
2             Thor  Mjolnir
3             Hulk    Smash

List like to_replace and value

>>> df.replace(['Ironman', 'Captain America'], ['Rescue', 'Hawkeye'], inplace=True)
>>> df
      name   weapon
0   Rescue  Mark-45
1  Hawkeye   Shield
2     Thor  Mjolnir
3     Hulk    Smash

Dicts can be used to specify different replacement values for different existing values To use a dict in this way the value parameter should be None

>>> df.replace({'Mjolnir': 'Stormbuster'})
      name       weapon
0   Rescue      Mark-45
1  Hawkeye       Shield
2     Thor  Stormbuster
3     Hulk        Smash

Dict can specify that different values should be replaced in different columns The value parameter should not be None in this case

>>> df.replace({'weapon': 'Mjolnir'}, 'Stormbuster')
      name       weapon
0   Rescue      Mark-45
1  Hawkeye       Shield
2     Thor  Stormbuster
3     Hulk        Smash

Nested dictionaries The value parameter should be None to use a nested dict in this way

>>> df.replace({'weapon': {'Mjolnir': 'Stormbuster'}})
      name       weapon
0   Rescue      Mark-45
1  Hawkeye       Shield
2     Thor  Stormbuster
3     Hulk        Smash