pyspark.pandas.MultiIndex.from_frame¶

static MultiIndex.from_frame(df: pyspark.pandas.frame.DataFrame, names: Optional[List[Union[Any, Tuple[Any, …]]]] = None) → pyspark.pandas.indexes.multi.MultiIndex¶

Make a MultiIndex from a DataFrame.

Parameters

dfDataFrame: DataFrame to be converted to MultiIndex.
nameslist-like, optional: If no names are provided, use the column names, or tuple of column names if the columns is a MultiIndex. If a sequence, overwrite names with the given sequence.

Returns

MultiIndex: The MultiIndex representation of the given DataFrame.

See also

MultiIndex.from_arrays: Convert list of arrays to MultiIndex.
MultiIndex.from_tuples: Convert list of tuples to MultiIndex.
MultiIndex.from_product: Make a MultiIndex from cartesian product of iterables.

Examples

>>> df = ps.DataFrame([['HI', 'Temp'], ['HI', 'Precip'],
...                    ['NJ', 'Temp'], ['NJ', 'Precip']],
...                   columns=['a', 'b'])
>>> df  
      a       b
0    HI    Temp
1    HI  Precip
2    NJ    Temp
3    NJ  Precip

>>> ps.MultiIndex.from_frame(df)  
MultiIndex([('HI',   'Temp'),
            ('HI', 'Precip'),
            ('NJ',   'Temp'),
            ('NJ', 'Precip')],
           names=['a', 'b'])

Using explicit names, instead of the column names

>>> ps.MultiIndex.from_frame(df, names=['state', 'observation'])  
MultiIndex([('HI',   'Temp'),
            ('HI', 'Precip'),
            ('NJ',   'Temp'),
            ('NJ', 'Precip')],
           names=['state', 'observation'])

pyspark.pandas.MultiIndex.from_product

pyspark.pandas.MultiIndex.has_duplicates