pyspark.pandas.DataFrame.corrwith¶
-
DataFrame.
corrwith
(other: Union[DataFrame, Series], drop: bool = False, method: str = 'pearson') → Series¶ Compute pairwise correlation.
Pairwise correlation is computed between rows or columns of DataFrame with rows or columns of Series or DataFrame. DataFrames are first aligned along both axes before computing the correlations.
- Parameters
- otherDataFrame, Series
Object with which to compute correlations.
- dropbool, default False
Drop missing indices from result.
- methodstr, default ‘pearson’
Method of correlation, one of:
pearson : standard correlation coefficient
- Returns
- Series
Pairwise correlations.
See also
DataFrame.corr
Compute pairwise correlation of columns.
Examples
>>> df1 = ps.DataFrame({ ... "A":[1, 5, 7, 8], ... "X":[5, 8, 4, 3], ... "C":[10, 4, 9, 3]}) >>> df1.corrwith(df1[["X", "C"]]) X 1.0 C 1.0 A NaN dtype: float64
>>> df2 = ps.DataFrame({ ... "A":[5, 3, 6, 4], ... "B":[11, 2, 4, 3], ... "C":[4, 3, 8, 5]})
>>> with ps.option_context("compute.ops_on_diff_frames", True): ... df1.corrwith(df2) A -0.041703 C 0.395437 X NaN B NaN dtype: float64
>>> with ps.option_context("compute.ops_on_diff_frames", True): ... df2.corrwith(df1.X) A -0.597614 B -0.151186 C -0.642857 dtype: float64