pyspark.sql.functions.corr¶
-
pyspark.sql.functions.
corr
(col1: ColumnOrName, col2: ColumnOrName) → pyspark.sql.column.Column¶ Returns a new
Column
for the Pearson Correlation Coefficient forcol1
andcol2
.Examples
>>> a = range(20) >>> b = [2 * x for x in range(20)] >>> df = spark.createDataFrame(zip(a, b), ["a", "b"]) >>> df.agg(corr("a", "b").alias('c')).collect() [Row(c=1.0)]