pyspark.RDD.stats

RDD.stats() → pyspark.statcounter.StatCounter

Return a StatCounter object that captures the mean, variance and count of the RDD’s elements in one operation.