Grouping¶

`GroupedData.agg`(*exprs)	Compute aggregates and returns the result as a `DataFrame`.
`GroupedData.apply`(udf)	It is an alias of `pyspark.sql.GroupedData.applyInPandas()`; however, it takes a `pyspark.sql.functions.pandas_udf()` whereas `pyspark.sql.GroupedData.applyInPandas()` takes a Python native function.
`GroupedData.applyInPandas`(func, schema)	Maps each group of the current `DataFrame` using a pandas udf and returns the result as a DataFrame.
`GroupedData.avg`(*cols)	Computes average values for each numeric columns for each group.
`GroupedData.cogroup`(other)	Cogroups this group with another group so that we can run cogrouped operations.
`GroupedData.count`()	Counts the number of records for each group.
`GroupedData.max`(*cols)	Computes the max value for each numeric columns for each group.
`GroupedData.mean`(*cols)	Computes average values for each numeric columns for each group.
`GroupedData.min`(*cols)	Computes the min value for each numeric column for each group.
`GroupedData.pivot`(pivot_col[, values])	Pivots a column of the current `DataFrame` and perform the specified aggregation.
`GroupedData.sum`(*cols)	Computes the sum for each numeric columns for each group.
`PandasCogroupedOps.applyInPandas`(func, schema)	Applies a function to each cogroup using pandas and returns the result as a DataFrame.

pyspark.sql.WindowSpec.rowsBetween

pyspark.sql.GroupedData.agg