pyspark.sql.functions.grouping_id¶
-
pyspark.sql.functions.
grouping_id
(*cols: ColumnOrName) → pyspark.sql.column.Column¶ Aggregate function: returns the level of grouping, equals to
(grouping(c1) << (n-1)) + (grouping(c2) << (n-2)) + … + grouping(cn)
Notes
The list of columns should match with grouping columns exactly, or empty (means all the grouping columns).
Examples
>>> df.cube("name").agg(grouping_id(), sum("age")).orderBy("name").show() +-----+-------------+--------+ | name|grouping_id()|sum(age)| +-----+-------------+--------+ | null| 1| 7| |Alice| 0| 2| | Bob| 0| 5| +-----+-------------+--------+