pyspark.pandas.groupby.GroupBy.cumsum¶
-
GroupBy.
cumsum
() → FrameLike¶ Cumulative sum for each group.
- Returns
- Series or DataFrame
See also
Series.cumsum
DataFrame.cumsum
Examples
>>> df = ps.DataFrame( ... [[1, None, 4], [1, 0.1, 3], [1, 20.0, 2], [4, 10.0, 1]], ... columns=list('ABC')) >>> df A B C 0 1 NaN 4 1 1 0.1 3 2 1 20.0 2 3 4 10.0 1
By default, iterates over rows and finds the sum in each column.
>>> df.groupby("A").cumsum().sort_index() B C 0 NaN 4 1 0.1 7 2 20.1 9 3 10.0 1
It works as below in Series.
>>> df.B.groupby(df.A).cumsum().sort_index() 0 NaN 1 0.1 2 20.1 3 10.0 Name: B, dtype: float64