pyspark.pandas.DataFrame.last

DataFrame.last(offset: Union[str, pandas._libs.tslibs.offsets.DateOffset]) → pyspark.pandas.frame.DataFrame

Select final periods of time series data based on a date offset.

When having a DataFrame with dates as index, this function can select the last few rows based on a date offset.

Parameters
offsetstr or DateOffset

The offset length of the data that will be selected. For instance, ‘3D’ will display all the rows having their index within the last 3 days.

Returns
DataFrame

A subset of the caller.

Raises
TypeError

If the index is not a DatetimeIndex

Examples

>>> index = pd.date_range('2018-04-09', periods=4, freq='2D')
>>> psdf = ps.DataFrame({'A': [1, 2, 3, 4]}, index=index)
>>> psdf
            A
2018-04-09  1
2018-04-11  2
2018-04-13  3
2018-04-15  4

Get the rows for the last 3 days:

>>> psdf.last('3D')
            A
2018-04-13  3
2018-04-15  4

Notice the data for 3 last calendar days were returned, not the last 3 observed days in the dataset, and therefore data for 2018-04-11 was not returned.