pyspark.pandas.Series.asof

Series.asof(where: Union[Any, List]) → Union[int, float, bool, str, bytes, decimal.Decimal, datetime.date, datetime.datetime, None, pyspark.pandas.series.Series]

Return the last row(s) without any NaNs before where.

The last row (for each element in where, if list) without any NaN is taken.

If there is no good value, NaN is returned.

Note

This API is dependent on Index.is_monotonic_increasing() which is expensive.

Parameters
whereindex or array-like of indices
Returns
scalar or Series

The return can be:

  • scalar : when self is a Series and where is a scalar

  • Series: when self is a Series and where is an array-like

Return scalar or Series

Notes

Indices are assumed to be sorted. Raises if this is not the case and config ‘compute.eager_check’ is True. If ‘compute.eager_check’ is False pandas-on-Spark just proceeds and performs by ignoring the indeces’s order

Examples

>>> s = ps.Series([1, 2, np.nan, 4], index=[10, 20, 30, 40])
>>> s
10    1.0
20    2.0
30    NaN
40    4.0
dtype: float64

A scalar where.

>>> s.asof(20)
2.0

For a sequence where, a Series is returned. The first value is NaN, because the first element of where is before the first index value.

>>> s.asof([5, 20]).sort_index()
5     NaN
20    2.0
dtype: float64

Missing values are not considered. The following is 2.0, not NaN, even though NaN is at the index location for 30.

>>> s.asof(30)
2.0
>>> s = ps.Series([1, 2, np.nan, 4], index=[10, 30, 20, 40])
>>> with ps.option_context("compute.eager_check", False):
...     s.asof(20)
...
1.0