pyspark.pandas.Series.quantile¶
-
Series.
quantile
(q: Union[float, Iterable[float]] = 0.5, accuracy: int = 10000) → Union[int, float, bool, str, bytes, decimal.Decimal, datetime.date, datetime.datetime, None, pyspark.pandas.series.Series]¶ Return value at the given quantile.
Note
Unlike pandas’, the quantile in pandas-on-Spark is an approximated quantile based upon approximate percentile computation because computing quantile across a large dataset is extremely expensive.
- Parameters
- qfloat or array-like, default 0.5 (50% quantile)
0 <= q <= 1, the quantile(s) to compute.
- accuracyint, optional
Default accuracy of approximation. Larger value means better accuracy. The relative error can be deduced by 1.0 / accuracy.
- Returns
- float or Series
If the current object is a Series and
q
is an array, a Series will be returned where the index isq
and the values are the quantiles, otherwise a float will be returned.
Examples
>>> s = ps.Series([1, 2, 3, 4, 5]) >>> s.quantile(.5) 3.0
>>> (s + 1).quantile(.5) 4.0
>>> s.quantile([.25, .5, .75]) 0.25 2.0 0.50 3.0 0.75 4.0 dtype: float64
>>> (s + 1).quantile([.25, .5, .75]) 0.25 3.0 0.50 4.0 0.75 5.0 dtype: float64