Hi Team,
gh-22352 proposes the addition of an array API compatible quantile
function to scipy.stats
. Rationale includes:
- Based on data-apis/array-api#795, its not looking like the array API standard will converge on a
quantile
function, or at least not one that would do everything users might reasonably expect of aquantile
function. - Some SciPy functions will need an array API compatible quantile or median function if they are to be translated to the array API.
- Looking toward gh-22194, we’ll need a function to replace
mquantiles
, which is one of the most-used functions instats.mstats
.scoreatpercentile
is not be a great substitute for a few reasons, including the unfamiliar name.
Advantages compared to np.quantile
include:
- Rather than computing quantiles at all combinations of probability and data slice, it follows more familiar and flexible broadcasting rules. (Example use case: BCa bootstrap, which needs quantiles at different probabilities for each slice.)
- Regarding NaNs in the data array, it supports the standard
nan_policy
options rather than requiring different functions for NaN propagation/omission. - NaNs in the probability array produce NaNs in the output rather than causing the entire operation to fail.
- (to be added in immediate follow-up) Support for Harrell-Davis quantiles.
- (to be added in immediate follow-up, after gh-22393 merges) Support for masked arrays.
All are welcome to join the discussion in gh-22352!
Thanks!
Matt