Deprecating the return of object arrays from stats.obrientransform

j-bowhay · June 27, 2024, 5:16pm

Hi all,

In the ongoing effort to convert stats functions to support the array API, we have come up against the slightly awkward stats.obrientransform. Currently, this function has the signature obrientransform(*samples), it takes an arbitrary number of samples. The issue is the return type. If all the samples are the same size then it will return a 2d array where the transformed sample is a row of the array. However, when the samples are not the same size, an object array of arrays is returned. This is problematic as the array API does not support object arrays.

Two possible (of many) deprecation strategies put forward in https://github.com/scipy/scipy/pull/21055#discussion_r1655386350 are:

Add an axis argument and warn that specifying axis is required because the default is changing from None to axis=0 to match other stats functions. Document that if the user specifies axis, the return type is a tuple rather than an array. At the end of the deprecation period, nothing needs to be removed, the return type is always a tuple, and the default axis is 0 instead of None.
Deprecate obrientransform and replace a different function. Since the function just loops over each sample there is an argument to be made that the function should only really take one sample.

To me, 1. seems marginally preferable as we would like to add an axis argument anyway; however, please do share your thoughts below.

Regards,
Jake

fancidev · June 30, 2024, 1:38pm

Are there other functions that take an axis argument and return a tuple? It seems a bit unusual to me. In addition, I’m under the impression that an axis argument usually specifies the dimension to reduce, which is not the case here. Therefore, I think Option 2 might be more suitable.