Hi Team,
anderson is unusual among scipy.stats functions because it does not include a p-value in the result object. Instead, the result object has a table of significance levels, corresponding critical values of the test statistic, and a FitResult object[1]. scipy.stats infrastructure is not equipped to handle this case, making it much more challenging to add array API support and other standard features like axis, nan_policy, and keepdims to the function.
To address this, gh-24030 introduces a method parameter, which would allow the user to select a p-value computation method - on-the-fly Monte Carlo simulation, or interpolation from published tables (which were originally produced via Monte Carlo simulation). When the new argument is not provided, the function would emit a FutureWarning advising the user to specify a method, but return an object with the non-standard attributes (as usual). When method is provided, the result object would include a pvalue attribute, but not the non-standard attributes. This strategy is intended to gracefully replace the non-standard attributes with p-values, notifying users of the upcoming changes and allowing them to defer code adjustments for two SciPy versions. At the end of the transition period, only the new result object would be available.
Similarly, anderson_ksamp returns a critical_values attribute that should be phased out for the same reasons. We cannot follow exactly the same strategy with that function because it already accepts a method argument. Instead, gh-24031 introduces a variant parameter to replace the less flexible midrank option. When variant is not provided, a warning advises the user of the upcoming change; when provided, the return object would not include the critical_values attribute.
These changes will also phase out the ability to (partially) unpack the return object as a tuple[2]. After the transition period, the functions can be translated to the array API, and support for axis, nan_policy, and keepdims can be added.
Please consider joining the discussion in gh-24030 (anderson) and gh-24031 (anderson_ksamp).
Thanks!
Matt Haberland
Before computers, practitioners would manually compute a test statistic and assess significance by comparing the result against tables of critical values and significance levels. Computers made it feasible to compute p-values, superseding direct use of tables in practice. ↩︎
As discussed elsewhere, this has a number of benefits, including improved static typing, more explicit user code, and the opportunity for future enhancements without backward-incompatible changes. ↩︎