Hi team,
scipy.stats.linregress
is currently very flexible about how the independent and dependent variables are specified.
x, y : array
Two sets of measurements. Both arrays should have the same length. If only x is given (andy=None
), then it must be a two-dimensional array where one dimension has length 2. The two sets of measurements are then found by splitting the array along the length-2 dimension. In the case wherey=None
and x is a 2x2 array,linregress(x)
is equivalent tolinregress(x[0], x[1])
.
If we want to add an axis
argument and array-API support, this would get even more confusing, so I’d propose deprecating the one-argument use of stats.linregress
and requiring that the user pass x
and y
as separate arguments. To implement this, the stats.linregress
would split from the stats.mstats.linregress
implementation, and we’d deprecate use of masked arrays in the stats
version at the same time. I’ll open a PR for this shortly; in the meantime, thanks for your thoughts!
Matt