I would have been open to include bigger ambitions in 2.0 (given that major number bumps seem to be a thing of every ~10-15 years), but in any case, I’d like us to take the opportunity to conclude the make-kwargs-actually-kwargs-not-positional-arguments clean-up across all APIs (i.e. confirming @lucascolley and @j-bowhay’s mentions above)
This was essentially my original concern. Having reflected on the proposal I am fairly happy with it, I just share the concern you have raised. Either way, a learning point for version 3.0 ![]()
Yeah, this is exactly the sort of thing I was thinking of when I said we should focus on making sure everything works correctly. When I worked on cleaning up situations like this in signal I also thought it might be helpful to have a simple static check that all array creation functions like xp.asarray, xp.empty etc. always explicitly use the dtype and device kwargs. It wouldn’t guarantee that the correct values have been passed for these, but would at keep everyone mindful that these need to be set explicitly for internal array creation.
I think this would complement tests like you mentioned that have the default device set as GPU but use CPU arrays as inputs.
I had a look at the current state of the non-default device behavior. This is the only thing currently labeled as defect for all issues with the array types label: scipy#21736 and scipy#22680 (which are largely duplicate, I can write a better summary on one of those after my holiday and close the other).
For now I did a quick estimate with a grep of array creation functions and subtracting the calls that already use device= or apply asarray on input that’s already guaranteed to be an array. Rough estimate: about 10% of GPU-enabled public functions do not correctly support non-default device usage, mostly in signal, integrate and stats (special is clean now). This doesn’t look super hard to fix (the fix will be adding something like device=xp_device(a)to internal array creation calls as in scipy#22756). Audting existing code is fairly mechanical; having generic test coverage that’s usable and performant enough is the more challenging part - that needs an actual attempt at implementing it. There are some potential issues with fixtures mentioned in the linked issues; one idea I just had is that it should be possible to write a linter for this, given there’s a finite number of pure Python array creation functions to check. It should be supplemented by array-api-strict checks with a non-default device for GPU-enabled functions, but that’s easier than fully generic fixture-based machinery.
(to be continued on scipy#21736 in a week or two)
The actual issue wasn’t linked from the wiki or above; I think this is about scipy#18703, which relitigated the rejected scipy#14714. There was never an in-depth proposal, so I sketched out in scipy#18703-comment how I’d approach making the case for this idea.
Yes, this is what I meant by a “simple static check” above. I think it would be straightforward to write something that checks that all array creation functions explicitly set the device and dtype.
More keyword-only parameters would make things a lot easier for scipy-stubs.
If params like return_*=True and keepdims=False are allowed to be passed both positionally and by keyword, then you need two separate overloads to handle both these cases in scipy-stubs.
This can quickly become a mess, especially if you consider that every input combination requires an overload, so it’s not a case of +1 overload, but closer to x2. Making those keyword-only would avoid this.
So I’m +64 on this (which coincidentally is the highest number of overloads for a single function in scipy-stubs at the moment).
Re: keyword-only parameters, I would usually be one to push here, but I think asking for it in 2.0 is going to be a hard sell for some.
So maybe we make 2.0 be a surprisingly low-effort major release for users. (TBH, I think it will be zero for many / most.)
Then we make 3.0 a royal pain in the ass like Python 3.0.
That’s probably the most compelling argument given so far. They apply to a quite small subset of all functions though, so a proposal to make keywords that have an effect on the output type(s) would get you all of the gain here at a much more limited cost. In addition, all you need for scipy-stubs is deprecation, not removal. And all you’d need is to put the * before the type-changing keyword, which is often at the end. A quick count says that this is about ~70 functions, which is a lot more minimal than ~750 functions + ~1,500 methods. Deprecating positional usage of those problematic keywords in the next release seems quite feasible; the cost/benefit ratio is a lot better here.
Aside: somehow this argument didn’t come up for NumPy 2.0, since we still have positional keepdims in alland other reductions, return_* in unique, and compute_uv in svd?
From a scikit-learn perspective, having Array API support (here called array types support, where do the different names come from?) public in scipy helps a lot. It is a requirement for us to move our own Array API support out of experimental.
For a scipy 2.0 release, I would (like stronly) expect that the deprecation of sparse matrices is carried out. Only sparse arrays anymore. They are just sooooo much nicer. (Thanks @dschult for your great work here).
I just read the full thread here. I’m happy to manage the 2.0.0 release as the next release if that’s what the team ultimately decides. I already discussed it a bit with Ralf et al. in recent conference calls.
The default device handling thing seems useful to fix indeed, I just accidentally opened a triplicate issue about it (DOC/Query: default device handling outside the testsuite · Issue #25474 · scipy/scipy · GitHub) because I was trying to benchmark some SciPy functions (mostly rankdata) on supercomputer GPUs when I spent an hour or so trying to sort out why the testsuite was passing with a CUDA device but my benchmark was failing with torch default device set to the host.
While it seems like we’re mostly aligned on this thread, we’re probably not fully aligned yet - and also threads like this aren’t optimal to allow everyone in our pretty large maintainer team (and stakeholders beyond that) to weigh in. For NumPy 2.0 we had a 4 hour call, with some presentations from folks who wanted to propose a headline feature/change, and space for open discussion. I believe everyone involved found that very useful. So, I’d like to suggest that we do the same for SciPy 2.0: have a one-off call (2, 3, or 4 hours, depending on interest), with an agenda with a few topics with someone signed up to present and lead that part of the discussion. And then time for whatever else is on anyone’s mind. We could also touch on a potential timeline for a more breaking release (3.0), given that several folks have an interest in that, and in that happening much sooner than a decade from now.
WDYT?