Update on array types support in SciPy

Hi all, this is a short update after all of the hard work from many people on support for non-numpy array types that went into 1.16.x. I just posted this comment on gh-18867, which is a very brief summary of where we’re at today and a “plan for a plan” to have some more discussion around how things are looking now and the road ahead. I’ll repost that comment unchanged below

Note: this is more of a status update than a request for in-depth discussion right now. I think we can have more constructive discussions in a few weeks, after writing some more docs and more detailed summaries.


After discussion on the current status, changes in infrastructure, and documentation in gh-22960 and now that the maintenance/1.16.x branch has been created with a lot of changes and improvements related to array types support in it, it’s time for a summary and trying to get everyone on the same page about where we are at and who does what.

My take on the current state is that:

  1. All the machinery, conventions, and other things we needed are in place and should no longer be changing much.
  2. It has been changing a lot over the past year, so we need to slow down, document it better, and discuss it more as a team.
  3. My rough guess is that we’re about 40-50% through with implementing support across all public functionality. It’d be nice to have data to back this up.

Short term, these are the next actions to do to ensure we can all get on the same page:

  • Summarize the tradeoffs of adding JAX support (performance benefits, maintenance costs).
    Who: @crusaderky. Where: gh-22246.
  • Write a tool that generates a single-page overview of our coverage across submodules/functions, in an easy to understand format (e.g. % of coverage per array library for each submodule).
    Who: @steppi. Where: gh-23021
  • Ensure the developer docs at Support for the array API standard — SciPy v1.17.0.dev Manual are fully up-to-date and cover any new machinery, conventions of code patterns to use, and design rules (e.g., when to delegate to other libraries) added over the past release cycle.
    Who: (not assigned yet, I can probably do this soon if no one volunteers - happy to collaborate too)
  • Ensure the tracker in this issues, and the sub-issues for it, are up-to-date.
    Who: (not assigned yet, some requests: @ev-br can you do signal and interpolate, @lucascolley can you do this tracker?)

The above will hopefully not take more than 2-3 weeks. After that, we should do the following:

  • Evaluate the tradeoffs for JAX support based on the to-be-written summary and decide whether we’re happy.
  • After learning some lessons from the JAX evaluation, do the same for Dask (partial overlap between them, but also significant differences).
    • The same will apply to other array libraries in the future (e.g., MLX, ndonnx, PyTorch with MPS backend), so ensure we have a standard template for making this evaluation, covering topics like expected maintenance costs and performance benefits of adding explicit support.
  • Have a little brainstorm about remaining open issues, missing docs, opportunities, and concerns. It may be useful to do this synchronously as well as async - higher-bandwidth communication can be helpful.
  • Write a proposal about the path to making all of this public, rather than hidden behind an environment variable, and discuss that.
1 Like