A proposed design for supporting multiple array types across SciPy, scikit-learn, scikit-image and beyond

rgommers · December 2, 2021, 1:51pm

Thanks @seberg, I like your structure, what you wrote all makes sense to me (even if I had a slightly different structure in my head). A few short thoughts:

Re duck-typing protocol: mostly agree, it’s about a form of duck typing for the core array functionality. The change I’d like to make to your description is that it’s not array-api vs. __array_function__ but (a) we need a well-defined API which only the array API standard provides, and (b) we need some kind of dispatch/access mechanism - where we choose __array_namespace__ over __array_function__. In principle though, the functions in the array API standard could be combined with __array_function__.

In my mind (2) and (3) are combined because it’s a single dispatch layer put in front of compiled code (unlike (1)), and hence no Python code reuse is possible. (2) isn’t really “one library, one array object”, for example JAX has multiple array objects, and PyTorch is also growing new things like MaskedTensor which may or may not be subclasses of the main array/tensor object.

I’m not sure the with statement is the critical part here (nor is it something we necessarily want to keep - see the other opt-in vs. opt-out thread); you’re asking "is it possible to unambiguously choose an implementation I think (?). In the backend selection part, we have multiple implementations of a function for the same array object - so we need some kind of rule to select the desired backend, whether a with statement, or the order in which backends were registered.