Requirements and discussion of a type dispatcher for the ecosystem

Continuing the discussion from A proposed design for supporting multiple array types across SciPy, scikit-learn, scikit-image and beyond:

Trying to split this out, and now continue with a bit more in-depth discussion about type dispatching. Also to throw in some concepts (sorry if I over-read these concepts already being introduced differently).

Type dispatching has a few pretty clear designs that we want to achieve:

  • Type dispatching is implicit, the end-user does nothing, but gets an implementation that works with whatever they passed in (cupy, Dask, NumPy, torch) – so long one is available.
  • We want “generic implementations”. That is implementations that may work for multiple array-objects.
  • Generic implementations have to take care to return the expected output type (in case they don’t get the expected inputs). We could try to provide helpers for this, but it is probably not generically possible (e.g. whether an operation returns a sparse or dense array is non-trivial.)

Design space

Let me jump straight into uarray vs. multiple-dispatchers here and introduce a (maybe) new term. In the other post I intentionally called this step “type dispatching” and not multiple-dispatching.

I propose to use these terms to delineate different dispatching implementations:

  • multiple-dispatching: A type dispatcher which uses the type hierarchy to find the best implementation
  • Last come first serve (LCFS): the uarray/__array_function__ design which finds an implementation by asking all “candidates”. If one says they can handle it they get it. Since we have to assume that later registered backends are more specific, we would ask them in reverse order. So the last registered backends gets it if they want it.

As opposed to multiple-dispatchers, our LCFS dispatchers also have domains:

  • Usual multiple-dispatchers register directly with a specific function
  • These LCFS dispatchers have a concept of “domains”. For __array_function__ the “domain” is all of NumPy. uarray extends this to arbitrarily nested/name-spaced domains.

The introduction of domains means that it is possible to override a function without registering with it specfically. To make this clear: Dask does not have an implementation for all of NumPy, but it always provides a fallback. This fallback issues a warning and converts the Dask array to a NumPy array.
This means Dask “supports” functions that it does not even know about.

One thing I am not sure about: Does uarray have some other magic, such as “registration” without actually importing the library that is overriden? (which would use the “domains” as an implementation)

(Other features?)

uarray currently(?) also has with statement to disable/enable versions. This is targeted at backend-selection but spills into type-dipatching. It is hard to discuss though, if this is a feature that is considered for removal.
For now, I wish to ignore this, but to me that was one of the main points of why the uarray discussion stalled the last time. So after nailing down what backend-selection API we want, we need to look at how it plays together with type dispatching.


What do we want/need?

I think the above points basically describe our design space with respect to “type dispatching”, let us introduce “backend selection” later. Backend selection is an argument why multiple-dispatchers may not work for us, but it is not an argument for uarray’s design choices!

Domains?

There are two points above, the easier one IMO is “domains”. To me it seems unimportant to provide function generic override capability (i.e. I can override multiple functions with a single implementation). For now, for all practical purposes to me “domains” are are currently only an implementation detail.

multiple-dispatching vs. LCFS

This is a difficult discussion, and I am not yet sure how much it matters in practice, but let me put forward two points:

  • This point I strongly disagree with: A multiple-dispatcher can insert a DuckArray into the type hierarchy. This could be based on an ABC checking for dunders or based on registration. If it helps, I can provide a proof of concept impelementation.

  • The disadvantage of LCFS is that it has no notion of a “best” implementation. This means that registration order is important. As an example:

    • skimage.function has a generic “CPU” implementation
    • pytorch adds a specific pytorch implementation to skimage.function
    • cupy adds a generic “GPU” implementation to skimage.function.

    If the user now passes a pytorch GPU array in, the cupy implementation will provide the implementation even though presumably the pytorch one is better.

Smaller points are maybe:

  • LCFS is simpler to reason about for the implementer
  • multiple-dispatching (to me!) seems actually easier to reason about for the person writing the functions. They just list the types that they want to handle DuckArray, MyArray. The only step they really need to understand is that you probably want to use MyArrayOrCoercable rather than MyArray, and we need to provide an easy way to get MyArrayOrCoercable.
  • multiple-dispatching may sometimes be tedious/bloated to register the right combinations of types (although I doubt this is a huge problem). If no implementation is found, multiple-dispatching may also be harder to debug/understand.
2 Likes

Sorry, I actually did not reply to any of your comments @rgommers. I agree with combining 2+3 in a single dispatching step, because it looks like everything else will be more confusing/limited. But I would like to wade through laying out where the advantages come from after figuring out the type-dispatching mostly. I do think there is more to it than code reuse.

And yes, type-dispatching is not per-library, since libraries can have multiple types. Although, libraries can also have abstract types to describe all of these, so while interesting, I don’t think it has any effect on the design.

Thanks @seberg, more interesting food for thought.

It’s not quite that simple, there are multiple design axes here. We knew this is an important discussion, so @IvanYashchuk started a separate thread on this: Default dispatching behavior for supporting multiple array types across SciPy, scikit-learn, scikit-image.

Also remember, NEP 37 was written because the “fully implicit, always on” model of __array_function__ was deemed non-ideal by its original author.

Could you review that other thread? I think it may make more sense to continue there, at least to discuss the desired user-observable behavior.

This is incorrect. I think you are assuming that libraries like CuPy use asarray to consume non-native array types. That is not the case - NumPy is the only library that does this (unfortunately). No matter the registration order, a CuPy array will end up with the CuPy backend here, a PyTorch tensor with the PyTorch backend, etc.

I believe you can make this work, but I don’t fully understand yet based on this sentence. Should the ABC here be inherited from by the duck array to make that work? Then that makes sense, it basically imposes a class relationship where today we don’t have one. That’d indeed be a reasonable thing to do/require.

1 Like

Not quite hard-removing that feature, but basically preferring the desired behavior. So if that’s auto-registration of backends and implicit dispatching, with an opt-in knob for each array-consuming library, then that’s what uarray can/will document (and implement if something is missing). It seems kind of strange to do it the other way around. For multiple dispatch one would have to add the opt-in knob, for uarray the auto-registration. Each of these methods can do whatever is desired I believe. The with-statement would then be an extra feature, but one wouldn’t need it for regular usage.

Thanks, Ralf. I will try to make a list with some of these possibilities, since I am not sure this thread is all that helpful for you in the current form. (I guess that will be crude as well, but it is the best way I can think how I may be able help – I want to try to not have strong opinions, so I was hoping to ask questions that may help converge to a solution.)

There are two main points that I wanted to make in this thread and hoped to give something to work with:

  • Is type-hierarchy based multiple-dispatching maybe a better match, at least for the more basic usage? Yes, it will not be quite enough! E.g. even with a type-hierarchy, backends probably need to be able to bail out – which I doubt is a typical feature as it requires searching for the next-best match.
  • I am unsure about some other choices in the current implementation of uarray, and wonder if we could do without them for simplicity:
    • Could we do without the with statement? (I remember this as one of the larger unclear points during the NEP 36 discussion, so I am a bit wary.)
    • Is there an important use case for the domains/backend concept that I am missing? Or is it mainly an implementation detail?

      Aside the fact that we need to put __ua_function__ somewhere? In general, I think you could write a generic __ua_function__ to allow a multiple-dispatching like syntax:

      _dispatcher = UArrayDomain("scipy.linalg")
      
      @ _dispatcher.add(baseimpl=scipy.linalg.inv, if_matches=cupy.ndarray)
      def inv(arr):
           # work
      

      But in that example, the domain will be irrelevant for simple dispatching (we could just as well register with the function directly). It is a way to allow enabling/disabling multiple functions (i.e. the backend) at once, but I think there are other options for that which should work sufficiently well (i.e. a flag/context-var to check).


Replying to the other points directly:

To me this thread reads as being about how libraries will either transition or implement the ability to toggle dispatching on/off. But not about how the dispatching will be done. I.e. I assume we have an agreement that this will look something like:

with opt_in():
    scipy.linalg.eigh(numpy_arr) -> numpy_arr
    scipy.linalg.eigh(cupy_arr) -> cupy_arr  (using the cupy backend)

where how/if the first line exists is that what the linked thread asks? But the other two lines will always work by looking at the input type. I have no input/thoughts on it right now, I think it is a choice that (adopting) library authors have to weigh in on.

Let me give an example of what I thought cupy (or someone else) could do:

@scipy.linalg.inv.override  # register a new override using the type hints
def inv(arr : SupportsCudaArrayInterface):
    cupy_arr = cupy.from_cuda_interface(arr)
    result = arr.__array_namespace__.empty_like(arr)
    cupy_res = cupy.from_cuda_interface(arr)
    # solve inv using `cupy_arr` and `cupy_res`
    return result  # will have the same type as `arr`.

The above function works for all array-like objects that also supports the cuda-array-interface! pytorch could be one of those objects (I am not sure it is). A type-hierarchy would be a way to “sort” which loop to prefer, even if it comes at the cost of possible ambiguity which one is better (although we could decide to not care: presumably all backends give the same result).

That is one possibility, you can also create an ABC for which isinstance(obj, ArrayAPILike) == hasattr(type(obj), "__array_namespace__") (without inheritance or ABC.register() being required). But yes, the point is that we can impose class relations even where they do not currently exist.

Answering the separate points first:

Yes, and yes. For the former, it’s about two things actually: whether or not there should be an opt-in, and who does the opting-in.

This should be strongly discouraged. NumPy does this, and it’s one of the larger design mistakes made in NumPy. No other array/tensor library will even consider anything like this I believe

This should be strongly discouraged. NumPy does this, and it’s one of the larger design mistakes made in NumPy. No other array/tensor library works this way, nor will consider any design change like this I believe. They all work only on their own array types, or in the case of PyTorch on its own tensor type plus objects implementing __torch_function__/__torch_dispatch__. Foreign array objects should be explicitly converted by the user, for which there are specific functions like asarray and from_dlpack.

Same for lists/tuples/generators/etc. by the way. The NumPy design, accepting anything that can possibly be converted to an ndarray, is a really poor one.

There should not be any type hierarchy between array objects from different libraries. They’re all at the same level, and unrelated to each other.

For dealing with duck arrays I think that may be a useful idea, with the constraints that it works per array library. So if some duck array object wants to be dispatched to NumPy and NumPy accepts such duck arrays by design, it could expose an ABC to inherit from.

The question is what that would gain us if it’s not enough (which I agree is the case)? The one thing I can think of is that the implementation in SciPy et al. is simpler. But then the ABC idea is needed, and once we add back things like bailing out, will it still be simpler? I suspect that it will start looking a lot more like uarray than like the simple examples with ints and floats that one typically finds in the docs of a type-based multiple dispatch library.

Maybe (?). I’m not 100% sure, especially about the domains question. IIRC there was a pragmatic reason for adding numpy. to the domain, but I forgot. The docs say: Additionally, __ua_domain__ can be a sequence of domains, such as a tuple or list of strings. This allows a single backend to implement functions from more than one domain. - probably hard to do without the domains concept. @hameerabbasi maybe you could answer this one?

Domains were built so that one could use one „global-ish“ NumPy backend to register against, that would work for SciPy, NumPy, Scikit-Learn, et.al. One would just do with(dask_backend): … where the domain is numpy and it would work for numpy.* as well, without needing to design a separate backend for each separate library.

I haven’t given some deep thought to whether this would be possible in the ABC world, my gut says „yes“, by using get_namespace(str) instead of get_namespace(), and if we all decided to hop on the same ABC train.

Regarding the with statement: It’s a deliberate choice to allow for array creation functions (for example) without the like= bolt-on.

For context, let me explain why I call it a bolt-on: It has to be changed all the way down the call chain for anything to work without modifications, like= has to be added everywhere. I consider that not nice.

(Sorry for the book-sized answer :frowning:)

Let me be clear: my issue as of now is that I am currently aware of barely any reasons for many of the design choices as soon as you start scratching the surface. The only argument I am aware of is against any existing multiple-dispatcher because they cannot do backend-selection out-of-the-box. And that is just not enough.

Maybe the main problem is that right now, we are missing any play-ground/examples for testing and to help with formulating alternatives. The existing uarray examples are all strange to me: I am not aware of a single example, not even a mockup, that uses implicit type-dispatching!

The altnerative (or additional) approach to having a play-ground and examples is to clarify the design-space and choices. And while I am happy to argue, it might be better to just embrace our disagreement for now!

For example, we disagree, or have not quite made up our minds, on the following questions:

  • Should we prefer with or a local (e.g. like=) approach? (For type dispatching, not backend-selection!)
  • Should we embrace “generic implementations” written by a 3rd party be OK and sort them into the dispatching order?
  • (Maybe also: How much overhead do we accept for dispatching?)
  • (Stolen from Stéfan: should the base implementation be able to validate, e.g. which arguments are passed, e.g. to deprecate them?)

But I do hope we can agree that they are reasonable questions, and that if we can find an answer to them it would be helpful for converging on a concrete implementation.


With that, let me discuss the questions/answers and API decisions a bit more

Let me reply to two points first:

  1. [generic implementations] This should be strongly discouraged. NumPy does this and it’s one of the larger design mistakes made in NumPy.
  2. There should not be any type-hierarchy between array objects from different libraries. They’re all at the same level, and unrelated to each other.

Both yes. There is no hierarchy between pytorch and NumPy, but I think we agree that there can be a rudimentary hierarchy? Without nailing it all down, e.g.:

          SupportsArrayApi
              /    \
             /      \
            / SupportsArrayAPI_IfCythonCompatible  # maybe?!
           /     |     \ 
          /      |      \
        cupy   numpy   dask.Array   
          \      |      /
           \     |     /
 (dask.Array | numpy | cupy)    # this is what a dask backend supports?

And we can or could have generic implementations:

  • An Array-API implementation which works for all “ducks”
  • The Dask backend is happy to work if the input is all cupy (it should take care to return a cupy array though).
  • I write a backend for a custom ABC. As you said, requiring registration will always (often?) be required here, though.

I do not think we can quite argue against generic implementations per-se, but you are right that subclasses tend to break Liskov’s substitution principle, so the restrictive approach may be needed. In my example, the crucial line of code is:

    result = arr.__array_namespace__.empty_like(arr)

which may not be right correct for an xarray in this example!

How I see we can resolve this:

  • We decide that only the original “base” library can provide generic (e.g. duck-typed) implementations. (E.g. a dask backend must only match if at least one input is a dask array, but unlike __array_function__, __ua_function__ does not enforce this!)
  • We define some form of hierarchy based on the types (yes, this may require more thought than just taking plum…)
  • We define some rudimentary hierarchy, e.g. signalling “generic” versions and always sorting them to the end (hoping this is enough).
  • We ignore the problem, because in practice it will rarely matter that I register pytorch with an ABC to make A work, and by that also do it for B. But B has a pytorch specific implementation that may get overshadowed.

I concede that the fact that subclasses often break Liskov’s may be a good enough argument to say that it is OK if we do not have a hierarchy. Although, I am not quite there yet myself.

Further reasons why I am digging into the “how” of dispatching:

Simpler is worth quite a a lot IMO! Although, you could write convenience functions to make uarray simpler for these simple cases! The point is that every bit of API complexity we remove gives us additional flexibility to improve things now or in the future.

Concretely one reason why I am bringin this up also (together with domains), is that AFAIK the uarray implementation formalizes, that the dispatcher:

  • knows nothing about how an implementation decides whether it will match (except maybe by convention)
  • has to scan all “domains” which may span many projects

This means that if I do skimage.some_function(...) we will probably search all backends in the numpy domain also? Since we have to scan numpy as well, lets say we could end up with up to 10 backends?!

Because we know nothing at all (the dispatching is fully parametric!) there is no way to do any caching.
And asking 10 backends won’t be super fast, but if we notice down the road that it is a problem, it will be very hard to do anything about it from the dispatcher side?

If you instead:

  • Scope on each function individually
  • Incentivise to do (most of) the matching using non-parametric registration time information (such as types to match exactly and for which parameters).

You should be able to write a fast caching mechanism so that it will not matter even if you have dozens of backends, because you will normally only ever ask one of them (or at least very few).

So, by adopting the current implementation as is, I think you would:

  • Limit the easy ability of caching in the dispatcher to speed up things in the future.
  • Would also need to tag on new API to allow features like skimage.function.get_implementations() to list all relevant backends rather than all potential backends. (And I like having that information, if just to report errors in that annoying C++ style list: none of these options matched!)

Both of these are things we probably get for free if we design the API in a more strict form, but will have trouble with if we start out with uarray as is. And tagging things on top of uarray to solve this seems like the approach to me right now.

Note that this is not relevant for __array_function__ because it only queries involved types (which is usually just one). I will also point out that the __array_function__ NEP argues against multiple-dispatching based on:

The main reason is that NumPy already has a well-proven dispatching mechanism with __array_ufunc__

which is a fair reason for __array_function__, but does not translates to uarray.

The with statement :confused:

Yes, I realize that the with statement largely comes from there, but I currently still see like= as both a safer and better API choice. I remember this as maybe the biggest reason for NEP 37 vs. uarray! NEP 37 says:

In the case of duck typing for NumPy’s public API, we think non-local or global control would be mistakes, mostly because they don’t compose well. If one library sets/needs one set of overrides and then internally calls a routine that expects another set of overrides, the resulting behavior may be very surprising. Higher order functions are especially problematic, because the context in which functions are evaluated may not be the context in which they are defined.

All of this applies just as much here. I still think the comfort you gain by not doing it locally leads to exactly the reverse bugs compared to only being able to do it locally. Except that the context manager makes it all much more complicated to reason about.
If the __array_function__ people say that like= is a big nuisance in practice, I may reconsider. But for now I feel the ball is currently firmly on the uarray side to defend the choice more, if that is how we are to proceed…

Domains

That is what I expected and I expect it may help for the with statement usage. Take that away and I think it might help slightly to enable toggle many functions at once, but I do think at that point it becomes a trade-off and my feeling is the scales tip the other side.

Here is one for scipy.fft: ENH: Set fft backend with try_last=True by AnirudhDagar · Pull Request #14359 · scipy/scipy · GitHub

Neither! (and both are bad UX-wise). This is the wrong comparison. Uarray will not be used for this purpose, you’re mixing it up with the array API standard.

I think we have prior art here. There were some complaints about __array_function__'s overhead, but not that many. And I believe all other alternatives have less overhead.

I’d say yes. It must remain possible to deprecate whole APIs or individual keywords/options, just like today. As well as add new keyword arguments.

There’s a lot about “generic implementations” in your answer, so just to make sure we’re on the same page, you mean:

# Generic API
y1_mylib = mylib.somefunc(x1_mylib)
y2_mylib = mylib.somefunc(x1_otherlib)
# the second line above is shorthand for:
y2_mylib = mylib.somefunc(mylib.asarray(x1_otherlib))

?

I guess so yes, there can be one. I’d like to avoid it though unless there’s a good reason to introduce one - because it’s not a small ask from array libraries.

Cool, this is more concrete, I had tried scipy.fft, but things evolved a bit in main. So you are deciding for putting the generic backend last with the try_last=True (I was not aware of that feature).

To make my point about performance:

from scipy import fft; import numpy as np
arr = np.random.random(20)

%timeit fft.rfft(arr)
# 3.78 µs ± 14 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

from cupyx.scipy import fft as cfft; fft.register_backend(cfft)
%timeit fft.rfft(arr)
# 5.31 µs ± 15.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Lets "register" some more backends (duplicating cupy, since that is what I have):
fft.register_backend(cfft)  # JAX
fft.register_backend(cfft)  # tensorflow
fft.register_backend(cfft)  # pytorch
fft.register_backend(cfft)  # Dask

%timeit fft.rfft(arr)
9.44 µs ± 30.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Is 6µs small enough to not worry about? How many backends are plausible: 5, 10, more? Do we bake in an expectation that only end-users register it so that N will always be in the 2-3 range?

__array_function__ does not have this problem, because it is effectively constant with the number of backends. Yes, __array_function__'s implementation is not super-fast, but it is not slow by design. uarray’s implementation is probably fast, but it can’t currently avoid the the linear scaling with the number of backends.

Note that one reason why I never went far with optimizing __array_function__ is that it passes args, kwargs which made some attempts just not all that worthwhile. If __array_function__ would pass *args, **kwargs instead it would be much more worthwhile to optimize probably (maybe not a year ago, but today we can expect Python 3.8).
This is also true for __ua_function__, of course.

How to write a correct backend?

To make one more point here, maybe more about documentation clarity. But __ua_convert__ states:

By convention, operations larger than O(log n) (where n is the size of the object in memory) should only be done if coerce is True .

This would mean that a JAX backend should coerce NumPy array’s even if coerce=False since that is is O(1). However, doing so and registering it, would break type-dispatching, because now the JAX backend will be selected for numpy arrays.
The cupy backend is safe here fortunately, since it correctly rejects coercion, as that would be O(n).

More generally, we are using __ua_convert__ here to decide where to type-dispatch, but I am not actually sure we need __ua_convert__ for anything else… I think it exists mainly/only to unlock context-manager based features, if we don’t end up with one, we may not even need it.


So SciPy, skimage, sklearn all have do not have any array creation functions? But well, then I guess we agree on not having a with statement for type-dispatching purposes, but maybe needing it in some form for opt-in or backend-selection.

Yes, and uarray may be a bit faster than __array_function__ for one backend. If you have to step through multiple backends __array_function__ is much faster (and probably easier to optimize a bit).

Close, but no. If the above was how it was implemented, it would return y2_mylib not y2_otherlib. If I use the Array-API to implement somefunc I can write a single somefunc implementation that returns:

# Generic API
y1_mylib = mylib.somefunc(x1_mylib)
y2_otherlib = mylib.somefunc(x1_otherlib)

And say I could write a pure Array-API version for scipy.fft, maybe I should consider contributing it to scipy.fft or registering it, so any duck that speaks the “Array-API” will work?

I agree with the last comment about avoiding hierarchies as much as possible, but I do not think it has to be a big ask of array-libraries, especially not for now.
The backend author interested in a generic implementation could do all of the work. Or it might be just a request: “Can you please add XY.register() because then I will be able to visualize my data using seaborn without doing the registration myself.”

By the way, I am warming up to the limitation of only allowing concrete “ducks”, like Array-API supported, or maybe an implementation that I can register with explicitly. And otherwise always using exact type checking (i.e. maybe not even allowing subclasses, at least not normally/initially).

I think if we bake that in, it will allow extremely fast implementations, especially if we add syntax for marking what to dispatch on using a decorator, like:

@dispatch_on(x=True, /, other_arr=True, argument1=False, *, argument2=False)
def function(x, /, other_arr=None, argument1="option1", *, argument2="option2"):
    pass 

for which I could write a fast C-layer, that should beat uarray nicely for one backend. But if we continue with:

# this could mean: one array must be cupy array, but it is OK if another is a NumPy one (mixed case):
@function._dispatch_if(cupy.ndarray, can_coerce=(numpy.ndarray,))
def cupy_version(x, /, other_arr=None, argument1="option1", *, argument2="option2"):
    pass

(or similar; the point is that the logic is non-paramteric – i.e. type based – and explicit at registration time). It will have almost O(1) scaling with the number of backends.

One interesting feature of uarray in this direction is that I think it allows marking e.g. dtypes as “dispatchable”. Do you have any experience with that, with the necessity of supporting it, or just with how it should be used?

Having just read this thread it’s entirely unclear to me what it’s about and what it actually means in practical terms for “the ecosystem”.

Perhaps this is completely irrelevant but I have been thinking recently about the role of symbolics (e.g. SymPy) in “the eosystem” and how poorly integrated things are in Python. In Julia you can just use the same sin function for anything e.g. sin(int), sin(float), sin(array), sin(symbolics) etc. In Python we apparently need different functions like math.sin, cmath.sin, np.sin, sympy.sin and so on. This is a constant source of confusion for new users who try to mix and match e.g. math.sin(np.array(...)) or math.sin(sympy.pi) etc. It would be much better if there was a stdlib module with functions that could be overloaded by different libraries so users could just do something like from mathfunctions import sin, cos, ... and downstream libraries could overload those functions in some reasonable way.

I’m not sure from the above whether this is even slightly relevant to this thread though :slight_smile:

1 Like

I’m not sure from the above whether this is even slightly relevant to this thread though

Yes, that is what type-dispatching means. It seems that the typical python duck-typing, is stretched beyond its limits for array-like objects, e.g. it comes to functionality like sklearn, skimage, SciPy` because array-likes are just too different (numpy, dask, cupy, torch, tensorflow, sparse arrays, …).

But nobody is aiming for things like modifying the standard library, or a scope beyond array objects, I think. So it is the same principle, but a much narrower focus/scope.

I don’t want to derail this discussion but would there be general interest in having overloadable stdlib functions for this so that all downstream libraries can make use of them? I would personally be happy to write a PEP etc but I would want to do it on the basis that there is support for the idea from multiple libraries in the scientific Python ecosystem and e.g. draft it with people here before taking it to the main Python mailing lists.

Searching for an indicator that growing this for Python stdlib function is so far fetched that it is not worthwhile, I did not find it. But I did find:

which lists some more existing modules that implement type (or more generic) dispatching in Python. And the PEP and those links seem to have some interesting notes on implementation/design as well. May be very worthwhile to look at.

Personally, I doubt it is worthwhile to push Python here. Few in the scientific python ecosystem uses math.sin anyway, we write np.sin (which actually can already dispatch), and I am not sure there are many functions aside math where it is a good fit in any case.
To me it seems not very likely that Python core will be interested, and on first sight at least, I am not sure it is worthwhile anyway. (largely because working with arrays is a whole different approach in some sense)

And as a general mechanism, it can be implemented, I think we just need to figure how much complexity we want to implement. E.g. even the current logic of singledispatch doesn’t seem bad, so long we can boil down the “relevant arguments” (i.e. array-like objects) to a single representative object (which is how __array_function__ works also).

Another other big argument against it, is that due to the overhead, it probably isn’t a great match for the language at large anyway. No matter how fast, it will always be slow compared to things like math.sin that works on floats.
Languages that heavily invest into multiple dispatching are usually compiled or heavily just-in-time specialized, I expect?

I don’t know whether we should take this to a separate thread (or how to do that within this forum) but let me know if I should stop derailing this thread.

It’s true that few use math.sin and that np.sin is more common but there’s an obvious reason for that: math.sin only works with floats. My proposal would be to add a new module let’s say ma.py so that you can do

import ma
import numpy as np
import sympy

ma.sin(1j)
ma.sin(1)
ma.sin(np.array([1, 2]))
ma.sin(sympy.Symbol('x'))
# etc

I’ve always presumed that part of the reason for explicitly writing np.sin(x) everywhere rather than just importing the function and using sin(x) is to be explicit about what function you are using. That’s necessary because it really matters when there are so many different sin functions that are not interchangeable for different types of arguments. If there was an agreed global sin function (as there is in other scientific languages) then we could just drop the np. and write sin(x). I’m sure you’re very used to instinctively writing np.sin everywhere but I personally think that the np. is boilerplate that clutters mathematical expressions do you really want to write np.sin(np.arctan(x)+np.exp(y)) rather than sin(arctan(x)+exp(y))?

Another problem this could fix is:

In [3]: np.sin(np.array([1], dtype=object))
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
AttributeError: 'int' object has no attribute 'sin'

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
<ipython-input-3-4ff984083e3b> in <module>
----> 1 np.sin(np.array([1], dtype=object))

TypeError: loop of ufunc does not support argument 0 of type int which has no callable sin method

If there was a globally dispatched sin function then np.sin could work happily with object arrays provided the associated objects overload sin (as is the case for symbolics in Matlab and for all types in Julia).

You seem skeptical but I think that core Python could be amenable to this if it was presented clearly and if it was clear that the scientific Python ecosystem was more or less united in thinking that it would be beneficial. The same problems exist to some extent in the stdlib with math, cmath, decimal, fractions. The problem doesn’t come up as much there though because almost everyone who would want these things uses third party modules in the scientific Python ecosystem. Unlike previous PEPs for core Python coming from the scientific side though this isn’t a major language change: it’s just adding a lightweight module based on single dispatch.

Hi everyone. I have been reading this thread (though couldn’t complete it all). I stumbled across the following statement and wanted to share some questions and thoughts in my mind. It is possible that these points might have been already discussed and resolved here or else where. Please feel free to point me to the resources/links of replies in that case.

First of all, I think the above statement assumes the fact for a single array type there will always be one backend. Technically a one to one mapping between array types and backend. However, what if there is a one to many mapping between array types and backends. Say for example I am implementing matrix multiplication APIs for my own analysis purposes. The APIs accept CuPy arrays but deal with the elements in their own way (details don’t matter for now). The signature of these APIs is exactly the same but they belong to different namespaces or domains in uarray terminology. For example, mat_mul.impl1 and mat_mul.impl2. Now, I register both of these with uarray (first mat_mul.impl1 and second mat_mul.impl2). Since, due to LCFS order followed here, mat_mul.impl2 will always be executed first. Now, what if I want to direct some of my API calls to mat_mul.impl1 for some cases of inputs. Is the only way to deal with this is de-register both backends and then change the order of registration in uarray? Is something like modifying priorities of backends in uarray possible? Say something like, uarray.register_backend(mat_mul.impl1), then later on, uarray.set_priority(mat_mul.impl1, 1000) (upgrading the priority and putting it above other backends). Conceptually can we use a priority queue instead of a stack to determine the backend to be used instead of assigning these priorities on the basis of how backends were registered originally?

P.S. - I am just a newbie, so please feel free to correct where you find me wrong. In addition, it would be great if you can provide a detailed explanation for the same (will be helpful for my learnings). Thanks.