The goal of backend dispatching in scikit-image
is to provide compatibility with various backends, written in different languages, for all types of hardware and architectures, without breaking any previous code or the current user-API. This is crucial for extending the library’s capabilities and ensuring it runs optimally on diverse environments.
Current Implementation in scikit-image #7520
-
Uses
importlib.metadata.entry_points()
to list and filter available backends via theskimage_backends
andskimage_backend_infos
entry point groups. NetworkX has a similar Pythonentry_point
based dispatching implementation(see here). -
Dispatching can be disabled by setting the environment variable
SKIMAGE_NO_DISPATCHING=1
. -
Functions are marked as dispatchable using a
@dispatchable
decorator, which checks for backend implementations of the function when called.(would become a class in future to enhance the dispatching capabilities and to better manage dispatchable algorithms) -
can_has
is the mechanism that allows a backend to accept or reject a function call based on the function name and the arguments passed in. If it isFalse
we move onto the next backend -
Currently, users cannot explicitly specify the backend they wish to use. Instead, all installed backends are sorted alphabetically, and the dispatch mechanism selects the first backend that both implements the function and whose
can_has
method doesn’t returnsFalse
. This backend’s implementation is then used for the function call. -
The
BackendInformation
class allows backends to specify additional information about the functions they support. -
A
DispatchNotification
warning is issued when a function is dispatched to a backend, informing the user about the dispatch. -
Backend discovery is cached using
functools.cache
to avoid repeated lookups. -
If no backend implements the function or dispatching is disabled, the original scikit-image function is used.
Future Enhancements
To make the system more flexible and robust, several improvements are proposed:
-
Enabling dispatching mechanisms other than environment variable-based, such as:
- Kwarg-based dispatching: Checks the function signature for a
backend=
kwarg and dispatches based on kwarg’s value. - Type-based dispatching: Dispatch based on the input array types (e.g.,
cupy -> cupy
).
- Kwarg-based dispatching: Checks the function signature for a
-
Backend-Specific Arguments: If necessary, introduce support for backend-specific arguments, ensuring clear guidelines and documentation for their usage.
- A few very stretched out extentions(ideas) of this :
- Support for non-
scikit-image
algorithms in the backend. - Allow dispatching from one backend to another if supported. A single backend supporting multiple array types. Ensure that different array types (e.g.,
cupy
,numpy
) work seamlessly with the backends.
- Support for non-
- A few very stretched out extentions(ideas) of this :
-
Fallback: Instead of falling back to scikit-image’s implementation when the selected backend lacks the required support or doesn’t have the implementation, we fall back to some other backend(s) based on a backend priority list provided by the user.
- Compatibility among multiple array types.
-
Testing the dispatching system should focus on:
- General tests applicable to all backends, like simple unit tests ensuring they are correctly discovered through
scikit-image
’s entry points.(relevant PR) - Running scikit-image tests for backends to test their algorithms (Opt-in).
- Implemented in NetworkX - might or might not be a preferred feature by the scikit-image backends.
- General tests applicable to all backends, like simple unit tests ensuring they are correctly discovered through
-
Dispatching Docs:
-
Displaying which backend implementations exsists for an algorithm on scikit-image’s user facing docs website.
-
Key points from the related issue#7550:
- Using JSON documents to track backend-supported functions and version compatibility, allowing real-time updates without new scikit-image releases.(But, is this possible with sphinx?)
- Suggests dynamically updating function docstrings to show available backends, with concerns about potential side effects.
- Explores handling new backends after release and providing more tools for inspecting backend states.
-
docs for backend developer(how to create a backend, etc.) and docs backend users, general docs(how dispatching is setup and how it works, etc. - can be an enhancement proposal doc)
-
-
Better Introspection
-
Version Compatibility: Define guidelines for version compatibility between
scikit-image
and its backends, including how new versions of backends are supported. Additionally, some general guidelines for backend developers will be beneficial. -
Challenges:
- Functions that don’t fit in this dispatching machinery?
Any thoughts on broader scientific Python ecosystem and entry-point
based dispatching? (spatch
, SPEC2, etc.)
- What kind of projects should adopt it?
- What are the parts of this
entry-point
based dispatching that would be common for most of the libraries (like the backend discover, etc.) and what are the parts that would be specific to a particular library?
Some References and Related Resources:
- networkx/networkx/utils/backends.py at main · networkx/networkx · GitHub
- Basic infrastructure for dispatching to a backend by betatim · Pull Request #7520 · scikit-image/scikit-image · GitHub
- Adding dispatching to scikit-image by Schefflera-Arboricola · Pull Request #7513 · scikit-image/scikit-image · GitHub
- WIP: cucim backend for skimage by JoOkuma · Pull Request #7466 · scikit-image/scikit-image · GitHub
- META Backend dispatching discussion · Issue #7550 · scikit-image/scikit-image · GitHub
- Dispatching and backend selection - HackMD
- Design space of dispatching an - HackMD
- Array API Discussion - HackMD
- Array API adoption · Issue #12 · scientific-python/summit-2024 · GitHub
- Spatch requirements for reuse in other libraries · Issue #1 · scientific-python/spatch · GitHub
- Start of a prototype for string-based, very strict, type dispatching by seberg · Pull Request #2 · scientific-python/spatch · GitHub
- https://youtu.be/16rB-fosAWw?si=n1pdlQwAhVNcyvlp
- specs/spec-0002/index.md at main · scientific-python/specs · GitHub
- Requirements and discussion of a type dispatcher for the ecosystem
- SPEC 2 — API Dispatch
- A proposed design for supporting multiple array types across SciPy, scikit-learn, scikit-image and beyond
- Computational engine plugin API with a "generic API" by betatim · Pull Request #24826 · scikit-learn/scikit-learn · GitHub
- [DRAFT] Engine plugin API and engine entry point for Lloyd's KMeans by ogrisel · Pull Request #24497 · scikit-learn/scikit-learn · GitHub
- https://youtu.be/HVLPJnvInzM?si=bKc57pYAwSbqL8Tn
- https://youtu.be/57DXVHOxdAI?si=Ltm28YNR083BITub
- Array Libraries Interoperability | Quansight Labs
- Dispatch by types example · GitHub
Thank you