I created a repo that has a demo of dispatching to different backends with a uarray-based approach
Benchmark results and some discussion is given the README there. I mostly autogenerated the uarray multimethods on the scikit-image side using a script provided in the repository above, but it does create an additional layer of indirection that can make things a little harder for new contributors to understand. The Dask backend there is a more user-friendly way of using apply_parallel where a user would just have to register a dask_backend and never have to call apply_parallel (or the equivalent dask.array.map_blocks) directly.
We should make some corresponding demos of possibilities with the array_namespace approach that will become possible with NumPy 1.22. That approach is relatively simple to understand, but doesn’t help us with all of our functions that currently use Cython code.