Putting on my skimage hat, we’d certainly love to be more performant. The question is always: at what cost (to the mostly volunteer development team).
We tend to rely on more foundational libraries in the ecosystem (NumPy, Cython, Pythran) to implement optimizations. We are also quite interested in whole-sale dispatching of some sort, so that GPU-optimized implementations of skimage can be called using the skimage API.
The more generic enchancements you propose, like building for more modern chipsets, or improving NumPy, all sound feasible and worth pursuing.