SPEC 1 — Lazy Loading for Submodules

Recommends lazy loading functionality for an easily-accessible namespace, but without compromising performance.

I think this is missing an important piece of the history with respect to scipy. Some of the earliest releases may have imported all of the sub-modules in scipy/__init__.py, but we very quickly moved to a lazy import mechanism (PackageLoader). We eventually dropped the lazy import mechanism because it failed too frequently in very confusing ways, especially at interactive prompts. So the history was “Greedy Imports → Lazy Imports → No Imports”.

I’m sure the modern tooling is better, as the Python import system has incorporated new functionality that we can rely on, but that should be acknowledged as something that might let us go back to something that we once abandoned.

1 Like

Tensorflow may also be a useful precedent here, e.g., see:

https://github.com/tensorflow/tensorflow/blob/v2.5.0/tensorflow/api_template.init.py

In the past, my experience with the lazy module loader in TensorFlow was not great (it placed restrictions on how you could do imports) but that seems to have been fixed now.

Still, there are lots of potential subtle incompatibility issues. For example, the editor in Google’s own Colab IPython notebook environment doesn’t recognize some TensorFlow sub-modules as valid:
image

Thank you, Robert. I wasn’t aware (or forgot about?) this part of the SciPy history. The implementation I have now is very simple, and does not make use of any magic other than overriding __getattr__ on the module (which is the relevant development on the Python side). My hope is that this would mean a low likelihood of breakage.

After a suggestion by Jon Crall, I will also add an environment variable to enable greedy importing, for debugging purposes.

Thanks for making me aware of TensorFlow’s approach, Stephan.

I can understand how installing lazy modules can cause all sorts of issues with editors. We started that way, but the latest implementation is a very simple override only of __dir__ and __getattr__. This means that editors can introspect the way they normally would, and that they would encounter the same objects they would with non-lazy imports (i.e., no proxy objects).

I have tested the skimage PR with IPython, but not yet with any of the other editors. If you have one of those set up and could give it a try, that would be great.

Also, would you like me to mention this in the SPEC, test the mechanism in some specific way, or are you expressing concern about the approach overall?

I commented primarily to clear up the history being told in the SPEC.

I’m not especially looking forward to reading code invoking scipy.linalg.whatever() all over the place (though I am looking forward to interactively tab-completing my way to the skimage functions I want but never remember if they are in skimage.filters or skimage.morphology).

@rkern I have started a PR to correct the history. I’ve used your words directly—is that OK? Maybe you would consider being co-author on the SPEC and letting me know of any other context I am missing.

W.r.t. the scipy.linalg.whatever, hopefully people will use sp.linalg.whatever at least. But, also, we could write a recommendation that submodules still be imported explicitly, unless there is a conflict of sorts (numpy.linalg, networkx.linalg, etc.).