Please try out the latest main branch, and let us know if you uncover any issues. This change should improve import speed and, more importantly, allow interactive exploration of the entire namespace.
import skimage as ski
ski.filters.gaussian(...) # This now works!
If you’d prefer all modules to be imported on load, you can set the environment variable EAGER_IMPORT=1.
This looks great! Are you planning to do this for SciPy? If not, I am happy to do it. I quickly looked at the scikit-image PR and seems like there is not much to do Or do you advice to wait a bit?
When I originally proposed this, I don’t think the reception from SciPy was very warm; perhaps because we got burnt in that project with trying this early on. But, perhaps worth having a discussion about it now that we have a proof of concept on the table?
The evidence will be stronger once we’ve released scikit-image and have a few months of use under our belt, which should allow us to surface any complaints.
Having said that, we haven’t had any complaints about it on the napari side.
For what is worth, I would support this. It’s IMO a good idea and we should be able to do it.
This is so convenient for users to have this, especially newcomers.
I also think it’s good for the respective “brands”. NumPy has np and we should have the equivalent for SciPy, scikit-image, etc.
If PEP 690 existed, I think lazy.attach etc would not be required, and e.g. skimage/filters/__init__.py could just contain a series of normal import statements of its submodules, which PEP 690 could make lazy.
Currently the PEP is geared towards application developers making imports in their application lazy, not so much libraries making their own internal imports lazy by default, for e.g. interactive use. If “lazy by default” is important here, it may require adding something to the PEP to allow a module to declare all imports within it lazy by default.
Would welcome comments on the PEP from anyone involved in lazy importing in the scientific Python community.
Not sure what others think about it. Not having to attach seems great, but e.g. NetworkX tried to use LazyLoader on NumPy and that did not go well (I think this is because of the missing implicit from . import submodule which adds submodule to the namespace).
That is easy to fix (and mentioned in the NEP), but we are wondering if that is something that can be fixed in LazyLoader itself. From that perspective, would it be possible to add an opt-in at the module rather than user level? Say that networkx could add from future import lazy_load_this_package to mark that user opt-in is not required for using lazy loading?
Right now, I think a way for the library author to say that the library should always be lazy-loaded is not possible? That might be useful, since it would cover the current use-case of lazy-loading that is implemented in networkx, sklearn, and scipy.
IIUC in PEP 690, submodule imports of the form from . import submod will become lazy themselves (assuming laziness is enabled). However I think that from .submod import func will break the laziness.
What I am unsure about is whether the -L flag is strictly necessary to use PEP 690? Making -L necessary requires a decision by the user/application. Which is still very useful!
However, with the lazy-loading as done here, the decision/implementation is within networkx, scipy, etc. to lazy-load their submodules. They know that this is always OK, so application/user opt-in is not necessary. It is nice to not (strictly) require the user to know whether using -L would make their use-case faster.
This is true for LazyLoader, but not true for PEP 690. With PEP 690 all forms of import can be lazy; the latter form will delay the import until func is referenced.
Yes, this is I think the current mis-alignment between PEP 690 and the scikit et al use case. PEP 690 (and LazyLoader) are missing a way for a library to say “I want all imports within this module to be lazy, whether the application/end user requests it or not.” This is not hard to implement within PEP 690, the main issues are a) syntax (future import doesn’t make sense unless we plan to make it default behavior in the future), and b) do we then need a way for end users to override that library choice? (I would think no? But lazy.attach does provide this via EAGER_IMPORTS env var, so maybe we do.)
The EAGER_IMPORTS environment variable is not intended for library users to override the eager importing in scikit-image. We only introduced it to enable use on one of our CI testing jobs to rule out the presence of potential circular imports.
Right, so our lazy loader is a bit less lenient than the default importer when it comes to circular imports, so we added that option to make imports happen immediately and generate early failures in CI jobs.
Regarding whether this should be optional or not: one of the motivating reasons behind SPEC-1 is for libraries to import subpackages into their main namespace by default. Libraries typically don’t do this because it is slow, but it turns out to be very helpful for interactive exploration & teaching to have all the subpackages there from the start. It also avoids having to repeatedly having to jump to the beginning of a source file to keep adding imports, when referencing skimage.filters.gaussian is perfectly clear without the matching from skimage import filters.
I suspect if lazy loading cannot be enabled by default, libraries will go back to not importing those subpackages, since it will slow down imports for so many users.