From brett at python.org Sat Jul 8 19:56:05 2017 From: brett at python.org (Brett Cannon) Date: Sat, 08 Jul 2017 23:56:05 +0000 Subject: [Import-SIG] Proposal for a lazy-loading finder Message-ID: At PyCon US I found out that even though I tried to minimize people using importlib.util.LazyLoader, a bunch of people are using beyond the intended audience which was advanced Python users such as the Mercurial team. Knowing that people were somewhat ignoring the warnings about the dangers of using LazyLoader, I figured it was finally time to implement a lazy-loading finder to make sure people don't duplicate the same work and to make sure that it is implemented properly and can change as importlib itself does. (Plus I needed a coding break from workflow stuff while I was stuck in the US for a conference :) . Please have a look at https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb and let me know if I'm missing anything. I asked on Twitter if this would work for Mercurial and it turns out it closely mirrors what they already do: https://twitter.com/sid0/status/882775009051123712 (which is great since I didn't look at their code to avoid any GPL issues even though our relationship with the Mercurial devs is good enough to not have them sue us over code re-use :) . With independent verification that the approach works I'm fairly confident this can go into Python 3.7, but I still wanted to double-check with this mailing list to make sure the API design and approach seem sound. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Jul 9 03:02:46 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 9 Jul 2017 17:02:46 +1000 Subject: [Import-SIG] Proposal for a lazy-loading finder In-Reply-To: References: Message-ID: On 9 July 2017 at 09:56, Brett Cannon wrote: > Please have a look at > https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb > and let me know if I'm missing anything. I asked on Twitter if this would > work for Mercurial and it turns out it closely mirrors what they already do: > https://twitter.com/sid0/status/882775009051123712 (which is great since I > didn't look at their code to avoid any GPL issues even though our > relationship with the Mercurial devs is good enough to not have them sue us > over code re-use :) . With independent verification that the approach works > I'm fairly confident this can go into Python 3.7, but I still wanted to > double-check with this mailing list to make sure the API design and approach > seem sound. The technical approach seems sound, but from a naming perspective I'd suggest going with the more self-explanatory "ensure_lazy" and "ensure_eager" rather than "whitelist" and "blacklist". I'd also suggest that it would be nice to be able to do prefix matching rather than having to separately list every submodule of a package in either ensure_lazy or ensure_eager. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sun Jul 9 03:14:08 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 9 Jul 2017 17:14:08 +1000 Subject: [Import-SIG] Proposal for a lazy-loading finder In-Reply-To: References: Message-ID: On 9 July 2017 at 17:02, Nick Coghlan wrote: > The technical approach seems sound, but from a naming perspective I'd > suggest going with the more self-explanatory "ensure_lazy" and > "ensure_eager" rather than "whitelist" and "blacklist". One minor technical nit that I missed on first reading: rather than rebinding sys.path_hooks and sys.path_importer_cache when activating lazy loading, I'd suggest modifying them in-place via slice assignment and dict.update. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sun Jul 9 06:03:41 2017 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 9 Jul 2017 12:03:41 +0200 Subject: [Import-SIG] Proposal for a lazy-loading finder References: Message-ID: <20170709120341.2647c8df@fsol> On Sat, 08 Jul 2017 23:56:05 +0000 Brett Cannon wrote: > At PyCon US I found out that even though I tried to minimize people using > importlib.util.LazyLoader, a bunch of people are using beyond the intended > audience which was advanced Python users such as the Mercurial team. > Knowing that people were somewhat ignoring the warnings about the dangers > of using LazyLoader, I figured it was finally time to implement a > lazy-loading finder to make sure people don't duplicate the same work and > to make sure that it is implemented properly and can change as importlib > itself does. (Plus I needed a coding break from workflow stuff while I was > stuck in the US for a conference :) . > > Please have a look at > https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb Are LazyLoadingFinder() and activate_lazy_loading() as quoted there really all that's needed? Sounds good :-) I would suggest allow people to fully customize the blacklist / whitelist logic using a callable (because looking up by fullname is a bit inflexible). Regards Antoine. From ncoghlan at gmail.com Sun Jul 9 09:59:18 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 9 Jul 2017 23:59:18 +1000 Subject: [Import-SIG] Proposal for a lazy-loading finder In-Reply-To: <20170709120341.2647c8df@fsol> References: <20170709120341.2647c8df@fsol> Message-ID: On 9 July 2017 at 20:03, Antoine Pitrou wrote: > I would suggest allow people to fully customize the blacklist / > whitelist logic using a callable (because looking up by fullname is a > bit inflexible). Oh, I like that - and then we'd have "ensure_lazy" and "ensure_eager" as callback factories that accepted a predefined list of names. I think this is going to be one of the key changes in going from "utility a project implements for itself if it needs it" to "general purpose language level facility" - I could easily see the simple predefined list approach working for a *particular* project, but I think once we open this capability up to all Python users (rather than only those that are comfortable with customising the import system at runtime) it will prove to be inadequate. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From brett at python.org Mon Jul 10 14:48:22 2017 From: brett at python.org (Brett Cannon) Date: Mon, 10 Jul 2017 18:48:22 +0000 Subject: [Import-SIG] Proposal for a lazy-loading finder In-Reply-To: References: <20170709120341.2647c8df@fsol> Message-ID: On Sun, 9 Jul 2017 at 06:59 Nick Coghlan wrote: > On 9 July 2017 at 20:03, Antoine Pitrou wrote: > > I would suggest allow people to fully customize the blacklist / > > whitelist logic using a callable (because looking up by fullname is a > > bit inflexible). > > Oh, I like that - and then we'd have "ensure_lazy" and "ensure_eager" > as callback factories that accepted a predefined list of names. > If we provide an optional callable argument then I would drop the whitelist/ensure_lazy option. It's easier to explain and the common case will be blacklisting a module for lazy loading if you're implicitly flipping it on. For the whitelist case we can add importlib.util.lazy_import() and people can just be explicit (if this is important enough to even care about). And I purposefully didn't do a prefix match for ensure_eager as it's only meant for specific modules that have some try/except block which fails in the face of lazy loading. And since that should be a per-module thing instead of a per-package thing I don't want it over-extending. Plus providing a callback solution lets people engineer their own prefix matching solution if that's what they need. > > I think this is going to be one of the key changes in going from > "utility a project implements for itself if it needs it" to "general > purpose language level facility" - I could easily see the simple > predefined list approach working for a *particular* project, but I > think once we open this capability up to all Python users (rather than > only those that are comfortable with customising the import system at > runtime) it will prove to be inadequate. > Probably, which is partially why I have put off proposing the idea of importlib.util.activate_lazy_loading() for so long (and yes, Antoine, that function and LazyLoadingFinder is all that's needed :) . But I met multiple people at PyCon US this year who thanked me for the lazy loader which suggests the note warning people about not using the lazy loader is being ignore, so I figured it was finally time to help make sure people at least do it correctly. -Brett > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > https://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Mon Jul 10 15:40:40 2017 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 10 Jul 2017 13:40:40 -0600 Subject: [Import-SIG] Proposal for a lazy-loading finder In-Reply-To: References: Message-ID: On Sat, Jul 8, 2017 at 5:56 PM, Brett Cannon wrote: > [snip] > I figured it was finally time to implement a lazy-loading finder > to make sure people don't duplicate the same work and to make sure that it > is implemented properly and can change as importlib itself does. > [snip] > > Please have a look at > https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb > and let me know if I'm missing anything. LGTM. It's good that the class mirrors FileFinder. The only things I'd possibly suggest are: * name the class LazyLoadingFileFinder * make activate_lazy_loading() be a classmethod on PathFinder Both make it clear that they only relate to path-entry finders. I have some other concerns but they aren't problems with your proposal. :) Mostly, I still want to see a better high-level interface to the import machinery (i.e. "ImportSystem"). Having to poke things directly into the import state (e.g. sys.path_hooks) isn't ideal. However, I don't think my concerns are critical for this proposal so I won't elaborate here. :) -eric From brett at python.org Mon Jul 10 15:49:21 2017 From: brett at python.org (Brett Cannon) Date: Mon, 10 Jul 2017 19:49:21 +0000 Subject: [Import-SIG] Proposal for a lazy-loading finder In-Reply-To: References: Message-ID: On Mon, 10 Jul 2017 at 12:40 Eric Snow wrote: > On Sat, Jul 8, 2017 at 5:56 PM, Brett Cannon wrote: > > [snip] > > I figured it was finally time to implement a lazy-loading finder > > to make sure people don't duplicate the same work and to make sure that > it > > is implemented properly and can change as importlib itself does. > > [snip] > > > > Please have a look at > > > https://notebooks.azure.com/Brett/libraries/di2Btqj7zSI/html/Lazy%20importing.ipynb > > and let me know if I'm missing anything. > > LGTM. It's good that the class mirrors FileFinder. > > The only things I'd possibly suggest are: > > * name the class LazyLoadingFileFinder > This actually isn't restricted to FileFinder instances, it just so happens that's the common case. The design is such that any finder will work where the loader doesn't use a special object as the module instance, which is most loaders. > * make activate_lazy_loading() be a classmethod on PathFinder > That's an idea since it is only tweaking stuff that PathFinder cares about. It will hurt discoverability, though. > > Both make it clear that they only relate to path-entry finders. > False for the first, true for the second. :) > > I have some other concerns but they aren't problems with your > proposal. :) Mostly, I still want to see a better high-level > interface to the import machinery (i.e. "ImportSystem"). Having to > poke things directly into the import state (e.g. sys.path_hooks) isn't > ideal. However, I don't think my concerns are critical for this > proposal so I won't elaborate here. :) > Beyond activate_lazy_loading() I don't think anything else would require tweaking in an import system restructuring of import. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jul 10 21:55:15 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Jul 2017 11:55:15 +1000 Subject: [Import-SIG] Proposal for a lazy-loading finder In-Reply-To: References: <20170709120341.2647c8df@fsol> Message-ID: On 11 July 2017 at 04:48, Brett Cannon wrote: > > > On Sun, 9 Jul 2017 at 06:59 Nick Coghlan wrote: >> >> On 9 July 2017 at 20:03, Antoine Pitrou wrote: >> > I would suggest allow people to fully customize the blacklist / >> > whitelist logic using a callable (because looking up by fullname is a >> > bit inflexible). >> >> Oh, I like that - and then we'd have "ensure_lazy" and "ensure_eager" >> as callback factories that accepted a predefined list of names. > > > If we provide an optional callable argument then I would drop the > whitelist/ensure_lazy option. It's easier to explain and the common case > will be blacklisting a module for lazy loading if you're implicitly flipping > it on. For the whitelist case we can add importlib.util.lazy_import() and > people can just be explicit (if this is important enough to even care > about). Sorry, I was overly terse. By callback factories, I meant something like: def ensure_eager_exact(names): """Disable lazy loading for named modules""" names = set(names) def lazy_load_filter(fullname): return fullname not in names return lazy_load_filter def ensure_lazy_exact(names): """Only enable lazy loading for named modules""" names = set(names) def lazy_load_filter(fullname): return fullname in names return lazy_load_filter def ensure_eager_by_prefix(names): """Disable lazy loading for named modules and their submodules""" names = set(names) def lazy_load_filter(fullname): parts = fullname.split(".") prefixes = (".".join(parts[:i]) for i in range(len(parts))) return all(prefix not in names for prefix in prefixes) return lazy_load_filter def ensure_lazy_by_prefix(names): """Only enable lazy loading for named modules and their submodules""" names = set(names) def lazy_load_filter(fullname): parts = fullname.split(".") prefixes = (".".join(parts[:i]) for i in range(len(parts))) return any(prefix in names for prefix in prefixes) return lazy_load_filter Those could even just be recipes in the documentation rather than actual standard library functions. > And I purposefully didn't do a prefix match for ensure_eager as it's only > meant for specific modules that have some try/except block which fails in > the face of lazy loading. And since that should be a per-module thing > instead of a per-package thing I don't want it over-extending. Plus > providing a callback solution lets people engineer their own prefix matching > solution if that's what they need. Yep, that's why I like Antoine's callback suggestion. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From brett at python.org Tue Jul 11 12:50:50 2017 From: brett at python.org (Brett Cannon) Date: Tue, 11 Jul 2017 16:50:50 +0000 Subject: [Import-SIG] Proposal for a lazy-loading finder In-Reply-To: References: <20170709120341.2647c8df@fsol> Message-ID: On Mon, 10 Jul 2017 at 18:55 Nick Coghlan wrote: > On 11 July 2017 at 04:48, Brett Cannon wrote: > > > > > > On Sun, 9 Jul 2017 at 06:59 Nick Coghlan wrote: > >> > >> On 9 July 2017 at 20:03, Antoine Pitrou wrote: > >> > I would suggest allow people to fully customize the blacklist / > >> > whitelist logic using a callable (because looking up by fullname is a > >> > bit inflexible). > >> > >> Oh, I like that - and then we'd have "ensure_lazy" and "ensure_eager" > >> as callback factories that accepted a predefined list of names. > > > > > > If we provide an optional callable argument then I would drop the > > whitelist/ensure_lazy option. It's easier to explain and the common case > > will be blacklisting a module for lazy loading if you're implicitly > flipping > > it on. For the whitelist case we can add importlib.util.lazy_import() and > > people can just be explicit (if this is important enough to even care > > about). > > Sorry, I was overly terse. By callback factories, I meant something like: > > def ensure_eager_exact(names): > """Disable lazy loading for named modules""" > names = set(names) > def lazy_load_filter(fullname): > return fullname not in names > return lazy_load_filter > > def ensure_lazy_exact(names): > """Only enable lazy loading for named modules""" > names = set(names) > def lazy_load_filter(fullname): > return fullname in names > return lazy_load_filter > > def ensure_eager_by_prefix(names): > """Disable lazy loading for named modules and their submodules""" > names = set(names) > def lazy_load_filter(fullname): > parts = fullname.split(".") > prefixes = (".".join(parts[:i]) for i in range(len(parts))) > return all(prefix not in names for prefix in prefixes) > return lazy_load_filter > > def ensure_lazy_by_prefix(names): > """Only enable lazy loading for named modules and their > submodules""" > names = set(names) > def lazy_load_filter(fullname): > parts = fullname.split(".") > prefixes = (".".join(parts[:i]) for i in range(len(parts))) > return any(prefix in names for prefix in prefixes) > return lazy_load_filter > > Those could even just be recipes in the documentation rather than > actual standard library functions. > Possibly. The first two are rather simple and should be obvious for anyone wanting to use lazy loading (there will continue to be a warning in the docs that you should only use lazy loading if you know what you're doing). > > > And I purposefully didn't do a prefix match for ensure_eager as it's only > > meant for specific modules that have some try/except block which fails in > > the face of lazy loading. And since that should be a per-module thing > > instead of a per-package thing I don't want it over-extending. Plus > > providing a callback solution lets people engineer their own prefix > matching > > solution if that's what they need. > > Yep, that's why I like Antoine's callback suggestion. > So are you suggesting dropping even the ensure_eager convenience argument? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Jul 11 23:48:01 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Jul 2017 13:48:01 +1000 Subject: [Import-SIG] Proposal for a lazy-loading finder In-Reply-To: References: <20170709120341.2647c8df@fsol> Message-ID: On 12 July 2017 at 02:50, Brett Cannon wrote: > On Mon, 10 Jul 2017 at 18:55 Nick Coghlan wrote: >> Those could even just be recipes in the documentation rather than >> actual standard library functions. > > Possibly. The first two are rather simple and should be obvious for anyone > wanting to use lazy loading (there will continue to be a warning in the docs > that you should only use lazy loading if you know what you're doing). Right, I only added them to help make it clearer which parts of the prefix-matching ones were related to the prefix matching, and which were related to the callback interface. Without the as-simple-as-possible examples as a point of comparison, I think the prefix matching examples are harder to follow than they need to be. >> > And I purposefully didn't do a prefix match for ensure_eager as it's >> > only >> > meant for specific modules that have some try/except block which fails >> > in >> > the face of lazy loading. And since that should be a per-module thing >> > instead of a per-package thing I don't want it over-extending. Plus >> > providing a callback solution lets people engineer their own prefix >> > matching >> > solution if that's what they need. >> >> Yep, that's why I like Antoine's callback suggestion. > > So are you suggesting dropping even the ensure_eager convenience argument? Aye, I think so - the filter is pretty trivial to write, especially since you can even hardcode the blacklist or just put it in a module global: _ALWAYS_EAGER = set("module1 module2 module3 module4".split()) def _lazy_loading_filter(fullname): return fullname not in _ALWAYS_EAGER It's also easy enough to add a convenience shortcut for that later if we decide we really want to, while if we start with one, then we introduce a bit of an awkward transition from simple module blacklisting to arbitrary filtering when learning the API. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia