From brett at python.org Thu Aug 1 15:18:33 2013 From: brett at python.org (Brett Cannon) Date: Thu, 1 Aug 2013 09:18:33 -0400 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: [SNIP to stop Mailman from holding up the email for moderation because of size] A Module Attribute to Expose Contributing Ref Files >>> --------------------------------------------- >>> >>> Knowing the origin of a module is important when tracking down problems, >>> particularly import-related ones. Currently, that entails looking at >>> `.__file__` and `.__path__` (or `sys.path`). >>> >>> With this PEP there can be a chain of ref files in between the currently >>> available path and a module's __file__. Having access to that list of ref >>> files is important in order to determine why one file was selected over >>> another as the origin for the module. When an unexpected file gets used >>> for one of your imports, you'll care about this! >>> >>> In order to facilitate that, modules will have a new attribute: >>> `__indirect__`. It will be a tuple comprised of the chain of ref files, in >>> order, used to locate the module's __file__. An empty tuple or with one >>> item will be the most common case. An empty tuple indicates that no ref >>> files were used to locate the module. >>> >> >> This complicates things even further. How are you going to pass this info >> along a call chain through find_loader()? Are we going to have to add >> find_loader3() to support this (nasty side-effect of using tuples instead >> of types.SimpleNamespace for the return value)? Some magic second value or >> type from find_loader() which flags the values in the iterable are from a >> .ref file and not any other possible place? This requires an API change and >> there isn't any mention of how that would look or work. >> > > This is the big open question in my mind. I suppose having find_loader() > return a SimpleNamespace would help. Then the indirect path we aggregate > in find_loader() could be passed as a new argument to loaders (when > instantiated in either FileFinder.find_loader() or in > PathFinder.find_module(). > > Here are the options I see, some more realistic than others: > > 1. Build __indirect__ after the fact (in init_module_attrs()?). > 2. Change FileFinder.find_loader() to return a types.SimpleNamespace > instance. > You can't do that; it would change the method signature in a way that would break code automatically unpacking the tuple. I was lamenting the fact that no one thought to use types.SimpleNamespace in the first place, not suggesting it now be used. > 3. Change FileFinder.find_loader() to return a namedtuple subclass with an > extra "loader" attribute. > Only if it also subclassed dict or types.SimpleNamespace so that it could be documented that in Python 4 the tuple usage will be removed but the new, alternative access approach would continue to work. Plus something in importlib.util to help construct this monstrosity of an object so people future-proof their code. > 4. Piggy-back the indirect path on the loader returned by > FileFinder.find_loader() in an "_indirect" attribute (or in the loader spot > in the case of namespace packages). > Doesn't that tie this very tightly to FileFinder and not allowing alernative finders to participate? > 5. Something along the lines of Nick's IndirectReference. > I would avoid having to do any change that requires an isinstance check. New code can use getattr() (like with a namedtuple hybrid) w/o issue since that's a common way of dealing with API expansion, but having changing types in the return is something even Guido has said he doesn't care for. 6. Wrap the loader in a proxy that also sets __indirect__ when > load_module() is called. > Ew. > 7. Totally refactor the import system so that ModuleSpec objects are > passed to metapath finders rather than (name, path) and simply store the > indirect path on the spec (which is used directly to load the module rather > than the loader). > > Yeah, that ain't going to happen for backwards-compatibility reasons unless you're ready to make this new API work in a fully compatible way with the current API. > 4 feels too much like a hack, particularly when we have other options. 7 > would need a PEP of its own (forthcoming ). > >> > I see 2 as the best one. Is it really too late to change the return type > of FileFinder.find_loader()? If we simply can't bear the backward > compatibility risk (no matter how small ), > We unfortunately can't. It would require a new method which as a stub would call the old API to return the proper object (which is fine if you can come up with a reasonable name). > I'd advocate for one of 1, 3, 5, or 6. > I would try 1 and 3. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Aug 1 16:00:24 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 2 Aug 2013 00:00:24 +1000 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: On 1 August 2013 23:18, Brett Cannon wrote: >> I see 2 as the best one. Is it really too late to change the return type >> of FileFinder.find_loader()? If we simply can't bear the backward >> compatibility risk (no matter how small ), > > We unfortunately can't. It would require a new method which as a stub would > call the old API to return the proper object (which is fine if you can come > up with a reasonable name). Just musing on this one for a bit. 1. We still have the silliness where we call "find_module" on metapath importers to ask them for a loader. 2. We have defined an inflexible signature for find_loader on path entry finders (oops) 3. There's other interesting metadata finders could expose *without* loading the module So, how does this sound: add a new API called "find_module_info" for both metapath importers and path entry finders (falling back to the legacy APIs). This would return a simple namespace potentially providing the following pieces of information, using the same rules as the corresponding loader does for setting the module attributes (http://docs.python.org/3/reference/import.html#loaders): __loader__ __name__ __package__ __path__ __file__ __cached__ __indirect__ (We could also lose the double underscores for the namespace attributes, but I quite like the symmetry of keeping them) Thoughts? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ericsnowcurrently at gmail.com Thu Aug 1 16:36:20 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 1 Aug 2013 08:36:20 -0600 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: On Aug 1, 2013 8:00 AM, "Nick Coghlan" wrote: > > On 1 August 2013 23:18, Brett Cannon wrote: > >> I see 2 as the best one. Is it really too late to change the return type > >> of FileFinder.find_loader()? If we simply can't bear the backward > >> compatibility risk (no matter how small ), > > > > We unfortunately can't. It would require a new method which as a stub would > > call the old API to return the proper object (which is fine if you can come > > up with a reasonable name). > > Just musing on this one for a bit. > > 1. We still have the silliness where we call "find_module" on metapath > importers to ask them for a loader. > 2. We have defined an inflexible signature for find_loader on path > entry finders (oops) > 3. There's other interesting metadata finders could expose *without* > loading the module > > So, how does this sound: add a new API called "find_module_info" for > both metapath importers and path entry finders (falling back to the > legacy APIs). This would return a simple namespace potentially > providing the following pieces of information, using the same rules as > the corresponding loader does for setting the module attributes > (http://docs.python.org/3/reference/import.html#loaders): > > __loader__ > __name__ > __package__ > __path__ > __file__ > __cached__ > __indirect__ > > (We could also lose the double underscores for the namespace > attributes, but I quite like the symmetry of keeping them) > > Thoughts? This is basically what I've been thinking of as a new ModuleSpec type, though with some methods as well. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Aug 1 16:35:57 2013 From: brett at python.org (Brett Cannon) Date: Thu, 1 Aug 2013 10:35:57 -0400 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: On Thu, Aug 1, 2013 at 10:00 AM, Nick Coghlan wrote: > On 1 August 2013 23:18, Brett Cannon wrote: > >> I see 2 as the best one. Is it really too late to change the return > type > >> of FileFinder.find_loader()? If we simply can't bear the backward > >> compatibility risk (no matter how small ), > > > > We unfortunately can't. It would require a new method which as a stub > would > > call the old API to return the proper object (which is fine if you can > come > > up with a reasonable name). > > Just musing on this one for a bit. > > 1. We still have the silliness where we call "find_module" on metapath > importers to ask them for a loader. > 2. We have defined an inflexible signature for find_loader on path > entry finders (oops) > 3. There's other interesting metadata finders could expose *without* > loading the module > > So, how does this sound: add a new API called "find_module_info" for > both metapath importers and path entry finders (falling back to the > legacy APIs). This would return a simple namespace potentially > providing the following pieces of information, using the same rules as > the corresponding loader does for setting the module attributes > (http://docs.python.org/3/reference/import.html#loaders): > > __loader__ > __name__ > __package__ > __path__ > __file__ > __cached__ > __indirect__ > > (We could also lose the double underscores for the namespace > attributes, but I quite like the symmetry of keeping them) > > Thoughts? If you're going to do that, why stop at types.SimpleNamespace and not move all the way to a module object? Then you can simply start moving to APIs which take the module object to be operated on and the various methods in the loader, etc. and just fill in details as necessary; that's what I would do if I got to redesign the loader API today since it would simplify load_module() and almost everything would just become a static method which set the attribute on the module (e.g. ExecutionLoader.get_filename('some.module') would become ExecutionLoader.filename(module) or even ExecutionLoader.__file__(module) which gets really meta as you can then have a decorator which checks for a non-None value for that attribute on the module and then returns it as a short-circuit instead of calling the method). Only drawback I see is it not being easy to tell if a module has been initialized or not, but I don't view that as a critical issue. IOW introduce new_module()/fresh_module(). Even if types.SimpleNamespace is kept I do like the idea. Loaders could shift to working only off of the object and have their __init__ method standardized to take a single argument so what import is told about and what loaders work with is the same. Basically it becomes a caching mechanism of what finders can infer so that loaders can save themselves the hassle without complicated init call signatures. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Thu Aug 1 16:44:28 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 1 Aug 2013 08:44:28 -0600 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: On Aug 1, 2013 8:36 AM, "Brett Cannon" wrote: > If you're going to do that, why stop at types.SimpleNamespace and not move all the way to a module object? Then you can simply start moving to APIs which take the module object to be operated on and the various methods in the loader, etc. and just fill in details as necessary; that's what I would do if I got to redesign the loader API today since it would simplify load_module() and almost everything would just become a static method which set the attribute on the module (e.g. ExecutionLoader.get_filename('some.module') would become ExecutionLoader.filename(module) or even ExecutionLoader.__file__(module) which gets really meta as you can then have a decorator which checks for a non-None value for that attribute on the module and then returns it as a short-circuit instead of calling the method). Only drawback I see is it not being easy to tell if a module has been initialized or not, but I don't view that as a critical issue. IOW introduce new_module()/fresh_module(). > > Even if types.SimpleNamespace is kept I do like the idea. Loaders could shift to working only off of the object and have their __init__ method standardized to take a single argument so what import is told about and what loaders work with is the same. Basically it becomes a caching mechanism of what finders can infer so that loaders can save themselves the hassle without complicated init call signatures. This is pretty much exactly what I've been thinking about since PyCon. The only difference is that I have a distinct ModuleSpec class and modules would get a new __spec__ attribute. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Aug 2 02:56:41 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 2 Aug 2013 10:56:41 +1000 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: On 2 Aug 2013 00:44, "Eric Snow" wrote: > > > On Aug 1, 2013 8:36 AM, "Brett Cannon" wrote: > > If you're going to do that, why stop at types.SimpleNamespace and not move all the way to a module object? Then you can simply start moving to APIs which take the module object to be operated on and the various methods in the loader, etc. and just fill in details as necessary; that's what I would do if I got to redesign the loader API today since it would simplify load_module() and almost everything would just become a static method which set the attribute on the module (e.g. ExecutionLoader.get_filename('some.module') would become ExecutionLoader.filename(module) or even ExecutionLoader.__file__(module) which gets really meta as you can then have a decorator which checks for a non-None value for that attribute on the module and then returns it as a short-circuit instead of calling the method). Only drawback I see is it not being easy to tell if a module has been initialized or not, but I don't view that as a critical issue. IOW introduce new_module()/fresh_module(). > > > > Even if types.SimpleNamespace is kept I do like the idea. Loaders could shift to working only off of the object and have their __init__ method standardized to take a single argument so what import is told about and what loaders work with is the same. Basically it becomes a caching mechanism of what finders can infer so that loaders can save themselves the hassle without complicated init call signatures. > > This is pretty much exactly what I've been thinking about since PyCon. The only difference is that I have a distinct ModuleSpec class and modules would get a new __spec__ attribute. And we can quit adding ever more magic attributes directly to the module namespace. I like it. With that model, things might look vaguely like: 1. Finders would optionally offer "get_module_spec" (although a better name would be nice!) 2. Specs would have a load() method for the import system to call that optionally accepted an existing module object (this would then cover reload). 3. The responsibility for checking the sys.modules cache would move to the import system. 4. We'd create a "SpecLoader" to offer backwards compatibility in the old __loader__ attribute. Slight(!) tangent from the original problem, but a worthwhile refactoring issue to tackle, I think :) Cheers, Nick. > > -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Aug 2 05:34:38 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 1 Aug 2013 21:34:38 -0600 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: On Thu, Aug 1, 2013 at 6:56 PM, Nick Coghlan wrote: > > On 2 Aug 2013 00:44, "Eric Snow" wrote: > > This is pretty much exactly what I've been thinking about since PyCon. > The only difference is that I have a distinct ModuleSpec class and modules > would get a new __spec__ attribute. > > And we can quit adding ever more magic attributes directly to the module > namespace. I like it. > Yeah, that was part of what lead me to the idea. This could be taken to some pretty great lengths (I've given it a lot of thought), but I'm trying hard to not do too much at once. I wasn't even planning on pursuing ModuleSpec until 3.5, much less any of my more drastic ideas. > With that model, things might look vaguely like: > > 1. Finders would optionally offer "get_module_spec" (although a better > name would be nice!) > How about "find_module"? <.5 wink> Actually, I'm pretty sure this can be done in a backward-compatible way (in not too much time I've roughed out an implementation that should work). I would rather not introduce more API to the import system, but if that's preferable to hijacking (or improving ) find_module() then I can live with that. However, given the crowd that takes advantage of the import system APIs, I wouldn't consider the change disruptive as long as it's backward compatible. This would also allow us to deprecate PathEntryFinder.get_loader() which we wouldn't have needed if we'd had something like ModuleSpec. > 2. Specs would have a load() method for the import system to call that > optionally accepted an existing module object (this would then cover > reload). > That's been my plan from the get-go. Good call on the reload case. > 3. The responsibility for checking the sys.modules cache would move to the > import system. > To me it makes sense to go even further. ModuleSpec could easily take over a bunch of the responsibilities of loaders, particularly related to the management of the module objects. Also, Loader.init_module_attrs() and importlib.util.module_to_load() could be pulled before the 3.4 release (since they are new in 3.4). It would stink if we found we no longer needed them after they get locked in by the release. Note, however, that they can co-exist with ModuleSpec just fine so it's not as big a deal. > 4. We'd create a "SpecLoader" to offer backwards compatibility in the old > __loader__ attribute. > Interesting. I had anticipated loaders still sticking around, still exposed by module.__loader__ and filling most of their current role, especially with regard to the optional PEP 302 APIs. I suppose we could deprecate the __loader__ attribute, and maybe even __package__, in favor of __spec__, but I don't think there's any rush to do so before Python 4000. > Slight(!) tangent from the original problem, but a worthwhile refactoring > issue to tackle, I think :) > Yeah, even if it proves too big a change for 3.4 and we take some other approach for indirections, I think there's a lot to gain from separating the module specification from the module and from the loader. I've attached a patch that does the bare minimum of what I think we'd want from ModuleSpec. I'll probably flesh out more of my ideas for it later. Of course, I don't want anything here to get in the way of the .ref PEP which I think has more concrete value. So if this tangent threatens any chance at getting indirection files for 3.4, I'd rather defer any effort on these extras until 3.5 in favor of a simpler (if less desirable) approach. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: modulespec.diff Type: application/octet-stream Size: 9849 bytes Desc: not available URL: From ncoghlan at gmail.com Fri Aug 2 11:32:45 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 2 Aug 2013 19:32:45 +1000 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: On 2 Aug 2013 13:34, "Eric Snow" wrote: > > > > > On Thu, Aug 1, 2013 at 6:56 PM, Nick Coghlan wrote: >> >> >> On 2 Aug 2013 00:44, "Eric Snow" wrote: >> > This is pretty much exactly what I've been thinking about since PyCon. The only difference is that I have a distinct ModuleSpec class and modules would get a new __spec__ attribute. >> >> And we can quit adding ever more magic attributes directly to the module namespace. I like it. > > Yeah, that was part of what lead me to the idea. This could be taken to some pretty great lengths (I've given it a lot of thought), but I'm trying hard to not do too much at once. I wasn't even planning on pursuing ModuleSpec until 3.5, much less any of my more drastic ideas. >> >> With that model, things might look vaguely like: >> >> 1. Finders would optionally offer "get_module_spec" (although a better name would be nice!) > > How about "find_module"? <.5 wink> Actually, I'm pretty sure this can be done in a backward-compatible way (in not too much time I've roughed out an implementation that should work). I would rather not introduce more API to the import system, but if that's preferable to hijacking (or improving ) find_module() then I can live with that. However, given the crowd that takes advantage of the import system APIs, I wouldn't consider the change disruptive as long as it's backward compatible. > > This would also allow us to deprecate PathEntryFinder.get_loader() which we wouldn't have needed if we'd had something like ModuleSpec. If you can make find_module handle this in a backwards compatible way, cool :) >> >> 2. Specs would have a load() method for the import system to call that optionally accepted an existing module object (this would then cover reload). > > That's been my plan from the get-go. Good call on the reload case. >> >> 3. The responsibility for checking the sys.modules cache would move to the import system. > > To me it makes sense to go even further. ModuleSpec could easily take over a bunch of the responsibilities of loaders, particularly related to the management of the module objects. > > Also, Loader.init_module_attrs() and importlib.util.module_to_load() could be pulled before the 3.4 release (since they are new in 3.4). It would stink if we found we no longer needed them after they get locked in by the release. Note, however, that they can co-exist with ModuleSpec just fine so it's not as big a deal. >> >> 4. We'd create a "SpecLoader" to offer backwards compatibility in the old __loader__ attribute. > > Interesting. I had anticipated loaders still sticking around, still exposed by module.__loader__ and filling most of their current role, especially with regard to the optional PEP 302 APIs. I suppose we could deprecate the __loader__ attribute, and maybe even __package__, in favor of __spec__, but I don't think there's any rush to do so before Python 4000. I was thinking of finders returning customised types for module specs, but I guess you could get the same effect defining a new "exec_module" API on loaders. >> >> Slight(!) tangent from the original problem, but a worthwhile refactoring issue to tackle, I think :) > > Yeah, even if it proves too big a change for 3.4 and we take some other approach for indirections, I think there's a lot to gain from separating the module specification from the module and from the loader. I've attached a patch that does the bare minimum of what I think we'd want from ModuleSpec. I'll probably flesh out more of my ideas for it later. > > Of course, I don't want anything here to get in the way of the .ref PEP which I think has more concrete value. So if this tangent threatens any chance at getting indirection files for 3.4, I'd rather defer any effort on these extras until 3.5 in favor of a simpler (if less desirable) approach. I suspect ref files will be an easier sell with an elegant way to handle the indirection tracking. I'm not aware of anyone that actually *likes* the current amount of work loaders have to do, it's just that we only figured that out with the benefit of hindsight :) Cheers, Nick. > > -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Aug 4 07:07:47 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Aug 2013 15:07:47 +1000 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: On 2 August 2013 13:34, Eric Snow wrote: > > On Thu, Aug 1, 2013 at 6:56 PM, Nick Coghlan wrote: >> >> >> On 2 Aug 2013 00:44, "Eric Snow" wrote: >> > This is pretty much exactly what I've been thinking about since PyCon. >> > The only difference is that I have a distinct ModuleSpec class and modules >> > would get a new __spec__ attribute. >> >> And we can quit adding ever more magic attributes directly to the module >> namespace. I like it. > > Yeah, that was part of what lead me to the idea. This could be taken to > some pretty great lengths (I've given it a lot of thought), but I'm trying > hard to not do too much at once. I wasn't even planning on pursuing > ModuleSpec until 3.5, much less any of my more drastic ideas. >> >> With that model, things might look vaguely like: >> >> 1. Finders would optionally offer "get_module_spec" (although a better >> name would be nice!) > > How about "find_module"? <.5 wink> Actually, I'm pretty sure this can be > done in a backward-compatible way (in not too much time I've roughed out an > implementation that should work). I would rather not introduce more API to > the import system, but if that's preferable to hijacking (or improving > ) find_module() then I can live with that. However, given the crowd > that takes advantage of the import system APIs, I wouldn't consider the > change disruptive as long as it's backward compatible. > > This would also allow us to deprecate PathEntryFinder.get_loader() which we > wouldn't have needed if we'd had something like ModuleSpec. I finally had a chance to look at your draft implementation. That's a neat attempt at backwards compatibility, but I'm not sure it will work properly - you already had to block out several interesting methods for compatibility reasons, and there's a potential for conflict even with the methods you did keep (since custom loaders may have additional methods beyond those in the specs). YAFM is annoying (Yet Another Method, I'll let you fill in the rest), but I think it's better than trying to be too clever and accidentally breaking things. How about "find_import" as a new method name? And ImportSpec as the class name, rather than ModuleSpec? >> 4. We'd create a "SpecLoader" to offer backwards compatibility in the old >> __loader__ attribute. > > Interesting. I had anticipated loaders still sticking around, still exposed > by module.__loader__ and filling most of their current role, especially with > regard to the optional PEP 302 APIs. I suppose we could deprecate the > __loader__ attribute, and maybe even __package__, in favor of __spec__, but > I don't think there's any rush to do so before Python 4000. Yeah, I think having the spec as something people *don't* customise is a good idea. >> Slight(!) tangent from the original problem, but a worthwhile refactoring >> issue to tackle, I think :) > > Yeah, even if it proves too big a change for 3.4 and we take some other > approach for indirections, I think there's a lot to gain from separating the > module specification from the module and from the loader. I've attached a > patch that does the bare minimum of what I think we'd want from ModuleSpec. > I'll probably flesh out more of my ideas for it later. > > Of course, I don't want anything here to get in the way of the .ref PEP > which I think has more concrete value. So if this tangent threatens any > chance at getting indirection files for 3.4, I'd rather defer any effort on > these extras until 3.5 in favor of a simpler (if less desirable) approach. I just realised there's another added bonus to this approach: __spec__.__name__ will let us record the *real* name of modules executed via -m, even with __name__ set to "__main__". So it could also greatly simplify some aspects of PEP 395 :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ericsnowcurrently at gmail.com Thu Aug 8 03:08:01 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 7 Aug 2013 19:08:01 -0600 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: On Sat, Aug 3, 2013 at 11:07 PM, Nick Coghlan wrote: > On 2 August 2013 13:34, Eric Snow wrote: > I finally had a chance to look at your draft implementation. That's a > neat attempt at backwards compatibility, but I'm not sure it will work > properly - you already had to block out several interesting methods > for compatibility reasons, and there's a potential for conflict even > with the methods you did keep (since custom loaders may have > additional methods beyond those in the specs). > Yeah, that was a pretty rough stab at it. I've since done a little more, including implementing __getattr__() and getting a little clever for is_package. And I'm still not sure it will work. isinstance checks will fail (duck-typing FTW) and id() gives a different value for the spec and for the loader. I suppose that's the rub with proxies. So I'm not sure it will work, but it *could* be close enough. We'll see. > YAFM is annoying (Yet Another Method, I'll let you fill in the rest), > but I think it's better than trying to be too clever and accidentally > breaking things. > That's my concern too. > > How about "find_import" as a new method name? And ImportSpec as the > class name, rather than ModuleSpec? > To me "ImportSpec" says "spec for the import system". > >> 4. We'd create a "SpecLoader" to offer backwards compatibility in the > old > >> __loader__ attribute. > > > > Interesting. I had anticipated loaders still sticking around, still > exposed > > by module.__loader__ and filling most of their current role, especially > with > > regard to the optional PEP 302 APIs. I suppose we could deprecate the > > __loader__ attribute, and maybe even __package__, in favor of __spec__, > but > > I don't think there's any rush to do so before Python 4000. > > Yeah, I think having the spec as something people *don't* customise is > a good idea. > I tried it both ways and it's a *lot* simpler if the spec is not designed for modification. I expect the case for modifying a spec would be pretty uncommon. > > >> Slight(!) tangent from the original problem, but a worthwhile > refactoring > >> issue to tackle, I think :) > > > > Yeah, even if it proves too big a change for 3.4 and we take some other > > approach for indirections, I think there's a lot to gain from separating > the > > module specification from the module and from the loader. I've attached > a > > patch that does the bare minimum of what I think we'd want from > ModuleSpec. > > I'll probably flesh out more of my ideas for it later. > > > > Of course, I don't want anything here to get in the way of the .ref PEP > > which I think has more concrete value. So if this tangent threatens any > > chance at getting indirection files for 3.4, I'd rather defer any effort > on > > these extras until 3.5 in favor of a simpler (if less desirable) > approach. > > I just realised there's another added bonus to this approach: > __spec__.__name__ will let us record the *real* name of modules > executed via -m, even with __name__ set to "__main__". So it could > also greatly simplify some aspects of PEP 395 :) > That's a good one. I'll give it a try. The patch I've got is pretty hefty. Should I keep it low key and just post it here, or would it be worth logging a ticket and posting it there for review? Once I'm comfortable with the patch I'll try sticking my .ref patch on top and see how it looks. I'll probably whip up a PEP for ModuleSpec at that point if things are looking good. I'm just worried about getting this done in time for 3.4. On top of this I'm really close on OrderedDict, ordered class definition namespace, .__definition_order__, and locals('__kworder__'), so I'm still kind of nervous about taking on two non-trivial changes to the import system with so little time before beta 1. However, at this point I still think it's doable. :) -eric p.s. I hadn't realized this list was "closed". Should we change that, or take this (both ModuleSpec and .ref) to python-ideas (or off-line)? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Aug 8 14:18:16 2013 From: brett at python.org (Brett Cannon) Date: Thu, 8 Aug 2013 08:18:16 -0400 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: On Wed, Aug 7, 2013 at 9:08 PM, Eric Snow wrote: > > > > On Sat, Aug 3, 2013 at 11:07 PM, Nick Coghlan wrote: > >> On 2 August 2013 13:34, Eric Snow wrote: >> I finally had a chance to look at your draft implementation. That's a >> neat attempt at backwards compatibility, but I'm not sure it will work >> properly - you already had to block out several interesting methods >> for compatibility reasons, and there's a potential for conflict even >> with the methods you did keep (since custom loaders may have >> additional methods beyond those in the specs). >> > > Yeah, that was a pretty rough stab at it. I've since done a little more, > including implementing __getattr__() and getting a little clever for > is_package. And I'm still not sure it will work. isinstance checks will > fail (duck-typing FTW) and id() gives a different value for the spec and > for the loader. I suppose that's the rub with proxies. So I'm not sure it > will work, but it *could* be close enough. We'll see. > > >> YAFM is annoying (Yet Another Method, I'll let you fill in the rest), >> but I think it's better than trying to be too clever and accidentally >> breaking things. >> > > That's my concern too. > > >> >> How about "find_import" as a new method name? And ImportSpec as the >> class name, rather than ModuleSpec? >> > > To me "ImportSpec" says "spec for the import system". > > >> >> 4. We'd create a "SpecLoader" to offer backwards compatibility in the >> old >> >> __loader__ attribute. >> > >> > Interesting. I had anticipated loaders still sticking around, still >> exposed >> > by module.__loader__ and filling most of their current role, especially >> with >> > regard to the optional PEP 302 APIs. I suppose we could deprecate the >> > __loader__ attribute, and maybe even __package__, in favor of __spec__, >> but >> > I don't think there's any rush to do so before Python 4000. >> >> Yeah, I think having the spec as something people *don't* customise is >> a good idea. >> > > I tried it both ways and it's a *lot* simpler if the spec is not designed > for modification. I expect the case for modifying a spec would be pretty > uncommon. > > >> >> >> Slight(!) tangent from the original problem, but a worthwhile >> refactoring >> >> issue to tackle, I think :) >> > >> > Yeah, even if it proves too big a change for 3.4 and we take some other >> > approach for indirections, I think there's a lot to gain from >> separating the >> > module specification from the module and from the loader. I've >> attached a >> > patch that does the bare minimum of what I think we'd want from >> ModuleSpec. >> > I'll probably flesh out more of my ideas for it later. >> > >> > Of course, I don't want anything here to get in the way of the .ref PEP >> > which I think has more concrete value. So if this tangent threatens any >> > chance at getting indirection files for 3.4, I'd rather defer any >> effort on >> > these extras until 3.5 in favor of a simpler (if less desirable) >> approach. >> >> I just realised there's another added bonus to this approach: >> __spec__.__name__ will let us record the *real* name of modules >> executed via -m, even with __name__ set to "__main__". So it could >> also greatly simplify some aspects of PEP 395 :) >> > > That's a good one. I'll give it a try. > > The patch I've got is pretty hefty. Should I keep it low key and just > post it here, or would it be worth logging a ticket and posting it there > for review? > Once it's all written up in a PEP you can post an issue for the code. > Once I'm comfortable with the patch I'll try sticking my .ref patch on > top and see how it looks. I'll probably whip up a PEP for ModuleSpec at > that point if things are looking good. > > I'm just worried about getting this done in time for 3.4. On top of this > I'm really close on OrderedDict, ordered class definition namespace, > .__definition_order__, and locals('__kworder__'), so I'm still kind > of nervous about taking on two non-trivial changes to the import system > with so little time before beta 1. However, at this point I still think > it's doable. :) > I personally view all of this as bonus stuff that is in no way required to make Python function or make some new class of solution available, so I wouldn't stress about getting in for 3.4. > > -eric > > p.s. I hadn't realized this list was "closed". Should we change that, or > take this (both ModuleSpec and .ref) to python-ideas (or off-line)? > Eric or Barry are the admins so they can change the wording. I say just leave it here for now until people are happy with the proposal and then it can be kicked up to python-dev (python-ideas isn't needed in this case since we have this mailing list specifically for import discussions). -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Aug 9 08:34:34 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 9 Aug 2013 00:34:34 -0600 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System Message-ID: This is an outgrowth of discussions on the .ref PEP, but it's also something I've been thinking about for over a year and starting toying with at the last PyCon. I have a patch that passes all but a couple unit tests and should pass though when I get a minute to take another pass at it. I'll probably end up adding a bunch more unit tests before I'm done as well. However, the functionality is mostly there. BTW, I gotta say, Brett, I have a renewed appreciation for the long and hard effort you put into importlib. There are just so many odd corner cases that I never would have looked for if not for that library. And those unit tests do a great job of covering all of that. Thanks! -eric ------------------------------------------------------------------------------- PEP: 4XX Title: A ModuleSpec Type for the Import System Version: $Revision$ Last-Modified: $Date$ Author: Eric Snow BDFL-Delegate: ??? Discussions-To: import-sig at python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 8-Aug-2013 Python-Version: 3.4 Post-History: 8-Aug-2013 Resolution: Abstract ======== This PEP proposes to add a new class to ``importlib.machinery`` called ``ModuleSpec``. It will contain all the import-related information about a module without needing to load the module first. Finders will now return a module's spec rather than a loader. The import system will use the spec to load the module. Motivation ========== The import system has evolved over the lifetime of Python. In late 2002 PEP 302 introduced standardized import hooks via ``finders`` and ``loaders`` and ``sys.meta_path``. The ``importlib`` module, introduced with Python 3.1, now exposes a pure Python implementation of the APIs described by PEP 302, as well as of the full import system. It is now much easier to understand and extend the import system. While a benefit to the Python community, this greater accessibilty also presents a challenge. As more developers come to understand and customize the import system, any weaknesses in the finder and loader APIs will be more impactful. So the sooner we can address any such weaknesses the import system, the better...and there are a couple we can take care of with this proposal. Firstly, any time the import system needs to save information about a module we end up with more attributes on module objects that are generally only meaningful to the import system and occoasionally to some people. It would be nice to have a per-module namespace to put future import-related information. Secondly, there's an API void between finders and loaders that causes undue complexity when encountered. Finders are strictly responsible for providing the loader which the import system will use to load the module. The loader is then responsible for doing some checks, creating the module object, setting import-related attributes, "installing" the module to ``sys.modules``, and loading the module, along with some cleanup. This all takes place during the import system's call to ``Loader.load_module()``. Loaders also provide some APIs for accessing data associated with a module. Loaders are not required to provide any of the functionality of ``load_module()`` through other methods. Thus, though the import- related information about a module is likely available without loading the module, it is not otherwise exposed. Furthermore, the requirements assocated with ``load_module()`` are common to all loaders and mostly are implemented in exactly the same way. This means every loader has to duplicate the same boilerplate code. ``importlib.util`` provides some tools that help with this, but it would be more helpful if the import system simply took charge of these responsibilities. The trouble is that this would limit the degree of customization that ``load_module()`` facilitates. This is a gap between finders and loaders which this proposal aims to fill. Finally, when the import system calls a finder's ``find_module()``, the finder makes use of a variety of information about the module that is useful outside the context of the method. Currently the options are limited for persisting that per-module information past the method call, since it only returns the loader. Either store it in a module-to-info mapping somewhere like on the finder itself, or store it on the loader. Unfortunately, loaders are not required to be module-specific. On top of that, some of the useful information finders could provide is common to all finders, so ideally the import system could take care of that. This is the same gap as before between finders and loaders. As an example of complexity attributable to this flaw, the implementation of namespace packages in Python 3.3 (see PEP 420) added ``FileFinder.find_loader()`` because there was no good way for ``find_module()`` to provide the namespace path. The answer to this gap is a ``ModuleSpec`` object that contains the per-module information and takes care of the boilerplate functionality of loading the module. (The idea grew feet during discussions related to another PEP.[1]) Specification ============= ModuleSpec ---------- A new class which defines the import-related values to use when loading the module. It closely corresponds to the import-related attributes of module objects. ``ModuleSpec`` objects may also be used by finders and loaders and other import-related APIs to hold extra import-related information about the module. This greatly reduces the need to add any new import-related attributes to module objects. Attributes: * ``name`` - the module's name (compare to ``__name__``). * ``loader`` - the loader to use during loading and for module data (compare to ``__loader__``). * ``package`` - the name of the module's parent (compare to ``__package__``). * ``is_package`` - whether or not the module is a package. * ``origin`` - the location from which the module originates. * ``filename`` - like origin, but limited to a path-based location (compare to ``__file__``). * ``cached`` - the location where the compiled module should be stored (compare to ``__cached__``). * ``path`` - the list of path entries in which to search for submodules or ``None``. (compare to ``__path__``). It should be in sync with ``is_package``. Those are also the parameters to ``ModuleSpec.__init__()``, in that order. The last three are optional. When passed the values are taken as-is. The ``from_loader()`` method offers calculated values. Methods: * ``from_loader(cls, ...)`` - returns a new ``ModuleSpec`` derived from the arguments. The parameters are the same as with ``__init__``, except ``package`` is excluded and only ``name`` and ``loader`` are required. * ``module_repr()`` - returns a repr for the module. * ``init_module_attrs(module)`` - sets the module's import-related attributes. * ``load(module=None, *, is_reload=False)`` - calls the loader's ``exec_module()``, falling back to ``load_module()`` if necessary. This method performs the former responsibilities of loaders for managing modules before actually loading and for cleaning up. The reload case is facilitated by the ``module`` and ``is_reload`` parameters. Values Derived by from_loader() ------------------------------- As implied above, ``from_loader()`` makes a best effort at calculating any of the values that are not passed in. It duplicates the behavior that was formerly provided the several ``importlib.util`` functions as well as the ``init_module_attrs()`` method of several of ``importlib``'s loaders. Just to be clear, here is a more detailed description of those calculations: ``is_package`` is derived from ``path``, if passed. Otherwise the loader's ``is_package()`` is tried. Finally, it defaults to False. ``filename`` is pulled from the loader's ``get_filename()``, if possible. ``path`` is set to an empty list if ``is_package`` is true, and the directory from ``filename`` is appended to it, if available. ``cached`` is derived from ``filename`` if it's available. ``origin`` is set to ``filename``. ``package`` is set to ``name`` if the module is a package and to ``name.rpartition('.')[0]`` otherwise. Consequently, a top-level module will have ``package`` set to the empty string. Backward Compatibility ---------------------- Since finder ``find_module()`` methods would now return a module spec instead of loader, specs must act like the loader that would have been returned instead. This is relatively simple to solve since the loader is available as an attribute of the spec. However, ``ModuleSpec.is_package`` (an attribute) conflicts with ``InspectLoader.is_package()`` (a method). Working around this requires a more complicated solution but is not a large obstacle. Unfortunately, the ability to proxy does not extend to ``id()`` comparisons and ``isinstance()`` tests. In the case of the return value of ``find_module()``, we accept that break in backward compatibility. Subclassing ----------- .. XXX Allowed but discouraged? Module Objects -------------- Module objects will now have a ``__spec__`` attribute to which the module's spec will be bound. None of the other import-related module attributes will be changed or deprecated, though some of them could be. Any such deprecation can wait until Python 4. ``ModuleSpec`` objects will not be kept in sync with the corresponding module object's import-related attributes. They may differ, though in practice they will be the same. Finders ------- Finders will now return ModuleSpec objects when ``find_module()`` is called rather than loaders. For backward compatility, ``Modulespec`` objects proxy the attributes of their ``loader`` attribute. Adding another similar method to avoid backward-compatibility issues is undersireable if avoidable. The import APIs have suffered enough. The approach taken by this PEP should be sufficient. The change to ``find_module()`` applies to both ``MetaPathFinder`` and ``PathEntryFinder``. ``PathEntryFinder.find_loader()`` will be deprecated and, for backward compatibility, implicitly special-cased if the method exists on a finder. Loaders ------- Loaders will have a new method, ``exec_module(module)``. Its only job is to "exec" the module and consequently populate the module's namespace. It is not responsible for creating or preparing the module object, nor for any cleanup afterward. It has no return value. The ``load_module()`` of loaders will still work and be an active part of the loader API. It is still useful for cases where the default module creation/prepartion/cleanup is not appropriate for the loader. A loader must have ``exec_module()`` or ``load_module()`` defined. If both exist on the loader, ``exec_module()`` is used and ``load_module()`` is ignored. PEP 420 introduced the optional ``module_repr()`` loader method to limit the amount of special-casing in the module type's ``__repr__()``. Since this method is part of ``ModuleSpec``, it will be deprecated on loaders. However, if it exists on a loader it will be used exclusively. The loader ``init_module_attr()`` method, added for Python 3.4 will be eliminated in favor of the same method on ``ModuleSpec``. However, ``InspectLoader.is_package()`` will not be deprecated even though the same information is found on ``ModuleSpec``. ``ModuleSpec`` can use it to populate its own ``is_package`` if that information is not otherwise available. Still, it will be made optional. In addition to executing a module during loading, loaders will still be directly responsible for providing APIs concerning module-related data. Other Changes ------------- * The various finders and loaders provided by ``importlib`` will be updated to comply with this proposal. * The spec for the ``__main__`` module will reflect how the interpreter was started. For instance, with ``-m`` the spec's name will be that of the run module, while ``__main__.__name__`` will still be "__main__". * We add ``importlib.find_module()`` to mirror ``importlib.find_loader()`` (which becomes deprecated). * Deprecations in ``importlib.util``: ``set_package()``, ``set_loader()``, and ``module_for_loader()``. ``module_to_load()`` (introduced in 3.4) can be removed. * ``importlib.reload()`` is changed to use ``ModuleSpec.load()``. * ``ModuleSpec.load()`` and ``importlib.reload()`` will now make use of the per-module import lock, whereas ``Loader.load_module()`` did not. Reference Implementation ------------------------ A reference implementation is available at . References ========== [1] http://mail.python.org/pipermail/import-sig/2013-August/000658.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Aug 9 08:38:18 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 9 Aug 2013 00:38:18 -0600 Subject: [Import-SIG] PEP proposal: Per-Module Import Path In-Reply-To: References:

Message-ID: On Thu, Aug 8, 2013 at 6:18 AM, Brett Cannon wrote: > On Wed, Aug 7, 2013 at 9:08 PM, Eric Snow wrote: >> >> The patch I've got is pretty hefty. Should I keep it low key and just >> post it here, or would it be worth logging a ticket and posting it there >> for review? >> > > Once it's all written up in a PEP you can post an issue for the code. > PEP sent to list. I want to pass a couple lingering unit tests before I post the patch. > > >> Once I'm comfortable with the patch I'll try sticking my .ref patch on >> top and see how it looks. I'll probably whip up a PEP for ModuleSpec at >> that point if things are looking good. >> >> I'm just worried about getting this done in time for 3.4. On top of this >> I'm really close on OrderedDict, ordered class definition namespace, >> .__definition_order__, and locals('__kworder__'), so I'm still kind >> of nervous about taking on two non-trivial changes to the import system >> with so little time before beta 1. However, at this point I still think >> it's doable. :) >> > > I personally view all of this as bonus stuff that is in no way required to > make Python function or make some new class of solution available, so I > wouldn't stress about getting in for 3.4. > > >> >> -eric >> >> p.s. I hadn't realized this list was "closed". Should we change that, or >> take this (both ModuleSpec and .ref) to python-ideas (or off-line)? >> > > Eric or Barry are the admins so they can change the wording. I say just > leave it here for now until people are happy with the proposal and then it > can be kicked up to python-dev (python-ideas isn't needed in this case > since we have this mailing list specifically for import discussions). > Fine with me. :) -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Aug 9 10:28:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 9 Aug 2013 10:28:03 +0200 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System References: Message-ID: <20130809102803.5615941d@pitrou.net> Hi, Le Fri, 9 Aug 2013 00:34:34 -0600, Eric Snow a ?crit : > Abstract > ======== > > This PEP proposes to add a new class to ``importlib.machinery`` called > ``ModuleSpec``. It will contain all the import-related information > about a module without needing to load the module first. Finders will > now return a module's spec rather than a loader. The import system > will use the spec to load the module. Looks good on the principle. > Attributes: > > * ``name`` - the module's name (compare to ``__name__``). > * ``loader`` - the loader to use during loading and for module data > (compare to ``__loader__``). Should it be the loader or just a factory to build it? I'm wondering if in some cases creating a loader is costly. > * ``package`` - the name of the module's parent (compare to > ``__package__``). Is it None if there is no parent? > * ``is_package`` - whether or not the module is a package. > * ``origin`` - the location from which the module originates. > * ``filename`` - like origin, but limited to a path-based location > (compare to ``__file__``). Can you explain the difference between origin and filename (or, better, give an example)? > * ``load(module=None, *, is_reload=False)`` - calls the loader's > ``exec_module()``, falling back to ``load_module()`` if necessary. > This method performs the former responsibilities of loaders for > managing modules before actually loading and for cleaning up. The > reload case is facilitated by the ``module`` and ``is_reload`` > parameters. So how about separate load() and reload() methods? > However, ``ModuleSpec.is_package`` (an attribute) conflicts with > ``InspectLoader.is_package()`` (a method). Working around this > requires a more complicated solution but is not a large obstacle. Or how about keeping the method API? > Module Objects > -------------- > > Module objects will now have a ``__spec__`` attribute to which the > module's spec will be bound. Nice! > Loaders will have a new method, ``exec_module(module)``. Its only job > is to "exec" the module and consequently populate the module's > namespace. It is not responsible for creating or preparing the module > object, nor for any cleanup afterward. It has no return value. Does it work with extension modules as well? Generally, extension modules are populated when created (i.e. the two steps aren't separate at the C API level, IIRC). Regards Antoine. From brett at python.org Fri Aug 9 16:43:10 2013 From: brett at python.org (Brett Cannon) Date: Fri, 9 Aug 2013 10:43:10 -0400 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: <20130809102803.5615941d@pitrou.net> References: <20130809102803.5615941d@pitrou.net> Message-ID: On Fri, Aug 9, 2013 at 4:28 AM, Antoine Pitrou wrote: > > Hi, > > Le Fri, 9 Aug 2013 00:34:34 -0600, > Eric Snow a ?crit : > > Abstract > > ======== > > > > This PEP proposes to add a new class to ``importlib.machinery`` called > > ``ModuleSpec``. It will contain all the import-related information > > about a module without needing to load the module first. Finders will > > now return a module's spec rather than a loader. The import system > > will use the spec to load the module. > > Looks good on the principle. > > > Attributes: > > > > * ``name`` - the module's name (compare to ``__name__``). > > * ``loader`` - the loader to use during loading and for module data > > (compare to ``__loader__``). > > Should it be the loader or just a factory to build it? > I'm wondering if in some cases creating a loader is costly. > Theoretically it could be costly, but up to this point I have not seen a single loader that cost a lot to create. Every loader I have ever written just stores details that the finder had to calculate for it's work and potentially stores something, e.g. an open zipfile that the finder used to see if a module was there. > > > * ``package`` - the name of the module's parent (compare to > > ``__package__``). > > Is it None if there is no parent? > Top-level modules have the value of '' for __package__. None is used to represent an unknown value. -Brett > > > * ``is_package`` - whether or not the module is a package. > > * ``origin`` - the location from which the module originates. > > * ``filename`` - like origin, but limited to a path-based location > > (compare to ``__file__``). > > Can you explain the difference between origin and filename (or, better, > give an example)? > > > * ``load(module=None, *, is_reload=False)`` - calls the loader's > > ``exec_module()``, falling back to ``load_module()`` if necessary. > > This method performs the former responsibilities of loaders for > > managing modules before actually loading and for cleaning up. The > > reload case is facilitated by the ``module`` and ``is_reload`` > > parameters. > > So how about separate load() and reload() methods? > > > However, ``ModuleSpec.is_package`` (an attribute) conflicts with > > ``InspectLoader.is_package()`` (a method). Working around this > > requires a more complicated solution but is not a large obstacle. > > Or how about keeping the method API? > > > Module Objects > > -------------- > > > > Module objects will now have a ``__spec__`` attribute to which the > > module's spec will be bound. > > Nice! > > > Loaders will have a new method, ``exec_module(module)``. Its only job > > is to "exec" the module and consequently populate the module's > > namespace. It is not responsible for creating or preparing the module > > object, nor for any cleanup afterward. It has no return value. > > Does it work with extension modules as well? Generally, extension > modules are populated when created (i.e. the two steps aren't separate > at the C API level, IIRC). > > Regards > > Antoine. > > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Aug 9 18:45:22 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 9 Aug 2013 10:45:22 -0600 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: <20130809102803.5615941d@pitrou.net> References: <20130809102803.5615941d@pitrou.net> Message-ID: On Fri, Aug 9, 2013 at 2:28 AM, Antoine Pitrou wrote: > Le Fri, 9 Aug 2013 00:34:34 -0600, > Eric Snow a ?crit : > > Attributes: > > > > * ``name`` - the module's name (compare to ``__name__``). > > * ``loader`` - the loader to use during loading and for module data > > (compare to ``__loader__``). > > Should it be the loader or just a factory to build it? > I'm wondering if in some cases creating a loader is costly. > The finder is currently responsible for creating the loader and this PEP does not propose changing that. So any such loader already has to deal with this. I suppose some loader could be expensive to create, but none of the existing loaders in the stdlib are that costly. If some future loader runs into this problem they can pretty easily write the loader in such a way that it defers the costly operations. I'll make a note in the PEP about this. > > * ``package`` - the name of the module's parent (compare to > > ``__package__``). > > Is it None if there is no parent? > As Brett noted, it is ''. This is the same as the __package__ attribute of modules. The goal is to keep the same behavior, as much as possible, for all the feature that are moved into ModuleSpec. I'll make this objective more clear in the PEP. > > > * ``is_package`` - whether or not the module is a package. > > * ``origin`` - the location from which the module originates. > > * ``filename`` - like origin, but limited to a path-based location > > (compare to ``__file__``). > > Can you explain the difference between origin and filename (or, better, > give an example)? > Yeah, that wasn't too clear, was it? filename maps directly to the module's __file__ attribute, which is not set for all modules. For instance, built-in modules do not set it nor do namespace packages. In those cases it is still nice to be able to indicate where the module came from. For built-in modules origin will be set to 'built-in' and for namespace packages 'namespace'. For any module with a filename, origin is set to the filename. Having both origin and filename is meant to provide for different usage. filename is used to populate a module's __file__ attribute. If set, it indicates a path-based module (along with cached and path). In contrast, origin has a broader meaning and is used by the module_repr() method. I suppose there could be a flag to indicate the module is path-based, but I went with a separate spec attribute. Likewise, I toyed with the idea of a path-based subclass, perhaps PathModuleSpec, but wanted to stick with a one-size-fits-all spec class since it is meant to be used almost exclusively for state rather than functionality. In some ways it's like types.SimpleNamespace, but with a couple of import-related methods and some dedicated state. I'll make sure the PEP reflects this. > > * ``load(module=None, *, is_reload=False)`` - calls the loader's > > ``exec_module()``, falling back to ``load_module()`` if necessary. > > This method performs the former responsibilities of loaders for > > managing modules before actually loading and for cleaning up. The > > reload case is facilitated by the ``module`` and ``is_reload`` > > parameters. > > So how about separate load() and reload() methods? > I thought about that too, but found it simpler to keep them together. Also, reload is a pretty specialized activity and I plan on leaving some of the boilerplate of it to importlib.reload(). However, I'm not convinced either way actually. I'll think about that some more and update the PEP regardless. Do you have a case to make for making them separate? > > > However, ``ModuleSpec.is_package`` (an attribute) conflicts with > > ``InspectLoader.is_package()`` (a method). Working around this > > requires a more complicated solution but is not a large obstacle. > > Or how about keeping the method API? > Because it is a static piece of data. At the point that we can remove the backward compatibility support, we would be stuck with a method when it should be just a normal attribute. > > > Module Objects > > -------------- > > > > Module objects will now have a ``__spec__`` attribute to which the > > module's spec will be bound. > > Nice! > Ironic that this PEP adds yet another import-related attribute to modules. :) Hopefully it's the last one. > > > Loaders will have a new method, ``exec_module(module)``. Its only job > > is to "exec" the module and consequently populate the module's > > namespace. It is not responsible for creating or preparing the module > > object, nor for any cleanup afterward. It has no return value. > > Does it work with extension modules as well? Generally, extension > modules are populated when created (i.e. the two steps aren't separate > at the C API level, IIRC). > Yeah, it works great. We simply don't implement exec_module() on ExtensionFileLoader and things just stay the same. There is room to add an exec_module() and update the C-API for extension modules to support it, but I'll leaving that out of the PEP. However, I will mention that in the PEP because your question is quite relevant and not well answered there. -eric > Regards > > Antoine. > > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Aug 9 20:03:32 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 9 Aug 2013 12:03:32 -0600 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: References:

Message-ID: On Fri, Aug 9, 2013 at 8:40 AM, Brett Cannon wrote: > On Fri, Aug 9, 2013 at 2:34 AM, Eric Snow wrote: > >> Finally, when the import system calls a finder's ``find_module()``, the >> > finder makes use of a variety of information about the module that is >> useful outside the context of the method. Currently the options are >> limited for persisting that per-module information past the method call, >> since it only returns the loader. Either store it in a module-to-info >> mapping somewhere like on the finder itself, or store it on the loader. >> > > The two previous sentences are hard to read; I think you were after > something like, > "Popular options for this limitation are to store the information is in a > module-to-info > mapping somewhere on the finder itself, or store it on the loader. > Sounds good. > > >> (The idea grew feet during discussions related to another PEP.[1]) >> > > "(This PEP grew out of discussions related to another PEP [1])" > Yeah, this was one of the last things I added to the PEP and my brain was starting to get a little fuzzy. :) > * ``is_package`` - whether or not the module is a package. >> > > I think is_package() is redundant in the face of 'name'/'package' or > 'path' as you can introspect the same information. I honestly have always > found it a weakness of InspectLoader.is_package() that it didn't return the > value for __path__. > I see what you mean, but I also think it's nice to be able to explicitly see if a spec is for a package without having to know about underlying rules. However, I'll just make it a property instead of something set on the spec (and remove it from __init__). > > >> * ``origin`` - the location from which the module originates. >> > > Don't quite follow what this is meant to represent? Like the path to the > zipfile if loaded that way, otherwise it's the file path? > Yeah, Antoine had the same question. I'll make sure the PEP is clearer. Basically filename maps to the module's __file__ and origin is used for the module's repr if filename isn't set. > > >> * ``filename`` - like origin, but limited to a path-based location >> (compare to ``__file__``). >> * ``cached`` - the location where the compiled module should be stored >> (compare to ``__cached__``). >> * ``path`` - the list of path entries in which to search for submodules >> or ``None``. (compare to ``__path__``). It should be in sync with >> ``is_package``. >> > > Why is 'path' the only attribute with a default value? Should probably say > everything has a default value of None if not set/known. > Good point. > > >> >> Those are also the parameters to ``ModuleSpec.__init__()``, in that >> order. >> > > I would consider arguing all arguments should be keyword-only past 'name' > since there is no way most people will remember that order correctly. > Makes sense, though I'll make everything but name and loader keyword-only. > * ``from_loader(cls, ...)`` - returns a new ``ModuleSpec`` derived from the >> arguments. The parameters are the same as with ``__init__``, except >> ``package`` is excluded and only ``name`` and ``loader`` are required. >> > > Why the switch in requirements compared to __init__()? > Because package is always calculated and only name and loader are necessary to calculate the remaining attributes. Perhaps from_loader() is the wrong name (I'm open to alternatives). Perhaps __init__() should take over some of the calculating. My intention is to provide one API for what-you-pass-in-is-what-you-get (__init__) and another for calculating attributes. Of course, one could simply modify the spec after creating it, but I like idea of explicitly opting in to calculated values. I'll add this point to the PEP. Also I'll probably also drop package as a parameter of __init__ and make the attribute a property. I've also toyed with the idea of making all the attributes properties (aka read-only) since changing a module's spec later on could lead to headache, but I'm not convinced that is a easy problem to cause. It's better to not get in the way of those who have needs I haven't anticipated (consenting adults, etc.). What do you think? > > >> * ``module_repr()`` - returns a repr for the module. >> * ``init_module_attrs(module)`` - sets the module's import-related >> attributes. >> > > Specify what those attributes are and how they are set. > Will do. > > >> * ``load(module=None, *, is_reload=False)`` - calls the loader's >> ``exec_module()``, falling back to ``load_module()`` if necessary. >> This method performs the former responsibilities of loaders for >> managing modules before actually loading and for cleaning up. The >> reload case is facilitated by the ``module`` and ``is_reload`` >> parameters. >> > > If a module is provided and there is already a matching key in > sys.modules, what happens? > What if is_reload is True but there is no module provided or in > sys.modules; KeyError, ValueError, ImportError? Do you follow having None > in sys.modules and raise ImportError, or do you overwrite (same question if > a module is explicitly provided)? > That's a good point. I thought I had addressed this in the PEP, but apparently not. For Loader.load_module(), as you know, the existence of the key in sys.modules indicates a reload should happen. The is_reload parameter is meant to provide an explicit indicator. The module you pass in is simply the one to use. If a module is not passed in and is_reload is true, the module in sys.modules will be used. If that module is None or not there, ImportError would be raised. If a module is passed in and is_reload is false, I was planning on just ignoring that module. However raising ValueError in that case would be more useful, indicating that the method was called incorrectly. Having just the module parameter and letting it indicate a reload is doable, but that would mean losing the option of having load() look up the module (and it's less explicit). Another option is to have a separate reload() method. Antoine mentioned it and I'd considered it early on. I'm considering it again since it makes the API less complicated. Do you have a preference between the current proposal (load() does it all) and a separate reload() method? ``is_package`` is derived from ``path``, if passed. Otherwise the >> loader's ``is_package()`` is tried. Finally, it defaults to False. >> > > It can also be calculated based on whether ``name`` == ``package``: ``True > if path is not None else name == package``. > Good point, though at this point I don't think package will be something you set. Always need to watch out for [] for path as that is valid and signals the > module is a package. > Yeah, I've got that covered in from_loader(). This is where defining exactly what details need to be passed in and which > ones are optional are going to be critical in determining what represents > ambiguity/unknown details vs. what is flat-out known to be true/false. > Agreed. I'll be sure to spell it out. > ``cached`` is derived from ``filename`` if it's available. >> > > Derived how? > cache_from_source() > methods would now return a module spec >> instead of loader, specs must act like the loader that would have been >> returned instead. This is relatively simple to solve since the loader >> is available as an attribute of the spec. >> > > Are you going to define a __getattr__ to delegate to the loader? Or are > you going to specifically define equivalent methods, e.g. get_filename() is > obviously solvable by getting the attribute from the spec (as long as > filename is a required value)? > __getattr__(). I don't want to guess what methods a loader might have. And if someone wants to call get_filename() on what they think is the loader, I think it's better to just call the loader's get_filename(). I'd left this stuff out as an implementation detail. Do you think it should be in the PEP? I could simply elaborate on "specs must act like the loader". > > >> >> However, ``ModuleSpec.is_package`` (an attribute) conflicts with >> ``InspectLoader.is_package()`` (a method). Working around this requires >> a more complicated solution but is not a large obstacle. >> >> Unfortunately, the ability to proxy does not extend to ``id()`` >> comparisons and ``isinstance()`` tests. In the case of the return value >> of ``find_module()``, we accept that break in backward compatibility. >> > > Mention that ModuleSpec can be added to the proper ABCs in importlib.abc > to help alleviate this issue. > Good point. > > >> >> Subclassing >> ----------- >> >> .. XXX Allowed but discouraged? >> > > Why should it matter if they are subclassed? > My goal was for ModuleSpec to be the container for module definition state with some common attributes as a baseline and a minimal number of methods for the import system to use. Loaders would be where you would do extra stuff or customize functionality, which is basically what happens now. It seemed correct before but now it's feeling like a very artificial and unnecessary objective. Finders >> ------- >> >> Finders will now return ModuleSpec objects when ``find_module()`` is >> called rather than loaders. For backward compatility, ``Modulespec`` >> objects proxy the attributes of their ``loader`` attribute. >> >> Adding another similar method to avoid backward-compatibility issues >> is undersireable if avoidable. The import APIs have suffered enough. >> > > in lieu of the fact that find_loader() was just introduced in Python 3.3. > Are you suggesting additional wording or making a comment? > >> Loaders >> ------- >> >> Loaders will have a new method, ``exec_module(module)``. Its only job >> is to "exec" the module and consequently populate the module's >> namespace. It is not responsible for creating or preparing the module >> object, nor for any cleanup afterward. It has no return value. >> >> The ``load_module()`` of loaders will still work and be an active part >> of the loader API. It is still useful for cases where the default >> module creation/prepartion/cleanup is not appropriate for the loader. >> > > But will it still be required? Obviously importlib.abc.Loader can grow a > default load_module() defined around exec_module(), but it should be clear > if we expect the method to always be manually defined or if it will > eventually go away. > load_module() will no longer be required. However, it still serves a real purpose: the loader may still need to control more of the loading process. By implementing load_module() but not exec_module(), a loader gets that. I'm make sure that's clear. > > >> >> A loader must have ``exec_module()`` or ``load_module()`` defined. If >> both exist on the loader, ``exec_module()`` is used and >> ``load_module()`` is ignored. >> > > Ignored by whom? Should specify that the import system is the one doing > the ignoring. > Got it. > * Deprecations in ``importlib.util``: ``set_package()``, >> > ``set_loader()``, and ``module_for_loader()``. ``module_to_load()`` >> (introduced in 3.4) can be removed. >> > > "(introduced prior to Python 3.4's release)"; remember, PEPs are timeless > and will outlive 3.4 so specifying it never went public is important. > Good catch. You should be a PEP editor. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Aug 9 20:15:32 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 9 Aug 2013 12:15:32 -0600 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: References:

Message-ID: Would it be worth deprecating the current signature and attributes of FileLoader, NamespaceLoader, etc. FileLoader.get_filename() uses self.path, but otherwise the only use for the attributes is already covered by the info in the spec. Also, should we have timelines for the deprecations in the PEP. I'm inclined to not worry about it, but it *would* be nice to remove at least some of the backward compatibility hackery that this PEP will introduce. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Aug 9 20:23:39 2013 From: brett at python.org (Brett Cannon) Date: Fri, 9 Aug 2013 14:23:39 -0400 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: References:

Message-ID: On Fri, Aug 9, 2013 at 2:15 PM, Eric Snow wrote: > Would it be worth deprecating the current signature and attributes of > FileLoader, NamespaceLoader, etc. FileLoader.get_filename() uses > self.path, but otherwise the only use for the attributes is already covered > by the info in the spec. > Probably, or at least provide a Spec-only signature of the __init__(). > > Also, should we have timelines for the deprecations in the PEP. I'm > inclined to not worry about it, but it *would* be nice to remove at least > some of the backward compatibility hackery that this PEP will introduce. > Since the backwards-compatibility hacks don't sound like they will be ridiculously complex or getting in the way I say just put in proper PendingDeprecationWarnings and assume they will be there until Python 4 (no later than 8 years away! =). -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Aug 9 21:22:49 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 9 Aug 2013 21:22:49 +0200 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: References: <20130809102803.5615941d@pitrou.net> Message-ID: <20130809212249.04db6a5c@fsol> On Fri, 9 Aug 2013 10:45:22 -0600 Eric Snow wrote: > > So how about separate load() and reload() methods? > > > > I thought about that too, but found it simpler to keep them together. > Also, reload is a pretty specialized activity and I plan on leaving some > of the boilerplate of it to importlib.reload(). However, I'm not convinced > either way actually. I'll think about that some more and update the PEP > regardless. Do you have a case to make for making them separate? Well, is there another way to use load() than: - load(): load a new module - load(existing_module, is_reload=True): reload an existing module I mean, does it make sense to call e.g. - load(some_existing_module, is_reload=False) - load(is_reload=True) ? Regards Antoine. From ericsnowcurrently at gmail.com Sat Aug 10 00:28:06 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 9 Aug 2013 16:28:06 -0600 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: References:

Message-ID: On Fri, Aug 9, 2013 at 12:20 PM, Brett Cannon wrote: > On Fri, Aug 9, 2013 at 2:03 PM, Eric Snow wrote: > > Having just the module parameter and letting it indicate a reload is >> doable, but that would mean losing the option of having load() look up the >> module (and it's less explicit). Another option is to have a separate >> reload() method. Antoine mentioned it and I'd considered it early on. I'm >> considering it again since it makes the API less complicated. Do you have >> a preference between the current proposal (load() does it all) and a >> separate reload() method? >> > > Nope, no preference. > Okay. I'll probably try it out a separate reload() and see how things look. > > >> >> ``is_package`` is derived from ``path``, if passed. Otherwise the >>>> loader's ``is_package()`` is tried. Finally, it defaults to False. >>>> >>> >>> It can also be calculated based on whether ``name`` == ``package``: >>> ``True if path is not None else name == package``. >>> >> >> Good point, though at this point I don't think package will be something >> you set. >> > > So you would set 'name' and 'path' to decide if something is a package and > use that to calculate 'package'? > That and the loader's is_package(), if available. > cache_from_source() >> > > I figured, but I know too much about this stuff. =) I would spell it out > in the PEP. > Done. > __getattr__(). I don't want to guess what methods a loader might have. >> And if someone wants to call get_filename() on what they think is the >> loader, I think it's better to just call the loader's get_filename(). I'd >> left this stuff out as an implementation detail. Do you think it should be >> in the PEP? I could simply elaborate on "specs must act like the loader". >> > > I would elaborate that it's going to be __getattr__() since it influences > the level of backwards-compatibility. > Done. > My goal was for ModuleSpec to be the container for module definition state >> with some common attributes as a baseline and a minimal number of methods >> for the import system to use. Loaders would be where you would do extra >> stuff or customize functionality, which is basically what happens now. >> >> It seemed correct before but now it's feeling like a very artificial and >> unnecessary objective. >> > > I totally get where you are coming from and if we were working in a > language that pushed for read-only attributes I would agree, but we aren't > so I wouldn't. =) It just becomes more hassle than it's worth to enforce. > Agreed. > in lieu of the fact that find_loader() was just introduced in Python 3.3. >>> >> >> Are you suggesting additional wording or making a comment? >> > > Both? =) > Okay. I clarified that. I'll probably be posting an updated PEP shortly. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Sat Aug 10 00:36:49 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 9 Aug 2013 16:36:49 -0600 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: References:

Message-ID: On Fri, Aug 9, 2013 at 12:23 PM, Brett Cannon wrote: > > > > On Fri, Aug 9, 2013 at 2:15 PM, Eric Snow wrote: > >> Would it be worth deprecating the current signature and attributes of >> FileLoader, NamespaceLoader, etc. FileLoader.get_filename() uses >> self.path, but otherwise the only use for the attributes is already covered >> by the info in the spec. >> > > Probably, or at least provide a Spec-only signature of the __init__(). > > >> >> Also, should we have timelines for the deprecations in the PEP. I'm >> inclined to not worry about it, but it *would* be nice to remove at least >> some of the backward compatibility hackery that this PEP will introduce. >> > > Since the backwards-compatibility hacks don't sound like they will be > ridiculously complex or getting in the way I say just put in proper > PendingDeprecationWarnings and assume they will be there until Python 4 (no > later than 8 years away! =). > Sounds good. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Sat Aug 10 00:44:55 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 9 Aug 2013 16:44:55 -0600 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: <20130809212249.04db6a5c@fsol> References: <20130809102803.5615941d@pitrou.net> <20130809212249.04db6a5c@fsol> Message-ID: On Fri, Aug 9, 2013 at 1:22 PM, Antoine Pitrou wrote: > Well, is there another way to use load() than: > - load(): load a new module > - load(existing_module, is_reload=True): reload an existing module > > I mean, does it make sense to call e.g. > - load(some_existing_module, is_reload=False) > This would be a ValueError. The module argument is meant just for reload. I'm not sure it makes sense otherwise. Perhaps so you could prepare your own new module prior to calling load()? I'd like to leave that off the table for this PEP. > - load(is_reload=True) > This was always okay in my mind, but I realized it did not make it to the PEP until Brett had some similar questions. :) The updated PEP covers this. Like I told Brett, I'm going to see how a separate reload() looks and go from there. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Sat Aug 10 01:19:21 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 9 Aug 2013 17:19:21 -0600 Subject: [Import-SIG] 40k limit on this list Message-ID: Apparently I blew past the size limit for posting to this list. FYI, I posted an updated PEP for ModuleSpec and it should be showing up at some point. :) -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Sat Aug 10 06:58:01 2013 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 10 Aug 2013 06:58:01 +0200 Subject: [Import-SIG] 40k limit on this list In-Reply-To: References: Message-ID: I'm traveling and without access to a real computer. I'll release your message in the next 48 hours, if no one beats me to it. -- Eric. On Aug 10, 2013, at 1:19 AM, Eric Snow wrote: > Apparently I blew past the size limit for posting to this list. FYI, I posted an updated PEP for ModuleSpec and it should be showing up at some point. :) > > -eric > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig From ncoghlan at gmail.com Sat Aug 10 12:50:25 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 10 Aug 2013 20:50:25 +1000 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: References: <20130809102803.5615941d@pitrou.net> <20130809212249.04db6a5c@fsol> Message-ID: On 10 August 2013 08:44, Eric Snow wrote: > On Fri, Aug 9, 2013 at 1:22 PM, Antoine Pitrou wrote: >> >> Well, is there another way to use load() than: >> - load(): load a new module >> - load(existing_module, is_reload=True): reload an existing module >> >> I mean, does it make sense to call e.g. >> - load(some_existing_module, is_reload=False) > > > This would be a ValueError. The module argument is meant just for reload. > I'm not sure it makes sense otherwise. Perhaps so you could prepare your > own new module prior to calling load()? I'd like to leave that off the > table for this PEP. The advantage of offering that API over telling people to call spec.load.exec_module(m) directly is that it gives us more control over the loading process (by updating ModuleSpec.load), avoiding the current problem we have where providing new load time behaviour is difficult because we don't control the loader implementations. >> >> - load(is_reload=True) > > > This was always okay in my mind, but I realized it did not make it to the > PEP until Brett had some similar questions. :) The updated PEP covers this. > Like I told Brett, I'm going to see how a separate reload() looks and go > from there. A separate reload that works something like this sounds good to me: def reload(self, module=None): if module is None: module = sys.modules[self.name] self.load(module) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Aug 10 13:02:44 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 10 Aug 2013 21:02:44 +1000 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: References: Message-ID: This generally looks good to me. Something I'm wondering: Q1. Can we experiment with this as a custom metapath importer? A1. Not really, because we want to use it to avoid some of the other importlib additions made in 3.4. However, a backport to 3.3 as a custom metapath hook may still be interesting. Q2. Given this idea as a foundation, could we experiment with ref file support as a custom importer? A2. Quite possibly, which may make that a good thing to defer to 3.5 (for stdlib inclusion, anyway). I'll wait until the updated version gets through before commenting further :) Cheers, Nick. From ericsnowcurrently at gmail.com Sat Aug 10 19:57:18 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 10 Aug 2013 11:57:18 -0600 Subject: [Import-SIG] 40k limit on this list In-Reply-To: References:

Message-ID: Thanks, Eric. -eric On Fri, Aug 9, 2013 at 10:58 PM, Eric V. Smith wrote: > I'm traveling and without access to a real computer. I'll release your > message in the next 48 hours, if no one beats me to it. > > -- > Eric. > > On Aug 10, 2013, at 1:19 AM, Eric Snow > wrote: > > > Apparently I blew past the size limit for posting to this list. FYI, I > posted an updated PEP for ModuleSpec and it should be showing up at some > point. :) > > > > -eric > > _______________________________________________ > > Import-SIG mailing list > > Import-SIG at python.org > > http://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Aug 9 16:40:10 2013 From: brett at python.org (Brett Cannon) Date: Fri, 9 Aug 2013 10:40:10 -0400 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: References: Message-ID: I like the idea and I think it can be more-or-less safe. Just need more specification/clarification on things. On Fri, Aug 9, 2013 at 2:34 AM, Eric Snow wrote: > This is an outgrowth of discussions on the .ref PEP, but it's also > something I've been thinking about for over a year and starting toying with > at the last PyCon. I have a patch that passes all but a couple unit tests > and should pass though when I get a minute to take another pass at it. > I'll probably end up adding a bunch more unit tests before I'm done as > well. However, the functionality is mostly there. > > BTW, I gotta say, Brett, I have a renewed appreciation for the long and > hard effort you put into importlib. There are just so many odd corner > cases that I never would have looked for if not for that library. And > those unit tests do a great job of covering all of that. Thanks! > Welcome! And yes, importlib didn't take multiple years out of laziness, but just how much work had to go in to cover corner cases along with pauses from frustration with the semantics. :P > > -eric > > > ------------------------------------------------------------------------------- > > PEP: 4XX > Title: A ModuleSpec Type for the Import System > Version: $Revision$ > Last-Modified: $Date$ > Author: Eric Snow > BDFL-Delegate: ??? > Discussions-To: import-sig at python.org > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 8-Aug-2013 > Python-Version: 3.4 > Post-History: 8-Aug-2013 > Resolution: > > > Abstract > ======== > > This PEP proposes to add a new class to ``importlib.machinery`` called > ``ModuleSpec``. It will contain all the import-related information > about a module without needing to load the module first. Finders will > now return a module's spec rather than a loader. The import system will > use the spec to load the module. > > > Motivation > ========== > > The import system has evolved over the lifetime of Python. In late 2002 > PEP 302 introduced standardized import hooks via ``finders`` and > ``loaders`` and ``sys.meta_path``. The ``importlib`` module, introduced > with Python 3.1, now exposes a pure Python implementation of the APIs > described by PEP 302, as well as of the full import system. It is now > much easier to understand and extend the import system. While a benefit > to the Python community, this greater accessibilty also presents a > challenge. > > As more developers come to understand and customize the import system, > any weaknesses in the finder and loader APIs will be more impactful. So > the sooner we can address any such weaknesses the import system, the > better...and there are a couple we can take care of with this proposal. > > Firstly, any time the import system needs to save information about a > module we end up with more attributes on module objects that are > generally only meaningful to the import system and occoasionally to some > people. It would be nice to have a per-module namespace to put future > import-related information. Secondly, there's an API void between > finders and loaders that causes undue complexity when encountered. > > Finders are strictly responsible for providing the loader which the > import system will use to load the module. The loader is then > responsible for doing some checks, creating the module object, setting > import-related attributes, "installing" the module to ``sys.modules``, > and loading the module, along with some cleanup. This all takes place > during the import system's call to ``Loader.load_module()``. Loaders > also provide some APIs for accessing data associated with a module. > > Loaders are not required to provide any of the functionality of > ``load_module()`` through other methods. Thus, though the import- > related information about a module is likely available without loading > the module, it is not otherwise exposed. > > Furthermore, the requirements assocated with ``load_module()`` are > common to all loaders and mostly are implemented in exactly the same > way. This means every loader has to duplicate the same boilerplate > code. ``importlib.util`` provides some tools that help with this, but > it would be more helpful if the import system simply took charge of > these responsibilities. The trouble is that this would limit the degree > of customization that ``load_module()`` facilitates. This is a gap > between finders and loaders which this proposal aims to fill. > > Finally, when the import system calls a finder's ``find_module()``, the > finder makes use of a variety of information about the module that is > useful outside the context of the method. Currently the options are > limited for persisting that per-module information past the method call, > since it only returns the loader. Either store it in a module-to-info > mapping somewhere like on the finder itself, or store it on the loader. > The two previous sentences are hard to read; I think you were after something like, "Popular options for this limitation are to store the information is in a module-to-info mapping somewhere on the finder itself, or store it on the loader. > Unfortunately, loaders are not required to be module-specific. On top > of that, some of the useful information finders could provide is > common to all finders, so ideally the import system could take care of > that. This is the same gap as before between finders and loaders. > > As an example of complexity attributable to this flaw, the > implementation of namespace packages in Python 3.3 (see PEP 420) added > ``FileFinder.find_loader()`` because there was no good way for > ``find_module()`` to provide the namespace path. > > The answer to this gap is a ``ModuleSpec`` object that contains the > per-module information and takes care of the boilerplate functionality > of loading the module. > > (The idea grew feet during discussions related to another PEP.[1]) > "(This PEP grew out of discussions related to another PEP [1])" > > > Specification > ============= > > ModuleSpec > ---------- > > A new class which defines the import-related values to use when loading > the module. It closely corresponds to the import-related attributes of > module objects. ``ModuleSpec`` objects may also be used by finders and > loaders and other import-related APIs to hold extra import-related > information about the module. This greatly reduces the need to add any > new import-related attributes to module objects. > > Attributes: > > * ``name`` - the module's name (compare to ``__name__``). > * ``loader`` - the loader to use during loading and for module data > (compare to ``__loader__``). > * ``package`` - the name of the module's parent (compare to > ``__package__``). > * ``is_package`` - whether or not the module is a package. > I think is_package() is redundant in the face of 'name'/'package' or 'path' as you can introspect the same information. I honestly have always found it a weakness of InspectLoader.is_package() that it didn't return the value for __path__. > * ``origin`` - the location from which the module originates. > Don't quite follow what this is meant to represent? Like the path to the zipfile if loaded that way, otherwise it's the file path? > * ``filename`` - like origin, but limited to a path-based location > (compare to ``__file__``). > * ``cached`` - the location where the compiled module should be stored > (compare to ``__cached__``). > * ``path`` - the list of path entries in which to search for submodules > or ``None``. (compare to ``__path__``). It should be in sync with > ``is_package``. > Why is 'path' the only attribute with a default value? Should probably say everything has a default value of None if not set/known. > > Those are also the parameters to ``ModuleSpec.__init__()``, in that > order. > I would consider arguing all arguments should be keyword-only past 'name' since there is no way most people will remember that order correctly. > The last three are optional. > (filename, cached, and path). And that definitely makes is_package redundant if that's true. > When passed the values are taken > as-is. The ``from_loader()`` method offers calculated values. > "(see below)." > > Methods: > > * ``from_loader(cls, ...)`` - returns a new ``ModuleSpec`` derived from the > arguments. The parameters are the same as with ``__init__``, except > ``package`` is excluded and only ``name`` and ``loader`` are required. > Why the switch in requirements compared to __init__()? > * ``module_repr()`` - returns a repr for the module. > * ``init_module_attrs(module)`` - sets the module's import-related > attributes. > Specify what those attributes are and how they are set. > * ``load(module=None, *, is_reload=False)`` - calls the loader's > ``exec_module()``, falling back to ``load_module()`` if necessary. > This method performs the former responsibilities of loaders for > managing modules before actually loading and for cleaning up. The > reload case is facilitated by the ``module`` and ``is_reload`` > parameters. > If a module is provided and there is already a matching key in sys.modules, what happens? What if is_reload is True but there is no module provided or in sys.modules; KeyError, ValueError, ImportError? Do you follow having None in sys.modules and raise ImportError, or do you overwrite (same question if a module is explicitly provided)? > > Values Derived by from_loader() > ------------------------------- > > As implied above, ``from_loader()`` makes a best effort at calculating > any of the values that are not passed in. It duplicates the behavior > that was formerly provided the several ``importlib.util`` functions as > well as the ``init_module_attrs()`` method of several of ``importlib``'s > loaders. Just to be clear, here is a more detailed description of those > calculations: > > ``is_package`` is derived from ``path``, if passed. Otherwise the > loader's ``is_package()`` is tried. Finally, it defaults to False. > It can also be calculated based on whether ``name`` == ``package``: ``True if path is not None else name == package``. Always need to watch out for [] for path as that is valid and signals the module is a package. This is where defining exactly what details need to be passed in and which ones are optional are going to be critical in determining what represents ambiguity/unknown details vs. what is flat-out known to be true/false. > > ``filename`` is pulled from the loader's ``get_filename()``, if > possible. > > ``path`` is set to an empty list if ``is_package`` is true, and the > directory from ``filename`` is appended to it, if available. > > ``cached`` is derived from ``filename`` if it's available. > Derived how? > > ``origin`` is set to ``filename``. > > ``package`` is set to ``name`` if the module is a package and > "... is a package, else to ..." > to ``name.rpartition('.')[0]`` otherwise. Consequently, a > top-level module will have ``package`` set to the empty string. > > Backward Compatibility > ---------------------- > > Since finder ``find_module()`` > ``Finder.find_module()`` > methods would now return a module spec > instead of loader, specs must act like the loader that would have been > returned instead. This is relatively simple to solve since the loader > is available as an attribute of the spec. > Are you going to define a __getattr__ to delegate to the loader? Or are you going to specifically define equivalent methods, e.g. get_filename() is obviously solvable by getting the attribute from the spec (as long as filename is a required value)? > > However, ``ModuleSpec.is_package`` (an attribute) conflicts with > ``InspectLoader.is_package()`` (a method). Working around this requires > a more complicated solution but is not a large obstacle. > > Unfortunately, the ability to proxy does not extend to ``id()`` > comparisons and ``isinstance()`` tests. In the case of the return value > of ``find_module()``, we accept that break in backward compatibility. > Mention that ModuleSpec can be added to the proper ABCs in importlib.abc to help alleviate this issue. > > Subclassing > ----------- > > .. XXX Allowed but discouraged? > Why should it matter if they are subclassed? > > Module Objects > -------------- > > Module objects will now have a ``__spec__`` attribute to which the > module's spec will be bound. None of the other import-related module > attributes will be changed or deprecated, though some of them could be. > Any such deprecation can wait until Python 4. > "... could be; any such ..." > > ``ModuleSpec`` objects will not be kept in sync with the corresponding > module object's import-related attributes. They may differ, though in > practice they will be the same. > "Though they may differ, in practice they will typically be the same." > > Finders > ------- > > Finders will now return ModuleSpec objects when ``find_module()`` is > called rather than loaders. For backward compatility, ``Modulespec`` > objects proxy the attributes of their ``loader`` attribute. > > Adding another similar method to avoid backward-compatibility issues > is undersireable if avoidable. The import APIs have suffered enough. > in lieu of the fact that find_loader() was just introduced in Python 3.3. > The approach taken by this PEP should be sufficient. > > The change to ``find_module()`` applies to both ``MetaPathFinder`` and > ``PathEntryFinder``. ``PathEntryFinder.find_loader()`` will be > deprecated and, for backward compatibility, implicitly special-cased if > the method exists on a finder. > > Loaders > ------- > > Loaders will have a new method, ``exec_module(module)``. Its only job > is to "exec" the module and consequently populate the module's > namespace. It is not responsible for creating or preparing the module > object, nor for any cleanup afterward. It has no return value. > > The ``load_module()`` of loaders will still work and be an active part > of the loader API. It is still useful for cases where the default > module creation/prepartion/cleanup is not appropriate for the loader. > But will it still be required? Obviously importlib.abc.Loader can grow a default load_module() defined around exec_module(), but it should be clear if we expect the method to always be manually defined or if it will eventually go away. > > A loader must have ``exec_module()`` or ``load_module()`` defined. If > both exist on the loader, ``exec_module()`` is used and > ``load_module()`` is ignored. > Ignored by whom? Should specify that the import system is the one doing the ignoring. > > PEP 420 introduced the optional ``module_repr()`` loader method to limit > the amount of special-casing in the module type's ``__repr__()``. Since > this method is part of ``ModuleSpec``, it will be deprecated on loaders. > However, if it exists on a loader it will be used exclusively. > > The loader ``init_module_attr()`` method, added for Python 3.4 will be > eliminated in favor of the same method on ``ModuleSpec``. > "method, added prior to Python 3.4's release, will be removed ..." > > However, ``InspectLoader.is_package()`` will not be deprecated even > though the same information is found on ``ModuleSpec``. ``ModuleSpec`` > can use it to populate its own ``is_package`` if that information is > not otherwise available. Still, it will be made optional. > > In addition to executing a module during loading, loaders will still be > directly responsible for providing APIs concerning module-related data. > > Other Changes > ------------- > > * The various finders and loaders provided by ``importlib`` will be > updated to comply with this proposal. > > * The spec for the ``__main__`` module will reflect how the interpreter > was started. For instance, with ``-m`` the spec's name will be that of > the run module, while ``__main__.__name__`` will still be "__main__". > > * We add ``importlib.find_module()`` to mirror > ``importlib.find_loader()`` (which becomes deprecated). > > * Deprecations in ``importlib.util``: ``set_package()``, > ``set_loader()``, and ``module_for_loader()``. ``module_to_load()`` > (introduced in 3.4) can be removed. > "(introduced prior to Python 3.4's release)"; remember, PEPs are timeless and will outlive 3.4 so specifying it never went public is important. > > * ``importlib.reload()`` is changed to use ``ModuleSpec.load()``. > > * ``ModuleSpec.load()`` and ``importlib.reload()`` will now make use of > the per-module import lock, whereas ``Loader.load_module()`` did not. > > Reference Implementation > ------------------------ > > A reference implementation is available at . > > > References > ========== > > [1] http://mail.python.org/pipermail/import-sig/2013-August/000658.html > > > Copyright > ========= > > This document has been placed in the public domain. > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > coding: utf-8 > End: > > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Aug 9 20:20:45 2013 From: brett at python.org (Brett Cannon) Date: Fri, 9 Aug 2013 14:20:45 -0400 Subject: [Import-SIG] Rough PEP: A ModuleSpec Type for the Import System In-Reply-To: References:

Message-ID: On Fri, Aug 9, 2013 at 2:03 PM, Eric Snow wrote: > On Fri, Aug 9, 2013 at 8:40 AM, Brett Cannon wrote: > >> On Fri, Aug 9, 2013 at 2:34 AM, Eric Snow wrote: >> >>> Finally, when the import system calls a finder's ``find_module()``, the >>> >> finder makes use of a variety of information about the module that is >>> useful outside the context of the method. Currently the options are >>> limited for persisting that per-module information past the method call, >>> since it only returns the loader. Either store it in a module-to-info >>> mapping somewhere like on the finder itself, or store it on the loader. >>> >> >> The two previous sentences are hard to read; I think you were after >> something like, >> "Popular options for this limitation are to store the information is in a >> module-to-info >> mapping somewhere on the finder itself, or store it on the loader. >> > > Sounds good. > > >> >> >>> (The idea grew feet during discussions related to another PEP.[1]) >>> >> >> "(This PEP grew out of discussions related to another PEP [1])" >> > > Yeah, this was one of the last things I added to the PEP and my brain was > starting to get a little fuzzy. :) > > >> * ``is_package`` - whether or not the module is a package. >>> >> >> I think is_package() is redundant in the face of 'name'/'package' or >> 'path' as you can introspect the same information. I honestly have always >> found it a weakness of InspectLoader.is_package() that it didn't return the >> value for __path__. >> > > I see what you mean, but I also think it's nice to be able to explicitly > see if a spec is for a package without having to know about underlying > rules. However, I'll just make it a property instead of something set on > the spec (and remove it from __init__). > > >> >> >>> * ``origin`` - the location from which the module originates. >>> >> >> Don't quite follow what this is meant to represent? Like the path to the >> zipfile if loaded that way, otherwise it's the file path? >> > > Yeah, Antoine had the same question. I'll make sure the PEP is clearer. > Basically filename maps to the module's __file__ and origin is used for > the module's repr if filename isn't set. > > >> >> >>> * ``filename`` - like origin, but limited to a path-based location >>> (compare to ``__file__``). >>> * ``cached`` - the location where the compiled module should be stored >>> (compare to ``__cached__``). >>> * ``path`` - the list of path entries in which to search for submodules >>> or ``None``. (compare to ``__path__``). It should be in sync with >>> ``is_package``. >>> >> >> Why is 'path' the only attribute with a default value? Should probably >> say everything has a default value of None if not set/known. >> > > Good point. > > >> >> >>> >>> Those are also the parameters to ``ModuleSpec.__init__()``, in that >>> order. >>> >> >> I would consider arguing all arguments should be keyword-only past 'name' >> since there is no way most people will remember that order correctly. >> > > Makes sense, though I'll make everything but name and loader keyword-only. > > >> * ``from_loader(cls, ...)`` - returns a new ``ModuleSpec`` derived from >>> the >>> arguments. The parameters are the same as with ``__init__``, except >>> ``package`` is excluded and only ``name`` and ``loader`` are required. >>> >> >> Why the switch in requirements compared to __init__()? >> > > Because package is always calculated and only name and loader are > necessary to calculate the remaining attributes. Perhaps from_loader() is > the wrong name (I'm open to alternatives). Perhaps __init__() should take > over some of the calculating. My intention is to provide one API for > what-you-pass-in-is-what-you-get (__init__) and another for calculating > attributes. Of course, one could simply modify the spec after creating it, > but I like idea of explicitly opting in to calculated values. I'll add > this point to the PEP. Also I'll probably also drop package as a parameter > of __init__ and make the attribute a property. > > I've also toyed with the idea of making all the attributes properties (aka > read-only) since changing a module's spec later on could lead to headache, > but I'm not convinced that is a easy problem to cause. It's better to not > get in the way of those who have needs I haven't anticipated (consenting > adults, etc.). What do you think? > I agree with your thinking that you should necessarily block usage just because it might be a bad idea; consenting adults and all is right. > > >> >> >>> * ``module_repr()`` - returns a repr for the module. >>> * ``init_module_attrs(module)`` - sets the module's import-related >>> attributes. >>> >> >> Specify what those attributes are and how they are set. >> > > Will do. > > >> >> >>> * ``load(module=None, *, is_reload=False)`` - calls the loader's >>> ``exec_module()``, falling back to ``load_module()`` if necessary. >>> This method performs the former responsibilities of loaders for >>> managing modules before actually loading and for cleaning up. The >>> reload case is facilitated by the ``module`` and ``is_reload`` >>> parameters. >>> >> >> If a module is provided and there is already a matching key in >> sys.modules, what happens? >> > What if is_reload is True but there is no module provided or in >> sys.modules; KeyError, ValueError, ImportError? Do you follow having None >> in sys.modules and raise ImportError, or do you overwrite (same question if >> a module is explicitly provided)? >> > > That's a good point. I thought I had addressed this in the PEP, but > apparently not. For Loader.load_module(), as you know, the existence of > the key in sys.modules indicates a reload should happen. The is_reload > parameter is meant to provide an explicit indicator. The module you pass > in is simply the one to use. If a module is not passed in and is_reload is > true, the module in sys.modules will be used. If that module is None or > not there, ImportError would be raised. If a module is passed in and > is_reload is false, I was planning on just ignoring that module. However > raising ValueError in that case would be more useful, indicating that the > method was called incorrectly. > > Having just the module parameter and letting it indicate a reload is > doable, but that would mean losing the option of having load() look up the > module (and it's less explicit). Another option is to have a separate > reload() method. Antoine mentioned it and I'd considered it early on. I'm > considering it again since it makes the API less complicated. Do you have > a preference between the current proposal (load() does it all) and a > separate reload() method? > Nope, no preference. > > ``is_package`` is derived from ``path``, if passed. Otherwise the >>> loader's ``is_package()`` is tried. Finally, it defaults to False. >>> >> >> It can also be calculated based on whether ``name`` == ``package``: >> ``True if path is not None else name == package``. >> > > Good point, though at this point I don't think package will be something > you set. > So you would set 'name' and 'path' to decide if something is a package and use that to calculate 'package'? > > Always need to watch out for [] for path as that is valid and signals the >> module is a package. >> > > Yeah, I've got that covered in from_loader(). > > This is where defining exactly what details need to be passed in and which >> ones are optional are going to be critical in determining what represents >> ambiguity/unknown details vs. what is flat-out known to be true/false. >> > > Agreed. I'll be sure to spell it out. > > >> ``cached`` is derived from ``filename`` if it's available. >>> >> >> Derived how? >> > > cache_from_source() > I figured, but I know too much about this stuff. =) I would spell it out in the PEP. > > >> methods would now return a module spec >>> instead of loader, specs must act like the loader that would have been >>> returned instead. This is relatively simple to solve since the loader >>> is available as an attribute of the spec. >>> >> >> Are you going to define a __getattr__ to delegate to the loader? Or are >> you going to specifically define equivalent methods, e.g. get_filename() is >> obviously solvable by getting the attribute from the spec (as long as >> filename is a required value)? >> > > __getattr__(). I don't want to guess what methods a loader might have. > And if someone wants to call get_filename() on what they think is the > loader, I think it's better to just call the loader's get_filename(). I'd > left this stuff out as an implementation detail. Do you think it should be > in the PEP? I could simply elaborate on "specs must act like the loader". > I would elaborate that it's going to be __getattr__() since it influences the level of backwards-compatibility. > > >> >> >>> >>> However, ``ModuleSpec.is_package`` (an attribute) conflicts with >>> ``InspectLoader.is_package()`` (a method). Working around this requires >>> a more complicated solution but is not a large obstacle. >>> >>> Unfortunately, the ability to proxy does not extend to ``id()`` >>> comparisons and ``isinstance()`` tests. In the case of the return value >>> of ``find_module()``, we accept that break in backward compatibility. >>> >> >> Mention that ModuleSpec can be added to the proper ABCs in importlib.abc >> to help alleviate this issue. >> > > Good point. > > >> >> >>> >>> Subclassing >>> ----------- >>> >>> .. XXX Allowed but discouraged? >>> >> >> Why should it matter if they are subclassed? >> > > My goal was for ModuleSpec to be the container for module definition state > with some common attributes as a baseline and a minimal number of methods > for the import system to use. Loaders would be where you would do extra > stuff or customize functionality, which is basically what happens now. > > It seemed correct before but now it's feeling like a very artificial and > unnecessary objective. > I totally get where you are coming from and if we were working in a language that pushed for read-only attributes I would agree, but we aren't so I wouldn't. =) It just becomes more hassle than it's worth to enforce. > > Finders >>> ------- >>> >>> Finders will now return ModuleSpec objects when ``find_module()`` is >>> called rather than loaders. For backward compatility, ``Modulespec`` >>> objects proxy the attributes of their ``loader`` attribute. >>> >>> Adding another similar method to avoid backward-compatibility issues >>> is undersireable if avoidable. The import APIs have suffered enough. >>> >> >> in lieu of the fact that find_loader() was just introduced in Python 3.3. >> > > Are you suggesting additional wording or making a comment? > Both? =) > > >> >>> Loaders >>> ------- >>> >>> Loaders will have a new method, ``exec_module(module)``. Its only job >>> is to "exec" the module and consequently populate the module's >>> namespace. It is not responsible for creating or preparing the module >>> object, nor for any cleanup afterward. It has no return value. >>> >>> The ``load_module()`` of loaders will still work and be an active part >>> of the loader API. It is still useful for cases where the default >>> module creation/prepartion/cleanup is not appropriate for the loader. >>> >> >> But will it still be required? Obviously importlib.abc.Loader can grow a >> default load_module() defined around exec_module(), but it should be clear >> if we expect the method to always be manually defined or if it will >> eventually go away. >> > > load_module() will no longer be required. However, it still serves a real > purpose: the loader may still need to control more of the loading process. > By implementing load_module() but not exec_module(), a loader gets that. > I'm make sure that's clear. > > >> >> >>> >>> A loader must have ``exec_module()`` or ``load_module()`` defined. If >>> both exist on the loader, ``exec_module()`` is used and >>> ``load_module()`` is ignored. >>> >> >> Ignored by whom? Should specify that the import system is the one doing >> the ignoring. >> > > Got it. > > >> * Deprecations in ``importlib.util``: ``set_package()``, >>> >> ``set_loader()``, and ``module_for_loader()``. ``module_to_load()`` >>> (introduced in 3.4) can be removed. >>> >> >> "(introduced prior to Python 3.4's release)"; remember, PEPs are timeless >> and will outlive 3.4 so specifying it never went public is important. >> > > Good catch. You should be a PEP editor. > Ha! Being a PEP editor means I know how to use hg, run a make command, and can count. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Sat Aug 10 00:58:09 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 9 Aug 2013 16:58:09 -0600 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" Message-ID: Here's an updated version of the PEP for ModuleSpec which addresses the feedback I've gotten. Thanks for the help. The big open question, to me, is whether or not to have a separate reload() method. I'll be looking into that when I get a chance. There's also the question of a path-based subclass, but I'm currently not convinced it's worth it. -eric ----------------------------------- PEP: 4XX Title: A ModuleSpec Type for the Import System Version: $Revision$ Last-Modified: $Date$ Author: Eric Snow BDFL-Delegate: ??? Discussions-To: import-sig at python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 8-Aug-2013 Python-Version: 3.4 Post-History: 8-Aug-2013 Resolution: Abstract ======== This PEP proposes to add a new class to ``importlib.machinery`` called ``ModuleSpec``. It will contain all the import-related information about a module without needing to load the module first. Finders will now return a module's spec rather than a loader. The import system will use the spec to load the module. Motivation ========== The import system has evolved over the lifetime of Python. In late 2002 PEP 302 introduced standardized import hooks via ``finders`` and ``loaders`` and ``sys.meta_path``. The ``importlib`` module, introduced with Python 3.1, now exposes a pure Python implementation of the APIs described by PEP 302, as well as of the full import system. It is now much easier to understand and extend the import system. While a benefit to the Python community, this greater accessibilty also presents a challenge. As more developers come to understand and customize the import system, any weaknesses in the finder and loader APIs will be more impactful. So the sooner we can address any such weaknesses the import system, the better...and there are a couple we can take care of with this proposal. Firstly, any time the import system needs to save information about a module we end up with more attributes on module objects that are generally only meaningful to the import system and occoasionally to some people. It would be nice to have a per-module namespace to put future import-related information. Secondly, there's an API void between finders and loaders that causes undue complexity when encountered. Finders are strictly responsible for providing the loader which the import system will use to load the module. The loader is then responsible for doing some checks, creating the module object, setting import-related attributes, "installing" the module to ``sys.modules``, and loading the module, along with some cleanup. This all takes place during the import system's call to ``Loader.load_module()``. Loaders also provide some APIs for accessing data associated with a module. Loaders are not required to provide any of the functionality of ``load_module()`` through other methods. Thus, though the import- related information about a module is likely available without loading the module, it is not otherwise exposed. Furthermore, the requirements assocated with ``load_module()`` are common to all loaders and mostly are implemented in exactly the same way. This means every loader has to duplicate the same boilerplate code. ``importlib.util`` provides some tools that help with this, but it would be more helpful if the import system simply took charge of these responsibilities. The trouble is that this would limit the degree of customization that ``load_module()`` facilitates. This is a gap between finders and loaders which this proposal aims to fill. Finally, when the import system calls a finder's ``find_module()``, the finder makes use of a variety of information about the module that is useful outside the context of the method. Currently the options are limited for persisting that per-module information past the method call, since it only returns the loader. Popular options for this limitation are to store the information in a module-to-info mapping somewhere on the finder itself, or store it on the loader. Unfortunately, loaders are not required to be module-specific. On top of that, some of the useful information finders could provide is common to all finders, so ideally the import system could take care of that. This is the same gap as before between finders and loaders. As an example of complexity attributable to this flaw, the implementation of namespace packages in Python 3.3 (see PEP 420) added ``FileFinder.find_loader()`` because there was no good way for ``find_module()`` to provide the namespace path. The answer to this gap is a ``ModuleSpec`` object that contains the per-module information and takes care of the boilerplate functionality of loading the module. (The idea gained momentum during discussions related to another PEP.[1]) Specification ============= The goal is to address the gap between finders and loaders while changing as little of their semantics as possible. Though some functionality and information is moved the new ``ModuleSpec`` type, their semantics should remain the same. However, for the sake of clarity, those semantics will be explicitly identified. A High-Level View ----------------- ... ModuleSpec ---------- A new class which defines the import-related values to use when loading the module. It closely corresponds to the import-related attributes of module objects. ``ModuleSpec`` objects may also be used by finders and loaders and other import-related APIs to hold extra import-related state about the module. This greatly reduces the need to add any new new import-related attributes to module objects, and loader ``__init__`` methods won't need to accommodate such per-module state. Creating a ModuleSpec: ``ModuleSpec(name, loader, *, origin=None, filename=None, cached=None, path=None)`` The parameters have the same meaning as the attributes described below. However, not all ``ModuleSpec`` attributes are also parameters. The passed values are set as-is. For calculated values use the ``from_loader()`` method. ModuleSpec Attributes --------------------- Each of the following names is an attribute on ``ModuleSpec`` objects. A value of ``None`` indicates "not set". This contrasts with module objects where the attribute simply doesn't exist. While ``package`` and ``is_package`` are read-only properties, the remaining attributes can be replaced after the module spec is created and after import is complete. This allows for unusual cases where modifying the spec is the best option. However, typical use should not involve changing the state of a module's spec. Most of the attributes correspond to the import-related attributes of modules. Here is the mapping, followed by a description of the attributes. The reverse of this mapping is used by ``init_module_attrs()``. ============= =========== On ModuleSpec On Modules ============= =========== name __name__ loader __loader__ package __package__ is_package - origin - filename __file__ cached __cached__ path __path__ ============= =========== ``name`` The module's fully resolved and absolute name. It must be set. ``loader`` The loader to use during loading and for module data. These specific functionalities do not change for loaders. Finders are still responsible for creating the loader and this attribute is where it is stored. The loader must be set. ``package`` The name of the module's parent. This is a dynamic attribute with a value derived from ``name`` and ``is_package``. For packages it is the value of ``name``. Otherwise it is equivalent to ``name.rpartition('.')[0]``. Consequently, a top-level module will have give the empty string for ``package``. ``is_package`` Whether or not the module is a package. This dynamic attribute is True if ``path`` is set (even if empty), else it is false. ``origin`` A string for the location from which the module originates. If ``filename`` is set, ``origin`` should be set to the same value unless some other value is more appropriate. ``origin`` is used in ``module_repr()`` if it does not match the value of ``filename``. Using ``filename`` for this meaning would be inaccurate, since not all modules have path-based locations. For instance, built-in modules do not have ``__file__`` set. Yet it is useful to have a descriptive string indicating that it originated from the interpreter as a built-in module. So built-in modules will have ``origin`` set to ``"built-in"``. Path-based attributes: If any of these is set, it indicates that the module is path-based. For reference, a path entry is a string for a location where the import system will look for modules, e.g. the path entries in ``sys.path`` or a package's ``__path__``). ``filename`` Like ``origin``, but limited to a path-based location. If ``filename`` is set, ``origin`` should be set to the same string, unless origin is explicitly set to something else. ``filename`` is not necessarily an actual file name, but could be any location string based on a path entry. Regarding the attribute name, while it is potentially inaccurate, it is both consistent with the equivalent module attribute and generally accurate. .. XXX Would a different name be better? ``path_location``? ``cached`` The path-based location where the compiled code for a module should be stored. If ``filename`` is set to a source file, this should be set to corresponding path that PEP 3147 specifies. The ``importlib.util.source_to_cache()`` function facilitates getting the correct value. ``path`` The list of path entries in which to search for submodules if this module is a package. Otherwise it is ``None``. .. XXX add a path-based subclass? ModuleSpec Methods ------------------ ``from_loader(name, loader, *, is_package=None, origin=None, filename=None, cached=None, path=None)`` .. XXX use a different name? A factory classmethod that returns a new ``ModuleSpec`` derived from the arguments. ``is_package`` is used inside the method to indicate that the module is a package. If not explicitly passed in, it is set to ``True`` if ``path`` is passed in. It falls back to using the result of the loader's ``is_package()``, if available. Finally it defaults to False. The remaining parameters have the same meaning as the corresponding ``ModuleSpec`` attributes. In contrast to ``ModuleSpec.__init__()``, which takes the arguments as-is, ``from_loader()`` calculates missing values from the ones passed in, as much as possible. This replaces the behavior that is currently provided the several ``importlib.util`` functions as well as the optional ``init_module_attrs()`` method of loaders. Just to be clear, here is a more detailed description of those calculations:: If not passed in, ``filename`` is to the result of calling the loader's ``get_filename()``, if available. Otherwise it stays unset (``None``). If not passed in, ``path`` is set to an empty list if ``is_package`` is true. Then the directory from ``filename`` is appended to it, if possible. If ``is_package`` is false, ``path`` stays unset. If ``cached`` is not passed in and ``filename`` is passed in, ``cached`` is derived from it. For filenames with a source suffix, it set to the result of calling ``importlib.util.cache_from_source()``. For bytecode suffixes (e.g. ``.pyc``), ``cached`` is set to the value of ``filename``. If ``filename`` is not passed in or ``cache_from_source()`` raises ``NotImplementedError``, ``cached`` stays unset. If not passed in, ``origin`` is set to ``filename``. Thus if ``filename`` is unset, ``origin`` stays unset. ``module_repr()`` Returns a repr string for the module if ``origin`` is set and ``filename`` is not set. The string refers to the value of ``origin``. Otherwise ``module_repr()`` returns None. This indicates to the module type's ``__repr__()`` that it should fall back to the default repr. We could also have ``module_repr()`` produce the repr for the case where ``filename`` is set or where ``origin`` is not set, mirroring the repr that the module type produces directly. However, the repr string is derived from the import-related module attributes, which might be out of sync with the spec. .. XXX Is using the spec close enough? Probably not. The implementation of the module type's ``__repr__()`` will change to accommodate this PEP. However, the current functionality will remain to handle the case where a module does not have a ``__spec__`` attribute. ``init_module_attrs(module)`` Sets the module's import-related attributes to the corresponding values in the module spec. If a path-based attribute is not set on the spec, it is not set on the module. For the rest, a ``None`` value on the spec (aka "not set") means ``None`` will be set on the module. If any of the attributes are already set on the module, the existing values are replaced. The module's own ``__spec__`` is not consulted but does get replaced with the spec on which ``init_module_attrs()`` was called. The earlier mapping of ``ModuleSpec`` attributes to module attributes indicates which attributes are involved on both sides. ``load(module=None, *, is_reload=False)`` This method captures the current functionality of and requirements on ``Loader.load_module()`` without any semantic changes, except one. Reloading a module when ``exec_module()`` is available actually uses ``module`` rather than ignoring it in favor of the one in ``sys.modules``, as ``Loader.load_module()`` does. ``module`` is only allowed when ``is_reload`` is true. This means that ``is_reload`` could be dropped as a parameter. However, doing so would mean we could not use ``None`` to indicate that the module should be pulled from ``sys.modules``. Furthermore, ``is_reload`` makes the intent of the call clear. There are two parts to what happens in ``load()``. First, the module is prepared, loaded, updated appropriately, and left available for the second part. This is described in more detail shortly. Second, in the case of error during a normal load (not reload) the module is removed from ``sys.modules``. If no error happened, the module is pulled from ``sys.modules``. This the module returned by ``load()``. Before it is returned, if it is a different object than the one produced by the first part, attributes of the module from ``sys.modules`` are updated to reflect the spec. Returning the module from ``sys.modules`` accommodates the ability of the module to replace itself there while it is executing (during load). As already noted, this is what already happens in the import system. ``load()`` is not meant to change any of this behavior. Regarding the first part of ``load()``, the following describes what happens. It depends on if ``is_reload`` is true and if the loader has ``exec_module()``. For normal load with ``exec_module()`` available:: A new module is created, ``init_module_attrs()`` is called to set its attributes, and it is set on sys.modules. At that point the loader's ``exec_module()`` is called, after which the module is ready for the second part of loading. .. XXX What if the module already exists in sys.modules? For normal load without ``exec_module()`` available:: The loader's ``load_module()`` is called and the attributes of the module it returns are updated to match the spec. For reload with ``exec_module()`` available:: If ``module`` is ``None``, it is pulled from ``sys.modules``. If still ``None``, ImportError is raised. Otherwise ``exec_module()`` is called, passing in the module-to-be-reloaded. For reload without ``exec_module()`` available:: The loader's ``load_module()`` is called and the attributes of the module it returns are updated to match the spec. There is some boilerplate involved when ``exec_module()`` is available, but only the boilerplate that the import system uses currently. If ``loader`` is not set (``None``), ``load()`` raises a ValueError. If ``module`` is passed in but ``is_reload`` is false, a ValueError is also raises to indicate that ``load()`` was called incorrectly. There may be use cases for calling ``load()`` in that way, but they are outside the scope of this PEP .. XXX add reload(module=None) and drop load()'s parameters entirely? .. XXX add more of importlib.reload()'s boilerplate to load()/reload()? Backward Compatibility ---------------------- Since ``Finder.find_module()`` methods would now return a module spec instead of loader, specs must act like the loader that would have been returned instead. This is relatively simple to solve since the loader is available as an attribute of the spec. We will use ``__getattr__()`` to do it. However, ``ModuleSpec.is_package`` (an attribute) conflicts with ``InspectLoader.is_package()`` (a method). Working around this requires a more complicated solution but is not a large obstacle. Simply making ``ModuleSpec.is_package`` a method does not reflect that is a relatively static piece of data. ``module_repr()`` also conflicts with the same method on loaders, but that workaround is not complicated since both are methods. Unfortunately, the ability to proxy does not extend to ``id()`` comparisons and ``isinstance()`` tests. In the case of the return value of ``find_module()``, we accept that break in backward compatibility. However, we will mitigate the problem with ``isinstance()`` somewhat by registering ``ModuleSpec`` on the loaders in ``importlib.abc``. Subclassing ----------- Subclasses of ModuleSpec are allowed, but should not be necessary. Adding functionality to a custom finder or loader will likely be a better fit and should be tried first. However, as long as a subclass still fulfills the requirements of the import system, objects of that type are completely fine as the return value of ``find_module()``. Module Objects -------------- Module objects will now have a ``__spec__`` attribute to which the module's spec will be bound. None of the other import-related module attributes will be changed or deprecated, though some of them could be; any such deprecation can wait until Python 4. ``ModuleSpec`` objects will not be kept in sync with the corresponding module object's import-related attributes. Though they may differ, in practice they will typically be the same. Finders ------- Finders will now return ModuleSpec objects when ``find_module()`` is called rather than loaders. For backward compatility, ``Modulespec`` objects proxy the attributes of their ``loader`` attribute. Adding another similar method to avoid backward-compatibility issues is undersireable if avoidable. The import APIs have suffered enough, especially considering ``PathEntryFinder.find_loader()`` was just added in Python 3.3. The approach taken by this PEP should be sufficient to address backward-compatibility issues for ``find_module()``. The change to ``find_module()`` applies to both ``MetaPathFinder`` and ``PathEntryFinder``. ``PathEntryFinder.find_loader()`` will be deprecated and, for backward compatibility, implicitly special-cased if the method exists on a finder. Finders are still responsible for creating the loader. That loader will now be stored in the module spec returned by ``find_module()`` rather than returned directly. As is currently the case without the PEP, if a loader would be costly to create, that loader can be designed to defer the cost until later. Loaders ------- Loaders will have a new method, ``exec_module(module)``. Its only job is to "exec" the module and consequently populate the module's namespace. It is not responsible for creating or preparing the module object, nor for any cleanup afterward. It has no return value. The ``load_module()`` of loaders will still work and be an active part of the loader API. It is still useful for cases where the default module creation/prepartion/cleanup is not appropriate for the loader. For example, the C API for extension modules only supports the full control of ``load_module()``. As such, ``ExtensionFileLoader`` will not implement ``exec_module()``. In the future it may be appropriate to produce a second C API that would support an ``exec_module()`` implementation for ``ExtensionFileLoader``. Such a change is outside the scope of this PEP. A loader must have at least one of ``exec_module()`` and ``load_module()`` defined. If both exist on the loader, ``ModuleSpec.load()`` uses ``exec_module()`` and ignores ``load_module()``. PEP 420 introduced the optional ``module_repr()`` loader method to limit the amount of special-casing in the module type's ``__repr__()``. Since this method is part of ``ModuleSpec``, it will be deprecated on loaders. However, if it exists on a loader it will be used exclusively. ``Loader.init_module_attr()`` method, added prior to Python 3.4's release , will be removed in favor of the same method on ``ModuleSpec``. However, ``InspectLoader.is_package()`` will not be deprecated even though the same information is found on ``ModuleSpec``. ``ModuleSpec`` can use it to populate its own ``is_package`` if that information is not otherwise available. Still, it will be made optional. The path-based loaders in ``importlib`` take arguments in their ``__init__()`` and have corresponding attributes. However, the need for those values is eliminated. The only exception is ``FileLoader.get_filename()``, which uses ``self.path``. The signatures for these loaders and the accompanying attributes will be deprecated. In addition to executing a module during loading, loaders will still be directly responsible for providing APIs concerning module-related data. Other Changes ------------- * The various finders and loaders provided by ``importlib`` will be updated to comply with this proposal. * The spec for the ``__main__`` module will reflect how the interpreter was started. For instance, with ``-m`` the spec's name will be that of the run module, while ``__main__.__name__`` will still be "__main__". * We add ``importlib.find_module()`` to mirror ``importlib.find_loader()`` (which becomes deprecated). * Deprecations in ``importlib.util``: ``set_package()``, ``set_loader()``, and ``module_for_loader()``. ``module_to_load()`` (introduced prior to Python 3.4's release) can be removed. * ``importlib.reload()`` is changed to use ``ModuleSpec.load()``. * ``ModuleSpec.load()`` and ``importlib.reload()`` will now make use of the per-module import lock, whereas ``Loader.load_module()`` did not. Reference Implementation ------------------------ A reference implementation is available at . References ========== [1] http://mail.python.org/pipermail/import-sig/2013-August/000658.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Aug 11 15:03:00 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 11 Aug 2013 09:03:00 -0400 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References: Message-ID: I think this is solid enough to be worth adding to the PEPs repo now. On 9 August 2013 18:58, Eric Snow wrote: > Here's an updated version of the PEP for ModuleSpec which addresses the > feedback I've gotten. Thanks for the help. The big open question, to me, > is whether or not to have a separate reload() method. I'll be looking into > that when I get a chance. There's also the question of a path-based > subclass, but I'm currently not convinced it's worth it. One piece of feedback from me (triggered by the C extension modules discussion on python-dev): we should consider proposing a new "exec" hook for C extension modules that could be defined instead of or in addition to the existing PEP 3121 init hook. Extension modules that don't rely on mutable static variables or the PEP 3121 per-interpreter state APIs could just define the new exec hook and get a new module instance every time they're imported. Those that do have per-interpreter state would still get an opportunity to run additional code after all the magic attributes have been set. Also, to handle the extension module case, we may need to let loaders define an optional "create_module" method that accepts the MethodSpec object as an argument. The extension module loader would implement this as handling the PyInit_ call. (Setting the magic attributes according to the spec would happen automatically after the call, so each loader wouldn't need to implement that part) (Note: once I get back to Australia around the 22nd, I should have time to help out more directly with this) > ----------------------------------- > Firstly, any time the import system needs to save information about a > module we end up with more attributes on module objects that are > generally only meaningful to the import system and occoasionally to some Typo: occoasionally > people. It would be nice to have a per-module namespace to put future > import-related information. Secondly, there's an API void between > finders and loaders that causes undue complexity when encountered. > > Finders are strictly responsible for providing the loader which the "are currently responsible" (since the PEP is about changing the responsibiity of finders, this is a little unclear at present) > Specification > ============= > > The goal is to address the gap between finders and loaders while > changing as little of their semantics as possible. Though some > functionality and information is moved the new ``ModuleSpec`` type, "moved to the new" > their semantics should remain the same. However, for the sake of > clarity, those semantics will be explicitly identified. > > A High-Level View > ----------------- > > ... Not sure a high level view is needed, but you can fill this in if you want :) > > ModuleSpec > ---------- > > A new class which defines the import-related values to use when loading > the module. It closely corresponds to the import-related attributes of > module objects. ``ModuleSpec`` objects may also be used by finders and > loaders and other import-related APIs to hold extra import-related > state about the module. This greatly reduces the need to add any new > new import-related attributes to module objects, and loader ``__init__`` > methods won't need to accommodate such per-module state. To avoid conflicts as the spec attributes evolve in the future, would it be worth having a "custom" field which is just an arbitrary object reference used to pass info from the finder to the loader without troubling the rest of the import system? > Creating a ModuleSpec: > > ``ModuleSpec(name, loader, *, origin=None, filename=None, cached=None, > path=None)`` > > The parameters have the same meaning as the attributes described below. > However, not all ``ModuleSpec`` attributes are also parameters. > The > passed values are set as-is. For calculated values use the > ``from_loader()`` method. This paragraph isn't particularly clear. Perhaps: "Passed in parameter values are assigned directly to the corresponding attributes below. Other attributes not listed as parameters (such as ``package``) are read-only properties that are automatically derived from these values. The ``ModuleSpec.from_loader()`` class method allows a suitable ModuleSpec instance to be easily created from a PEP 302 loader object" > ModuleSpec Attributes > --------------------- > > Each of the following names is an attribute on ``ModuleSpec`` objects. > A value of ``None`` indicates "not set". This contrasts with module > objects where the attribute simply doesn't exist. > > While ``package`` and ``is_package`` are read-only properties, the > remaining attributes can be replaced after the module spec is created > and after import is complete. This allows for unusual cases where > modifying the spec is the best option. However, typical use should not > involve changing the state of a module's spec. I'm with Brett that "is_package" should go, to be replaced by "spec.path is not None" wherever it matters. is_package() would then fall through to the PEP 302 loader API via __getattr__. > ``package`` > > The name of the module's parent. This is a dynamic attribute with a > value derived from ``name`` and ``is_package``. For packages it is the > value of ``name``. Otherwise it is equivalent to > ``name.rpartition('.')[0]``. Consequently, a top-level module will have > give the empty string for ``package``. s/give// > ``is_package`` > > Whether or not the module is a package. This dynamic attribute is True > if ``path`` is set (even if empty), else it is false. As above (i.e. don't use it) > ``origin`` > > A string for the location from which the module originates. If > ``filename`` is set, ``origin`` should be set to the same value unless > some other value is more appropriate. ``origin`` is used in > ``module_repr()`` if it does not match the value of ``filename``. > > Using ``filename`` for this meaning would be inaccurate, since not all > modules have path-based locations. For instance, built-in modules do > not have ``__file__`` set. Yet it is useful to have a descriptive > string indicating that it originated from the interpreter as a built-in > module. So built-in modules will have ``origin`` set to ``"built-in"``. How about we *just* have origin, with a separate "set_fileattr" attribute to indicate "this is a discrete file, you should set __file__"? Also, we should explicitly note that we'll still set __file__ for zip imports, due to backwards compatibility concerns, even though it doesn't correspond to a valid filesystem path. (Random thought: spec.origin + spec.cached + a cache directory setting in zipimport would give a potentially clean way to do extension module imports from zip archives) > ``path`` > > The list of path entries in which to search for submodules if this > module is a package. Otherwise it is ``None``. Path entries don't have to correspond to filesystem locations - they just have to make sense to at least one path hook (e.g. a DB URI would be a valid path entry). > .. XXX add a path-based subclass? Nope :) > ModuleSpec Methods > ------------------ > > ``from_loader(name, loader, *, is_package=None, origin=None, filename=None, > cached=None, path=None)`` > > .. XXX use a different name? I'd disallow customisation on this one - if people want to customise, they should just query the PEP 302 APIs themselves and call the ModuleSpec constructor directly. The use case for this one should be to make it trivial to switch from "return loader" to "return ModuleSpec.from_loader(loader)" in a find_module implementation. > In contrast to ``ModuleSpec.__init__()``, which takes the arguments > as-is, ``from_loader()`` calculates missing values from the ones passed > in, as much as possible. This replaces the behavior that is currently > provided the several ``importlib.util`` functions as well as the > optional ``init_module_attrs()`` method of loaders. Just to be clear, > here is a more detailed description of those calculations:: > > If not passed in, ``filename`` is to the result of calling the > loader's ``get_filename()``, if available. Otherwise it stays > unset (``None``). > > If not passed in, ``path`` is set to an empty list if > ``is_package`` is true. Then the directory from ``filename`` is > appended to it, if possible. If ``is_package`` is false, ``path`` > stays unset. > > If ``cached`` is not passed in and ``filename`` is passed in, > ``cached`` is derived from it. For filenames with a source suffix, > it set to the result of calling > ``importlib.util.cache_from_source()``. For bytecode suffixes (e.g. > ``.pyc``), ``cached`` is set to the value of ``filename``. If > ``filename`` is not passed in or ``cache_from_source()`` raises > ``NotImplementedError``, ``cached`` stays unset. > > If not passed in, ``origin`` is set to ``filename``. Thus if > ``filename`` is unset, ``origin`` stays unset. Hmm, is there a reason this can't be the default constructor behaviour? What's the value of *not* having the sensible fallbacks, given they can always be overridden by passing in explicit values when you want something different? A separate "from_module(m)" constructor would probably make sense, though. > ``module_repr()`` > > Returns a repr string for the module if ``origin`` is set and > ``filename`` is not set. The string refers to the value of ``origin``. > Otherwise ``module_repr()`` returns None. This indicates to the module > type's ``__repr__()`` that it should fall back to the default repr. > > We could also have ``module_repr()`` produce the repr for the case where > ``filename`` is set or where ``origin`` is not set, mirroring the repr > that the module type produces directly. However, the repr string is > derived from the import-related module attributes, which might be out of > sync with the spec. > > .. XXX Is using the spec close enough? Probably not. I think it makes sense to always return the expected repr based on the spec attributes, but allow a custom origin to be passed in to handle the case where the module __file__ attribute differs from __spec__.origin (keeping in mind I think __spec__.filename should be replaced with __spec__.set_fileattr) > The implementation of the module type's ``__repr__()`` will change to > accommodate this PEP. However, the current functionality will remain to > handle the case where a module does not have a ``__spec__`` attribute. Experience tells us that the import system should ensure the __spec__ attribute always exists (even if it has to be filled in from the module attributes after calling load_module) > ``load(module=None, *, is_reload=False)`` Yep, definitely needs to be a separate method. "is_reload" would almost always be set to a boolean, which means a separate API is likely to be better. However, I think the separate method should be "exec()" rather than "reload()" and require that the module always be passed in. We could also expose a "create" method that just creates and returns the new module object, and replace importlib.util.module_to_load with a context manager that accepted the module as a parameter. Say "add_to_sys", which fails if the module is already present in sys.modules. load() would then look something like: def load(self): m = self.create() with importlib.util.add_to_sys(m): self.exec(m) return sys.modules[self.name] We could also provide reload() if we wanted to: def reload(self): self.exec(sys.modules[self.name]) return sys.modules[self.name] > Subclassing > ----------- > > Subclasses of ModuleSpec are allowed, but should not be necessary. > Adding functionality to a custom finder or loader will likely be a > better fit and should be tried first. However, as long as a subclass > still fulfills the requirements of the import system, objects of that > type are completely fine as the return value of ``find_module()``. We may need to do subclasses for the ABC registration backwards compatibility hack. > > Module Objects > -------------- > > Module objects will now have a ``__spec__`` attribute to which the > module's spec will be bound. None of the other import-related module > attributes will be changed or deprecated, though some of them could be; > any such deprecation can wait until Python 4. > > ``ModuleSpec`` objects will not be kept in sync with the corresponding > module object's import-related attributes. Though they may differ, in > practice they will typically be the same. Worth mentioning that __main__.__spec__.name will give the real name of module's executed with -m here rather than delaying that until the notes at the end. > Finders > ------- > > Finders will now return ModuleSpec objects when ``find_module()`` is > called rather than loaders. For backward compatility, ``Modulespec`` > objects proxy the attributes of their ``loader`` attribute. > > Adding another similar method to avoid backward-compatibility issues > is undersireable if avoidable. The import APIs have suffered enough, > especially considering ``PathEntryFinder.find_loader()`` was just > added in Python 3.3. The approach taken by this PEP should be > sufficient to address backward-compatibility issues for > ``find_module()``. > > The change to ``find_module()`` applies to both ``MetaPathFinder`` and > ``PathEntryFinder``. ``PathEntryFinder.find_loader()`` will be > deprecated and, for backward compatibility, implicitly special-cased if > the method exists on a finder. Actually, we don't currently have anything on ModuleSpec to indicate "this is complete, stop scanning for more path fragments" or how we will compose multiple module specs for the individual fragments into a combined spec for the namespace package. > Finders are still responsible for creating the loader. That loader will > now be stored in the module spec returned by ``find_module()`` rather > than returned directly. As is currently the case without the PEP, if a > loader would be costly to create, that loader can be designed to defer > the cost until later. > > Loaders > ------- > > Loaders will have a new method, ``exec_module(module)``. Its only job > is to "exec" the module and consequently populate the module's > namespace. It is not responsible for creating or preparing the module > object, nor for any cleanup afterward. It has no return value. > > The ``load_module()`` of loaders will still work and be an active part > of the loader API. It is still useful for cases where the default > module creation/prepartion/cleanup is not appropriate for the loader. > > For example, the C API for extension modules only supports the full > control of ``load_module()``. As such, ``ExtensionFileLoader`` will not > implement ``exec_module()``. In the future it may be appropriate to > produce a second C API that would support an ``exec_module()`` > implementation for ``ExtensionFileLoader``. Such a change is outside > the scope of this PEP. As above, I think it may worth tackling this. It shouldn't be *that* hard given the higher level changes and will solve some hard problems at the lower level. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From brett at python.org Sun Aug 11 22:08:26 2013 From: brett at python.org (Brett Cannon) Date: Sun, 11 Aug 2013 16:08:26 -0400 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References: Message-ID: On Fri, Aug 9, 2013 at 6:58 PM, Eric Snow wrote: > Here's an updated version of the PEP for ModuleSpec which addresses the > feedback I've gotten. Thanks for the help. The big open question, to me, > is whether or not to have a separate reload() method. I'll be looking into > that when I get a chance. There's also the question of a path-based > subclass, but I'm currently not convinced it's worth it. > > -eric > > ----------------------------------- > > PEP: 4XX > Title: A ModuleSpec Type for the Import System > Version: $Revision$ > Last-Modified: $Date$ > Author: Eric Snow > BDFL-Delegate: ??? > Discussions-To: import-sig at python.org > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 8-Aug-2013 > Python-Version: 3.4 > Post-History: 8-Aug-2013 > Resolution: > > > Abstract > ======== > > This PEP proposes to add a new class to ``importlib.machinery`` called > ``ModuleSpec``. It will contain all the import-related information > about a module without needing to load the module first. Finders will > now return a module's spec rather than a loader. The import system will > use the spec to load the module. > > > Motivation > ========== > > The import system has evolved over the lifetime of Python. In late 2002 > PEP 302 introduced standardized import hooks via ``finders`` and > ``loaders`` and ``sys.meta_path``. The ``importlib`` module, introduced > with Python 3.1, now exposes a pure Python implementation of the APIs > described by PEP 302, as well as of the full import system. It is now > much easier to understand and extend the import system. While a benefit > to the Python community, this greater accessibilty also presents a > challenge. > > As more developers come to understand and customize the import system, > any weaknesses in the finder and loader APIs will be more impactful. So > the sooner we can address any such weaknesses the import system, the > better...and there are a couple we can take care of with this proposal. > > Firstly, any time the import system needs to save information about a > module we end up with more attributes on module objects that are > generally only meaningful to the import system and occoasionally to some > people. It would be nice to have a per-module namespace to put future > import-related information. Secondly, there's an API void between > finders and loaders that causes undue complexity when encountered. > > Finders are strictly responsible for providing the loader which the > import system will use to load the module. The loader is then > responsible for doing some checks, creating the module object, setting > import-related attributes, "installing" the module to ``sys.modules``, > and loading the module, along with some cleanup. This all takes place > during the import system's call to ``Loader.load_module()``. Loaders > also provide some APIs for accessing data associated with a module. > > Loaders are not required to provide any of the functionality of > ``load_module()`` through other methods. Thus, though the import- > related information about a module is likely available without loading > the module, it is not otherwise exposed. > > Furthermore, the requirements assocated with ``load_module()`` are > common to all loaders and mostly are implemented in exactly the same > way. This means every loader has to duplicate the same boilerplate > code. ``importlib.util`` provides some tools that help with this, but > it would be more helpful if the import system simply took charge of > these responsibilities. The trouble is that this would limit the degree > of customization that ``load_module()`` facilitates. This is a gap > between finders and loaders which this proposal aims to fill. > > Finally, when the import system calls a finder's ``find_module()``, the > finder makes use of a variety of information about the module that is > useful outside the context of the method. Currently the options are > limited for persisting that per-module information past the method call, > since it only returns the loader. Popular options for this limitation > are to store the information in a module-to-info mapping somewhere on > the finder itself, or store it on the loader. > > Unfortunately, loaders are not required to be module-specific. On top > of that, some of the useful information finders could provide is > common to all finders, so ideally the import system could take care of > that. This is the same gap as before between finders and loaders. > > As an example of complexity attributable to this flaw, the > implementation of namespace packages in Python 3.3 (see PEP 420) added > ``FileFinder.find_loader()`` because there was no good way for > ``find_module()`` to provide the namespace path. > > The answer to this gap is a ``ModuleSpec`` object that contains the > per-module information and takes care of the boilerplate functionality > of loading the module. > > (The idea gained momentum during discussions related to another PEP.[1]) > > > Specification > ============= > > The goal is to address the gap between finders and loaders while > changing as little of their semantics as possible. Though some > functionality and information is moved the new ``ModuleSpec`` type, > their semantics should remain the same. However, for the sake of > clarity, those semantics will be explicitly identified. > > A High-Level View > ----------------- > > ... > > ModuleSpec > ---------- > > A new class which defines the import-related values to use when loading > the module. It closely corresponds to the import-related attributes of > module objects. ``ModuleSpec`` objects may also be used by finders and > loaders and other import-related APIs to hold extra import-related > state about the module. This greatly reduces the need to add any new > new import-related attributes to module objects, and loader ``__init__`` > methods won't need to accommodate such per-module state. > > Creating a ModuleSpec: > > ``ModuleSpec(name, loader, *, origin=None, filename=None, cached=None, > path=None)`` > > The parameters have the same meaning as the attributes described below. > However, not all ``ModuleSpec`` attributes are also parameters. The > passed values are set as-is. For calculated values use the > ``from_loader()`` method. > > ModuleSpec Attributes > --------------------- > > Each of the following names is an attribute on ``ModuleSpec`` objects. > A value of ``None`` indicates "not set". This contrasts with module > objects where the attribute simply doesn't exist. > > While ``package`` and ``is_package`` are read-only properties, the > remaining attributes can be replaced after the module spec is created > and after import is complete. This allows for unusual cases where > modifying the spec is the best option. However, typical use should not > involve changing the state of a module's spec. > > Most of the attributes correspond to the import-related attributes of > modules. Here is the mapping, followed by a description of the > attributes. The reverse of this mapping is used by > ``init_module_attrs()``. > > ============= =========== > On ModuleSpec On Modules > ============= =========== > name __name__ > loader __loader__ > package __package__ > is_package - > origin - > filename __file__ > cached __cached__ > path __path__ > ============= =========== > > ``name`` > > The module's fully resolved and absolute name. It must be set. > > ``loader`` > > The loader to use during loading and for module data. These specific > functionalities do not change for loaders. Finders are still > responsible for creating the loader and this attribute is where it is > stored. The loader must be set. > > ``package`` > > The name of the module's parent. This is a dynamic attribute with a > value derived from ``name`` and ``is_package``. For packages it is the > value of ``name``. Otherwise it is equivalent to > ``name.rpartition('.')[0]``. Consequently, a top-level module will have > give the empty string for ``package``. > > > ``is_package`` > > Whether or not the module is a package. This dynamic attribute is True > if ``path`` is set (even if empty), else it is false. > "is True if ``path`` is not None (e.g. the empty list is a "true" value), else it is False". > > ``origin`` > > A string for the location from which the module originates. If > ``filename`` is set, ``origin`` should be set to the same value unless > some other value is more appropriate. ``origin`` is used in > ``module_repr()`` if it does not match the value of ``filename``. > > Using ``filename`` for this meaning would be inaccurate, since not all > modules have path-based locations. For instance, built-in modules do > not have ``__file__`` set. Yet it is useful to have a descriptive > string indicating that it originated from the interpreter as a built-in > module. So built-in modules will have ``origin`` set to ``"built-in"``. > I still don't know what you would put there for a zipfile-based loader. Would you still put __file__ or would you put the zipfile? I ask because I would want a way to pass along in a zipfile finder to the loader where the zipfile is located and then the internal location of the file. Otherwise you need to pass in the zip path separately from the internal path to the loader constructor instead of simply passing in a ModuleSpec (e.g. see _split_path in http://bugs.python.org/file30660/zip_importlib.diff). > > Path-based attributes: > > If any of these is set, it indicates that the module is path-based. For > reference, a path entry is a string for a location where the import > system will look for modules, e.g. the path entries in ``sys.path`` or a > package's ``__path__``). > > ``filename`` > > Like ``origin``, but limited to a path-based location. If ``filename`` > is set, ``origin`` should be set to the same string, unless origin is > explicitly set to something else. ``filename`` is not necessarily an > actual file name, but could be any location string based on a path > entry. Regarding the attribute name, while it is potentially > inaccurate, it is both consistent with the equivalent module attribute > and generally accurate. > > .. XXX Would a different name be better? ``path_location``? > > ``cached`` > > The path-based location where the compiled code for a module should be > stored. If ``filename`` is set to a source file, this should be set to > corresponding path that PEP 3147 specifies. The > ``importlib.util.source_to_cache()`` function facilitates getting the > correct value. > > ``path`` > > The list of path entries in which to search for submodules if this > module is a package. Otherwise it is ``None``. > > .. XXX add a path-based subclass? > You mean like namespace package's __path__ object? Or are you saying you want ModuleSpec vs. PackageSpec? > > ModuleSpec Methods > ------------------ > > ``from_loader(name, loader, *, is_package=None, origin=None, > filename=None, cached=None, path=None)`` > > .. XXX use a different name? > > A factory classmethod that returns a new ``ModuleSpec`` derived from the > arguments. ``is_package`` is used inside the method to indicate that > the module is a package. > Why is this parameter instead of the other than inferring from 'path' or loader.is_package() as you fall back on? What's the motivation? > If not explicitly passed in, it is set to > ``True`` if ``path`` is passed in. It falls back to using the result of > the loader's ``is_package()``, if available. Finally it defaults to > False. The remaining parameters have the same meaning as the > corresponding ``ModuleSpec`` attributes. > > In contrast to ``ModuleSpec.__init__()``, which takes the arguments > as-is, ``from_loader()`` calculates missing values from the ones passed > in, as much as possible. This replaces the behavior that is currently > provided the several ``importlib.util`` functions as well as the > "provided by several" > optional ``init_module_attrs()`` method of loaders. Just to be clear, > here is a more detailed description of those calculations:: > > If not passed in, ``filename`` is to the result of calling the > loader's ``get_filename()``, if available. Otherwise it stays > unset (``None``). > > If not passed in, ``path`` is set to an empty list if > ``is_package`` is true. Then the directory from ``filename`` is > appended to it, if possible. If ``is_package`` is false, ``path`` > stays unset. > > If ``cached`` is not passed in and ``filename`` is passed in, > ``cached`` is derived from it. For filenames with a source suffix, > it set to the result of calling > ``importlib.util.cache_from_source()``. For bytecode suffixes (e.g. > ``.pyc``), ``cached`` is set to the value of ``filename``. If > ``filename`` is not passed in or ``cache_from_source()`` raises > ``NotImplementedError``, ``cached`` stays unset. > > If not passed in, ``origin`` is set to ``filename``. Thus if > ``filename`` is unset, ``origin`` stays unset. > Why is this a static constructor instead of a method like infer_values() or an infer_values keyword-only argument to the constructor to do this if requested? > > ``module_repr()`` > > Returns a repr string for the module if ``origin`` is set and > ``filename`` is not set. The string refers to the value of ``origin``. > Otherwise ``module_repr()`` returns None. This indicates to the module > type's ``__repr__()`` that it should fall back to the default repr. > This makes me think that origin is an odd name if all it affects is module_repr(). > > We could also have ``module_repr()`` produce the repr for the case where > ``filename`` is set or where ``origin`` is not set, mirroring the repr > that the module type produces directly. However, the repr string is > derived from the import-related module attributes, which might be out of > sync with the spec. > [SNIP] > .. XXX add reload(module=None) and drop load()'s parameters entirely? > If you are going to make these semantics of making the module argument only good for reloading then I say yes, make it a separate method. > .. XXX add more of importlib.reload()'s boilerplate to load()/reload()? > > Backward Compatibility > ---------------------- > > Since ``Finder.find_module()`` methods would now return a module spec > instead of loader, specs must act like the loader that would have been > returned instead. This is relatively simple to solve since the loader > is available as an attribute of the spec. We will use ``__getattr__()`` > to do it. > > However, ``ModuleSpec.is_package`` (an attribute) conflicts with > ``InspectLoader.is_package()`` (a method). Working around this requires > a more complicated solution but is not a large obstacle. Simply making > ``ModuleSpec.is_package`` a method does not reflect that is a relatively > static piece of data. > Maybe, but depending on what your "more complicated solution" it it might be best to just give up the purity and go with the practicality. > ``module_repr()`` also conflicts with the same > method on loaders, but that workaround is not complicated since both are > methods. > > Unfortunately, the ability to proxy does not extend to ``id()`` > comparisons and ``isinstance()`` tests. In the case of the return value > of ``find_module()``, we accept that break in backward compatibility. > However, we will mitigate the problem with ``isinstance()`` somewhat by > registering ``ModuleSpec`` on the loaders in ``importlib.abc``. > Actually, ModuleSpec doesn't even need to register; __instancecheck__ and __subclasscheck__ can just be defined and delegate by calling issubclass/isinstance on the loader as appropriate. > [SNIP] > > Loaders > ------- > > Loaders will have a new method, ``exec_module(module)``. Its only job > is to "exec" the module and consequently populate the module's > namespace. It is not responsible for creating or preparing the module > object, nor for any cleanup afterward. It has no return value. > > The ``load_module()`` of loaders will still work and be an active part > of the loader API. It is still useful for cases where the default > module creation/prepartion/cleanup is not appropriate for the loader. > > For example, the C API for extension modules only supports the full > control of ``load_module()``. As such, ``ExtensionFileLoader`` will not > implement ``exec_module()``. In the future it may be appropriate to > produce a second C API that would support an ``exec_module()`` > implementation for ``ExtensionFileLoader``. Such a change is outside > the scope of this PEP. > > A loader must have at least one of ``exec_module()`` and > ``load_module()`` defined. > "A load must define either ``exec_module()`` or ``load_module()``." -Brett [SNIP] -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Tue Aug 13 05:35:14 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 12 Aug 2013 21:35:14 -0600 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References: Message-ID: On Sun, Aug 11, 2013 at 7:03 AM, Nick Coghlan wrote: > I think this is solid enough to be worth adding to the PEPs repo now. > Sounds good. > > On 9 August 2013 18:58, Eric Snow wrote: > > Here's an updated version of the PEP for ModuleSpec which addresses the > > feedback I've gotten. Thanks for the help. The big open question, to > me, > > is whether or not to have a separate reload() method. I'll be looking > into > > that when I get a chance. There's also the question of a path-based > > subclass, but I'm currently not convinced it's worth it. > > One piece of feedback from me (triggered by the C extension modules > discussion on python-dev): we should consider proposing a new "exec" > hook for C extension modules that could be defined instead of or in > addition to the existing PEP 3121 init hook. > Sounds good. I expect you mean as a separate proposal... > Also, to handle the extension module case, we may need to let loaders > define an optional "create_module" method that accepts the MethodSpec > object as an argument. I'd considered that here, whether on the loader or on ModuleSpec. My plan was to hold off on that to stay focused on the rest of the changes. However, I'm open to adding this to the PEP. > > A High-Level View > > ----------------- > > > > ... > > Not sure a high level view is needed, but you can fill this in if you want > :) > Forgot that was in there. :) > > > > ModuleSpec > > ---------- > > > > A new class which defines the import-related values to use when loading > > the module. It closely corresponds to the import-related attributes of > > module objects. ``ModuleSpec`` objects may also be used by finders and > > loaders and other import-related APIs to hold extra import-related > > state about the module. This greatly reduces the need to add any new > > new import-related attributes to module objects, and loader ``__init__`` > > methods won't need to accommodate such per-module state. > > To avoid conflicts as the spec attributes evolve in the future, would > it be worth having a "custom" field which is just an arbitrary object > reference used to pass info from the finder to the loader without > troubling the rest of the import system? > I see what you're saying, but am conflicted. For some reason providing a sub-namespace for that doesn't seem quite right. However, the alternative runs the risk of collisions later on. Maybe we could recommend the use of a preceding "_" for custom attributes? I'll see if I can come up with something. > > The parameters have the same meaning as the attributes described below. > > However, not all ``ModuleSpec`` attributes are also parameters. > > The > > passed values are set as-is. For calculated values use the > > ``from_loader()`` method. > > This paragraph isn't particularly clear. Perhaps: > > "Passed in parameter values are assigned directly to the corresponding > attributes below. Other attributes not listed as parameters (such as > ``package``) are read-only properties that are automatically derived > from these values. > > The ``ModuleSpec.from_loader()`` class method allows a suitable > ModuleSpec instance to be easily created from a PEP 302 loader object" > That's much better. > > While ``package`` and ``is_package`` are read-only properties, the > > remaining attributes can be replaced after the module spec is created > > and after import is complete. This allows for unusual cases where > > modifying the spec is the best option. However, typical use should not > > involve changing the state of a module's spec. > > I'm with Brett that "is_package" should go, to be replaced by > "spec.path is not None" wherever it matters. is_package() would then > fall through to the PEP 302 loader API via __getattr__. > I'm considering the recommendation, but I still feel like `is_package` as an attribute is worth having. I see module.__spec__ as useful to more than the import system and its hackers, and `is_package` as a value to the broader audience that may not have learned about what __path__ means. It's certainly not obvious that __path__ implies a package. Then again, a person would have to be looking at __spec__ to see `is_package`, so maybe it loses enough utility to be worth keeping. > ``origin`` > > > > A string for the location from which the module originates. If > > ``filename`` is set, ``origin`` should be set to the same value unless > > some other value is more appropriate. ``origin`` is used in > > ``module_repr()`` if it does not match the value of ``filename``. > > > > Using ``filename`` for this meaning would be inaccurate, since not all > > modules have path-based locations. For instance, built-in modules do > > not have ``__file__`` set. Yet it is useful to have a descriptive > > string indicating that it originated from the interpreter as a built-in > > module. So built-in modules will have ``origin`` set to ``"built-in"``. > > How about we *just* have origin, with a separate "set_fileattr" > attribute to indicate "this is a discrete file, you should set > __file__"? > I like that. I'll see how it works. There doesn't seem to be any reason why you would have two distinct strings for origin and filename. In fact, that's kind of smelly. However, I wonder if this is where a PathModuleSpec subclass would be meaningful. Then no flag would be necessary. > Also, we should explicitly note that we'll still set __file__ for zip > imports, due to backwards compatibility concerns, even though it > doesn't correspond to a valid filesystem path. > Hmm. So deprecate the use of __file__ for anything but actual file names? Interesting. I was planning on just leaving the current meaning of "location relative to a path entry". > > (Random thought: spec.origin + spec.cached + a cache directory setting > in zipimport would give a potentially clean way to do extension module > imports from zip archives) > That would be cool. > > ``path`` > > > > The list of path entries in which to search for submodules if this > > module is a package. Otherwise it is ``None``. > > Path entries don't have to correspond to filesystem locations - they > just have to make sense to at least one path hook > (e.g. a DB URI would be a valid path entry). > Right. I didn't mean to imply that they do. > > .. XXX add a path-based subclass? > > Nope :) > I keep vacillating on this. > > ModuleSpec Methods > > ------------------ > > > > ``from_loader(name, loader, *, is_package=None, origin=None, > filename=None, > > cached=None, path=None)`` > > > > .. XXX use a different name? > > I'd disallow customisation on this one - if people want to customise, > they should just query the PEP 302 APIs themselves and call the > ModuleSpec constructor directly. The use case for this one should be > to make it trivial to switch from "return loader" to "return > ModuleSpec.from_loader(loader)" in a find_module implementation. > What do you mean by disallow customization? Make it "private"? `from_loader()` is intended for exactly the use that you described. > > In contrast to ``ModuleSpec.__init__()``, which takes the arguments > > as-is, ``from_loader()`` calculates missing values from the ones passed > > in, as much as possible. This replaces the behavior that is currently > > provided the several ``importlib.util`` functions as well as the > > optional ``init_module_attrs()`` method of loaders. Just to be clear, > > here is a more detailed description of those calculations:: > > > > If not passed in, ``filename`` is to the result of calling the > > loader's ``get_filename()``, if available. Otherwise it stays > > unset (``None``). > > > > If not passed in, ``path`` is set to an empty list if > > ``is_package`` is true. Then the directory from ``filename`` is > > appended to it, if possible. If ``is_package`` is false, ``path`` > > stays unset. > > > > If ``cached`` is not passed in and ``filename`` is passed in, > > ``cached`` is derived from it. For filenames with a source suffix, > > it set to the result of calling > > ``importlib.util.cache_from_source()``. For bytecode suffixes (e.g. > > ``.pyc``), ``cached`` is set to the value of ``filename``. If > > ``filename`` is not passed in or ``cache_from_source()`` raises > > ``NotImplementedError``, ``cached`` stays unset. > > > > If not passed in, ``origin`` is set to ``filename``. Thus if > > ``filename`` is unset, ``origin`` stays unset. > > Hmm, is there a reason this can't be the default constructor > behaviour? What's the value of *not* having the sensible fallbacks, > given they can always be overridden by passing in explicit values when > you want something different? > I'll think about this. There was some value in it before, but with changes to other signatures, `from_loader()` is much less useful as a separate factory method. > > A separate "from_module(m)" constructor would probably make sense, though. > I have this for internal use in the implementation, but did not expose it since all modules should already have a spec. > ``module_repr()`` > > > > Returns a repr string for the module if ``origin`` is set and > > ``filename`` is not set. The string refers to the value of ``origin``. > > Otherwise ``module_repr()`` returns None. This indicates to the module > > type's ``__repr__()`` that it should fall back to the default repr. > > > > We could also have ``module_repr()`` produce the repr for the case where > > ``filename`` is set or where ``origin`` is not set, mirroring the repr > > that the module type produces directly. However, the repr string is > > derived from the import-related module attributes, which might be out of > > sync with the spec. > > > > .. XXX Is using the spec close enough? Probably not. > > I think it makes sense to always return the expected repr based on the > spec attributes, but allow a custom origin to be passed in to handle > the case where the module __file__ attribute differs from > __spec__.origin (keeping in mind I think __spec__.filename should be > replaced with __spec__.set_fileattr) > That's the approach that I took at first, but the module that is passed in is not guaranteed to be a spec. Furthermore, having the spec take precedence over the module's attrs for the repr seems like too big a backward-compatibility risk. > > > The implementation of the module type's ``__repr__()`` will change to > > accommodate this PEP. However, the current functionality will remain to > > handle the case where a module does not have a ``__spec__`` attribute. > > Experience tells us that the import system should ensure the __spec__ > attribute always exists (even if it has to be filled in from the > module attributes after calling load_module) > That's a good point. The only possible problem is for someone that creates their own module object and expects repr to work the same as it does currently. > ``load(module=None, *, is_reload=False)`` > > Yep, definitely needs to be a separate method. "is_reload" would > almost always be set to a boolean, which means a separate API is > likely to be better. > Agreed. > However, I think the separate method should be "exec()" rather than > "reload()" and require that the module always be passed in. > I'll see how that looks. It seems like a better fit than just plain `reload()`. We could also expose a "create" method that just creates and returns > the new module object, and replace importlib.util.module_to_load with > a context manager that accepted the module as a parameter. Say > "add_to_sys", which fails if the module is already present in > sys.modules. > One of the points of ModuleSpec is to remove the need for `module_to_load()`. I'm not convinced of the utility of a create method like you've described other than possibly as something internal to ModuleSpec. load() would then look something like: > > def load(self): > m = self.create() > with importlib.util.add_to_sys(m): > self.exec(m) > return sys.modules[self.name] > > We could also provide reload() if we wanted to: > > def reload(self): > self.exec(sys.modules[self.name]) > return sys.modules[self.name] > > > Subclassing > > ----------- > > > > Subclasses of ModuleSpec are allowed, but should not be necessary. > > Adding functionality to a custom finder or loader will likely be a > > better fit and should be tried first. However, as long as a subclass > > still fulfills the requirements of the import system, objects of that > > type are completely fine as the return value of ``find_module()``. > > We may need to do subclasses for the ABC registration backwards > compatibility hack. > I was thinking of registering ModuleSpec in the setter of a `loader > > > > > Module Objects > > -------------- > > > > Module objects will now have a ``__spec__`` attribute to which the > > module's spec will be bound. None of the other import-related module > > attributes will be changed or deprecated, though some of them could be; > > any such deprecation can wait until Python 4. > > > > ``ModuleSpec`` objects will not be kept in sync with the corresponding > > module object's import-related attributes. Though they may differ, in > > practice they will typically be the same. > > Worth mentioning that __main__.__spec__.name will give the real name > of module's executed with -m here rather than delaying that until the > notes at the end. > > > Finders > > ------- > > > > Finders will now return ModuleSpec objects when ``find_module()`` is > > called rather than loaders. For backward compatility, ``Modulespec`` > > objects proxy the attributes of their ``loader`` attribute. > > > > Adding another similar method to avoid backward-compatibility issues > > is undersireable if avoidable. The import APIs have suffered enough, > > especially considering ``PathEntryFinder.find_loader()`` was just > > added in Python 3.3. The approach taken by this PEP should be > > sufficient to address backward-compatibility issues for > > ``find_module()``. > > > > The change to ``find_module()`` applies to both ``MetaPathFinder`` and > > ``PathEntryFinder``. ``PathEntryFinder.find_loader()`` will be > > deprecated and, for backward compatibility, implicitly special-cased if > > the method exists on a finder. > > Actually, we don't currently have anything on ModuleSpec to indicate > "this is complete, stop scanning for more path fragments" or how we > will compose multiple module specs for the individual fragments into a > combined spec for the namespace package. > > > Finders are still responsible for creating the loader. That loader will > > now be stored in the module spec returned by ``find_module()`` rather > > than returned directly. As is currently the case without the PEP, if a > > loader would be costly to create, that loader can be designed to defer > > the cost until later. > > > > Loaders > > ------- > > > > Loaders will have a new method, ``exec_module(module)``. Its only job > > is to "exec" the module and consequently populate the module's > > namespace. It is not responsible for creating or preparing the module > > object, nor for any cleanup afterward. It has no return value. > > > > The ``load_module()`` of loaders will still work and be an active part > > of the loader API. It is still useful for cases where the default > > module creation/prepartion/cleanup is not appropriate for the loader. > > > > For example, the C API for extension modules only supports the full > > control of ``load_module()``. As such, ``ExtensionFileLoader`` will not > > implement ``exec_module()``. In the future it may be appropriate to > > produce a second C API that would support an ``exec_module()`` > > implementation for ``ExtensionFileLoader``. Such a change is outside > > the scope of this PEP. > > As above, I think it may worth tackling this. It shouldn't be *that* > hard given the higher level changes and will solve some hard problems > at the lower level. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Tue Aug 13 05:47:27 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 12 Aug 2013 21:47:27 -0600 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References:

Message-ID: Accidently sent. :P Continuing... > On Sun, Aug 11, 2013 at 7:03 AM, Nick Coghlan wrote: > >> > Subclassing >> > > ----------- >> > >> > Subclasses of ModuleSpec are allowed, but should not be necessary. >> > Adding functionality to a custom finder or loader will likely be a >> > better fit and should be tried first. However, as long as a subclass >> > still fulfills the requirements of the import system, objects of that >> > type are completely fine as the return value of ``find_module()``. >> >> We may need to do subclasses for the ABC registration backwards >> compatibility hack. > > I was thinking of registering ModuleSpec in the setter of a `loader` property (as long as the loader's class has a `register()` method >> > >> > Module Objects >> > -------------- >> > >> > Module objects will now have a ``__spec__`` attribute to which the >> > module's spec will be bound. None of the other import-related module >> > attributes will be changed or deprecated, though some of them could be; >> > any such deprecation can wait until Python 4. >> > >> > ``ModuleSpec`` objects will not be kept in sync with the corresponding >> > module object's import-related attributes. Though they may differ, in >> > practice they will typically be the same. >> >> Worth mentioning that __main__.__spec__.name will give the real name >> of module's executed with -m here rather than delaying that until the >> notes at the end. >> > Fair enough. > >> > Finders >> > ------- >> > >> > Finders will now return ModuleSpec objects when ``find_module()`` is >> > called rather than loaders. For backward compatility, ``Modulespec`` >> > objects proxy the attributes of their ``loader`` attribute. >> > >> > Adding another similar method to avoid backward-compatibility issues >> > is undersireable if avoidable. The import APIs have suffered enough, >> > especially considering ``PathEntryFinder.find_loader()`` was just >> > added in Python 3.3. The approach taken by this PEP should be >> > sufficient to address backward-compatibility issues for >> > ``find_module()``. >> > >> > The change to ``find_module()`` applies to both ``MetaPathFinder`` and >> > ``PathEntryFinder``. ``PathEntryFinder.find_loader()`` will be >> > deprecated and, for backward compatibility, implicitly special-cased if >> > the method exists on a finder. >> >> Actually, we don't currently have anything on ModuleSpec to indicate >> "this is complete, stop scanning for more path fragments" or how we >> will compose multiple module specs for the individual fragments into a >> combined spec for the namespace package. >> > I was planning on just using the loader's type. If it's NamespaceLoader then path is where we'll get the fragments. I was going to say it's working in my implementation, but namespace packages are actually the one part that still have some failing tests. :P > >> > Finders are still responsible for creating the loader. That loader will >> > now be stored in the module spec returned by ``find_module()`` rather >> > than returned directly. As is currently the case without the PEP, if a >> > loader would be costly to create, that loader can be designed to defer >> > the cost until later. >> > >> > Loaders >> > ------- >> > >> > Loaders will have a new method, ``exec_module(module)``. Its only job >> > is to "exec" the module and consequently populate the module's >> > namespace. It is not responsible for creating or preparing the module >> > object, nor for any cleanup afterward. It has no return value. >> > >> > The ``load_module()`` of loaders will still work and be an active part >> > of the loader API. It is still useful for cases where the default >> > module creation/prepartion/cleanup is not appropriate for the loader. >> > >> > For example, the C API for extension modules only supports the full >> > control of ``load_module()``. As such, ``ExtensionFileLoader`` will not >> > implement ``exec_module()``. In the future it may be appropriate to >> > produce a second C API that would support an ``exec_module()`` >> > implementation for ``ExtensionFileLoader``. Such a change is outside >> > the scope of this PEP. >> >> As above, I think it may worth tackling this. It shouldn't be *that* >> hard given the higher level changes and will solve some hard problems >> at the lower level. >> > For me that seems like a separate proposal. Certainly it's related, but in some ways it would feel tacked on. On top of that, I'd have to dive into the extension module API much more than I have and I'd rather get ModuleSpec and .ref file wrapped up sooner. At the same time, I haven't really done much API design in C so that would be interesting. In the end, I'd like to keep the extension module API additions out of this PEP. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Tue Aug 13 06:17:58 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 12 Aug 2013 22:17:58 -0600 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References: Message-ID: On Sun, Aug 11, 2013 at 2:08 PM, Brett Cannon wrote: > On Fri, Aug 9, 2013 at 6:58 PM, Eric Snow wrote: > >> >> ``is_package`` >> >> Whether or not the module is a package. This dynamic attribute is True >> if ``path`` is set (even if empty), else it is false. >> > > "is True if ``path`` is not None (e.g. the empty list is a "true" value), > else it is False". > Thanks. That is clearer. > > >> >> ``origin`` >> >> A string for the location from which the module originates. If >> ``filename`` is set, ``origin`` should be set to the same value unless >> some other value is more appropriate. ``origin`` is used in >> ``module_repr()`` if it does not match the value of ``filename``. >> >> Using ``filename`` for this meaning would be inaccurate, since not all >> modules have path-based locations. For instance, built-in modules do >> not have ``__file__`` set. Yet it is useful to have a descriptive >> string indicating that it originated from the interpreter as a built-in >> module. So built-in modules will have ``origin`` set to ``"built-in"``. >> > > I still don't know what you would put there for a zipfile-based loader. > Would you still put __file__ or would you put the zipfile? I ask because I > would want a way to pass along in a zipfile finder to the loader where the > zipfile is located and then the internal location of the file. Otherwise > you need to pass in the zip path separately from the internal path to the > loader constructor instead of simply passing in a ModuleSpec (e.g. see > _split_path in http://bugs.python.org/file30660/zip_importlib.diff). > For me origin makes the most sense as the "string for the location from which the module originates". I'd think it would be the same as gets put into __file__ right now. However, you're right that there's more useful info that could be stored on the spec. In this case I'd expect it to be added as an extra attribute on the spec rather than as part of the normal ModuleSpec attributes. However, as Nick pointed out, custom attributes currently don't have a good strategy for avoiding collisions with future normal ModuleSpec attributes. > ``path`` >> >> The list of path entries in which to search for submodules if this >> module is a package. Otherwise it is ``None``. >> >> .. XXX add a path-based subclass? >> > > You mean like namespace package's __path__ object? Or are you saying you > want ModuleSpec vs. PackageSpec? > More like ModuleSpec and PathModuleSpec. PathModuleSpec would have filename, cached, and path (an associated handling), while ModuleSpec would not. At the same time I like having a one-size-fits-all ModuleSpec if possible, since it should probably pretty closely follow the one-size-fits-all module type. > > >> >> ModuleSpec Methods >> ------------------ >> >> ``from_loader(name, loader, *, is_package=None, origin=None, >> filename=None, cached=None, path=None)`` >> >> .. XXX use a different name? >> >> A factory classmethod that returns a new ``ModuleSpec`` derived from the >> arguments. ``is_package`` is used inside the method to indicate that >> the module is a package. >> > > Why is this parameter instead of the other than inferring from 'path' or > loader.is_package() as you fall back on? What's the motivation? > In part it's intended to lower the barrier to entry for people learning about the import system and getting their hands dirty. It's just more obvious as an explicit parameter. Of course, it means there are two parameters that basically accomplish the same thing, so perhaps it's not worth it. Furthermore, `from_loader()` may go the way of the dodo since the motivation for it has mostly gone away with other API changes. > Just to be clear, >> here is a more detailed description of those calculations:: >> >> If not passed in, ``filename`` is to the result of calling the >> loader's ``get_filename()``, if available. Otherwise it stays >> unset (``None``). >> >> If not passed in, ``path`` is set to an empty list if >> ``is_package`` is true. Then the directory from ``filename`` is >> appended to it, if possible. If ``is_package`` is false, ``path`` >> stays unset. >> >> If ``cached`` is not passed in and ``filename`` is passed in, >> ``cached`` is derived from it. For filenames with a source suffix, >> it set to the result of calling >> ``importlib.util.cache_from_source()``. For bytecode suffixes (e.g. >> ``.pyc``), ``cached`` is set to the value of ``filename``. If >> ``filename`` is not passed in or ``cache_from_source()`` raises >> ``NotImplementedError``, ``cached`` stays unset. >> >> If not passed in, ``origin`` is set to ``filename``. Thus if >> ``filename`` is unset, ``origin`` stays unset. >> > > Why is this a static constructor instead of a method like infer_values() > or an infer_values keyword-only argument to the constructor to do this if > requested? > Good point. I was already planning on yanking `from_loader()`. That kw-only argument would probably be a good fit. I'll try it out. > > >> >> ``module_repr()`` >> >> Returns a repr string for the module if ``origin`` is set and >> ``filename`` is not set. The string refers to the value of ``origin``. >> Otherwise ``module_repr()`` returns None. This indicates to the module >> type's ``__repr__()`` that it should fall back to the default repr. >> > > This makes me think that origin is an odd name if all it affects is > module_repr(). > It's also informational, of course. > > >> >> We could also have ``module_repr()`` produce the repr for the case where >> ``filename`` is set or where ``origin`` is not set, mirroring the repr >> that the module type produces directly. However, the repr string is >> derived from the import-related module attributes, which might be out of >> sync with the spec. >> > > > [SNIP] > > >> .. XXX add reload(module=None) and drop load()'s parameters entirely? >> > > If you are going to make these semantics of making the module argument > only good for reloading then I say yes, make it a separate method. > Yeah, I think it's settled. I like Nick's suggestion of calling it `exec()`. > >> .. XXX add more of importlib.reload()'s boilerplate to load()/reload()? >> >> Backward Compatibility >> ---------------------- >> >> Since ``Finder.find_module()`` methods would now return a module spec >> instead of loader, specs must act like the loader that would have been >> returned instead. This is relatively simple to solve since the loader >> is available as an attribute of the spec. We will use ``__getattr__()`` >> to do it. >> >> However, ``ModuleSpec.is_package`` (an attribute) conflicts with >> ``InspectLoader.is_package()`` (a method). Working around this requires >> a more complicated solution but is not a large obstacle. Simply making >> ``ModuleSpec.is_package`` a method does not reflect that is a relatively >> static piece of data. >> > > Maybe, but depending on what your "more complicated solution" it it might > be best to just give up the purity and go with the practicality. > It's not that complicated, but not exactly pretty: class _TruthyFunction: def __init__(self, func, is_true): self.func = func self._is_true = bool(is_true) def __repr__(self): return repr(self._is_true) def __bool__(self): return self._is_true def __call__(self, *args, **kwargs): return self.func(*args, **kwargs) class ModuleSpec: ... @property def is_package(self): loader = self.loader is_package = False if self.path is not None: is_package = True elif hasattr(self.loader, 'is_package'): try: is_package = loader.is_package(self.name) except ImportError: pass # Since InspectLoader also has is_package(), we have to # accommodate the use of the return value as a function. def func(*args, **kwargs): # XXX Throw a DeprecationWarning here? return self.loader.is_package(*args, **kwargs) return _TruthyFunction(func, is_package) > >> ``module_repr()`` also conflicts with the same >> method on loaders, but that workaround is not complicated since both are >> methods. >> >> Unfortunately, the ability to proxy does not extend to ``id()`` >> comparisons and ``isinstance()`` tests. In the case of the return value >> of ``find_module()``, we accept that break in backward compatibility. >> However, we will mitigate the problem with ``isinstance()`` somewhat by >> registering ``ModuleSpec`` on the loaders in ``importlib.abc``. >> > > Actually, ModuleSpec doesn't even need to register; __instancecheck__ and > __subclasscheck__ can just be defined and delegate by calling > issubclass/isinstance on the loader as appropriate. > Do you mean add custom versions of those methods to importlib.abc.Loader? That should work as well as the register approach. It won't work for all loaders but should be good enough. I was just planning on registering ModuleSpec on the loader in the setter for a `loader` property on ModuleSpec. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Aug 13 15:21:42 2013 From: brett at python.org (Brett Cannon) Date: Tue, 13 Aug 2013 09:21:42 -0400 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References:

Message-ID: On Tue, Aug 13, 2013 at 12:17 AM, Eric Snow wrote: > On Sun, Aug 11, 2013 at 2:08 PM, Brett Cannon wrote: > >> >> [SNIP] > >> >>> ``module_repr()`` also conflicts with the same >>> method on loaders, but that workaround is not complicated since both are >>> methods. >>> >>> Unfortunately, the ability to proxy does not extend to ``id()`` >>> comparisons and ``isinstance()`` tests. In the case of the return value >>> of ``find_module()``, we accept that break in backward compatibility. >>> However, we will mitigate the problem with ``isinstance()`` somewhat by >>> registering ``ModuleSpec`` on the loaders in ``importlib.abc``. >>> >> >> Actually, ModuleSpec doesn't even need to register; __instancecheck__ and >> __subclasscheck__ can just be defined and delegate by calling >> issubclass/isinstance on the loader as appropriate. >> > > Do you mean add custom versions of those methods to importlib.abc.Loader? > Nope, I meant ModuleSpec because every time I have a reason to override something it's on the object and not the class and so I forget the support is the other way around. Argh. > That should work as well as the register approach. It won't work for all > loaders but should be good enough. I was just planning on registering > ModuleSpec on the loader in the setter for a `loader` property on > ModuleSpec. > But the registration is at the class level so how would that work? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Wed Aug 14 01:47:53 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 13 Aug 2013 17:47:53 -0600 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References:

Message-ID: On Tue, Aug 13, 2013 at 7:21 AM, Brett Cannon wrote: > On Tue, Aug 13, 2013 at 12:17 AM, Eric Snow wrote: > >> On Sun, Aug 11, 2013 at 2:08 PM, Brett Cannon wrote: >> >>> >>> > [SNIP] > > >> >>> >>>> ``module_repr()`` also conflicts with the same >>>> method on loaders, but that workaround is not complicated since both are >>>> methods. >>>> >>>> Unfortunately, the ability to proxy does not extend to ``id()`` >>>> comparisons and ``isinstance()`` tests. In the case of the return value >>>> of ``find_module()``, we accept that break in backward compatibility. >>>> However, we will mitigate the problem with ``isinstance()`` somewhat by >>>> registering ``ModuleSpec`` on the loaders in ``importlib.abc``. >>>> >>> >>> Actually, ModuleSpec doesn't even need to register; __instancecheck__ >>> and __subclasscheck__ can just be defined and delegate by calling >>> issubclass/isinstance on the loader as appropriate. >>> >> >> Do you mean add custom versions of those methods to importlib.abc.Loader? >> > > Nope, I meant ModuleSpec because every time I have a reason to override > something it's on the object and not the class and so I forget the support > is the other way around. Argh. > Yeah, that would make things a lot easier. > That should work as well as the register approach. It won't work for all >> loaders but should be good enough. I was just planning on registering >> ModuleSpec on the loader in the setter for a `loader` property on >> ModuleSpec. >> > > But the registration is at the class level so how would that work? > @property def loader(self): return self._loader @loader.setter def loader(self, loader): try: register = loader.__class__.register except AttributeError: pass else: register(self.__class__) self._loader = loader It's not pretty and it won't work on non-ABCs, but it's better than nothing. The likelihood of someone doing an isinstance check on a loader seems pretty low though. Of course, I'm planning on doing just that for handling of namespace packages, but that's a little different. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Aug 14 03:16:07 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Aug 2013 21:16:07 -0400 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References:

Message-ID: On 13 Aug 2013 18:48, "Eric Snow" wrote: > > On Tue, Aug 13, 2013 at 7:21 AM, Brett Cannon wrote: >> >> On Tue, Aug 13, 2013 at 12:17 AM, Eric Snow wrote: >>> >>> On Sun, Aug 11, 2013 at 2:08 PM, Brett Cannon wrote: >>>> >>>> >> >> [SNIP] >> >>>> >>>> >>>>> >>>>> ``module_repr()`` also conflicts with the same >>>>> method on loaders, but that workaround is not complicated since both are >>>>> methods. >>>>> >>>>> Unfortunately, the ability to proxy does not extend to ``id()`` >>>>> comparisons and ``isinstance()`` tests. In the case of the return value >>>>> of ``find_module()``, we accept that break in backward compatibility. >>>>> However, we will mitigate the problem with ``isinstance()`` somewhat by >>>>> registering ``ModuleSpec`` on the loaders in ``importlib.abc``. >>>> >>>> >>>> Actually, ModuleSpec doesn't even need to register; __instancecheck__ and __subclasscheck__ can just be defined and delegate by calling issubclass/isinstance on the loader as appropriate. >>> >>> >>> Do you mean add custom versions of those methods to importlib.abc.Loader? >> >> >> Nope, I meant ModuleSpec because every time I have a reason to override something it's on the object and not the class and so I forget the support is the other way around. Argh. > > > Yeah, that would make things a lot easier. > >>> >>> That should work as well as the register approach. It won't work for all loaders but should be good enough. I was just planning on registering ModuleSpec on the loader in the setter for a `loader` property on ModuleSpec. >> >> >> But the registration is at the class level so how would that work? > > > @property > def loader(self): > return self._loader > > @loader.setter > def loader(self, loader): > try: > register = loader.__class__.register > except AttributeError: > pass > else: > register(self.__class__) > self._loader = loader > > It's not pretty and it won't work on non-ABCs, but it's better than nothing. The likelihood of someone doing an isinstance check on a loader seems pretty low though. Of course, I'm planning on doing just that for handling of namespace packages, but that's a little different. That ends up registering ModuleSpec as an example of every loader ABC, so it doesn't work at all. Making the importlib ABC hooks ModuleSpec aware (so they knew to check the loader, not the spec) would be pretty easy, though. > > -eric > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Wed Aug 14 05:18:35 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 13 Aug 2013 21:18:35 -0600 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References:

Message-ID: On Tue, Aug 13, 2013 at 7:16 PM, Nick Coghlan wrote: > On 13 Aug 2013 18:48, "Eric Snow" wrote: > > @property > > def loader(self): > > return self._loader > > > > @loader.setter > > def loader(self, loader): > > try: > > register = loader.__class__.register > > except AttributeError: > > pass > > else: > > register(self.__class__) > > self._loader = loader > > > > It's not pretty and it won't work on non-ABCs, but it's better than > nothing. The likelihood of someone doing an isinstance check on a loader > seems pretty low though. Of course, I'm planning on doing just that for > handling of namespace packages, but that's a little different. > > That ends up registering ModuleSpec as an example of every loader ABC, so > it doesn't work at all. > I guess it does amount to a cheap trick, allowing isinstance() checks to pass but not necessarily providing the appropriate APIs. > Making the importlib ABC hooks ModuleSpec aware (so they knew to check the > loader, not the spec) would be pretty easy, though. > That's what I thought Brett was recommending earlier. I was going to express hesitation at spreading backward-compatibility tendrils. However, your recommendation is probably a good idea on its own. Several of the collections ABCs do explicit API checks and they'd work well here too. I'll add this to the PEP. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Thu Aug 15 02:27:40 2013 From: pje at telecommunity.com (PJ Eby) Date: Wed, 14 Aug 2013 20:27:40 -0400 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References: Message-ID: On Fri, Aug 9, 2013 at 6:58 PM, Eric Snow wrote: > A High-Level View > ----------------- > > ... It would be really helpful if that high-level view were actually included, as I'm having a lot of trouble wrapping my head around the rest of the spec. For that matter, some introductory examples to contrast "before" and "after" for something that this changes would be really nice at about this point. > Path-based attributes: > > If any of these is set, it indicates that the module is path-based. For > reference, a path entry is a string for a location where the import > system will look for modules, e.g. the path entries in ``sys.path`` or a > package's ``__path__``). > What does "path-based" actually mean here? On the one hand, you're saying that a path entry is on sys.path or a __path__, but then we're using an attribute called "filename". Shouldn't it be called path_entry or subpath, or location or something, if it's not required to be a filename? The overlap between path = sys.path and path = filesystem path is way too confusing here. > .. XXX Would a different name be better? ``path_location``? Yeah, definitely something other than filename. ;-) It might also help to explain that some modules can be loaded by reference to a location, e.g. a filesystem path or a URL or something of the sort -- having the location lets you load the module, but in theory you could load that module under various names. In contrast, non-located modules can't be loaded in this fashion: modules created by a meta path loader (such as builtins), or modules dynamically created in code. For these, the name is the only way to access them, so they have an "origin" but not a "location". Also, bear in mind that it's not just exotic locations like URLs that aren't filenames. zipimport uses pseudo-filenames that pretend a zipfile is a directory, by prepending the zipfile's filename to a path that's within the zipfile. So, calling this "filename" is *really* a bad idea; it's not always a filename for even stdlib importers, let alone anything third-party! > ``path`` > > The list of path entries in which to search for submodules if this > module is a package. Otherwise it is ``None``. This should probably be called submodule_path or submodule_search_locations or something, to avoid even *more* overloading of the word "path". ;-) > .. XXX add a path-based subclass? Why? What good would it do? > ModuleSpec Methods > ------------------ > > ``from_loader(name, loader, *, is_package=None, origin=None, filename=None, > cached=None, path=None)`` > > .. XXX use a different name? Seems fine to me: it's consistent w/other stdlib factory method names. > If not passed in, ``path`` is set to an empty list if > ``is_package`` is true. Then the directory from ``filename`` is > appended to it, if possible. If ``is_package`` is false, ``path`` > stays unset. How does this interact with namespace packages? Does it? > Sets the module's import-related attributes to the corresponding values > in the module spec. If a path-based attribute is not set on the spec, Location-based? ;-) > ``load(module=None, *, is_reload=False)`` > > This method captures the current functionality of and requirements on > ``Loader.load_module()`` without any semantic changes, except one. > Reloading a module when ``exec_module()`` is available actually uses > ``module`` rather than ignoring it in favor of the one in > ``sys.modules``, as ``Loader.load_module()`` does. Interesting -- this could possibly be leveraged to implement multi-version imports. > ``module`` is only allowed when ``is_reload`` is true. ...or not. ;) > This means that > ``is_reload`` could be dropped as a parameter. However, doing so would > mean we could not use ``None`` to indicate that the module should be > pulled from ``sys.modules``. Wait, what? That doesn't seem true to me: why not just use the module or pull one according to whether it's None or not? What actual difference does is_reload really make here? > Regarding the first part of ``load()``, the following describes what > happens. I'm thinking maybe this should be parameterized to allow passing in a 'modules' dictionary other than sys.modules. This would make multi-version imports or other "isolated environment" imports more viable, and factor out another global element of the import system. That way, if you implement an isolated module system, you don't have to duplicate or subclass ModuleSpec to perform the same loading functionality. > Unfortunately, the ability to proxy does not extend to ``id()`` > comparisons and ``isinstance()`` tests. Who does id() tests on loaders? isinstance() fudging, OTOH, is quite doable. See the ProxyTypes library on PyPI for an example; it's 2.x-only but I believe somebody has done a proof-of-concept port (due to some __special__ methods being different or missing in 3.x) > Finders > ------- > > Finders will now return ModuleSpec objects when ``find_module()`` is > called rather than loaders. For backward compatility, ``Modulespec`` > objects proxy the attributes of their ``loader`` attribute. Has anybody looked at how this change affects pkgutil's (and setuptools') generic function-based extensions to PEP 302? Currently, you can register specific loader types with these guys, but that'll likely break if importlib is going to start wrapping loaders without those tools' knowledge. May I suggest adding a new finder method, find_module_spec() instead? Then, implement it for finders that don't support it by calling find_module() and wrapping the loader with a ModuleSpec. This approach would be less disruptive to code that already uses find_module and inspects loader types to add extension protocols. > Adding another similar method to avoid backward-compatibility issues > is undersireable if avoidable. The import APIs have suffered enough, > especially considering ``PathEntryFinder.find_loader()`` was just > added in Python 3.3. The approach taken by this PEP should be > sufficient to address backward-compatibility issues for > ``find_module()``. I'm not sure I'm following here: are you saying that all PEP 302 finders implemented by anyone, anywhere, must be changed *in order to work at all*, when this lands in a *minor version change*? > Other Changes > ------------- This section doesn't address impact on pkgutil, which makes significant use of the PEP 302 API. From ericsnowcurrently at gmail.com Thu Aug 15 09:38:11 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 15 Aug 2013 01:38:11 -0600 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References: Message-ID: On Wed, Aug 14, 2013 at 6:27 PM, PJ Eby wrote: > On Fri, Aug 9, 2013 at 6:58 PM, Eric Snow > wrote: > > A High-Level View > > ----------------- > > > > ... > > It would be really helpful if that high-level view were actually > included, as I'm having a lot of trouble wrapping my head around the > rest of the spec. For that matter, some introductory examples to > contrast "before" and "after" for something that this changes would be > really nice at about this point. > Sounds good. As to examples, do you mean how you would replace an implementation of load_module() with one of exec_module()? > > Path-based attributes: > > > > If any of these is set, it indicates that the module is path-based. For > > reference, a path entry is a string for a location where the import > > system will look for modules, e.g. the path entries in ``sys.path`` or a > > package's ``__path__``). > > > > What does "path-based" actually mean here? On the one hand, you're > saying that a path entry is on sys.path or a __path__, but then we're > using an attribute called "filename". Shouldn't it be called > path_entry or subpath, or location or something, if it's not required > to be a filename? The overlap between path = sys.path and path = > filesystem path is way too confusing here. > This is a really good point. I'll clean it up. I've already changed "path" to "path_entries" and dropped "filename" in favor of "set_fileattr". Furthermore, "file location" is a good substitute for "path" when talking about files. > > .. XXX Would a different name be better? ``path_location``? > > Yeah, definitely something other than filename. ;-) > > It might also help to explain that some modules can be loaded by > reference to a location, e.g. a filesystem path or a URL or something > of the sort -- having the location lets you load the module, but in > theory you could load that module under various names. In contrast, > non-located modules can't be loaded in this fashion: modules created > by a meta path loader (such as builtins), or modules dynamically > created in code. For these, the name is the only way to access them, > so they have an "origin" but not a "location". > Right. That's the point of "origin". It will be up to the loader whether or not to use "origin" to determine a location, if any. Also, bear in mind that it's not just exotic locations like URLs that > aren't filenames. zipimport uses pseudo-filenames that pretend a > zipfile is a directory, by prepending the zipfile's filename to a path > that's within the zipfile. So, calling this "filename" is *really* a > bad idea; it's not always a filename for even stdlib importers, let > alone anything third-party! > Yeah, that has always bugged me about "__file__". The upcoming revision of the PEP uses the combo of "origin" and "set_fileattr" (a bool) instead of "filename". > > ``path`` > > > > The list of path entries in which to search for submodules if this > > module is a package. Otherwise it is ``None``. > > This should probably be called submodule_path or > submodule_search_locations or something, to avoid even *more* > overloading of the word "path". ;-) > I came to the same conclusion and was planning on using "path_entries". However perhaps something even more explicit, like "submodule_search_locations", would be better. :) > If not passed in, ``path`` is set to an empty list if > > ``is_package`` is true. Then the directory from ``filename`` is > > appended to it, if possible. If ``is_package`` is false, ``path`` > > stays unset. > > How does this interact with namespace packages? Does it? > Namespace packages won't use this method, so nothing will be populated dynamically. > > ``load(module=None, *, is_reload=False)`` > > > > This method captures the current functionality of and requirements on > > ``Loader.load_module()`` without any semantic changes, except one. > > Reloading a module when ``exec_module()`` is available actually uses > > ``module`` rather than ignoring it in favor of the one in > > ``sys.modules``, as ``Loader.load_module()`` does. > > Interesting -- this could possibly be leveraged to implement > multi-version imports. > I'm planning on splitting reload() out from load() so those semantics would go away. However, there may be room to still provide the same functionality. What would be needed for multi-version imports? (Is that question opening a can of worms? ) > > This means that > > ``is_reload`` could be dropped as a parameter. However, doing so would > > mean we could not use ``None`` to indicate that the module should be > > pulled from ``sys.modules``. > > Wait, what? That doesn't seem true to me: why not just use the module > or pull one according to whether it's None or not? What actual > difference does is_reload really make here? > With a separate reload() this point is moot. > > Regarding the first part of ``load()``, the following describes what > > happens. > > I'm thinking maybe this should be parameterized to allow passing in a > 'modules' dictionary other than sys.modules. This would make > multi-version imports or other "isolated environment" imports more > viable, and factor out another global element of the import system. > That way, if you implement an isolated module system, you don't have > to duplicate or subclass ModuleSpec to perform the same loading > functionality. > Cool idea, but couldn't this wait. I could totally see this as part of PEP 406 (import engine). > > Unfortunately, the ability to proxy does not extend to ``id()`` > > comparisons and ``isinstance()`` tests. > > Who does id() tests on loaders? Which is why I'm not going to worry about it too much. :) > isinstance() fudging, OTOH, is quite > doable. See the ProxyTypes library on PyPI for an example; it's > 2.x-only but I believe somebody has done a proof-of-concept port (due > to some __special__ methods being different or missing in 3.x) > The current plan is to simply implement __subclasshook__() on the various importlib ABCs, and perhaps other loaders, to check for methods. Some of the ABCs in collections.abc (like Iterator) do this. > Finders > > ------- > > > > Finders will now return ModuleSpec objects when ``find_module()`` is > > called rather than loaders. For backward compatility, ``Modulespec`` > > objects proxy the attributes of their ``loader`` attribute. > > Has anybody looked at how this change affects pkgutil's (and > setuptools') generic function-based extensions to PEP 302? Currently, > you can register specific loader types with these guys, but that'll > likely break if importlib is going to start wrapping loaders without > those tools' knowledge. > Good point. I'll look into this. > > May I suggest adding a new finder method, find_module_spec() instead? > Then, implement it for finders that don't support it by calling > find_module() and wrapping the loader with a ModuleSpec. This > approach would be less disruptive to code that already uses > find_module and inspects loader types to add extension protocols. > I consider this a last resort--i.e. if we can't find a way to make find_module() work for us in a simple enough way. I just cringe at the idea of bolting on another backward-compatibility-induced method, particularly when it's the OOTDI and the existing name is better fit for the new functionality than old and re-purposing find_module() is within reach. > Adding another similar method to avoid backward-compatibility issues > > is undersireable if avoidable. The import APIs have suffered enough, > > especially considering ``PathEntryFinder.find_loader()`` was just > > added in Python 3.3. The approach taken by this PEP should be > > sufficient to address backward-compatibility issues for > > ``find_module()``. > > I'm not sure I'm following here: are you saying that all PEP 302 > finders implemented by anyone, anywhere, must be changed *in order to > work at all*, when this lands in a *minor version change*? > Existing finders and loaders will continue working as-is. I've already got this working in a rough implementation, so it's not that big a stretch. > > Other Changes > > ------------- > > This section doesn't address impact on pkgutil, which makes > significant use of the PEP 302 API. > I'll add that in. Thanks for bringing it up. My draft implementation is passing all the pkgutil tests, but I wouldn't be surprised if I've missed something here. Anyway, thanks for the feedback. I'll post an update to the PEP in the next day or two. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Aug 15 17:23:45 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 15 Aug 2013 10:23:45 -0500 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References: Message-ID: On 15 August 2013 02:38, Eric Snow wrote: > PJE wrote: >> I'm thinking maybe this should be parameterized to allow passing in a >> 'modules' dictionary other than sys.modules. This would make >> multi-version imports or other "isolated environment" imports more >> viable, and factor out another global element of the import system. >> That way, if you implement an isolated module system, you don't have >> to duplicate or subclass ModuleSpec to perform the same loading >> functionality. > > Cool idea, but couldn't this wait. I could totally see this as part of PEP > 406 (import engine). One of the conclusions I came to from Greg's import engine work is that the only practical way for us to get to isolated import subsystems is either with a Decimal style thread local context based solution, or with a split create/exec API where the loader doesn't do any global state manipulation at all and instead operates in a functional mode where it just returns values based on passed in parameters (that way the import system at least has the chance to override __import__ before running the module code). Anything else looks like it will be too fragile (and the latter approach doesn't necessarily work for C extensions that do imports). This is part of why I'm keen on having this PEP expose "create" and "exec" as separate operations on ModuleSpec, with "load" acting solely as a convenience function for combining them with the appropriate sys.modules manipulation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Thu Aug 15 17:59:43 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 15 Aug 2013 10:59:43 -0500 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References:

Message-ID: On 12 Aug 2013 23:35, "Eric Snow" wrote: > > On Sun, Aug 11, 2013 at 7:03 AM, Nick Coghlan wrote: >> >> I think this is solid enough to be worth adding to the PEPs repo now. > > > Sounds good. > >> >> >> On 9 August 2013 18:58, Eric Snow wrote: >> > Here's an updated version of the PEP for ModuleSpec which addresses the >> > feedback I've gotten. Thanks for the help. The big open question, to me, >> > is whether or not to have a separate reload() method. I'll be looking into >> > that when I get a chance. There's also the question of a path-based >> > subclass, but I'm currently not convinced it's worth it. >> >> One piece of feedback from me (triggered by the C extension modules >> discussion on python-dev): we should consider proposing a new "exec" >> hook for C extension modules that could be defined instead of or in >> addition to the existing PEP 3121 init hook. > > > Sounds good. I expect you mean as a separate proposal... I actually meant in this proposal, but strictly speaking, I just need the "create" part of the API at this level to tie into my ideas for C extensions :) Now that this has a PEP number I can reference, I'll try to get something more fleshed out posted later this week. >> > ModuleSpec >> > ---------- >> > >> > A new class which defines the import-related values to use when loading >> > the module. It closely corresponds to the import-related attributes of >> > module objects. ``ModuleSpec`` objects may also be used by finders and >> > loaders and other import-related APIs to hold extra import-related >> > state about the module. This greatly reduces the need to add any new >> > new import-related attributes to module objects, and loader ``__init__`` >> > methods won't need to accommodate such per-module state. >> >> To avoid conflicts as the spec attributes evolve in the future, would >> it be worth having a "custom" field which is just an arbitrary object >> reference used to pass info from the finder to the loader without >> troubling the rest of the import system? > > > I see what you're saying, but am conflicted. For some reason providing a sub-namespace for that doesn't seem quite right. However, the alternative runs the risk of collisions later on. Maybe we could recommend the use of a preceding "_" for custom attributes? I'll see if I can come up with something. It wouldn't be a custom namespace, just a single attribute to pass data to the loader. It could be a dict, namespace, string, custom object, anything. By default, it would be None. For example, zipimporter could use it to pass the zip archive name to the loader directly, rather than needing to derive it from origin or create a custom loader for each find operation. >> > While ``package`` and ``is_package`` are read-only properties, the >> > remaining attributes can be replaced after the module spec is created >> > and after import is complete. This allows for unusual cases where >> > modifying the spec is the best option. However, typical use should not >> > involve changing the state of a module's spec. >> >> I'm with Brett that "is_package" should go, to be replaced by >> "spec.path is not None" wherever it matters. is_package() would then >> fall through to the PEP 302 loader API via __getattr__. > > > I'm considering the recommendation, but I still feel like `is_package` as an attribute is worth having. I see module.__spec__ as useful to more than the import system and its hackers, and `is_package` as a value to the broader audience that may not have learned about what __path__ means. It's certainly not obvious that __path__ implies a package. Then again, a person would have to be looking at __spec__ to see `is_package`, so maybe it loses enough utility to be worth keeping. I think we need to emphasise the fact that a package is just a module with a search path attribute *more* rather than less. Don't try to hide it, shout it from the rooftops :) Say, something like "spec.submodule_search_path is not None" :) >> How about we *just* have origin, with a separate "set_fileattr" >> attribute to indicate "this is a discrete file, you should set >> __file__"? > > > I like that. I'll see how it works. There doesn't seem to be any reason why you would have two distinct strings for origin and filename. In fact, that's kind of smelly. > > However, I wonder if this is where a PathModuleSpec subclass would be meaningful. Then no flag would be necessary. I realised we may not need a separate flag at all: how about we key this off "hasattr(self.loader, 'get_data')"? And expose that as a "is_location" read-only property? (I like PJE's suggestion of "location" as a name for modules which may be used with a loader that supports the get_data API) (Tangent: at some point in the future, we could define an "open" method on spec objects. This would do the path munging relative to origin automatically, using the opener argument to the builtin open to back it with BytesIO and the get_data API on the loader. If loaders defined an "opener" method, then the spec could use that instead) >> > ModuleSpec Methods >> > ------------------ >> > >> > ``from_loader(name, loader, *, is_package=None, origin=None, filename=None, >> > cached=None, path=None)`` >> > >> > .. XXX use a different name? >> >> I'd disallow customisation on this one - if people want to customise, >> they should just query the PEP 302 APIs themselves and call the >> ModuleSpec constructor directly. The use case for this one should be >> to make it trivial to switch from "return loader" to "return >> ModuleSpec.from_loader(loader)" in a find_module implementation. > > > What do you mean by disallow customization? Make it "private"? `from_loader()` is intended for exactly the use that you described. The keyword arguments. If from_loader stays, it shouldn't allow you to override the values derived from the loader - if you want to do that, just read the values you want to keep from the loader and pass them in explicitly. >> A separate "from_module(m)" constructor would probably make sense, though. > > I have this for internal use in the implementation, but did not expose it since all modules should already have a spec. It's more for the benefit of adapting existing loaders - since they already have the code to initialise the module, we should make it easy for them to just initialise a throwaway module and convert it to a spec object, rather than having to completely rewrite their initialisation code to be spec based. >> > ``module_repr()`` >> > >> > Returns a repr string for the module if ``origin`` is set and >> > ``filename`` is not set. The string refers to the value of ``origin``. >> > Otherwise ``module_repr()`` returns None. This indicates to the module >> > type's ``__repr__()`` that it should fall back to the default repr. >> > >> > We could also have ``module_repr()`` produce the repr for the case where >> > ``filename`` is set or where ``origin`` is not set, mirroring the repr >> > that the module type produces directly. However, the repr string is >> > derived from the import-related module attributes, which might be out of >> > sync with the spec. >> > >> > .. XXX Is using the spec close enough? Probably not. >> >> I think it makes sense to always return the expected repr based on the >> spec attributes, but allow a custom origin to be passed in to handle >> the case where the module __file__ attribute differs from >> __spec__.origin (keeping in mind I think __spec__.filename should be >> replaced with __spec__.set_fileattr) > > > That's the approach that I took at first, but the module that is passed in is not guaranteed to be a spec. Furthermore, having the spec take precedence over the module's attrs for the repr seems like too big a backward-compatibility risk. I don't understand your response. Simplifying the API a bit to allow a module to be passed in directly, ModuleType.__repr__ would just call it like this: self.__spec__.module_repr(self) All the logic would be in one place (ModuleSpec), but modules could still override the original values with the actual settings in the module namespace. >> > The implementation of the module type's ``__repr__()`` will change to >> > accommodate this PEP. However, the current functionality will remain to >> > handle the case where a module does not have a ``__spec__`` attribute. >> >> Experience tells us that the import system should ensure the __spec__ >> attribute always exists (even if it has to be filled in from the >> module attributes after calling load_module) > > That's a good point. The only possible problem is for someone that creates their own module object and expects repr to work the same as it does currently. Hmm, true - however, we can handle that by creating and throwing away a dummy spec object rather than duplicating the logic. >> We could also expose a "create" method that just creates and returns >> the new module object, and replace importlib.util.module_to_load with >> a context manager that accepted the module as a parameter. Say >> "add_to_sys", which fails if the module is already present in >> sys.modules. > > > One of the points of ModuleSpec is to remove the need for `module_to_load()`. I'm not convinced of the utility of a create method like you've described other than possibly as something internal to ModuleSpec. Splitting create and exec should eventually let me delete a bunch of code from runpy :) Cheers, Nick. From ericsnowcurrently at gmail.com Fri Aug 16 00:15:57 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 15 Aug 2013 16:15:57 -0600 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References: Message-ID: On Thu, Aug 15, 2013 at 9:23 AM, Nick Coghlan wrote: > On 15 August 2013 02:38, Eric Snow wrote: > > PJE wrote: > >> I'm thinking maybe this should be parameterized to allow passing in a > >> 'modules' dictionary other than sys.modules. This would make > >> multi-version imports or other "isolated environment" imports more > >> viable, and factor out another global element of the import system. > >> That way, if you implement an isolated module system, you don't have > >> to duplicate or subclass ModuleSpec to perform the same loading > >> functionality. > > > > Cool idea, but couldn't this wait. I could totally see this as part of > PEP > > 406 (import engine). > > One of the conclusions I came to from Greg's import engine work is > that the only practical way for us to get to isolated import > subsystems is either with a Decimal style thread local context based > solution, I was messing around with this a while back and the thread-local context approach was pretty easy to do. > or with a split create/exec API where the loader doesn't do > any global state manipulation at all and instead operates in a > functional mode where it just returns values based on passed in > parameters (that way the import system at least has the chance to > override __import__ before running the module code). Anything else > looks like it will be too fragile (and the latter approach doesn't > necessarily work for C extensions that do imports). > > This is part of why I'm keen on having this PEP expose "create" and > "exec" as separate operations on ModuleSpec, with "load" acting solely > as a convenience function for combining them with the appropriate > sys.modules manipulation. > Ah. That helps clarify things. I'll got stew on that a bit. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Aug 24 13:50:24 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 24 Aug 2013 21:50:24 +1000 Subject: [Import-SIG] Round 2 for "A ModuleSpec Type for the Import System" In-Reply-To: References: