From ncoghlan at gmail.com Mon Aug 1 01:39:39 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 1 Aug 2011 09:39:39 +1000 Subject: [Import-SIG] New PEP Draft: Import Engine In-Reply-To: <20110731145241.AEC433A409B@sparrow.telecommunity.com> References: <20110731145241.AEC433A409B@sparrow.telecommunity.com> Message-ID: On Mon, Aug 1, 2011 at 12:51 AM, P.J. Eby wrote: > It would be much better if you can reframe your proposal in terms of > *additions* to the PEP 302 protocol, rather than *changes*. They really are additions, just in the form of an optional argument rather than new methods. The idea is that the builtin import would continue to call the PEP 302 methods according to their current signature, so existing implementations of importers and loaders would continue to work without modification. Engine-based importers and loaders would use the idiom described in the PEP in order to support both styles of access - if the engine argument is missing, they take that as "ah, this is a process global import" and fall back on the global engine instance that uses property descriptors to access the existing process global state. The GlobalImportState subclass will similarly omit the engine argument when invoking the PEP 302 APIs. Aside from that, the incompatibilities and changes in assumptions run too deep - old style importers would *only* be usable with the global import state (including via the GlobalImportState API). However, all of the importlib importers and loaders would be updated to support the engine argument. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Aug 1 02:17:40 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 1 Aug 2011 10:17:40 +1000 Subject: [Import-SIG] Windows registry imports Message-ID: imports based on the Windows registry are one of the parts of the import system that I know the least about, so I'd appreciate a second opinion on this tracker issue: http://bugs.python.org/issue12648 (someone ran across the behaviour where installed copies of Python on Windows don't let you shadow stdlib modules). However, I can't find any real reference to the intended operation of this functionality or its reason for existence - just Guido's initial checkin back in 1996 of the files he received from Mark Hammond (http://hg.python.org/cpython/annotate/740def697d8b/PC/import_nt.c). http://docs.python.org/using/windows.html#finding-modules describes some additional sys.path entries retrieved from the registry, but says nothing about the actual "/Module" entries that completely preempt sys.path processing. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jergosh at gmail.com Mon Aug 15 22:15:56 2011 From: jergosh at gmail.com (Greg Slodkowicz) Date: Mon, 15 Aug 2011 22:15:56 +0200 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: <20110721163700.3daff988@resist.wooz.org> References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> Message-ID: Dear all, I've been working on a proof-of-concept implementation of PEP 402 as part of my import-related Google Summer of Code project (the repository is at https://bitbucket.org/jergosh/pep-402). I made the changes to my ImportEngine-based import but the logic should be the same as the original __import__ implemented in importlib. I haven't studied the CPython implementation but I'm assuming it's similar. I've ran into the following problem: prior to PEP-402, if we import Foo.Bar.Baz, first Foo and then Foo.Bar are imported by recursion finally, if these succeed, also Foo.Bar.Baz. Then __import__ returns the root module, in this case Foo. But suppose Foo and Bar are virtual packages, i. e. there is a file Foo/Bar/Baz.py somewhere on the path, lacking any __init__.py files. As far as I understand the proposal, then there would be no Foo to return. One way to get around this would be to create an empty module on each import level (Foo and Foo.Bar in this case) and set appropriate __path__ on each of them. To my mind, this would require changing the way get_subpath() works. Currently, the requirement is that "Each importer is checked for a get_subpath() method, and if present, the method is called with the *full name* of the module/package the path is being constructed for. The return value is either a string representing a subdirectory for the requested package, or None if no such subdirectory exists." If instead we allow passing a 'partial' name (Foo and Foo.Bar) of a virtual package to get_subpath() and indicate (by an extra parameter) that we are looking for a purely virtual module, we could create __path__ entries for each of these parent modules. For that reason, in my implementation I introduced an extra parameter ' as_parent' to _gcd_import(), indicating that if a module or package with a given name is not found, a subdirectory (e. g. Foo/Bar) on the import path should be accepted as well and an analogical extra parameter in get_subpath(). Hope this makes sense. I'd appreciate any feedback. Best regards, Greg From ncoghlan at gmail.com Tue Aug 16 01:28:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 16 Aug 2011 09:28:58 +1000 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> Message-ID: On Tue, Aug 16, 2011 at 6:15 AM, Greg Slodkowicz wrote: > I've ran into the following problem: prior to PEP-402, if we import > Foo.Bar.Baz, first Foo and then Foo.Bar are imported by recursion > finally, if these succeed, also Foo.Bar.Baz. Then __import__ returns > the root module, in this case Foo. But suppose Foo and Bar are virtual > packages, i. e. there is a file Foo/Bar/Baz.py somewhere on the path, > lacking any __init__.py files. As far as I understand the proposal, > then there would be no Foo to return. > > One way to get around this would be to create an empty module on each > import level (Foo and Foo.Bar in this case) and set appropriate > __path__ on each of them. To my mind, this would require changing the > way get_subpath() works. > > Currently, the requirement is that > "Each importer is checked for a get_subpath() method, and if present, > the method is called with the *full name* of the module/package the > path is being constructed for. The return value is either a string > representing a subdirectory for the requested package, or None if no > such subdirectory exists." However, if you go back up to the Specification section, you'll find this bit: "Note, by the way, that this change must be applied recursively: that is, if foo and foo.bar are pure virtual packages, then import foo.bar.baz must wait until foo.bar.baz is found before creating module objects for both foo and foo.bar, and then create both of them together, properly setting the foo module's .bar attribute to point to the foo.bar module. In this way, pure virtual packages are never directly importable: an import foo or import foo.bar by itself will fail, and the corresponding modules will not appear in sys.modules until they are needed to point to a successfully imported submodule or self-contained subpackage." I'm actually not sure this is a viable approach as currently described in the PEP - most of the existing import machinery assumes that parent modules will exist in sys.modules before child modules are imported. In addition, it would create some confusing inconsistencies in cases where, for example 'foo' was pure virtual, but 'bar' was either self-contained or included a module alongside the subpackage directories (whether or not 'foo' remained around after a failed 'foo.bar.baz' lookup would depend on whether or not 'foo.bar' was pure virtual or not). Given that, I think once a pure virtual package has been created while hunting for subpackages, there's no percentage in trying to remove it even if the subpackage search ultimately fails. This would be most consistent with current import behaviour: >>> 'logging' in sys.modules False >>> 'logging.handlers' in sys.modules False >>> import logging.handlers.foo Traceback (most recent call last): File "", line 1, in ImportError: No module named foo >>> 'logging' in sys.modules True >>> 'logging.handlers' in sys.modules True > If instead we allow passing a 'partial' name (Foo and Foo.Bar) of a > virtual package to get_subpath() and indicate (by an extra parameter) > that we are looking for a purely virtual module, we could create > __path__ entries for each of these parent modules. > > For that reason, in my implementation I introduced an extra parameter > ' as_parent' to _gcd_import(), indicating that if a module or package > with a given name is not found, a subdirectory (e. g. Foo/Bar) on the > import path should be accepted as well and an analogical extra > parameter in get_subpath(). > > Hope this makes sense. I'd appreciate any feedback. I don't think you need the extra argument to get_subpath() - virtual path construction should be the same regardless of whether you're creating it for a pure virtual module or not. However, I suspect you're right about needing the flag argument to _gcd_import(). For the foo.bar.baz case, where there is no 'foo' instance in sys.modules, you want to end up doing something along the lines of the following (obviously deriving the names from the passed in dotted string rather than hard coding anything): # Assume a _std_import utility function is available path = sys.path for name in ('foo', 'foo.bar'): # Virtual packages may be created while hunting for subpackages try: pkg = _std_import(name) except ImportError: # Try the pure virtual case path = get_virtual_path(name, path) # May throw ImportError pkg = imp.new_module(name) pkg.__path__ = path else: try: path = pkg.__path__ except AttributeError: # Try the initialised virtual case path = get_virtual_path(name, path) # May throw ImportError pkg.__path__ = path else: # Direct initial import of virtual packages is not allowed result = _std_import('foo.bar.baz') # Hook up imported modules in relevant namespaces foo.bar = sys.modules['foo.bar'] foo.bar.baz = sys.modules['foo.bar.baz'] I haven't looked at the code at this point, but the body of that loop presumably corresponds to your 'as_parent' case in _gcd_import(), while the else clause on the loop would be the standard import case. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From pje at telecommunity.com Tue Aug 16 01:44:07 2011 From: pje at telecommunity.com (P.J. Eby) Date: Mon, 15 Aug 2011 19:44:07 -0400 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> Message-ID: <20110815234422.DAA243A4105@sparrow.telecommunity.com> At 10:15 PM 8/15/2011 +0200, Greg Slodkowicz wrote: >Dear all, >I've been working on a proof-of-concept implementation of PEP 402 as >part of my import-related Google Summer of Code project (the >repository is at https://bitbucket.org/jergosh/pep-402). I made the >changes to my ImportEngine-based import but the logic should be the >same as the original __import__ implemented in importlib. I haven't >studied the CPython implementation but I'm assuming it's similar. It's not. The C implementation is iterative rather than recursive, so it's actually more amenable to the approach described in the PEP. >I've ran into the following problem: prior to PEP-402, if we import >Foo.Bar.Baz, first Foo and then Foo.Bar are imported by recursion >finally, if these succeed, also Foo.Bar.Baz. Then __import__ returns >the root module, in this case Foo. But suppose Foo and Bar are virtual >packages, i. e. there is a file Foo/Bar/Baz.py somewhere on the path, >lacking any __init__.py files. As far as I understand the proposal, >then there would be no Foo to return. That's correct. It's understood (or at least I understood) that this meant importlib could not continue using a recursive approach. I do have a rough (i.e. utterly untested and probably bug-ridden) draft rewrite of _gcd_import() and associated functions, if you'd like to take a look at it: http://pastebin.com/6e29v8LR It even includes a VirtualPath proxy for automatic recalculation of virtual modules' __path__ attributes when sys.path (or any parent __path__ object) is changed. >One way to get around this would be to create an empty module on each >import level (Foo and Foo.Bar in this case) and set appropriate >__path__ on each of them. To my mind, this would require changing the >way get_subpath() works. If I understand your proposal correctly, this introduces the problem of having to get rid of these dummy modules if the import fails -- and with no guarantee that you won't wind up with dangling references somewhere. From pje at telecommunity.com Tue Aug 16 01:57:53 2011 From: pje at telecommunity.com (P.J. Eby) Date: Mon, 15 Aug 2011 19:57:53 -0400 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> Message-ID: <20110815235759.934CD3A415F@sparrow.telecommunity.com> At 09:28 AM 8/16/2011 +1000, Nick Coghlan wrote: >I'm actually not sure this is a viable approach as currently described >in the PEP - most of the existing import machinery assumes that parent >modules will exist in sys.modules before child modules are imported. That assumption isn't being violated as much as you might think. The PEP merely requires that one hold off on creating the parent module instance until you're just about to load the child. This is quite straightforward to implement if you process the import iteratively from left to right, as CPython (pre-importlib) does. In effect, the algorithm is: path = sys.path for each part of the name: check for an already imported name up to this point if it's already imported: path = module.__path__ else: try to find the module (using 'path') if the module is found: add any missing parent modules to sys.modules load the module path = module.__path__ else: path = virtual path for the missing module if we have a module: return it else: raise ImportError Very simple, really. Granted, the "fill in any missing parent modules" is a wee bit tricky and reintroduces recursion into the mix. (See http://pastebin.com/6e29v8LR for the full details, including a draft implementation of an auto-updating proxy __path__ object.) But the only place where the "parent modules are already in sys.modules" assumption can be broken here is in find_module() calls -- *not* in load_module() or any actual module code. And this assumption is only broken in scenarios where, in today's Python, the import would already have failed first. From ncoghlan at gmail.com Tue Aug 16 02:06:19 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 16 Aug 2011 10:06:19 +1000 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: <20110815235759.934CD3A415F@sparrow.telecommunity.com> References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815235759.934CD3A415F@sparrow.telecommunity.com> Message-ID: On Tue, Aug 16, 2011 at 9:57 AM, P.J. Eby wrote: > At 09:28 AM 8/16/2011 +1000, Nick Coghlan wrote: >> >> I'm actually not sure this is a viable approach as currently described >> in the PEP - most of the existing import machinery assumes that parent >> modules will exist in sys.modules before child modules are imported. > > That assumption isn't being violated as much as you might think. ?The PEP > merely requires that one hold off on creating the parent module instance > until you're just about to load the child. ?This is quite straightforward to > implement if you process the import iteratively from left to right, as > CPython (pre-importlib) does. Ah, good point. Yes, I had missed that. > In effect, the algorithm is: > > ? path = sys.path > > ? for each part of the name: > ? ? ? check for an already imported name up to this point > ? ? ? if it's already imported: > ? ? ? ? ? ?path = module.__path__ > ? ? ? else: > ? ? ? ? ? ?try to find the module (using 'path') > ? ? ? ? ? ?if the module is found: > ? ? ? ? ? ? ? ?add any missing parent modules to sys.modules > ? ? ? ? ? ? ? ?load the module > ? ? ? ? ? ? ? ?path = module.__path__ > ? ? ? ? ? ?else: > ? ? ? ? ? ? ? ?path = virtual path for the missing module > > ? if we have a module: > ? ? ? return it > ? else: > ? ? ? raise ImportError > > Very simple, really. ?Granted, the "fill in any missing parent modules" is a > wee bit tricky and reintroduces recursion into the mix. ?(See > http://pastebin.com/6e29v8LR for the full details, including a draft > implementation of an auto-updating proxy __path__ object.) The other slightly fiddly bit is coping with the "the module is in sys.modules but doesn't have a __path__ attribute" case, so the logic flow isn't *quite* as neat as shown above (the draft version of _gcd_import appears to deal with this case correctly, though). > But the only place where the "parent modules are already in sys.modules" > assumption can be broken here is in find_module() calls -- *not* in > load_module() or any actual module code. ?And this assumption is only broken > in scenarios where, in today's Python, the import would already have failed > first. Yeah, consider my objection with drawn. You may want to elaborate on this a little in the PEP, though. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jergosh at gmail.com Tue Aug 16 21:13:05 2011 From: jergosh at gmail.com (Greg Slodkowicz) Date: Tue, 16 Aug 2011 21:13:05 +0200 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: <20110815234422.DAA243A4105@sparrow.telecommunity.com> References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815234422.DAA243A4105@sparrow.telecommunity.com> Message-ID: > I do have a rough > (i.e. utterly untested and probably bug-ridden) draft rewrite of > _gcd_import() and associated functions, if you'd like to take a look at it: > > ? ?http://pastebin.com/6e29v8LR I haven't run the code but I don't currently see why this approach shouldn't work. I've been thinking about reimplementing _gcd_import() like this but wasn't sure it wouldn't interfere with the way importlib is designed. I would be happy to work on this further, if that's the agreed way to go. >> One way to get around this would be to create an empty module on each >> import level (Foo and Foo.Bar in this case) and set appropriate >> __path__ on each of them. To my mind, this would require changing the >> way get_subpath() works. > > If I understand your proposal correctly, this introduces the problem of > having to get rid of these dummy modules if the import fails -- and with no > guarantee that you won't wind up with dangling references somewhere. I followed the approach in 'standard' import, i. e. to leave the parent modules be even if importing the child fails but I guess it doesn't make much sense to have essentially empty packages hanging around. But then again, if some of the parents are virtual and some 'actual,' should we clear the former and leave the latter or just get rid of everything? I'm hoping there is not much legacy code that relies on parent modules being imported in this way ;) Cheers, Greg From pje at telecommunity.com Tue Aug 16 23:31:09 2011 From: pje at telecommunity.com (P.J. Eby) Date: Tue, 16 Aug 2011 17:31:09 -0400 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815234422.DAA243A4105@sparrow.telecommunity.com> Message-ID: <20110816213116.8FE063A409E@sparrow.telecommunity.com> At 09:13 PM 8/16/2011 +0200, Greg Slodkowicz wrote: > > I do have a rough > > (i.e. utterly untested and probably bug-ridden) draft rewrite of > > _gcd_import() and associated functions, if you'd like to take a look at it: > > > > ? ? http://pastebin.com/6e29v8LR > >I haven't run the code but I don't currently see why this approach >shouldn't work. I've been thinking about reimplementing _gcd_import() >like this but wasn't sure it wouldn't interfere with the way importlib >is designed. It shouldn't. Since importlib is supposed to be as backward-compatible as possible with the C implementation, and the C implementation was iterative, it should be even *more* correct to implement it with iteration. ;-) >I followed the approach in 'standard' import, i. e. to leave the >parent modules be even if importing the child fails but I guess it >doesn't make much sense to have essentially empty packages hanging >around. But then again, if some of the parents are virtual and some >'actual,' should we clear the former and leave the latter or just get >rid of everything? I'm hoping there is not much legacy code that >relies on parent modules being imported in this way ;) If you look closely at the proposed code, I simply avoid creating the parents until the last possible moment. I don't think there's any need to *remove* the parents if the load fails; the idea is just that a virtual package's module doesn't exist until a submodule is *found* (whether or not it's successfully *loaded*). I think this is a reasonable compromise, as it ensures that the module itself will see the parents, and that if somehow something keeps a reference it won't end up stale/invalid. Anyway, I wrote the PEP with essentially this exact implementation approach in mind, though it might not be 100% clear from the PEP itself. From ncoghlan at gmail.com Wed Aug 17 05:24:19 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 17 Aug 2011 13:24:19 +1000 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: <20110816213116.8FE063A409E@sparrow.telecommunity.com> References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815234422.DAA243A4105@sparrow.telecommunity.com> <20110816213116.8FE063A409E@sparrow.telecommunity.com> Message-ID: On Wed, Aug 17, 2011 at 7:31 AM, P.J. Eby wrote: > If you look closely at the proposed code, I simply avoid creating the > parents until the last possible moment. ?I don't think there's any need to > *remove* the parents if the load fails; the idea is just that a virtual > package's module doesn't exist until a submodule is *found* (whether or not > it's successfully *loaded*). > > I think this is a reasonable compromise, as it ensures that the module > itself will see the parents, and that if somehow something keeps a reference > it won't end up stale/invalid. > > Anyway, I wrote the PEP with essentially this exact implementation approach > in mind, though it might not be 100% clear from the PEP itself. I've been pondering this aspect a bit, and I'm still not sure delaying the parent module creation until after the named module has been definitively located makes sense. The current algorithm is basically to break the module path up into components, and then find and load each segment in turn. Errors in finding or loading the later segments leave the earlier, already loaded, segments alone. With PEP 402, for the backwards compatibility reasons described in the PEP, we need to distinguish the 'parent imports' from direct imports, so we can ignore pure virtual packages in the latter case. However, the further special casing to say 'only load the pure virtual packages if the child module is found' seems like a needless complication. We may end up with childless virtual packages in sys.modules *anyway* due to failures in the load stage, so what's the advantage of avoiding creating them just because the subsequent failure happened to occur in the find stage? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From pje at telecommunity.com Wed Aug 17 22:53:01 2011 From: pje at telecommunity.com (P.J. Eby) Date: Wed, 17 Aug 2011 16:53:01 -0400 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815234422.DAA243A4105@sparrow.telecommunity.com> <20110816213116.8FE063A409E@sparrow.telecommunity.com> Message-ID: <20110817205311.C52B23A4100@sparrow.telecommunity.com> At 01:24 PM 8/17/2011 +1000, Nick Coghlan wrote: >With PEP 402, for the backwards compatibility reasons described in the >PEP, we need to distinguish the 'parent imports' from direct imports, >so we can ignore pure virtual packages in the latter case. However, >the further special casing to say 'only load the pure virtual packages >if the child module is found' seems like a needless complication. We >may end up with childless virtual packages in sys.modules *anyway* due >to failures in the load stage, so what's the advantage of avoiding >creating them just because the subsequent failure happened to occur in >the find stage? You might be right. My concern, though, is that whenever you try to import a non-existent module, you'll be adding its parents to sys.modules -- something that doesn't happen right now, because the import fails as soon as the parent isn't found. IOW, right now if I import foo.bar.baz, and there's no "foo", nothing gets added to sys.modules. Under PEP 402 without lazy module creation, you'd end up with an emtpy 'foo' pointing to 'bar', and an empty 'foo.bar' module, despite the possible non-existence of these packages. Sure, if the module load fails, you'll still have the parents... but in that instance, you at least have an assurance that the parents are "real". From ncoghlan at gmail.com Thu Aug 18 01:12:39 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 18 Aug 2011 09:12:39 +1000 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: <20110817205311.C52B23A4100@sparrow.telecommunity.com> References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815234422.DAA243A4105@sparrow.telecommunity.com> <20110816213116.8FE063A409E@sparrow.telecommunity.com> <20110817205311.C52B23A4100@sparrow.telecommunity.com> Message-ID: On Thu, Aug 18, 2011 at 6:53 AM, P.J. Eby wrote: > You might be right. ?My concern, though, is that whenever you try to import > a non-existent module, you'll be adding its parents to sys.modules -- > something that doesn't happen right now, because the import fails as soon as > the parent isn't found. > > IOW, right now if I import foo.bar.baz, and there's no "foo", nothing gets > added to sys.modules. ?Under PEP 402 without lazy module creation, you'd end > up with an emtpy 'foo' pointing to 'bar', and an empty 'foo.bar' module, > despite the possible non-existence of these packages. Well, you'd only add the pure virtual packages if they had a non-empty __path__, so they'd exist in at least *some* sense (i.e. there is at least one subdirectory with an appropriate name available via sys.path or the relevant parent __path__ attribute). To use the example from the PEP, all of the following would create a pure virtual 'json' entry in sys.modules, even though the last one doesn't find the requested child module: import json.foo import json.bar import json.baz On the other hand, something like the following wouldn't touch sys.modules, since it would fail to find any directories for the parent name and the generation of the pure virtual package itself would fail: import notadir.foo Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jergosh at gmail.com Sun Aug 21 01:09:20 2011 From: jergosh at gmail.com (Greg Slodkowicz) Date: Sun, 21 Aug 2011 01:09:20 +0200 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: <20110816213116.8FE063A409E@sparrow.telecommunity.com> References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815234422.DAA243A4105@sparrow.telecommunity.com> <20110816213116.8FE063A409E@sparrow.telecommunity.com> Message-ID: > > If you look closely at the proposed code, I simply avoid creating the > parents until the last possible moment. I don't think there's any need to > *remove* the parents if the load fails; the idea is just that a virtual > package's module doesn't exist until a submodule is *found* (whether or not > it's successfully *loaded*). > > I think this is a reasonable compromise, as it ensures that the module > itself will see the parents, and that if somehow something keeps a reference > it won't end up stale/invalid. > I replaced _gcd_import() with your code ( https://bitbucket.org/jergosh/pep-402) and, after small changes, it passes all but one* unittests, including ones I devised for PEP402-style imports (the other 5 failures are due to an unrelated bug which was fixed after I made my fork of cpython). * I'm getting a mysterious failure in importlib.test.import_.test_meta_path.test_no_path, no idea why. I'd appreciate any feedback, especially more test cases and possible reasons for the unittest failure. Cheers, Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sun Aug 21 01:31:01 2011 From: brett at python.org (Brett Cannon) Date: Sat, 20 Aug 2011 16:31:01 -0700 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815234422.DAA243A4105@sparrow.telecommunity.com> <20110816213116.8FE063A409E@sparrow.telecommunity.com> Message-ID: On Sat, Aug 20, 2011 at 16:09, Greg Slodkowicz wrote: > If you look closely at the proposed code, I simply avoid creating the >> parents until the last possible moment. I don't think there's any need to >> *remove* the parents if the load fails; the idea is just that a virtual >> package's module doesn't exist until a submodule is *found* (whether or not >> it's successfully *loaded*). >> >> I think this is a reasonable compromise, as it ensures that the module >> itself will see the parents, and that if somehow something keeps a reference >> it won't end up stale/invalid. >> > > I replaced _gcd_import() with your code ( > https://bitbucket.org/jergosh/pep-402) and, after small changes, it passes > all but one* unittests, including ones I devised for PEP402-style imports > (the other 5 failures are due to an unrelated bug which was fixed after I > made my fork of cpython). > > * I'm getting a mysterious failure > in importlib.test.import_.test_meta_path.test_no_path, no idea why. > Can you be more specific about what the failure is? That specific test has multiple asserts. The point of the test is to simply make sure that meta_path finders get called with None when __path__ is not defined. -Brett > > I'd appreciate any feedback, especially more test cases and possible > reasons for the unittest failure. > > Cheers, > Greg > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jergosh at gmail.com Sun Aug 21 01:47:53 2011 From: jergosh at gmail.com (Greg Slodkowicz) Date: Sun, 21 Aug 2011 01:47:53 +0200 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815234422.DAA243A4105@sparrow.telecommunity.com> <20110816213116.8FE063A409E@sparrow.telecommunity.com> Message-ID: > > Can you be more specific about what the failure is? That specific test has > multiple asserts. The point of the test is to simply make sure that > meta_path finders get called with None when __path__ is not defined. > Good point, sorry. It's the last one that fails, self.assertTrue(args[1] is None) -Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sun Aug 21 02:09:15 2011 From: brett at python.org (Brett Cannon) Date: Sat, 20 Aug 2011 17:09:15 -0700 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815234422.DAA243A4105@sparrow.telecommunity.com> <20110816213116.8FE063A409E@sparrow.telecommunity.com> Message-ID: On Sat, Aug 20, 2011 at 16:47, Greg Slodkowicz wrote: > Can you be more specific about what the failure is? That specific test has >> multiple asserts. The point of the test is to simply make sure that >> meta_path finders get called with None when __path__ is not defined. >> > > Good point, sorry. It's the last one that fails, > > self.assertTrue(args[1] is None) > So that means that a module that has no __path__ defined is causing importlib to pass something other than None for a 'path' argument. Didn't you add some argument to meta_path importers? That could be triggering the failure. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jergosh at gmail.com Wed Aug 24 15:05:02 2011 From: jergosh at gmail.com (Greg Slodkowicz) Date: Wed, 24 Aug 2011 15:05:02 +0200 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815234422.DAA243A4105@sparrow.telecommunity.com> <20110816213116.8FE063A409E@sparrow.telecommunity.com> Message-ID: > > So that means that a module that has no __path__ defined is causing > importlib to pass something other than None for a 'path' argument. Didn't > you add some argument to meta_path importers? That could be triggering the > failure. > Okay, this was in fact quite simple, _gcd_import() didn't need to initialise the path passed to the importers with sys.path: https://bitbucket.org/jergosh/pep-402/changeset/ff388e7aafd1. Thanks for the tip, it helped me a lot. This means all unittests now pass (minus the 5 mentioned earlier). -Greg -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Wed Aug 24 15:18:48 2011 From: pje at telecommunity.com (P.J. Eby) Date: Wed, 24 Aug 2011 09:18:48 -0400 Subject: [Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning" In-Reply-To: References: <20110713171345.4E0673A4100@sparrow.telecommunity.com> <20110718121726.123e5b44@resist.wooz.org> <20110721163700.3daff988@resist.wooz.org> <20110815234422.DAA243A4105@sparrow.telecommunity.com> <20110816213116.8FE063A409E@sparrow.telecommunity.com> Message-ID: <20110824131856.7D7AF3A4114@sparrow.telecommunity.com> At 03:05 PM 8/24/2011 +0200, Greg Slodkowicz wrote: >So that means that a module that has no __path__ defined is causing >importlib to pass something other than None for a 'path' argument. >Didn't you add some argument to meta_path importers? That could be >triggering the failure.? > > >Okay, this was in fact quite simple, _gcd_import() didn't need to >initialise the path passed to the importers with sys.path: >https://bitbucket.org/jergosh/pep-402/changeset/ff388e7aafd1. >Thanks for the tip, it helped me a lot. This means all unittests now >pass (minus the 5 mentioned earlier). By the way, do any of your tests make sys.path changes and check whether virtual packages' __path__ attributes are updated accordingly? From ncoghlan at gmail.com Fri Aug 26 01:44:24 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 26 Aug 2011 09:44:24 +1000 Subject: [Import-SIG] Suggested example/use case for PEP 402's extensible packages Message-ID: When Tarek asked for help with the packaging->distutils2 backport, the question came up as to *why* the backport was being distributed under a different name. The rationale put forward was that it allowed the future 3.4 version to be backported to 3.3 without conflicting with the standard library version. A similar convention is already in place for backports like unittest -> unittest2 and it seems to work well in practice, despite being somewhat ugly. In a world with extensible (virtual) packages, as proposed by PEP 402, it would be straightforward to instead adopt a namespace convention for such standard lib backports, such as "backports.packaging" and "backports.unittest" rather than having to mangle the name of the package itself. With the standard import mechanism properly on sys.meta_path (as it should be in 3.3), it would even be possible to define a meta importer that checked for such backports automatically if the ordinary import process failed. Obviously, this is a far future kind of thing, only feasible when 3.3 is the oldest version a backport wants to support, rather than the newest, but I like it as an example of what extensible package namespaces allows. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From barry at python.org Mon Aug 29 16:19:22 2011 From: barry at python.org (Barry Warsaw) Date: Mon, 29 Aug 2011 10:19:22 -0400 Subject: [Import-SIG] Suggested example/use case for PEP 402's extensible packages In-Reply-To: References: Message-ID: <20110829101922.734b37e7@resist.wooz.org> Brilliant insight, Nick. We might even want to reserve the `backports` name in the stdlib exactly for this purpose. -Barry On Aug 26, 2011, at 09:44 AM, Nick Coghlan wrote: >When Tarek asked for help with the packaging->distutils2 backport, the >question came up as to *why* the backport was being distributed under >a different name. The rationale put forward was that it allowed the >future 3.4 version to be backported to 3.3 without conflicting with >the standard library version. A similar convention is already in place >for backports like unittest -> unittest2 and it seems to work well in >practice, despite being somewhat ugly. > >In a world with extensible (virtual) packages, as proposed by PEP 402, >it would be straightforward to instead adopt a namespace convention >for such standard lib backports, such as "backports.packaging" and >"backports.unittest" rather than having to mangle the name of the >package itself. With the standard import mechanism properly on >sys.meta_path (as it should be in 3.3), it would even be possible to >define a meta importer that checked for such backports automatically >if the ordinary import process failed. > >Obviously, this is a far future kind of thing, only feasible when 3.3 >is the oldest version a backport wants to support, rather than the >newest, but I like it as an example of what extensible package >namespaces allows.