From eric at trueblade.com Wed May 2 00:00:28 2012 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 01 May 2012 18:00:28 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4F90730D.1040808@trueblade.com> References: <4F90730D.1040808@trueblade.com> Message-ID: <4FA05CFC.6050609@trueblade.com> I'm working on finishing up the PEP 420 work. I think the PEP itself is complete. If you have any comments, please send them to me or this list. The implementation at features/pep-420 has been merged with the recent importlib changes to the 3.3 branch. I've implemented support in the import machinery itself, as well as modified the filesystem finder (FileFinder) and the zipimport finder. About the only question I have is: Is everyone okay with the changes to the finders, described in the PEP? Basically they now return a string in addition to a loader or None. If they return a string, then the string represents the path of a possible namespace package portion. The change is backward compatible: unmodified finders will just be unable to participate in a namespace package. Barry Warsaw, Jason Coombs, and I are sprinting this Thursday. We'll focus on adding tests, and maybe documentation if we have time. If anyone has any concerns I'd like to hear them before then so that we can work on addressing them. The changes themselves are very small. I think the diff is a total of maybe 40 lines of code. Yury Selivanov had mentioned backporting to 3.2 (which I assume would be an unsupported-by-python-dev effort). I actually don't think it would be all that complicated. Eric. From brett at python.org Wed May 2 04:22:03 2012 From: brett at python.org (Brett Cannon) Date: Tue, 1 May 2012 22:22:03 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA05CFC.6050609@trueblade.com> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> Message-ID: On Tue, May 1, 2012 at 6:00 PM, Eric V. Smith wrote: > I'm working on finishing up the PEP 420 work. I think the PEP itself is > complete. If you have any comments, please send them to me or this list. > > The implementation at features/pep-420 has been merged with the recent > importlib changes to the 3.3 branch. I've implemented support in the > import machinery itself, as well as modified the filesystem finder > (FileFinder) and the zipimport finder. > > About the only question I have is: Is everyone okay with the changes to > the finders, described in the PEP? Basically they now return a string in > addition to a loader or None. If they return a string, then the string > represents the path of a possible namespace package portion. The change > is backward compatible: unmodified finders will just be unable to > participate in a namespace package. > I obviously okay with the change. =) So this email is just a +1 in support of this work and a thanks for coding it up and seeing this through! -Brett > > Barry Warsaw, Jason Coombs, and I are sprinting this Thursday. We'll > focus on adding tests, and maybe documentation if we have time. If > anyone has any concerns I'd like to hear them before then so that we can > work on addressing them. > > The changes themselves are very small. I think the diff is a total of > maybe 40 lines of code. Yury Selivanov had mentioned backporting to 3.2 > (which I assume would be an unsupported-by-python-dev effort). I > actually don't think it would be all that complicated. > Ignoring that the classes he would need to access are technically private, backporting should be no more than a subclass and an extra stat call by FileFinder if None is returned. -Brett > > Eric. > > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Wed May 2 09:17:00 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 02 May 2012 09:17:00 +0200 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA05CFC.6050609@trueblade.com> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> Message-ID: <4FA0DF6C.4090709@v.loewis.de> > About the only question I have is: Is everyone okay with the changes to > the finders, described in the PEP? It looks good to me. It's a somewhat surprising change, but I can see no flaw in it. Regards, Martin From eric at trueblade.com Wed May 2 12:23:17 2012 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 02 May 2012 06:23:17 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA0DF6C.4090709@v.loewis.de> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> Message-ID: <4FA10B15.1000302@trueblade.com> On 5/2/2012 3:17 AM, "Martin v. L?wis" wrote: >> About the only question I have is: Is everyone okay with the changes to >> the finders, described in the PEP? > > It looks good to me. It's a somewhat surprising change, but I can see no > flaw in it. Surprising in that any change to find_module is needed, or surprising that it now returns one of {None, loader, str}? If it's the latter: yeah, it's a little strange. But find_module knows something that the caller needs to be told. It seemed easiest to add another possible return type. Any other suggestions? Eric. From pje at telecommunity.com Wed May 2 19:06:27 2012 From: pje at telecommunity.com (PJ Eby) Date: Wed, 2 May 2012 13:06:27 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA10B15.1000302@trueblade.com> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> Message-ID: On Wed, May 2, 2012 at 6:23 AM, Eric V. Smith wrote: > If it's the latter: yeah, it's a little strange. But find_module knows > something that the caller needs to be told. It seemed easiest to add > another possible return type. Any other suggestions? > It seems quite elegant to me. I do see one point of concern with the spec, though. At one point it says that finders must return a path without a trailing separator, but at another it says the package __file__ will contain a separator. This strikes me as inconsistent, and also incompatible with non-filesystem-based finder implementations. The import machinery *must not* assume that import path strings are filenames, so it is wrong for the import machinery to add a path separator that the finder did not include. IOW, I don't think the spec can assume or guarantee anything about the strings returned by finders: it MUST treat them as opaque strings. If this means that there can't be any meaningful __file__ for a namespace package, I think we will have to live with that. The only alternative I see is to delegate the string manipulation back to the finders, or to change the return value from a string to a (file, path) tuple, wherein 'file' is the value to be used as __file__, and 'path' is the value to be used in __path__. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Wed May 2 19:24:21 2012 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 02 May 2012 13:24:21 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> Message-ID: <4FA16DC5.1000204@trueblade.com> On 05/02/2012 01:06 PM, PJ Eby wrote: > I do see one point of concern with the spec, though. At one point it > says that finders must return a path without a trailing separator, but > at another it says the package __file__ will contain a separator. > > This strikes me as inconsistent, and also incompatible with > non-filesystem-based finder implementations. The import machinery *must > not* assume that import path strings are filenames, so it is wrong for > the import machinery to add a path separator that the finder did not > include. > > IOW, I don't think the spec can assume or guarantee anything about the > strings returned by finders: it MUST treat them as opaque strings. If > this means that there can't be any meaningful __file__ for a namespace > package, I think we will have to live with that. I've come to the same conclusion myself. I actually had a draft of the PEP that removed the word "directory", at which point it becomes obvious that you're adding a path separator to something that might not be a path name. > The only alternative I see is to delegate the string manipulation back > to the finders, or to change the return value from a string to a (file, > path) tuple, wherein 'file' is the value to be used as __file__, and > 'path' is the value to be used in __path__. I don't see the value of __file__ at all in the case of namespace packages. If it's just a hint that it's a namespace package, I think it would be better to set __file__ to None. That would noisily break some code that isn't likely to work anyway. Eric. From brett at python.org Wed May 2 19:53:44 2012 From: brett at python.org (Brett Cannon) Date: Wed, 2 May 2012 13:53:44 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA16DC5.1000204@trueblade.com> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> Message-ID: On Wed, May 2, 2012 at 1:24 PM, Eric V. Smith wrote: > On 05/02/2012 01:06 PM, PJ Eby wrote: > > > I do see one point of concern with the spec, though. At one point it > > says that finders must return a path without a trailing separator, but > > at another it says the package __file__ will contain a separator. > > > > This strikes me as inconsistent, and also incompatible with > > non-filesystem-based finder implementations. The import machinery *must > > not* assume that import path strings are filenames, so it is wrong for > > the import machinery to add a path separator that the finder did not > > include. > > > > IOW, I don't think the spec can assume or guarantee anything about the > > strings returned by finders: it MUST treat them as opaque strings. If > > this means that there can't be any meaningful __file__ for a namespace > > package, I think we will have to live with that. > > I've come to the same conclusion myself. I actually had a draft of the > PEP that removed the word "directory", at which point it becomes obvious > that you're adding a path separator to something that might not be a > path name. > > > The only alternative I see is to delegate the string manipulation back > > to the finders, or to change the return value from a string to a (file, > > path) tuple, wherein 'file' is the value to be used as __file__, and > > 'path' is the value to be used in __path__. > > I don't see the value of __file__ at all in the case of namespace > packages. If it's just a hint that it's a namespace package, I think it > would be better to set __file__ to None. That would noisily break some > code that isn't likely to work anyway. Problem is that None for __file__ would be a unique use here. Frozen modules, for instance, typically say "" for __file__. Now part of the reason (I suspect) this is done is that this was the only way to tell how the module was created, but with __loader__ now on all modules this is redundant. So perhaps this fake value for __file__ is just outdated and not worth perpetuating? I vote for using __file__ as None as suggested and having people infer how the module was created from __loader__. -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Wed May 2 20:32:09 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 02 May 2012 20:32:09 +0200 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA10B15.1000302@trueblade.com> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> Message-ID: <4FA17DA9.1070207@v.loewis.de> On 02.05.2012 12:23, Eric V. Smith wrote: > On 5/2/2012 3:17 AM, "Martin v. L?wis" wrote: >>> About the only question I have is: Is everyone okay with the changes to >>> the finders, described in the PEP? >> >> It looks good to me. It's a somewhat surprising change, but I can see no >> flaw in it. > > Surprising in that any change to find_module is needed, or surprising > that it now returns one of {None, loader, str}? > Both, actually. I had expected that new API (i.e. a new method of some kind) would be necessary, so it has elegance that this is not required. OTOH, explicit type checking is despised in the OO world, and varying result types are disliked by Guido van Rossum (not sure whether this reservation applies to this case as well, or only to cases where the return type depends on the parameter types). Regards, Martin From barry at python.org Wed May 2 20:50:05 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 2 May 2012 14:50:05 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA17DA9.1070207@v.loewis.de> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA17DA9.1070207@v.loewis.de> Message-ID: <20120502145005.4d0633b4@resist.wooz.org> On May 02, 2012, at 08:32 PM, Martin v. L?wis wrote: >Both, actually. I had expected that new API (i.e. a new method of some kind) >would be necessary, so it has elegance that this is not required. OTOH, >explicit type checking is despised in the OO world, and varying result types >are disliked by Guido van Rossum (not sure whether this reservation applies >to this case as well, or only to cases where the return type depends on the >parameter types). My understanding (and I'm sure Guido will correct me if I'm wrong) is that it's the latter: return type should not depend on function argument values. -Barry From brett at python.org Wed May 2 20:53:17 2012 From: brett at python.org (Brett Cannon) Date: Wed, 2 May 2012 14:53:17 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA17DA9.1070207@v.loewis.de> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA17DA9.1070207@v.loewis.de> Message-ID: On Wed, May 2, 2012 at 2:32 PM, "Martin v. L?wis" wrote: > On 02.05.2012 12:23, Eric V. Smith wrote: > >> On 5/2/2012 3:17 AM, "Martin v. L?wis" wrote: >> >>> About the only question I have is: Is everyone okay with the changes to >>>> the finders, described in the PEP? >>>> >>> >>> It looks good to me. It's a somewhat surprising change, but I can see no >>> flaw in it. >>> >> >> Surprising in that any change to find_module is needed, or surprising >> that it now returns one of {None, loader, str}? >> >> > Both, actually. I had expected that new API (i.e. a new method of some > kind) would be necessary, so it has elegance that this is not required. > OTOH, explicit type checking is despised in the OO world, and varying > result types are disliked by Guido van Rossum (not sure whether this > reservation applies to this case as well, or only to cases where the > return type depends on the parameter types). > You actually don't need to explicitly type-check and instead can rely on duck typing:: if loader is None: continue elif hasattr(loader, 'load_module'): return loader else: namespace.append(loader) continue -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Wed May 2 21:28:42 2012 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 02 May 2012 15:28:42 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA17DA9.1070207@v.loewis.de> Message-ID: <4FA18AEA.9070406@trueblade.com> On 05/02/2012 02:53 PM, Brett Cannon wrote: > You actually don't need to explicitly type-check and instead can rely on > duck typing:: > > if loader is None: continue > elif hasattr(loader, 'load_module'): return loader > else: > namespace.append(loader) > continue While I agree that this accomplishes the job, I don't think it's any more readable than the existing code: if isinstance(loader, str): namespace.append(loader) elif loader: return loader (with the case of None causing the code to loop) But I'm open to changing it. As to the three return types: Given that find_module() has all of the information, I don't think it makes sense to add another method. And for backward compatibility, we need to keep the {None, loader} return types. If you agree that adding another method is wasteful (it will have to do most of the same work as find_module(), or cache its result), then I think adding a str return type makes the most sense. I can't foresee this ever causing an actual problem. No one is going to subclass a loader from str (famous last words, I know!). Eric. From brett at python.org Wed May 2 21:39:47 2012 From: brett at python.org (Brett Cannon) Date: Wed, 2 May 2012 15:39:47 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA18AEA.9070406@trueblade.com> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA17DA9.1070207@v.loewis.de> <4FA18AEA.9070406@trueblade.com> Message-ID: On Wed, May 2, 2012 at 3:28 PM, Eric V. Smith wrote: > On 05/02/2012 02:53 PM, Brett Cannon wrote: > > > You actually don't need to explicitly type-check and instead can rely on > > duck typing:: > > > > if loader is None: continue > > elif hasattr(loader, 'load_module'): return loader > > else: > > namespace.append(loader) > > continue > > While I agree that this accomplishes the job, I don't think it's any > more readable than the existing code: > > if isinstance(loader, str): > namespace.append(loader) > elif loader: > return loader > > (with the case of None causing the code to loop) > > But I'm open to changing it. > > I honestly don't care. I just wanted to point out to Martin that if he wanted a more interface check over type check it's totally doable. > As to the three return types: Given that find_module() has all of the > information, I don't think it makes sense to add another method. And for > backward compatibility, we need to keep the {None, loader} return types. > If you agree that adding another method is wasteful (it will have to do > most of the same work as find_module(), or cache its result), then I > think adding a str return type makes the most sense. > > I can't foresee this ever causing an actual problem. No one is going to > subclass a loader from str (famous last words, I know!). Just as I know PJE is going to point out that your loader test won't work if a loader happens to be false and thus you should do an explicit ``is not None`` check. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed May 2 21:40:41 2012 From: brett at python.org (Brett Cannon) Date: Wed, 2 May 2012 15:40:41 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120502145005.4d0633b4@resist.wooz.org> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA17DA9.1070207@v.loewis.de> <20120502145005.4d0633b4@resist.wooz.org> Message-ID: On Wed, May 2, 2012 at 2:50 PM, Barry Warsaw wrote: > On May 02, 2012, at 08:32 PM, Martin v. L?wis wrote: > > >Both, actually. I had expected that new API (i.e. a new method of some > kind) > >would be necessary, so it has elegance that this is not required. OTOH, > >explicit type checking is despised in the OO world, and varying result > types > >are disliked by Guido van Rossum (not sure whether this reservation > applies > >to this case as well, or only to cases where the return type depends on > the > >parameter types). > > My understanding (and I'm sure Guido will correct me if I'm wrong) is that > it's the latter: return type should not depend on function argument values. This is how I interpreted Guido's preference (e.g. return bytes or str based on whether an argument(s) is bytes or str). -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Wed May 2 21:47:37 2012 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 02 May 2012 15:47:37 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA17DA9.1070207@v.loewis.de> <4FA18AEA.9070406@trueblade.com> Message-ID: <4FA18F59.5070701@trueblade.com> On 05/02/2012 03:39 PM, Brett Cannon wrote: > I can't foresee this ever causing an actual problem. No one is going to > subclass a loader from str (famous last words, I know!). > > > Just as I know PJE is going to point out that your loader test won't > work if a loader happens to be false and thus you should do an explicit > ``is not None`` check. Good one! I'll make that change. From pje at telecommunity.com Wed May 2 23:05:51 2012 From: pje at telecommunity.com (PJ Eby) Date: Wed, 2 May 2012 17:05:51 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA16DC5.1000204@trueblade.com> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> Message-ID: On Wed, May 2, 2012 at 1:24 PM, Eric V. Smith wrote: > On 05/02/2012 01:06 PM, PJ Eby wrote: > > > I do see one point of concern with the spec, though. At one point it > > says that finders must return a path without a trailing separator, but > > at another it says the package __file__ will contain a separator. > > > > This strikes me as inconsistent, and also incompatible with > > non-filesystem-based finder implementations. The import machinery *must > > not* assume that import path strings are filenames, so it is wrong for > > the import machinery to add a path separator that the finder did not > > include. > > > > IOW, I don't think the spec can assume or guarantee anything about the > > strings returned by finders: it MUST treat them as opaque strings. If > > this means that there can't be any meaningful __file__ for a namespace > > package, I think we will have to live with that. > > I've come to the same conclusion myself. I actually had a draft of the > PEP that removed the word "directory", at which point it becomes obvious > that you're adding a path separator to something that might not be a > path name. > > > The only alternative I see is to delegate the string manipulation back > > to the finders, or to change the return value from a string to a (file, > > path) tuple, wherein 'file' is the value to be used as __file__, and > > 'path' is the value to be used in __path__. > > I don't see the value of __file__ at all in the case of namespace > packages. If it's just a hint that it's a namespace package, I think it > would be better to set __file__ to None. That would noisily break some > code that isn't likely to work anyway. > Either None or a missing attribute is fine with me. (One advantage to the missing attribute is that it fails at the exact point where the inspecting code needs fixing, whereas the None will get passed on to some other code before the error manifests itsefl.) By the way, I finished reading the rest of the PEP, and with regard to auto-updating paths, I want to mention that it wasn't me who originally brought up issues about auto-update, it was someone on Python-Dev, and the use cases were discussed there. Also, I would challenge the argument about it being a major block to implementation, since the implementation is straightforward (and TONS simpler than setuptools' approach to the problem). More to the point, though, supporting auto-updates *later* is not really an option, since we'd be changing the rules on people, and invalidating whatever workarounds people come up with for manually updating the path. If namespace package __path__ objects start out as some other type than lists, then there's no change to trip anyone up later. I guess my point is that if we're not going to do auto-updates from the start, it's kind of going to rule it out in the long term as well, so if that's the intention it should be explicitly addressed. I don't want to see it just get ruled out by default due to not being done now, and then not being able to be done later. That's why my earlier question was about whether it had been discussed or not -- there was previous discussion on it in the 402 context, and it was left as an open issue pending BDFL comment on the basic idea of 402. Since then, the basic idea of treating init-less directories as namespace packages has been blessed, so now it's time to get the auto-updates yea-or-nay question ruled on as well. The implementation is pretty trivial; see PEP 402 version of it here: http://mail.python.org/pipermail/import-sig/2012-April/000473.html ...and the PEP 420 version is even simpler, since instead of looking for a 'get_subpath()' method on the finders, it should just call find_module() and check for a string return. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Thu May 3 02:58:27 2012 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 02 May 2012 20:58:27 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> Message-ID: <4FA1D833.20208@trueblade.com> On 5/2/2012 5:05 PM, PJ Eby wrote: > I don't see the value of __file__ at all in the case of namespace > packages. If it's just a hint that it's a namespace package, I think it > would be better to set __file__ to None. That would noisily break some > code that isn't likely to work anyway. > > > Either None or a missing attribute is fine with me. (One advantage to > the missing attribute is that it fails at the exact point where the > inspecting code needs fixing, whereas the None will get passed on to > some other code before the error manifests itsefl.) I can go either way on this, but would lean toward __file__ not being set. Brett: what's your opinion? > By the way, I finished reading the rest of the PEP, and with regard to > auto-updating paths, I want to mention that it wasn't me who originally > brought up issues about auto-update, it was someone on Python-Dev, and > the use cases were discussed there. Also, I would challenge the > argument about it being a major block to implementation, since the > implementation is straightforward (and TONS simpler than setuptools' > approach to the problem). > > I guess my point is that if we're not going to do auto-updates from the > start, it's kind of going to rule it out in the long term as well, so if > that's the intention it should be explicitly addressed. I don't want to > see it just get ruled out by default due to not being done now, and then > not being able to be done later. Okay. I'll take a look at it tomorrow to see what's involved and if we're backing ourselves into a corner or not. Thanks. Eric. From barry at python.org Thu May 3 03:23:55 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 2 May 2012 21:23:55 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA1D833.20208@trueblade.com> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> Message-ID: <20120502212355.6bda4cd4@resist.wooz.org> On May 02, 2012, at 08:58 PM, Eric V. Smith wrote: >On 5/2/2012 5:05 PM, PJ Eby wrote: > >> I don't see the value of __file__ at all in the case of namespace >> packages. If it's just a hint that it's a namespace package, I think it >> would be better to set __file__ to None. That would noisily break some >> code that isn't likely to work anyway. >> >> >> Either None or a missing attribute is fine with me. (One advantage to >> the missing attribute is that it fails at the exact point where the >> inspecting code needs fixing, whereas the None will get passed on to >> some other code before the error manifests itsefl.) > >I can go either way on this, but would lean toward __file__ not being >set. Brett: what's your opinion? I rather like __file__ not existing, although I haven't really thought about the practical effects. PJE makes a good argument though. -Barry From pje at telecommunity.com Thu May 3 06:37:25 2012 From: pje at telecommunity.com (PJ Eby) Date: Thu, 3 May 2012 00:37:25 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120502212355.6bda4cd4@resist.wooz.org> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: On Wed, May 2, 2012 at 9:23 PM, Barry Warsaw wrote: > On May 02, 2012, at 08:58 PM, Eric V. Smith wrote: > > >On 5/2/2012 5:05 PM, PJ Eby wrote: > > > >> I don't see the value of __file__ at all in the case of namespace > >> packages. If it's just a hint that it's a namespace package, I > think it > >> would be better to set __file__ to None. That would noisily break > some > >> code that isn't likely to work anyway. > >> > >> > >> Either None or a missing attribute is fine with me. (One advantage to > >> the missing attribute is that it fails at the exact point where the > >> inspecting code needs fixing, whereas the None will get passed on to > >> some other code before the error manifests itsefl.) > > > >I can go either way on this, but would lean toward __file__ not being > >set. Brett: what's your opinion? > > I rather like __file__ not existing, although I haven't really thought > about > the practical effects. PJE makes a good argument though. > There's a counterargument that I realized later: PEP 302 currently requires that __file__ be set, AND that it be a string. "The privilege of not having a __file__ attribute at all is reserved for built-in modules." (Of course, that argues equally against __file__ being None, so I'm not sure it helps any to point that out!) Still, code that expects to do something with a package's __file__ is *going* to break somehow with a namespace package, so it's probably better for it to break sooner rather than later. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu May 3 08:23:34 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 3 May 2012 16:23:34 +1000 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: On Thu, May 3, 2012 at 2:37 PM, PJ Eby wrote: > Still, code that expects to do something with a package's __file__ is > *going* to break somehow with a namespace package, so it's probably better > for it to break sooner rather than later. My own preference is for markers like "", "" and "". They're significantly nicer to deal with when dumping module state for diagnostic purposes. If I get a KeyError on __file__, or an AttributeError on NoneType when all I'm trying to do is display data, it's annoying. Standardising on a pattern also opens up the possibility of doing something meaningful with it in get_data() later. One of the guarantees of PEP 302 if that you should be able to do this: data_ref = os.path.join(__file__, relative_ref) data = __loader__.get_data(data_ref) That should really only blow up in get_data(), *not* on the os.path.join step. Ideally, you should also be able to do this: data_ref = os.path.join(mod.__file__, relative_ref) data = mod.__loader__.get_data(data_ref) I see it as being similar to the mandatory file attribute on code objects - placeholders like "" and "" are a lot more informative when errors occur than just using None, even though neither of them is a valid filesystem path. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From martin at v.loewis.de Thu May 3 10:37:02 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Thu, 03 May 2012 10:37:02 +0200 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA1D833.20208@trueblade.com> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> Message-ID: <20120503103702.Horde.pBsTdLuWis5PokOuiL1VKAA@webmail.df.eu> > I can go either way on this, but would lean toward __file__ not being > set. Brett: what's your opinion? I'd like to recall that we were explicitly discussion this question at PyCon, and (IIRC) I proposed that it be None, and Guido pronounced that it shall be the path to the first portion. So if you now want to change it, you should check with him again. Regards, Martin From eric at trueblade.com Thu May 3 14:28:03 2012 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 03 May 2012 08:28:03 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120503103702.Horde.pBsTdLuWis5PokOuiL1VKAA@webmail.df.eu> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120503103702.Horde.pBsTdLuWis5PokOuiL1VKAA@webmail.df.eu> Message-ID: <4FA279D3.6090701@trueblade.com> On 5/3/2012 4:37 AM, martin at v.loewis.de wrote: >> I can go either way on this, but would lean toward __file__ not being >> set. Brett: what's your opinion? > > I'd like to recall that we were explicitly discussion this question at > PyCon, and (IIRC) I proposed that it be None, and Guido pronounced that > it shall be the path to the first portion. So if you now want to change it, > you should check with him again. I recall that, and I also recall advocating None. I see the process as: - come to a consensus here - update the PEP, documenting this discussion - update the implementation - get Guido to rule on the PEP Eric. From brett at python.org Thu May 3 16:48:43 2012 From: brett at python.org (Brett Cannon) Date: Thu, 3 May 2012 10:48:43 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: On Thu, May 3, 2012 at 2:23 AM, Nick Coghlan wrote: > On Thu, May 3, 2012 at 2:37 PM, PJ Eby wrote: > > Still, code that expects to do something with a package's __file__ is > > *going* to break somehow with a namespace package, so it's probably > better > > for it to break sooner rather than later. > I'm going to roll my replies all into this email to keep things simple. So, to the people not wanting to set __file__, that (probably) won't fly because it has been documented for years that built-in modules are the only things that don't define __file__. Or we at least need to explain to people how to tell the difference in a backwards-compatible fashion (e.g. ``module.__name__ in sys.builtin_module_names``). > > My own preference is for markers like "", "" and > "". > So I would have said that had experience with the stdlib not big me on this. In my situation, the trace module was checking file, and if __file__ didn't contain "" or "'), but I wonder how many people made a similar whitelist approach. And while having __file__ to None or non-existent will take about the same amount of time to fix, it is less prone to silly whitelisting like what the trace module had. > > They're significantly nicer to deal with when dumping module state for > diagnostic purposes. If I get a KeyError on __file__, or an > AttributeError on NoneType when all I'm trying to do is display data, > it's annoying. > > Standardising on a pattern also opens up the possibility of doing > something meaningful with it in get_data() later. One of the > guarantees of PEP 302 if that you should be able to do this: > > data_ref = os.path.join(__file__, relative_ref) > data = __loader__.get_data(data_ref) > > That should really only blow up in get_data(), *not* on the > os.path.join step. Ideally, you should also be able to do this: > > data_ref = os.path.join(mod.__file__, relative_ref) > data = mod.__loader__.get_data(data_ref) > > I see it as being similar to the mandatory file attribute on code > objects - placeholders like "" and "" are a lot more > informative when errors occur than just using None, even though > neither of them is a valid filesystem path. > But that's because there are no other introspection options to tell where the module originated, unlike modules which have __loader__. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Thu May 3 17:00:26 2012 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 03 May 2012 11:00:26 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: <4FA29D8A.7020103@trueblade.com> On 5/3/2012 2:23 AM, Nick Coghlan wrote: > On Thu, May 3, 2012 at 2:37 PM, PJ Eby wrote: >> Still, code that expects to do something with a package's __file__ is >> *going* to break somehow with a namespace package, so it's probably better >> for it to break sooner rather than later. > > My own preference is for markers like "", "" and "". It looks like "" is indeed used, but built in modules do not set __file__. So I don't really see that as a precedent for setting it to something, but I do agree with most of your points below. > They're significantly nicer to deal with when dumping module state for > diagnostic purposes. If I get a KeyError on __file__, or an > AttributeError on NoneType when all I'm trying to do is display data, > it's annoying. > > Standardising on a pattern also opens up the possibility of doing > something meaningful with it in get_data() later. One of the > guarantees of PEP 302 if that you should be able to do this: > > data_ref = os.path.join(__file__, relative_ref) > data = __loader__.get_data(data_ref) > > That should really only blow up in get_data(), *not* on the > os.path.join step. Ideally, you should also be able to do this: > > data_ref = os.path.join(mod.__file__, relative_ref) > data = mod.__loader__.get_data(data_ref) While I embrace the pattern, I don't see how it could ever work for a namespace package. The defining quality is that the namespace package itself doesn't contain any files. And NamespaceLoader doesn't define get_data for this reason. > I see it as being similar to the mandatory file attribute on code > objects - placeholders like "" and "" are a lot more > informative when errors occur than just using None, even though > neither of them is a valid filesystem path. So the 4 options on the table are: 1. Add a (possibly meaningless) trailing slash character. 2. Use None. 3. Do not set it. 4. Set it to "". We'll discuss it today at our sprint. From brett at python.org Thu May 3 17:09:10 2012 From: brett at python.org (Brett Cannon) Date: Thu, 3 May 2012 11:09:10 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: On Thu, May 3, 2012 at 10:48 AM, Brett Cannon wrote: > > > On Thu, May 3, 2012 at 2:23 AM, Nick Coghlan wrote: > >> On Thu, May 3, 2012 at 2:37 PM, PJ Eby wrote: >> > Still, code that expects to do something with a package's __file__ is >> > *going* to break somehow with a namespace package, so it's probably >> better >> > for it to break sooner rather than later. >> > > I'm going to roll my replies all into this email to keep things simple. > > So, to the people not wanting to set __file__, that (probably) won't fly > because it has been documented for years that built-in modules are the only > things that don't define __file__. Or we at least need to explain to people > how to tell the difference in a backwards-compatible fashion (e.g. > ``module.__name__ in sys.builtin_module_names``). > > >> >> My own preference is for markers like "", "" and >> "". >> > > So I would have said that had experience with the stdlib not big me on > this. > That should say "So I would have agreed with that had my experience with the stdlib in bootstrapping importlib not caused me to disagree." Don't try to multi-task at work while in the middle of writing an email is the lesson there. =) -Brett In my situation, the trace module was checking file, and if __file__ didn't > contain "" or " then error out if it couldn't open the file. Now I updated it to > startswith('<') and endswith('>'), but I wonder how many people made a > similar whitelist approach. And while having __file__ to None or > non-existent will take about the same amount of time to fix, it is less > prone to silly whitelisting like what the trace module had. > > >> >> They're significantly nicer to deal with when dumping module state for >> diagnostic purposes. If I get a KeyError on __file__, or an >> AttributeError on NoneType when all I'm trying to do is display data, >> it's annoying. >> >> Standardising on a pattern also opens up the possibility of doing >> something meaningful with it in get_data() later. One of the >> guarantees of PEP 302 if that you should be able to do this: >> >> data_ref = os.path.join(__file__, relative_ref) >> data = __loader__.get_data(data_ref) >> >> That should really only blow up in get_data(), *not* on the >> os.path.join step. Ideally, you should also be able to do this: >> >> data_ref = os.path.join(mod.__file__, relative_ref) >> data = mod.__loader__.get_data(data_ref) >> >> I see it as being similar to the mandatory file attribute on code >> objects - placeholders like "" and "" are a lot more >> informative when errors occur than just using None, even though >> neither of them is a valid filesystem path. >> > > But that's because there are no other introspection options to tell where > the module originated, unlike modules which have __loader__. > > >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> _______________________________________________ >> Import-SIG mailing list >> Import-SIG at python.org >> http://mail.python.org/mailman/listinfo/import-sig >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Thu May 3 18:11:00 2012 From: pje at telecommunity.com (PJ Eby) Date: Thu, 3 May 2012 12:11:00 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: On Thu, May 3, 2012 at 2:23 AM, Nick Coghlan wrote: > Standardising on a pattern also opens up the possibility of doing > something meaningful with it in get_data() later. One of the > guarantees of PEP 302 if that you should be able to do this: > > data_ref = os.path.join(__file__, relative_ref) > data = __loader__.get_data(data_ref) > Um, namespace package modules shouldn't have a __loader__ either, should they? -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu May 3 18:15:41 2012 From: barry at python.org (Barry Warsaw) Date: Thu, 3 May 2012 12:15:41 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: <20120503121541.6b5ff385@resist.wooz.org> On May 03, 2012, at 10:48 AM, Brett Cannon wrote: >So, to the people not wanting to set __file__, that (probably) won't fly >because it has been documented for years that built-in modules are the only >things that don't define __file__. Okay, but *why* is this the rule, other than that PEP 302 says it? IOW, PEP 302 doesn't give much of a rationale for the rule, and I suspect it just reflected the reality back in 2002. >Or we at least need to explain to people how to tell the difference in a >backwards-compatible fashion. Definitely, and I think that would be fine to include in PEP 420. >So I would have said that had experience with the stdlib not big me on >this. In my situation, the trace module was checking file, and if __file__ >didn't contain "" or "and then error out if it couldn't open the file. Now I updated it to >startswith('<') and endswith('>'), but I wonder how many people made a >similar whitelist approach. And while having __file__ to None or >non-existent will take about the same amount of time to fix, it is less >prone to silly whitelisting like what the trace module had. See what I mean about arbitrary and underdocumented? :) Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From brett at python.org Thu May 3 18:47:39 2012 From: brett at python.org (Brett Cannon) Date: Thu, 3 May 2012 12:47:39 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: On Thu, May 3, 2012 at 12:11 PM, PJ Eby wrote: > On Thu, May 3, 2012 at 2:23 AM, Nick Coghlan wrote: > >> Standardising on a pattern also opens up the possibility of doing >> something meaningful with it in get_data() later. One of the >> guarantees of PEP 302 if that you should be able to do this: >> >> data_ref = os.path.join(__file__, relative_ref) >> data = __loader__.get_data(data_ref) >> > > Um, namespace package modules shouldn't have a __loader__ either, should > they? > No, they should (and PEP 302 now requires that). Namespace modules are loaded by a loader, and thus should have it defined. It's all the other optional interfaces that they don't need to have (e.g. NamespaceLoader should have importlib.abc.Loader and probably none of the other ABCs). > > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu May 3 18:49:23 2012 From: brett at python.org (Brett Cannon) Date: Thu, 3 May 2012 12:49:23 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120503121541.6b5ff385@resist.wooz.org> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120503121541.6b5ff385@resist.wooz.org> Message-ID: On Thu, May 3, 2012 at 12:15 PM, Barry Warsaw wrote: > On May 03, 2012, at 10:48 AM, Brett Cannon wrote: > > >So, to the people not wanting to set __file__, that (probably) won't fly > >because it has been documented for years that built-in modules are the > only > >things that don't define __file__. > > Okay, but *why* is this the rule, other than that PEP 302 says it? IOW, > PEP > 302 doesn't give much of a rationale for the rule, and I suspect it just > reflected the reality back in 2002. > Exactly. I am willing to be that historically it's just because that was the only way you could tell what was or was not a built-in module. > > >Or we at least need to explain to people how to tell the difference in a > >backwards-compatible fashion. > > Definitely, and I think that would be fine to include in PEP 420. > > >So I would have said that had experience with the stdlib not big me on > >this. In my situation, the trace module was checking file, and if __file__ > >didn't contain "" or " >and then error out if it couldn't open the file. Now I updated it to > >startswith('<') and endswith('>'), but I wonder how many people made a > >similar whitelist approach. And while having __file__ to None or > >non-existent will take about the same amount of time to fix, it is less > >prone to silly whitelisting like what the trace module had. > > See what I mean about arbitrary and underdocumented? :) > I don't remind me about "arbitrary and underdocumented" when it comes to the import system. =P -Brett > > Cheers, > -Barry > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri May 4 00:20:16 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 4 May 2012 08:20:16 +1000 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: I'd still prefer to just officially bless the existing "" convention for non-filesystem imports over encouraging type checks on __loader__ or defining a new introspection interface for loaders. If we say "this is the stdlib convention" people are going to start using the same check as is now used in traceback.py The precedent is there with code objects, and I think it's a good example to follow. Cheers, Nick. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri May 4 00:43:40 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 3 May 2012 15:43:40 -0700 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: +1 On Thu, May 3, 2012 at 3:20 PM, Nick Coghlan wrote: > I'd still prefer to just officially bless the existing "" > convention for non-filesystem imports over encouraging type checks on > __loader__ or defining a new introspection interface for loaders. > > If we say "this is the stdlib convention" people are going to start using > the same check as is now used in traceback.py > > The precedent is there with code objects, and I think it's a good example to > follow. > > Cheers, > Nick. > > -- > Sent from my phone, thus the relative brevity :) > > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > -- --Guido van Rossum (python.org/~guido) From pje at telecommunity.com Fri May 4 02:05:15 2012 From: pje at telecommunity.com (PJ Eby) Date: Thu, 3 May 2012 20:05:15 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: On Thu, May 3, 2012 at 6:20 PM, Nick Coghlan wrote: > I'd still prefer to just officially bless the existing "" > convention for non-filesystem imports over encouraging type checks on > __loader__ or defining a new introspection interface for loaders. > > If we say "this is the stdlib convention" people are going to start using > the same check as is now used in traceback.py > > The precedent is there with code objects, and I think it's a good example > to follow. > Note that this messes with the idea of using the first directory as filename -- anybody who joins with os.path.dirname(__file__) is going to get a mess (on regular filesystem paths), which is (I'm guessing) why the trailing separator idea was proposed in the first place. Which kind of brings us full circle on that point. I suppose we could just say screw it, anybody implementing VFS importers had darn well better understand os.path.join and friends, since PEP 302 requires it for get_data anyway. Still seems like a wart, but oh well. OTOH, maybe it's better for people munging __file__ to get a weird error all the time with namespace packages, instead of something that works some of the time, and fails later? -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Fri May 4 02:11:02 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Fri, 04 May 2012 02:11:02 +0200 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120503121541.6b5ff385@resist.wooz.org> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120503121541.6b5ff385@resist.wooz.org> Message-ID: <20120504021102.Horde.4iA2c9jz9kRPox6WHie3KUA@webmail.df.eu> Zitat von Barry Warsaw : > On May 03, 2012, at 10:48 AM, Brett Cannon wrote: > >> So, to the people not wanting to set __file__, that (probably) won't fly >> because it has been documented for years that built-in modules are the only >> things that don't define __file__. > > Okay, but *why* is this the rule, other than that PEP 302 says it? I think it predates PEP 302 by a decade or so. You might also ask why the keyword is "def", and not "define" (other than that the Grammar says so). It's a natural thing, also: If the module comes from the file system, it has an __file__ attribute, else it's built-in. Regards, Martin From ncoghlan at gmail.com Fri May 4 03:05:16 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 4 May 2012 11:05:16 +1000 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: On Fri, May 4, 2012 at 10:05 AM, PJ Eby wrote: > On Thu, May 3, 2012 at 6:20 PM, Nick Coghlan wrote: >> >> I'd still prefer to just officially bless the existing "" >> convention for non-filesystem imports over encouraging type checks on >> __loader__ or defining a new introspection interface for loaders. >> >> If we say "this is the stdlib convention" people are going to start using >> the same check as is now used in traceback.py >> >> The precedent is there with code objects, and I think it's a good example >> to follow. > > Note that this messes with the idea of using the first directory as filename > -- anybody who joins with os.path.dirname(__file__) is going to get a mess > (on regular filesystem paths), which is (I'm guessing) why the trailing > separator idea was proposed in the first place. > > Which kind of brings us full circle on that point.? I suppose we could just > say screw it, anybody implementing VFS importers had darn well better > understand os.path.join and friends, since PEP 302 requires it for get_data > anyway. Yep. It also means VFS importers are officially free to put all the metadata they want inside the angle brackets, secure in the knowledge that everyone else should be treating it as an opaque blob. It then becomes a way for them to pass necessary info to get_data() *without* having to create distinct loader instances for every module. Arguably, we should also be adding the angle brackets in zipimporter (since those aren't real filesystem paths). > Still seems like a wart, but oh well.? OTOH, maybe it's better for people > munging __file__ to get a weird error all the time with namespace packages, > instead of something that works some of the time, and fails later? Right. Otherwise we'd get layout dependent behaviour where dubious cross-portion references worked if all portions were installed to the same path segment, but then failed if they were split across multiple segments. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Fri May 4 03:21:44 2012 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 03 May 2012 21:21:44 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: <4FA32F28.5040200@trueblade.com> On 05/03/2012 09:05 PM, Nick Coghlan wrote: > On Fri, May 4, 2012 at 10:05 AM, PJ Eby wrote: >> Still seems like a wart, but oh well. OTOH, maybe it's better for people >> munging __file__ to get a weird error all the time with namespace packages, >> instead of something that works some of the time, and fails later? > > Right. Otherwise we'd get layout dependent behaviour where dubious > cross-portion references worked if all portions were installed to the > same path segment, but then failed if they were split across multiple > segments. Under no circumstances should anyone be looking at __file__ for a namespace package in order to find a related file. We should do something that causes this to always break. Eric. From barry at python.org Fri May 4 16:34:50 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 4 May 2012 10:34:50 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: <20120504103450.58286b0c@limelight.wooz.org> On May 04, 2012, at 08:20 AM, Nick Coghlan wrote: >I'd still prefer to just officially bless the existing "" >convention for non-filesystem imports over encouraging type checks on >__loader__ or defining a new introspection interface for loaders. The thing is, that convention is at best meaningless and at worst misleading. I also don't think it gives you all the diagnosis support you really want. The PEP 302 rule (reservation of no __file__ only for built-ins) is a historical relic for which no good rationale exists. Forgetting that for a moment, it simply makes no sense for a module that wasn't loaded from a file system path to have an __file__ attribute. It's also not true even today. At our PEP 420 sprint we noticed importlib does something like this to create new modules: >>> type(sys)('foo') That module isn't a built-in and doesn't have an __file__. It also doesn't have an __loader__, but oh well. (BTW, Brett, that's pretty clever. :) It seemed to us that the only reasonable semantics for such modules is that __file__ is None or __file__ is missing. Not setting __file__ is better though because you get appropriate exceptions at the place where you make the initial mistake (i.e. assuming every module has an __file__). If you set __file__ to None, you may instead get cryptic messages in os.path.join() for example. So, what about the "diagnostics" use case? Certainly a very important use case is the repr of module objects. In the case of modules loaded from the file system, I definitely want to know where the file lives, and the repr is a great way to see that. For other modules, you do want to know something about how that module was created, and having a repr that gives a good indication of that is very useful. But you can easily do that without a contrived __file__ (more on that below). What about other introspection use cases? Relying on __file__ programmatically might be a convenient shorthand, but knowing the loader (via __loader__ if available) is more helpful, because that tells you more about how that module actually came into existence. The value of __file__ is really under the purview of the loader anyway. Consider a hypothetical database loader (or even many different third party database loaders). Of what use is an __file__ that says ''? That way leads to uncertainty, and namespace collisions, for example if both a SQLite loader and a PostgreSQL loader wanted to use the '' value. In either case, maybe you'd prefer to know what the database url is, or maybe the query that produced the module, or some combination there of. Overloading all that into a contrived __file__ seems wrong. I would prefer if the requirement were relaxed, and we simply allowed the loaders to set __file__ to whatever they think is appropriate, which would include allowing them to not setting __file__ at all. It's actually easy to give modules a reasonable repr even without __file__. I have a branch in the PEP 420 feature repo which implements the following rules for module object reprs: * Use mod.__file__ if it exists * Otherwise, get the module's __loader__ * If the module has no loader, then just return the module's name. E.g. >>> type(sys)('foo') * Define a new optional method on loaders, called module_repr() that takes the module as an argument. Use whatever this returns as the module's repr. * As a last fallback, just use the repr of the loader as part of the module's repr. I'm not particularly married to this implementation, but it seems reasonably backward compatible, and flexible enough to support useful alternatives. For example, the BuiltinImporter could define its module_repr() like so: @classmethod def module_repr(cls, module): return ''.format(module.__name__) Specifically, my proposed elaboration on PEP 420 is this: * Explicitly leave the assignment of __file__ to the loader. * Allow loaders to not set __file__ * Add an optional API to loaders, module_repr() as defined above. Cheers, -Barry From barry at python.org Fri May 4 16:51:49 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 4 May 2012 10:51:49 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120504021102.Horde.4iA2c9jz9kRPox6WHie3KUA@webmail.df.eu> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120503121541.6b5ff385@resist.wooz.org> <20120504021102.Horde.4iA2c9jz9kRPox6WHie3KUA@webmail.df.eu> Message-ID: <20120504105149.472a2f61@limelight.wooz.org> On May 04, 2012, at 02:11 AM, martin at v.loewis.de wrote: >I think it predates PEP 302 by a decade or so. You might also ask why >the keyword is "def", and not "define" (other than that the Grammar says >so). It's a natural thing, also: If the module comes from the file system, >it has an __file__ attribute, else it's built-in. Sure, that makes sense in a 2002 world where we didn't have importlib and all the modernization of the import system. Today, it's not only antiquated, it's also not necessarily true. We're already significantly overhauling the import machinery, so I think it's entirely reasonable to relax this constraint. See my previous post for a proposal. -Barry From barry at python.org Fri May 4 16:56:56 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 4 May 2012 10:56:56 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> Message-ID: <20120504105656.11fca0e9@limelight.wooz.org> On May 04, 2012, at 11:05 AM, Nick Coghlan wrote: >Yep. It also means VFS importers are officially free to put all the >metadata they want inside the angle brackets, secure in the knowledge >that everyone else should be treating it as an opaque blob. It then >becomes a way for them to pass necessary info to get_data() *without* >having to create distinct loader instances for every module. Ooh! I can't wait for the __file__ set to a pickle to steganographically communicate secret messages to get_data(). :) -Barry From pje at telecommunity.com Fri May 4 16:56:56 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 4 May 2012 10:56:56 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120504103450.58286b0c@limelight.wooz.org> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> Message-ID: On May 4, 2012 10:34 AM, "Barry Warsaw" wrote: > Specifically, my proposed elaboration on PEP 420 is this: > > * Explicitly leave the assignment of __file__ to the loader. > * Allow loaders to not set __file__ > * Add an optional API to loaders, module_repr() as defined above. +1 on all the above, plus getting rid of __file__ for namespace packages. Seems like an elegant solution to the problems involved, and allows DB or other importers to make their own attributes like __dsn__ or __url__, but still have a decent repr. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Fri May 4 17:13:48 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 04 May 2012 11:13:48 -0400 Subject: [Import-SIG] PEP 420 sprint report Message-ID: <4FA3F22C.9090003@trueblade.com> Yesterday Jason Coombs, Barry Warsaw, and I met for about 6 hours of sprinting on PEP 420. We added a test framework and added tests for namespace packages using the filesystem loader and zipimport loader. We flushed out a bug in zipimport's namespace finder support as part of this. We identified the following issues which need to get resolved before the PEP is ruled on: 1. What about __file__? Barry is currently discussing this in the other thread. 2: Parent path modification detection. I'm still thinking this one over. I'm going to look into whipping up a sample implementation. I think these can all be resolved this weekend, so we'll ask that a ruling be made on the PEP next week. Please let me know if you have other PEP (not implementation) concerns. There are also these quality of implementation issues that I don't think need to get addressed before PEP 420 is ruled on: 1. Documentation. 2. More tests. We need to test namespace packages as sub-packages, not just top level. 3. The zipimport finder currently looks for "path/" to detect if a 'directory' exists and could be a namespace portion. However, this is a valid zip file: Archive: namespace_pkgs/missing_directory.zip Length Date Time Name --------- ---------- ----- ---- 0 2012-05-04 04:45 bar/ 35 2012-05-04 04:45 bar/two.py 26 2012-05-04 04:45 foo/one.py --------- ------- 61 3 files The current code will treat "bar" as a possible portion, but not "foo". We discussed a number of ways to address this, but I'm unconvinced they're worth the hassle and runtime expense. But in any event, it's an issue for another day and doesn't affect the PEP's acceptance one way or the other. All of the code is checked in to features/pep-420. Eric. From ncoghlan at gmail.com Fri May 4 17:14:13 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 5 May 2012 01:14:13 +1000 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120504103450.58286b0c@limelight.wooz.org> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> Message-ID: On Sat, May 5, 2012 at 12:34 AM, Barry Warsaw wrote: > ?* Explicitly leave the assignment of __file__ to the loader. > ?* Allow loaders to not set __file__ > ?* Add an optional API to loaders, module_repr() as defined above. I can accept that approach on one condition: the PEP 420 implementation comes with the long-overdue migration of the definition of the import system semantics into the language reference. The main sticking point preventing that in the past has been that nobody wanted to document all the caveats and special cases needed to accurately describe CPython's behaviour. For 3.3+, no such caveats are necessary, since Brett's importlib efforts mean that even the default import system follows the rules. The proposed update will require changes to the description of the import semantics, anyway, so rather than making those changes directly in PEP 302, it would be better to document them in the language reference and update PEP 302 with a note to say that, for 3.3+, it is no longer the authoritative source. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From p.f.moore at gmail.com Fri May 4 17:16:14 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 4 May 2012 16:16:14 +0100 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120504105149.472a2f61@limelight.wooz.org> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120503121541.6b5ff385@resist.wooz.org> <20120504021102.Horde.4iA2c9jz9kRPox6WHie3KUA@webmail.df.eu> <20120504105149.472a2f61@limelight.wooz.org> Message-ID: On 4 May 2012 15:51, Barry Warsaw wrote: > On May 04, 2012, at 02:11 AM, martin at v.loewis.de wrote: > >>I think it predates PEP 302 by a decade or so. You might also ask why >>the keyword is "def", and not "define" (other than that the Grammar says >>so). It's a natural thing, also: If the module comes from the file system, >>it has an __file__ attribute, else it's built-in. > > Sure, that makes sense in a 2002 world where we didn't have importlib and all > the modernization of the import system. ?Today, it's not only antiquated, it's > also not necessarily true. ?We're already significantly overhauling the import > machinery, so I think it's entirely reasonable to relax this constraint. When we wrote PEP 302, so much code assumed that modules lived in the filesystem that we had very little room for manoeuvre, One of the goals of PEP 302 (in my mind, at least) was to disrupt the mindset that assumed this. Now, Brett's implementation of importlib has made that a reality - code that assumes modules live in a filesystem should have a really good justification for doing so (and document the limitation, ideally). I suspect you'll still break a reasonable amount of code like this, but that's probably OK, as it's less of a breakage, and more of a case of the existing code not anticipating cases that never existed before. > See my previous post for a proposal. +1 and I'd also explicitly allow for loaders to assign other "private" metadata as well as __file__, if only to avoid the spectre of __file__ being a base64-encoded pickled object :-) I wonder whether treating repr specially is the best way, though - maybe have a loader method "code_location" which is defined as being a human-readable, but otherwise unspecified string. The key use case is for repr, but it might be useful elsewhere (IDE tooltips or some such usage spring to mind). Paul. From eric at trueblade.com Fri May 4 17:17:13 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 04 May 2012 11:17:13 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> Message-ID: <4FA3F2F9.8020001@trueblade.com> On 05/04/2012 11:14 AM, Nick Coghlan wrote: > On Sat, May 5, 2012 at 12:34 AM, Barry Warsaw wrote: >> * Explicitly leave the assignment of __file__ to the loader. >> * Allow loaders to not set __file__ >> * Add an optional API to loaders, module_repr() as defined above. > > I can accept that approach on one condition: the PEP 420 > implementation comes with the long-overdue migration of the definition > of the import system semantics into the language reference. > > The main sticking point preventing that in the past has been that > nobody wanted to document all the caveats and special cases needed to > accurately describe CPython's behaviour. For 3.3+, no such caveats are > necessary, since Brett's importlib efforts mean that even the default > import system follows the rules. > > The proposed update will require changes to the description of the > import semantics, anyway, so rather than making those changes directly > in PEP 302, it would be better to document them in the language > reference and update PEP 302 with a note to say that, for 3.3+, it is > no longer the authoritative source. We did discuss this yesterday at the sprint. I'm all for it, and I think the others were, too. I'm not keen on tying all of this to PEP 420 acceptance or rejection, but it's not the end of the world. Eric. From p.f.moore at gmail.com Fri May 4 17:23:37 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 4 May 2012 16:23:37 +0100 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> Message-ID: On 4 May 2012 16:14, Nick Coghlan wrote: > On Sat, May 5, 2012 at 12:34 AM, Barry Warsaw wrote: >> ?* Explicitly leave the assignment of __file__ to the loader. >> ?* Allow loaders to not set __file__ >> ?* Add an optional API to loaders, module_repr() as defined above. > > I can accept that approach on one condition: the PEP 420 > implementation comes with the long-overdue migration of the definition > of the import system semantics into the language reference. That would be a *very* good idea. Whether PEP 420 should be held hostage to this, I don't know, but I think it should be targeted as a key item for 3.3. Just having a reference to what the language actually guarantees would be immensely useful. I did actually try to do this once, but my head exploded :-) (I'd be willing to help out with it, but I don't know where it would fit in the docs - could anyone suggest a basic location and structure, and I could try to write some words to go into it?) On a somewhat related note, does anyone know how well oddities like jython's ability to import Java classes (and IronPython for .Net classes) fit any such rules? Paul. From fwierzbicki at gmail.com Fri May 4 18:00:52 2012 From: fwierzbicki at gmail.com (fwierzbicki at gmail.com) Date: Fri, 4 May 2012 09:00:52 -0700 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120504103450.58286b0c@limelight.wooz.org> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> Message-ID: On Fri, May 4, 2012 at 7:34 AM, Barry Warsaw wrote: > It's also not true even today. ?At our PEP 420 sprint we noticed importlib > does something like this to create new modules: > > ? ?>>> type(sys)('foo') > > That module isn't a built-in and doesn't have an __file__. ?It also > doesn't have an __loader__, but oh well. > > (BTW, Brett, that's pretty clever. :) Too clever for Jython at them moment :) -- which leads me to ask: Should I consider this a a feature of the sys module? It doesn't look too hard to do, and I really want importlib to work when Jython starts on Jython3 (I'm hoping to seriously start that this summer - Jython 2.7 is progressing well). -Frank From brett at python.org Fri May 4 18:21:36 2012 From: brett at python.org (Brett Cannon) Date: Fri, 4 May 2012 12:21:36 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> Message-ID: On Fri, May 4, 2012 at 12:00 PM, fwierzbicki at gmail.com < fwierzbicki at gmail.com> wrote: > On Fri, May 4, 2012 at 7:34 AM, Barry Warsaw wrote: > > It's also not true even today. At our PEP 420 sprint we noticed > importlib > > does something like this to create new modules: > > > > >>> type(sys)('foo') > > > > That module isn't a built-in and doesn't have an __file__. It also > > doesn't have an __loader__, but oh well. > > > > (BTW, Brett, that's pretty clever. :) > Too clever for Jython at them moment :) -- which leads me to ask: > Should I consider this a a feature of the sys module? No, this is an ability of types.ModuleType (which I don't have access to in importlib, so I just inlined the call). This works for any module in CPython. > It doesn't look > too hard to do, and I really want importlib to work when Jython starts > on Jython3 (I'm hoping to seriously start that this summer - Jython > 2.7 is progressing well). > I've actually been meaning to email the various VMs to have them look over importlib to see if there are any sticking points that are obvious so we can fix them now instead of waiting until a point release when the first VM other than CPython tries to use importlib. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fwierzbicki at gmail.com Fri May 4 18:28:55 2012 From: fwierzbicki at gmail.com (fwierzbicki at gmail.com) Date: Fri, 4 May 2012 09:28:55 -0700 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> Message-ID: On Fri, May 4, 2012 at 9:21 AM, Brett Cannon wrote: >> Too clever for Jython at them moment :) -- which leads me to ask: >> Should I consider this a a feature of the sys module? > > > No, this is an ability of types.ModuleType (which I don't have access to in > importlib, so I just inlined the call). This works for any module in > CPython. Ah of course, and our ModuleType works just fine for this. The Jython sys module is fake sadly. Perhaps 3.x will be the time to finally make it a real module... it's been a fake module with a comment at the top to make it a real module for longer than I've been involved. BTW any real module works for us, for example: >>> type(os)('foo') -Frank From pje at telecommunity.com Fri May 4 18:50:11 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 4 May 2012 12:50:11 -0400 Subject: [Import-SIG] PEP 420 sprint report In-Reply-To: <4FA3F22C.9090003@trueblade.com> References: <4FA3F22C.9090003@trueblade.com> Message-ID: On Fri, May 4, 2012 at 11:13 AM, Eric V. Smith wrote: > 3. The zipimport finder currently looks for "path/" to detect if a > 'directory' exists and could be a namespace portion. However, this is a > valid zip file: > Archive: namespace_pkgs/missing_directory.zip > Length Date Time Name > --------- ---------- ----- ---- > 0 2012-05-04 04:45 bar/ > 35 2012-05-04 04:45 bar/two.py > 26 2012-05-04 04:45 foo/one.py > --------- ------- > 61 3 files > The current code will treat "bar" as a possible portion, but not "foo". > We discussed a number of ways to address this, but I'm unconvinced > they're worth the hassle and runtime expense. But in any event, it's an > issue for another day and doesn't affect the PEP's acceptance one way or > the other. > FYI, the zip files produced by distutils do not include the empty directory. Actually, I'm not sure when/where I've ever seen an empty directory listed in a zipfile. IMO, the no-explicit-directory case should be handled, if for no other reason than that it shouldn't randomly break depending on which archiving tool you used to create the zipfile with. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Fri May 4 18:57:28 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 04 May 2012 12:57:28 -0400 Subject: [Import-SIG] PEP 420 sprint report In-Reply-To: References: <4FA3F22C.9090003@trueblade.com> Message-ID: <4FA40A78.2020806@trueblade.com> On 05/04/2012 12:50 PM, PJ Eby wrote: > On Fri, May 4, 2012 at 11:13 AM, Eric V. Smith > wrote: > > 3. The zipimport finder currently looks for "path/" to detect if a > 'directory' exists and could be a namespace portion. However, this is a > valid zip file: > Archive: namespace_pkgs/missing_directory.zip > Length Date Time Name > --------- ---------- ----- ---- > 0 2012-05-04 04:45 bar/ > 35 2012-05-04 04:45 bar/two.py > 26 2012-05-04 04:45 foo/one.py > --------- ------- > 61 3 files > The current code will treat "bar" as a possible portion, but not "foo". > We discussed a number of ways to address this, but I'm unconvinced > they're worth the hassle and runtime expense. But in any event, it's an > issue for another day and doesn't affect the PEP's acceptance one way or > the other. > > > FYI, the zip files produced by distutils do not include the empty > directory. Actually, I'm not sure when/where I've ever seen an empty > directory listed in a zipfile. Interesting, thanks for the info. They are created if you use "zip -r" from a Linux box and it recurses into the directory. But it's definitely possible to create them without the empty directory if you explicitly list the files, or of course you can just delete them after the fact (which is what I did here). > IMO, the no-explicit-directory case should be handled, if for no other > reason than that it shouldn't randomly break depending on which > archiving tool you used to create the zipfile with. I agree. It's just that I'm not likely to get to it in the next few weeks. Hopefully I'll delay long enough that someone smarter than me will rewrite zipimport in Python (http://bugs.python.org/issue14678?@ok_message=issue 14678). I started with Python so I wouldn't have to write any more C! Eric. From martin at v.loewis.de Fri May 4 19:00:01 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Fri, 04 May 2012 19:00:01 +0200 Subject: [Import-SIG] PEP 420 sprint report In-Reply-To: References: <4FA3F22C.9090003@trueblade.com> Message-ID: <20120504190001.Horde.G7ZKSML8999PpAsRP9YVMTA@webmail.df.eu> > IMO, the no-explicit-directory case should be handled, if for no other > reason than that it shouldn't randomly break depending on which archiving > tool you used to create the zipfile with. I agree. IIRC, the zip importer creates a cached list/dictionary of the zip directory, anyway; while doing so, it could easily synthesize the directory names. Regards, Martin From eric at trueblade.com Fri May 4 19:07:42 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 04 May 2012 13:07:42 -0400 Subject: [Import-SIG] PEP 420 sprint report In-Reply-To: <20120504190001.Horde.G7ZKSML8999PpAsRP9YVMTA@webmail.df.eu> References: <4FA3F22C.9090003@trueblade.com> <20120504190001.Horde.G7ZKSML8999PpAsRP9YVMTA@webmail.df.eu> Message-ID: <4FA40CDE.3080207@trueblade.com> On 05/04/2012 01:00 PM, martin at v.loewis.de wrote: >> IMO, the no-explicit-directory case should be handled, if for no other >> reason than that it shouldn't randomly break depending on which archiving >> tool you used to create the zipfile with. > > I agree. IIRC, the zip importer creates a cached list/dictionary of the > zip directory, anyway; while doing so, it could easily synthesize the > directory names. Correct. It builds a dictionary. It could create another dictionary (or set is all I really need) with all directories, found or synthesized. Eric. From brett at python.org Fri May 4 19:32:53 2012 From: brett at python.org (Brett Cannon) Date: Fri, 4 May 2012 13:32:53 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> Message-ID: On Fri, May 4, 2012 at 12:28 PM, fwierzbicki at gmail.com < fwierzbicki at gmail.com> wrote: > On Fri, May 4, 2012 at 9:21 AM, Brett Cannon wrote: > >> Too clever for Jython at them moment :) -- which leads me to ask: > >> Should I consider this a a feature of the sys module? > > > > > > No, this is an ability of types.ModuleType (which I don't have access to > in > > importlib, so I just inlined the call). This works for any module in > > CPython. > Ah of course, and our ModuleType works just fine for this. The Jython > sys module is fake sadly. Perhaps 3.x will be the time to finally make > it a real module... it's been a fake module with a comment at the top > to make it a real module for longer than I've been involved. > > BTW any real module works for us, for example: > > >>> type(os)('foo') > > OK, so of the CPython built-in modules that importlib uses (sys, _imp, _warnings, _io, marshal, builtins, posix/nt), which are an actual module in Jython? -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Fri May 4 21:07:37 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 4 May 2012 15:07:37 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> Message-ID: <20120504150737.7a5131ab@resist.wooz.org> On May 05, 2012, at 01:14 AM, Nick Coghlan wrote: >On Sat, May 5, 2012 at 12:34 AM, Barry Warsaw wrote: >> ?* Explicitly leave the assignment of __file__ to the loader. >> ?* Allow loaders to not set __file__ >> ?* Add an optional API to loaders, module_repr() as defined above. > >I can accept that approach on one condition: the PEP 420 >implementation comes with the long-overdue migration of the definition >of the import system semantics into the language reference. I think you were listening in our sprint Nick! :) One of the downsides of the PEP process is that sometimes the PEP will end up being the definitive documentation for a new feature. This sucks for many reasons, including that PEPs don't live in the source tree and they end up getting pretty out-of-date as time goes by. PEP 302 suffers quite a bit from historical rot, but also from lots of superfluous text that doesn't make it easy to understand exactly what is going on. At our sprint, we all agreed that it would be much better for there to be documentation about the import system's semantics in the language reference guide. I think "Import System" is important enough to warrant a top-level chapter, probably either before or after "Execution Model". Section 6.11 describes the import statement, but I'd probably refactor large bits of that into the "Import System" chapter, and leave $6.11 to describe the import statement specifically. I mentioned at the sprint that I'd be willing to work on such a document. It's likely more than a one-person-operation, but I'd be happy to take a crack at a first draft once PEP 420 gets accepted. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Fri May 4 21:11:05 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 4 May 2012 15:11:05 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> Message-ID: <20120504151105.3d953080@resist.wooz.org> On May 04, 2012, at 10:56 AM, PJ Eby wrote: >On May 4, 2012 10:34 AM, "Barry Warsaw" wrote: >> Specifically, my proposed elaboration on PEP 420 is this: >> >> * Explicitly leave the assignment of __file__ to the loader. >> * Allow loaders to not set __file__ >> * Add an optional API to loaders, module_repr() as defined above. > >+1 on all the above, plus getting rid of __file__ for namespace packages. >Seems like an elegant solution to the problems involved, and allows DB or >other importers to make their own attributes like __dsn__ or __url__, but >still have a decent repr. Yes, exactly. It seems like there's general consensus about the basic proposal; I'll update the PEP so Guido has specific language to pronounce on. I want to make one change to what I posted. If m.__loader__.module_repr() exists, I want to give it a first crack at producing the repr. This means that __file__ is used as a fallback, not as the first step. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From fwierzbicki at gmail.com Fri May 4 21:44:29 2012 From: fwierzbicki at gmail.com (fwierzbicki at gmail.com) Date: Fri, 4 May 2012 12:44:29 -0700 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> Message-ID: Sorry for the dup Brett - I still mess up on the new gmail interface sometimes :( On Fri, May 4, 2012 at 10:32 AM, Brett Cannon wrote: > OK, so of the CPython built-in modules that importlib uses (sys, _imp, > _warnings, _io, marshal, builtins, posix/nt), which are an actual module in > Jython? I'll start with the bad: builtins would be hard to turn into a module - however __builtin__ is a module and works well. nt is not likely to get implemented, we pretend nt is a posix with missing bits. The ok: posix is not a currently a true module, but can probably be turned into one without too much trouble -- I will need to investigate. _imp is not exposed as a module, but I think this will be a necessary and acceptable step to integrate with importlib (and I don't think it should be too hard given the benefits). The good: marshal and _io are already true modules. _warnings will be when I get around to implementing it - probably next week :) -- if I run out of time it may end up just being the same as the python version (but that will still make it a true module). -Frank From barry at python.org Fri May 4 21:52:58 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 4 May 2012 15:52:58 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120503121541.6b5ff385@resist.wooz.org> <20120504021102.Horde.4iA2c9jz9kRPox6WHie3KUA@webmail.df.eu> <20120504105149.472a2f61@limelight.wooz.org> Message-ID: <20120504155258.45ea89aa@resist.wooz.org> On May 04, 2012, at 04:16 PM, Paul Moore wrote: >+1 and I'd also explicitly allow for loaders to assign other "private" >metadata as well as __file__, if only to avoid the spectre of __file__ >being a base64-encoded pickled object :-) That's in PEP 420 now too. >I wonder whether treating repr specially is the best way, though - >maybe have a loader method "code_location" which is defined as being a >human-readable, but otherwise unspecified string. The key use case is >for repr, but it might be useful elsewhere (IDE tooltips or some such >usage spring to mind). Maybe, but I think this is the simplest thing possible, which solves an existing use case. :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Fri May 4 21:56:51 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 4 May 2012 15:56:51 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA3F2F9.8020001@trueblade.com> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> <4FA3F2F9.8020001@trueblade.com> Message-ID: <20120504155651.1f661364@resist.wooz.org> On May 04, 2012, at 11:17 AM, Eric V. Smith wrote: >I'm not keen on tying all of this to PEP 420 acceptance or rejection, >but it's not the end of the world. I think the PEP should be pronounced on before the documentation is written. If Guido wants to make changes to the spec, it's better not to waste effort. Are there any more open issues? Are we ready to ask Guido to pronounce? I think the feature branch is in pretty good shape, but we can delay merging it to the main trunk (assuming the PEP gets accepted) until we have more tests and a first draft of the import semantics documentation. I don't mind working in the feature branch for a little while longer. Cheers, -Barry From pje at telecommunity.com Fri May 4 23:02:16 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 4 May 2012 17:02:16 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120504155651.1f661364@resist.wooz.org> References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> <4FA3F2F9.8020001@trueblade.com> <20120504155651.1f661364@resist.wooz.org> Message-ID: On Fri, May 4, 2012 at 3:56 PM, Barry Warsaw wrote: > Are there any more open issues? Maybe not on this particular subproposal, but IIUC, Eric was still looking at the feasibility of doing auto-updates when parent paths change. (Unless I'm mistaken, my sketch for PEP 402 should only need a bit of hacking to allow setting the initial calculated path, so that there's not an extra scan when a namespace package is initialized, and a change to make it use find_module() instead of PEP 402's get_subpath(). Well, that, and renaming "virtual packages" back to "namespace packages" in the error messages and such.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Fri May 4 23:13:47 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 04 May 2012 17:13:47 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <4FA05CFC.6050609@trueblade.com> <4FA0DF6C.4090709@v.loewis.de> <4FA10B15.1000302@trueblade.com> <4FA16DC5.1000204@trueblade.com> <4FA1D833.20208@trueblade.com> <20120502212355.6bda4cd4@resist.wooz.org> <20120504103450.58286b0c@limelight.wooz.org> <4FA3F2F9.8020001@trueblade.com> <20120504155651.1f661364@resist.wooz.org> Message-ID: <4FA4468B.7040105@trueblade.com> On 5/4/2012 5:02 PM, PJ Eby wrote: > On Fri, May 4, 2012 at 3:56 PM, Barry Warsaw > wrote: > > Are there any more open issues? > > > Maybe not on this particular subproposal, but IIUC, Eric was still > looking at the feasibility of doing auto-updates when parent paths change. > > (Unless I'm mistaken, my sketch for PEP 402 should only need a bit of > hacking to allow setting the initial calculated path, so that there's > not an extra scan when a namespace package is initialized, and a change > to make it use find_module() instead of PEP 402's get_subpath(). Well, > that, and renaming "virtual packages" back to "namespace packages" in > the error messages and such.) I'm looking at it and have it mostly implemented for PEP 420. I still need to refactor out some code so I can re-use the path-building code that's currently in PathFinder.find_module. It looks simple enough. Eric. From solipsis at pitrou.net Sat May 5 00:47:11 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 5 May 2012 00:47:11 +0200 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages References: <4F90730D.1040808@trueblade.com> Message-ID: <20120505004711.2140afbf@pitrou.net> Hello, On Thu, 19 Apr 2012 16:18:21 -0400 "Eric V. Smith" wrote: > This reflects (I hope!) the discussions at PyCon. My plan is to produce > an implementation based on the importlib code, and then flush out pieces > of the PEP. I don't understand why PEP 382 was rejected. There doesn't seem to be any obvious argument against it. The mechanism is simple, explicit and unambiguous. As PEP 382 points out: ?At the discussion at PyCon DE 2011, people remarked that having an explicit declaration of a directory as contributing to a package is a desirable property, rather than an obstactle. In particular, Jython developers noticed that Jython could easily mistake a directory that is a Java package as being a Python package, if there is no need to declare Python packages.? The "directory.pyp" scheme is highly unlikely to conflict with unrelated uses of a ".pyp" directory extension. It's also easy to use, and avoids oddities in the lookup algorithm such as ?if the scan completes without returning a module or package, and at least one directory was recorded, then a namespace package is created?. On the other hand, PEP 420 provides potential for confusion (for example, if the standard "test" package is not installed, trying to import it could end up importing some other arbitrary "test" directory on the path as a namespace package), without seeming to have any obvious advantage over PEP 382. Unless there are clear advantages over PEP 382, I'm -1 on this PEP, and would like to see PEP 382 revived. Regards Antoine. From ncoghlan at gmail.com Sat May 5 08:27:26 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 5 May 2012 16:27:26 +1000 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120505004711.2140afbf@pitrou.net> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> Message-ID: On Sat, May 5, 2012 at 8:47 AM, Antoine Pitrou wrote: > Unless there are clear advantages over PEP 382, I'm -1 on this PEP, and > would like to see PEP 382 revived. I raised this question as well, and the PEP as written doesn't do a great job of summarising the thread that addressed it. There were two counterpoints raised that I found compelling: A. Guido simply doesn't like directory extensions. I have to agree with him that using them to handle packaging would be a weird and unusual approach, and, well, he *does* get to play the BDFL card in cases like this. B. Current version control systems are still pretty abysmal when it comes to coping with directory renames, and we want to avoid unnecessary stumbling blocks on the migration path from the current pkgutil.extend_path() based namespace packages to the new native system. With PEP 382, the migration path is: 1. delete all __init__.py files from namespace package portions 2. rename the directories for all namespace package portions to append the ".pyp" extension With PEP 420, the migration path is: 1. delete all __init__.py files from namespace package portions 2. there is no step 2 The extra step required by the PEP 382 approach is exactly the kind of pointless revision history noise that PEP 414's reintroduction of explicit Unicode literals is designed to eliminate from Python 2 to Python 3 migrations. Between "Guido doesn't like directory suffixes" and "version control systems are still fairly bad at handling directory renames", I changed my own opinion on PEP 420 from -1 to +0. If we'd been starting from a clean slate with no language history or migration of existing projects to account for, then my opinion would be different, but given where we are today, I find the pragmatic argument in favour of simply losing the explicit markers compelling. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From martin at v.loewis.de Sat May 5 09:18:13 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Sat, 05 May 2012 09:18:13 +0200 Subject: [Import-SIG] PEP 420 issue: standard namespace packages Message-ID: <20120505091813.Horde.angpY8L8999PpNQ1Vz8hCnA@webmail.df.eu> I'd like the PEP to rule that the standard library may designate some of its packages as namespace packages, and also specifically declare the encodings package as a namespace package. This would allow to install additional encodings just by mere installation, without the need of having a search function registered at startup. Not sure what other packages would be candidates for namespace packages. Regards, Martin From solipsis at pitrou.net Sat May 5 12:33:03 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 5 May 2012 12:33:03 +0200 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> Message-ID: <20120505123303.29c3f4bb@pitrou.net> On Sat, 5 May 2012 16:27:26 +1000 Nick Coghlan wrote: > On Sat, May 5, 2012 at 8:47 AM, Antoine Pitrou wrote: > > Unless there are clear advantages over PEP 382, I'm -1 on this PEP, and > > would like to see PEP 382 revived. > > I raised this question as well, and the PEP as written doesn't do a > great job of summarising the thread that addressed it. > > There were two counterpoints raised that I found compelling: > > A. Guido simply doesn't like directory extensions. I have to agree > with him that using them to handle packaging would be a weird and > unusual approach, and, well, he *does* get to play the BDFL card in > cases like this. Well, I agree that "foo.pyp" isn't very pretty, but that's a pretty minor argument. At least it's explicit. (of course, another marker could have been chosen: for example an empty "foo/__namespace__.py", or whatever else floats our boat of aesthetics) > B. Current version control systems are still pretty abysmal when it > comes to coping with directory renames, and we want to avoid > unnecessary stumbling blocks on the migration path from the current > pkgutil.extend_path() based namespace packages to the new native > system. Isn't that baseless? AFAIU all modern DVCS should cope correctly with a directory rename. Even SVN may be ok. If anything, I'd like to see data points about these "current version control systems" being "pretty abysmal [!] when it comes to coping with directory renames". (preferably something else than a 2007 rant by Mark Shuttleworth in order to justify bzr's existence :-)) > The extra step required by the PEP 382 approach is exactly the kind of > pointless revision history noise that PEP 414's reintroduction of > explicit Unicode literals is designed to eliminate from Python 2 to > Python 3 migrations. Except that noone *has* to migrate to namespace packages. These are fairly rare and only useful for a couple of big projects. (I've only heard about Zope using them; Twisted AFAICT doesn't) Even then, renaming a directory is hardly comparable to the hurdle of migrating unicode literals from Python 2 to Python 3. The analogy sounds melodramatic. > Between "Guido doesn't like directory suffixes" and "version control > systems are still fairly bad at handling directory renames", I changed > my own opinion on PEP 420 from -1 to +0. This doesn't address PEP 420's issues, which will still come to bite us in 10 years: the potential for confusion, the weirdness of the lookup algorithm. > If we'd been starting from a > clean slate with no language history or migration of existing projects > to account for, then my opinion would be different, but given where we > are today, I find the pragmatic argument in favour of simply losing > the explicit markers compelling. The real pragmatic argument would be to avoid creating maintenance and support issues for the future, IMO. Regards Antoine. From ncoghlan at gmail.com Sat May 5 14:12:51 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 5 May 2012 22:12:51 +1000 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120505123303.29c3f4bb@pitrou.net> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> Message-ID: On Sat, May 5, 2012 at 8:33 PM, Antoine Pitrou wrote: > If anything, I'd like to see data points about these "current version > control systems" being "pretty abysmal [!] when it comes to coping > with directory renames". It's really irrelevant. The real deciding factor is that Guido didn't like the scheme proposed in PEP 382, so he rejected it. However, my personal experience with both git and hg is that renaming files still generates an awful lot of diff noise - none of them have formal rename, they still fake it with "remove and add" the same way subversion does. "abysmal" is really too strong a word (they're much better than CVS), but it's still a far cry from formal rename tracking. Since we *want* people to eventually drop their custom namespace package systems in favour of the standard one, it makes sense to make the migration path as smooth as possible. Requiring people to do a mass rename of files makes it unnecessarily difficult for them to make that transition. > This doesn't address PEP 420's issues, which will still come to bite us > in 10 years: the potential for confusion, the weirdness of the lookup > algorithm. Believe me, I sympathise - PEP 420 getting accepted is going to mean I have to make some fairly major changes to PEP 395 before I can propose it for 3.4. However, the proposed mechanism in PEP 420 basically just brings Python's import system into line with the way that C, Java, Perl, etc all already work, so I predict the "maintenance and support issues for the future" as a result of this change aren't going to be severe (particularly once I revise PEP 395 to be primarily a proposal for better error reporting in various error cases relating to __main__). I've also started Tools/scripts/import_diagnostics.py - initially just to help me while trying to eliminate the _frozen_importlib vs importlib._bootstrap duplication, but longer term I hope to see some more sophisticated commands get added so that people can easily get better info if their imports start doing strange things. After the last discussion, I now believe that accepting *either* PEP 382 or 420 will lead to an acceptable long term outcome. While my own preferences still favour the explicit approach in PEP 382, I can also acknowledge that PEP 420 has its own attractive features, most notably that it: - is more consistent with the module systems of other languages - has a greater chance of completely displacing existing namespace package mechanisms in the long term - is significantly more intuitive than PEP 382, since almost nothing else uses directory extensions, so any scheme relying on them is going to feel awkward and unintuitive to beginners and veterans alike (and we can't use a shared marker file, since getting rid of __init__.py is the entire point of these PEPs, and using a *set* of marker files with a common extension clutters the filesystem and means we have to do pattern matching on directory listings during import instead of being able to use simple stat calls and exact string matches). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Sat May 5 14:32:26 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 5 May 2012 14:32:26 +0200 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> Message-ID: <20120505143226.4c438951@pitrou.net> On Sat, 5 May 2012 22:12:51 +1000 Nick Coghlan wrote: > > It's really irrelevant. The real deciding factor is that Guido didn't > like the scheme proposed in PEP 382, so he rejected it. > > However, my personal experience with both git and hg is that renaming > files still generates an awful lot of diff noise - none of them have > formal rename, they still fake it with "remove and add" the same way > subversion does. That doesn't seem to make any difference in practice: $ cat a a\nb\n $ hg mv a b $ hg di diff --git a/a b/b rename from a rename to b (there's no "awful lot of diff noise" above) > "abysmal" is really too strong a word (they're much > better than CVS), but it's still a far cry from formal rename > tracking. Well, you should come up with well-defined situations where this is a problem, or you are making a purity argument. (I'm still baffled that FUD about VCS capabilities has a weight in the discussion of a Python PEP; yes, they're much better than CVS :-)) > Requiring people to do a > mass rename of files makes it unnecessarily difficult for them to make > that transition. Renaming a directory should not be "unnecessarily difficult" by any stretch of the word, especially for a developer of something as large as a project requiring namespace packages. Any x.y -> x.y+1 transition is harder than renaming a directory for any such large Python project. > However, the proposed mechanism in PEP 420 basically just > brings Python's import system into line with the way that C, Java, > Perl, etc all already work, so I predict the "maintenance and support > issues for the future" as a result of this change aren't going to be > severe Python's import system is different from these languages', so the implications are not the same either. The very fact that PEP 420 has to propose a deferred detection of namespace packages compared to other kinds of importable objects (modules, packages) proves it. > - is significantly more intuitive than PEP 382, since almost nothing > else uses directory extensions, so any scheme relying on them is going > to feel awkward and unintuitive to beginners and veterans alike (and > we can't use a shared marker file, since getting rid of __init__.py is > the entire point of these PEPs, and using a *set* of marker files with > a common extension clutters the filesystem and means we have to do > pattern matching on directory listings during import instead of being > able to use simple stat calls and exact string matches). "clutters the filesystem"? We're talking about a little-used feature here. As for "simple stat calls" instead of "directory listings", I suggest you take a look at current importlib, because it uses directory listings in order to avoid stat calls :-) Regards Antoine. From eric at trueblade.com Sat May 5 14:38:13 2012 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 05 May 2012 08:38:13 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> Message-ID: <4FA51F35.80807@trueblade.com> On 5/5/2012 8:12 AM, Nick Coghlan wrote: > On Sat, May 5, 2012 at 8:33 PM, Antoine Pitrou wrote: >> If anything, I'd like to see data points about these "current version >> control systems" being "pretty abysmal [!] when it comes to coping >> with directory renames". > > It's really irrelevant. The real deciding factor is that Guido didn't > like the scheme proposed in PEP 382, so he rejected it. Right. I think arguing about VCS capabilities is pointless. You'll need to convince Guido, instead. Eric. From ncoghlan at gmail.com Sat May 5 15:18:20 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 5 May 2012 23:18:20 +1000 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120505143226.4c438951@pitrou.net> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> <20120505143226.4c438951@pitrou.net> Message-ID: On Sat, May 5, 2012 at 10:32 PM, Antoine Pitrou wrote: > On Sat, 5 May 2012 22:12:51 +1000 > Nick Coghlan wrote: >> >> It's really irrelevant. The real deciding factor is that Guido didn't >> like the scheme proposed in PEP 382, so he rejected it. >> >> However, my personal experience with both git and hg is that renaming >> files still generates an awful lot of diff noise - none of them have >> formal rename, they still fake it with "remove and add" the same way >> subversion does. > > That doesn't seem to make any difference in practice: > > $ cat a > a\nb\n > $ hg mv a b > $ hg di > diff --git a/a b/b > rename from a > rename to b > > (there's no "awful lot of diff noise" above) Now rename zope/ to zope.pyp/ in a full Zope checkout and see how much noise you get. Besides, I have yet to have any VCS (git and hg included) get a rename right. My bad experiences with renames is one element that has helped me to come to terms with the fact that PEP 382 is dead and PEP 420 is going to replace it. If your experiences differ, then fine, that's not going to help you accept the decision. But it doesn't matter *how* you come to terms with it, only that you do. That's really the only option here: Guido has flat out rejected PEP 382 because he doesn't like the idea of directory extensions. It's not coming back. However, the PEP 420 authors should probably take note that the two most interested people that weren't in the room at the language summit still don't find that the PEP text explains the situation all that well :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Sat May 5 15:37:12 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 5 May 2012 15:37:12 +0200 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> <20120505143226.4c438951@pitrou.net> Message-ID: <20120505153712.70571117@pitrou.net> On Sat, 5 May 2012 23:18:20 +1000 Nick Coghlan wrote: > > If your experiences differ, then fine, that's not going to help you > accept the decision. But it doesn't matter *how* you come to terms > with it, only that you do. That's really the only option here: Guido > has flat out rejected PEP 382 because he doesn't like the idea of > directory extensions. It's not coming back. Then perhaps PEP 420 should be rejected too, because of the complication it introduces. I've done a Google code search and there doesn't seem to be much more than a dozen projects using namespace packages (Zope, pygraph, a couple of others): http://code.google.com/codesearch#search&q=lang:python+declare_namespace http://code.google.com/codesearch#search&q=lang:python+extend_path The current idiom is not extremely pretty but it works, and it doesn't seem to cause much trouble. The fact that setuptools proposes a different idiom from pkgutil's is not due to the idiom itself, but probably historical reasons: both idioms require a single import and a single function call, so they are similarly expressive. The lack-of-prettiness argument is quite underwhelming when there are so few projects using namespace packages; and this is not something you see when you only *use* the package, rather than develop it. Regards Antoine. From solipsis at pitrou.net Sat May 5 15:42:34 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 5 May 2012 15:42:34 +0200 Subject: [Import-SIG] PEP 420 issue: standard namespace packages References: <20120505091813.Horde.angpY8L8999PpNQ1Vz8hCnA@webmail.df.eu> Message-ID: <20120505154234.75e8c40b@pitrou.net> On Sat, 05 May 2012 09:18:13 +0200 martin at v.loewis.de wrote: > I'd like the PEP to rule that the standard library may > designate some of its packages as namespace packages, > and also specifically declare the encodings package as > a namespace package. This would allow to install additional > encodings just by mere installation, without the need of > having a search function registered at startup. What would be the impact on startup time when encodings get imported? The PEP 420 algorithm makes performance of namespace packages potentially much lower than regular packages (if sys.path is long or parts of it reside on a slow filesystem). Regards Antoine. From eric at trueblade.com Sat May 5 15:45:50 2012 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 05 May 2012 09:45:50 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120505153712.70571117@pitrou.net> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> <20120505143226.4c438951@pitrou.net> <20120505153712.70571117@pitrou.net> Message-ID: <4FA52F0E.6040704@trueblade.com> On 5/5/2012 9:37 AM, Antoine Pitrou wrote: > On Sat, 5 May 2012 23:18:20 +1000 > I've done a Google code search and there doesn't seem to be much more > than a dozen projects using namespace packages (Zope, pygraph, a couple > of others): >From my experience, they're used extensively inside companies. Three unrelated companies I've worked at use "company_name" as their top-level namespace package. Eric. From eric at trueblade.com Sat May 5 15:51:51 2012 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 05 May 2012 09:51:51 -0400 Subject: [Import-SIG] PEP 420 issue: standard namespace packages In-Reply-To: <20120505154234.75e8c40b@pitrou.net> References: <20120505091813.Horde.angpY8L8999PpNQ1Vz8hCnA@webmail.df.eu> <20120505154234.75e8c40b@pitrou.net> Message-ID: <4FA53077.5020305@trueblade.com> On 5/5/2012 9:42 AM, Antoine Pitrou wrote: > On Sat, 05 May 2012 09:18:13 +0200 > martin at v.loewis.de wrote: >> I'd like the PEP to rule that the standard library may >> designate some of its packages as namespace packages, >> and also specifically declare the encodings package as >> a namespace package. This would allow to install additional >> encodings just by mere installation, without the need of >> having a search function registered at startup. > > What would be the impact on startup time when encodings get imported? > > The PEP 420 algorithm makes performance of namespace packages > potentially much lower than regular packages (if sys.path is long or > parts of it reside on a slow filesystem). I don't see how this issue is related to PEP 420 specifically. All of the alternatives also involve scanning the path to find parts of the namespace package. I'd rather the "parts of the std lib be namespace packages" be a separate issue, once a namespace package PEP is accepted. Although I agree the possibility should be mentioned in the PEP. Eric. From ncoghlan at gmail.com Sat May 5 15:55:23 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 5 May 2012 23:55:23 +1000 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120505153712.70571117@pitrou.net> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> <20120505143226.4c438951@pitrou.net> <20120505153712.70571117@pitrou.net> Message-ID: On Sat, May 5, 2012 at 11:37 PM, Antoine Pitrou wrote: > The lack-of-prettiness argument is quite underwhelming when there are > so few projects using namespace packages; and this is not something you > see when you only *use* the package, rather than develop it. No, it's a chicken-and-egg problem. Yes, namespace packages *are* possible now, but they're a PITA to coordinate (everybody has to play by the rules and put the right magic incantation in their __init__.py files). So, people avoid them because they're a pain, not because they're necessarily a bad idea (when used appropriately). However, the problem isn't with the concept of namespace packages, it's with the current awkward *implementation*. Both PEP 382 and 420 fix the ugliness problem and bring namespace packages up to a standard where I'd be happy seeing them used in the standard library (MvL has proposed that "encodings" would be a good candidate for that, and I'm inclined to agree). A clean collaborative namespace system also helps with the evolution of informal taxonomies on PyPI. You can see this on CPAN, where file related modules are all in File::, email related ones are in Email::, etc. At the moment, pretty much everything ends up being a top-level module on PyPI, *because* namespace packages are so awkward and unintuitive. If those of us that do stdlib backports like contextlib2, unittest2 and distutils2 could just as easily publish backports.contextlib, backports.unittest and backports.packaging, that would make it *much* clearer to the world what is going on. None of us are willing to do that at the moment, because we'd have to coordinate the installation of backports.__init__, instead of being able to just include an additional directory in our path names. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Sat May 5 16:23:11 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 5 May 2012 16:23:11 +0200 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> <20120505143226.4c438951@pitrou.net> <20120505153712.70571117@pitrou.net> Message-ID: <20120505162311.391df655@pitrou.net> On Sat, 5 May 2012 23:55:23 +1000 Nick Coghlan wrote: > > A clean collaborative namespace system also helps with the evolution > of informal taxonomies on PyPI. You can see this on CPAN, where file > related modules are all in File::, email related ones are in Email::, > etc. At the moment, pretty much everything ends up being a top-level > module on PyPI, *because* namespace packages are so awkward and > unintuitive. "Flat is better than nested" would indicate this is a virtue. The stdlib's experiments with nested namespaces (e.g. urllib.request) have turned out quite unpractical and clumsy IMHO. Also, namespace packages have an authority problem: what happens if two namespaces packages both define e.g. "foo/bar.py"? It works when you have a central body, such as Zope or Eric's companies, but otherwise? > If those of us that do stdlib backports like contextlib2, unittest2 > and distutils2 could just as easily publish backports.contextlib, > backports.unittest and backports.packaging, that would make it *much* > clearer to the world what is going on. Would it? Why would it go into the "backports" package? Why favour this category over another (e.g. "testing.unittest")? You're soon gonna re-discover the limitations of hierarchical classification :-) > None of us are willing to do > that at the moment, because we'd have to coordinate the installation > of backports.__init__, instead of being able to just include an > additional directory in our path names. Would you? You could just settle on the standard pkgutil boilerplate in __init__.py. Regards Antoine. From eric at trueblade.com Sat May 5 16:37:39 2012 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 05 May 2012 10:37:39 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120505162311.391df655@pitrou.net> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> <20120505143226.4c438951@pitrou.net> <20120505153712.70571117@pitrou.net> <20120505162311.391df655@pitrou.net> Message-ID: <4FA53B33.706@trueblade.com> On 5/5/2012 10:23 AM, Antoine Pitrou wrote: > On Sat, 5 May 2012 23:55:23 +1000 > Nick Coghlan wrote: >> None of us are willing to do >> that at the moment, because we'd have to coordinate the installation >> of backports.__init__, instead of being able to just include an >> additional directory in our path names. > > Would you? You could just settle on the standard pkgutil boilerplate in > __init__.py. The typical problem here is for system packagers (RPM, DEB, ...). The shared __init__.py has to be removed from each individual package and placed in a standalone package that all of the other packages have to depend on. That's a lot of hassle, and one more roadblock to using namespace packages. setuptools is some help here, but many people object to using it. All of the namespace PEPs address this problem by having no file that's shared among all of the portions (to use the 382 and 420 term). I think there's wide agreement that the import machinery should understand namespace packages. You (Antoine) seem to be arguing against it, but it's pretty well settled, if the PyCon discussions are representative (which they may not be). Eric. From yselivanov.ml at gmail.com Sat May 5 18:06:48 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sat, 5 May 2012 12:06:48 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> <20120505143226.4c438951@pitrou.net> Message-ID: <3C8BD127-390C-4CFE-9172-C83D35401BE5@gmail.com> On 2012-05-05, at 9:18 AM, Nick Coghlan wrote: > Now rename zope/ to zope.pyp/ in a full Zope checkout and see how much > noise you get. Why can't we modify whatever PEP to simply mark namespace package with '__init__.pyp' or some other special file? Why rename directories, introduce ugly suffixes, deal with all the weirdness of importing just plain directories and guessing that they are namespace packages, ignoring content in __init__.py etc, instead of plain simple file marker? In terms of steps (as Nick illustrated): With PEP 382, the migration path is: 1. delete all __init__.py files from namespace package portions 2. rename the directories for all namespace package portions to append the ".pyp" extension With PEP 420, the migration path is: 1. delete all __init__.py files from namespace package portions 2. there is no step 2 With a marker: 1. $ mv __init__.py __init__.pyp 2. there is no step 2 The first step can be even replaced with '$ rm __init__.py && touch __init__.pyp', as current __init__.py files of namespace packages contain only '__path__ = extend_path(__path__ ...)' crap. - Yury From eric at trueblade.com Sat May 5 18:20:24 2012 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 05 May 2012 12:20:24 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <3C8BD127-390C-4CFE-9172-C83D35401BE5@gmail.com> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> <20120505143226.4c438951@pitrou.net> <3C8BD127-390C-4CFE-9172-C83D35401BE5@gmail.com> Message-ID: <4FA55348.3060808@trueblade.com> On 5/5/2012 12:06 PM, Yury Selivanov wrote: > On 2012-05-05, at 9:18 AM, Nick Coghlan wrote: >> Now rename zope/ to zope.pyp/ in a full Zope checkout and see how much >> noise you get. > > Why can't we modify whatever PEP to simply mark namespace package > with '__init__.pyp' or some other special file? Why rename directories, > introduce ugly suffixes, deal with all the weirdness of importing > just plain directories and guessing that they are namespace packages, > ignoring content in __init__.py etc, instead of plain simple file > marker? > Because it doesn't solve the problem of wanting to distribute namespace packages in pieces, using platform package managers, and installing them all into the same directory. If you do this, your __init__.pyp would need to be shipped with each portion's .rpm or .deb file. Platform package managers don't typically like a single file being included with multiple packages. You can factor it out into yet another package, but then you need to have every namespace package portion depend on it. This is described in PEP 420, and I think also 382. Eric. From pje at telecommunity.com Sat May 5 18:31:09 2012 From: pje at telecommunity.com (PJ Eby) Date: Sat, 5 May 2012 12:31:09 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <3C8BD127-390C-4CFE-9172-C83D35401BE5@gmail.com> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> <20120505143226.4c438951@pitrou.net> <3C8BD127-390C-4CFE-9172-C83D35401BE5@gmail.com> Message-ID: I just want to chime in at this point that PEP 402 actually provides rationale to answer a lot of the questions that are coming up in this thread, and which are still valid in a PEP 420 world. While some folks have complained about PEP 402's length, they're mostly people who were already present for all those discussions and hashing out of rationales. ;-) (I actually wrote 402 with the intent of answering as many as possible of these objections in advance, hence the length.) (On a more serious note, it might help to crib some bits of 402's rationale arguments into 420, so that we don't have to keep answering already-dead proposals that keep coming up, like, "why can't you just add a special file named xyz to fix this".) On Sat, May 5, 2012 at 12:06 PM, Yury Selivanov wrote: > On 2012-05-05, at 9:18 AM, Nick Coghlan wrote: > > Now rename zope/ to zope.pyp/ in a full Zope checkout and see how much > > noise you get. > > Why can't we modify whatever PEP to simply mark namespace package > with '__init__.pyp' or some other special file? Why rename directories, > introduce ugly suffixes, deal with all the weirdness of importing > just plain directories and guessing that they are namespace packages, > ignoring content in __init__.py etc, instead of plain simple file > marker? > > In terms of steps (as Nick illustrated): > > With PEP 382, the migration path is: > 1. delete all __init__.py files from namespace package portions > 2. rename the directories for all namespace package portions to append > the ".pyp" extension > > With PEP 420, the migration path is: > 1. delete all __init__.py files from namespace package portions > 2. there is no step 2 > > With a marker: > 1. $ mv __init__.py __init__.pyp > 2. there is no step 2 > > The first step can be even replaced with > '$ rm __init__.py && touch __init__.pyp', as current __init__.py files > of namespace packages contain only '__path__ = extend_path(__path__ ...)' > crap. > > - > Yury > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat May 5 20:09:16 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 5 May 2012 20:09:16 +0200 Subject: [Import-SIG] A finer-grained import lock Message-ID: <20120505200916.5abc57cf@pitrou.net> Hello, This patch fixes the long-standing issue of deadlocks with a combination of starting threads and importing modules. It also makes PyImport_ImportModuleNoBlock() basically useless. The idea is to have a separate lock for each module being imported, and only use the global import lock around a couple of operations (such as creating the module locks themselves). http://bugs.python.org/issue9260 Regards Antoine. From martin at v.loewis.de Sat May 5 20:57:42 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Sat, 05 May 2012 20:57:42 +0200 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <3C8BD127-390C-4CFE-9172-C83D35401BE5@gmail.com> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> <20120505143226.4c438951@pitrou.net> <3C8BD127-390C-4CFE-9172-C83D35401BE5@gmail.com> Message-ID: <20120505205742.Horde.6M5vStjz9kRPpXgmlbZg7hA@webmail.df.eu> > Why can't we modify whatever PEP to simply mark namespace package > with '__init__.pyp' or some other special file? That file name would not work, as then portions of the namespace would all install the same file, which causes conflicts in platform packaging tools (if the portions get installed into the same sys.path entry). > Why rename directories, > introduce ugly suffixes, deal with all the weirdness of importing > just plain directories and guessing that they are namespace packages, > ignoring content in __init__.py etc, instead of plain simple file > marker? Hence the current PEP doesn't propose to rename directories, and does not introduce ugly suffixes. As for the weirdness of importing just plain directories: yes, it does that. > With PEP 382, the migration path is: > 1. delete all __init__.py files from namespace package portions > 2. rename the directories for all namespace package portions to append > the ".pyp" extension Please understand that an earlier version of the PEP did indeed propose to use marker files instead of directories. You are, of course, free to reiterate four years of discussion in a single week, but please do familiarize yourself with the matter first. After that, you likely have to write a PEP if you want your idea to be seriously considered. Regards, Martin From barry at python.org Sat May 5 21:32:25 2012 From: barry at python.org (Barry Warsaw) Date: Sat, 5 May 2012 15:32:25 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120505205742.Horde.6M5vStjz9kRPpXgmlbZg7hA@webmail.df.eu> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <20120505123303.29c3f4bb@pitrou.net> <20120505143226.4c438951@pitrou.net> <3C8BD127-390C-4CFE-9172-C83D35401BE5@gmail.com> <20120505205742.Horde.6M5vStjz9kRPpXgmlbZg7hA@webmail.df.eu> Message-ID: <20120505153225.2f7bae0a@limelight.wooz.org> >Hence the current PEP doesn't propose to rename directories, and >does not introduce ugly suffixes. As for the weirdness of importing >just plain directories: yes, it does that. Of course, the parents of directories have to be on sys.path, so it's not *that* weird. ;) -Barry From martin at v.loewis.de Mon May 7 10:01:47 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 07 May 2012 10:01:47 +0200 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <20120505004711.2140afbf@pitrou.net> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> Message-ID: <4FA7816B.9070804@v.loewis.de> > Unless there are clear advantages over PEP 382, I'm -1 on this PEP, and > would like to see PEP 382 revived. When I started this project four years ago, I didn't know how involved it would get. At first, there was little interest in it, but the more details were discussed, the more opinions appeared. It eventually lead to PEPs, counter-PEPs, superceded PEPs. I had PEP czars signed up which then resigned in the face of having to make a difficult decision. At this point, I'm happy to have the PEP process. Guido will pronounce on a PEP, and will (as usual) take both community feedback and his own intuition into account. So there will be a decision, and then the community will have to accept it (or else fork Python :-) So while it is fine that people vote in favor or against individual PEPs or selected features, they also need to realize that this may not affect the outcome. Even writing yet another PEP likely will not affect the outcome. While I'm honored with the support, I personally have accepted that Guido has made up his mind on this specific detail. Looking back, I also highly appreciate PJE's pioneering of all this in setuptools (despite still disagreeing on many other aspects of setuptools). Regards, Martin From martin at v.loewis.de Mon May 7 10:38:15 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 07 May 2012 10:38:15 +0200 Subject: [Import-SIG] PEP 420 issue: extend_path Message-ID: <4FA789F7.7080609@v.loewis.de> I'd like to propose that pkgutil.extend_path is specified to also consider portions according to the PEP. Currently, it will only consider portions having an __init__.py If a namespace package gets a portion installed that has an __init__.py, then all existing portions become ignored under the current PEP. With that change, if a portion has an __init__.py that uses extend_path, the other portions would still be considered. With the current PEP, all contributors to a package need to simultaneously agree to drop their __init__.py for 3.3. Initially, this could cause confusion, and hinder adoption of the PEP. The same would also apply to pkg_resources.declare_namespace. Unfortunately, this is out of the scope of the PEP, but I'm sure Tarek would accept a patch to distribute to bring it into conformance to pkgutil. Interestingly, it appears that pkg_util will break under PEP 420, anyway, as it currently does (in _handle_ns) loader = importer.find_module(packageName) if loader is None: return None ... loader.load_module(packageName); module.__path__ = path Now, if loader suddenly becomes a string, than the load_module call will raise an attribute error (untested). Regards, Martin From ncoghlan at gmail.com Mon May 7 11:14:51 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 7 May 2012 19:14:51 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA789F7.7080609@v.loewis.de> References: <4FA789F7.7080609@v.loewis.de> Message-ID: On Mon, May 7, 2012 at 6:38 PM, "Martin v. L?wis" wrote: > I'd like to propose that pkgutil.extend_path is specified to also consider > portions according to the PEP. Currently, it will only > consider portions having an __init__.py > > If a namespace package gets a portion installed that has an __init__.py, > then all existing portions become ignored under > the current PEP. With that change, if a portion has an __init__.py > that uses extend_path, the other portions would still be considered. +1 (both on the change and on PEP 420 stating it explicitly) > Interestingly, it appears that pkg_util will break under PEP 420, anyway, as > it currently does (in _handle_ns) > > ? ?loader = importer.find_module(packageName) > ? ?if loader is None: > ? ? ? ?return None > ... > ? ? ? ?loader.load_module(packageName); module.__path__ = path > > Now, if loader suddenly becomes a string, than the load_module > call will raise an attribute error (untested). Yes, the PEP 420 implementation should definitely add some new pkgutil tests to make sure the various utility functions either work, or at least fail with an intelligible error message, when handed a namespace package. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Mon May 7 12:53:46 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 May 2012 12:53:46 +0200 Subject: [Import-SIG] PEP 420 issue: extend_path References: <4FA789F7.7080609@v.loewis.de> Message-ID: <20120507125346.2ad355ed@pitrou.net> On Mon, 07 May 2012 10:38:15 +0200 "Martin v. L?wis" wrote: > > Interestingly, it appears that pkg_util will break under PEP 420, > anyway, as it currently does (in _handle_ns) > > loader = importer.find_module(packageName) > if loader is None: > return None > ... > loader.load_module(packageName); module.__path__ = path > > Now, if loader suddenly becomes a string, than the load_module > call will raise an attribute error (untested). I think find_module() returning a string is a kludge. It would be better IMO if it returned a dedicated object clearly pointing out that a namespace package was potentially found (and also allowing to record other potential metadata). Regards Antoine. From eric at trueblade.com Mon May 7 15:01:07 2012 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 07 May 2012 09:01:07 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120507125346.2ad355ed@pitrou.net> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> Message-ID: <4FA7C793.4010501@trueblade.com> On 05/07/2012 06:53 AM, Antoine Pitrou wrote: > On Mon, 07 May 2012 10:38:15 +0200 > "Martin v. L?wis" wrote: >> >> Interestingly, it appears that pkg_util will break under PEP 420, >> anyway, as it currently does (in _handle_ns) >> >> loader = importer.find_module(packageName) >> if loader is None: >> return None >> ... >> loader.load_module(packageName); module.__path__ = path >> >> Now, if loader suddenly becomes a string, than the load_module >> call will raise an attribute error (untested). > > I think find_module() returning a string is a kludge. It would be > better IMO if it returned a dedicated object clearly pointing out that > a namespace package was potentially found (and also allowing to record > other potential metadata). Well the original goal was to allow existing finders to be called without modification. Are you saying we always return a dedicated object (thus breaking existing finders)? Or that finders should return a loader or an object (instead of the current PEP 420 behavior of a loader or a string)? I could get behind returning either a loader or another "this could be part of a namespace package" object. But note that in either case we still have to be checking the type of the returned object. From pje at telecommunity.com Mon May 7 15:42:05 2012 From: pje at telecommunity.com (PJ Eby) Date: Mon, 7 May 2012 09:42:05 -0400 Subject: [Import-SIG] PEP 420: Implicit Namespace Packages In-Reply-To: <4FA7816B.9070804@v.loewis.de> References: <4F90730D.1040808@trueblade.com> <20120505004711.2140afbf@pitrou.net> <4FA7816B.9070804@v.loewis.de> Message-ID: On May 7, 2012 4:01 AM, Martin v. L?wis wrote: > > Looking back, I also highly appreciate PJE's pioneering of all this > in setuptools Jim Fulton and Guido are the actual pioneers here... or maybe they're the explorers and I'm the pioneer. ;-) Jim coined the concept, and he and Guido created pkgutil. I just recognized a good idea when I saw it, and made it practical (or nearly so) to install and use them. (I very much appreciate the thought, though.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Mon May 7 15:48:31 2012 From: pje at telecommunity.com (PJ Eby) Date: Mon, 7 May 2012 09:48:31 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120507125346.2ad355ed@pitrou.net> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> Message-ID: On May 7, 2012 6:53 AM, "Antoine Pitrou" wrote: > I think find_module() returning a string is a kludge. It would be > better IMO if it returned a dedicated object clearly pointing out that > a namespace package was potentially found (and also allowing to record > other potential metadata). There isn't any other metadata to record, since a namespace package is simply the sum of its component parts. This makes the string an elegant solution to the requirements, and not a kludge at all. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon May 7 17:00:21 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 May 2012 17:00:21 +0200 Subject: [Import-SIG] PEP 420 issue: extend_path References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> Message-ID: <20120507170021.76f1594f@pitrou.net> On Mon, 07 May 2012 09:01:07 -0400 "Eric V. Smith" wrote: > On 05/07/2012 06:53 AM, Antoine Pitrou wrote: > > On Mon, 07 May 2012 10:38:15 +0200 > > "Martin v. L?wis" wrote: > >> > >> Interestingly, it appears that pkg_util will break under PEP 420, > >> anyway, as it currently does (in _handle_ns) > >> > >> loader = importer.find_module(packageName) > >> if loader is None: > >> return None > >> ... > >> loader.load_module(packageName); module.__path__ = path > >> > >> Now, if loader suddenly becomes a string, than the load_module > >> call will raise an attribute error (untested). > > > > I think find_module() returning a string is a kludge. It would be > > better IMO if it returned a dedicated object clearly pointing out that > > a namespace package was potentially found (and also allowing to record > > other potential metadata). > > Well the original goal was to allow existing finders to be called > without modification. Are you saying we always return a dedicated object > (thus breaking existing finders)? Why would it break existing finders? finder_module() would either return a loader, or a dedicated object (or None). Returning a string is completely non-obvious to the caller (who may not know about namespace packages or their precise implementation in PEP 420). Regards Antoine. From eric at trueblade.com Mon May 7 18:00:52 2012 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 07 May 2012 12:00:52 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120507170021.76f1594f@pitrou.net> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> Message-ID: <4FA7F1B4.7070405@trueblade.com> On 05/07/2012 11:00 AM, Antoine Pitrou wrote: > On Mon, 07 May 2012 09:01:07 -0400 > "Eric V. Smith" wrote: >> On 05/07/2012 06:53 AM, Antoine Pitrou wrote: >>> On Mon, 07 May 2012 10:38:15 +0200 >>> "Martin v. L?wis" wrote: >>>> >>>> Interestingly, it appears that pkg_util will break under PEP 420, >>>> anyway, as it currently does (in _handle_ns) >>>> >>>> loader = importer.find_module(packageName) >>>> if loader is None: >>>> return None >>>> ... >>>> loader.load_module(packageName); module.__path__ = path >>>> >>>> Now, if loader suddenly becomes a string, than the load_module >>>> call will raise an attribute error (untested). >>> >>> I think find_module() returning a string is a kludge. It would be >>> better IMO if it returned a dedicated object clearly pointing out that >>> a namespace package was potentially found (and also allowing to record >>> other potential metadata). >> >> Well the original goal was to allow existing finders to be called >> without modification. Are you saying we always return a dedicated object >> (thus breaking existing finders)? > > Why would it break existing finders? finder_module() would either return > a loader, or a dedicated object (or None). If we introduce a new type that all find_module() functions must return in all cases, it would break existing finders. This object would have a flag (or some other value) that says either "I returned a loader" or "I returned a namespace package path". That's how I originally read your message, but I guess that's not what you're saying. If we return this new object instead of what PEP 420 currently defines as a string, then I agree there won't be any impact on existing finders. Just as there won't be an impact if we define it as a string instead of some new object. > Returning a string is completely non-obvious to the caller (who may not > know about namespace packages or their precise implementation in PEP > 420). Well, if you don't know about namespace packages you won't be returning a string. So I don't see a problem there. But I agree there's some little part of me that says "why should namespace packages get to grab string as a return type for find_module(), when maybe there will be some other use for strings in the future". For me, it comes down to future-proofing the API versus the hassle of defining some new class just to support a use case that may never happen. Eric. From solipsis at pitrou.net Mon May 7 18:19:40 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 7 May 2012 18:19:40 +0200 Subject: [Import-SIG] PEP 420 issue: extend_path References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> Message-ID: <20120507181940.3c4e6be8@pitrou.net> On Mon, 07 May 2012 12:00:52 -0400 "Eric V. Smith" wrote: > > > Returning a string is completely non-obvious to the caller (who may not > > know about namespace packages or their precise implementation in PEP > > 420). > > Well, if you don't know about namespace packages you won't be returning > a string. So I don't see a problem there. I'm talking about calling an arbitrary finder, not one you wrote yourself. str is opaque, the str will look like a file path but it's not obvious that it signals the possibility of a namespace package. Also, if we later want find_module() to cater for another strange algorithm, another return type will be needed. Let's be explicit from the start. > For me, it comes down to future-proofing the API versus the hassle of > defining some new class just to support a use case that may never happen. I say we should future-proof it. This is a public API, we don't want to paint ourselves in a corner another time. In any case, do note that returning something else than either a loader or None already breaks the API, AFAICT. Existing code calling find_module() will have to be adapted... Which is perhaps worse than the perceived migration problem in PEP 382. Regards Antoine. From barry at python.org Mon May 7 19:05:01 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 7 May 2012 13:05:01 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA7F1B4.7070405@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> Message-ID: <20120507130501.4883ed4e@rivendell> On May 07, 2012, at 12:00 PM, Eric V. Smith wrote: >But I agree there's some little part of me that says "why should >namespace packages get to grab string as a return type for >find_module(), when maybe there will be some other use for strings in >the future". > >For me, it comes down to future-proofing the API versus the hassle of >defining some new class just to support a use case that may never happen. I'm of the same opinion. Returning strings seems useful and convenient, so I don't have a problem with it. I'd like a slightly more compelling argument then *maybe* future proofing to return more complicated, but I suppose if there's overwhelming support to return an object, that would be okay too. -Barry From eric at trueblade.com Mon May 7 20:03:13 2012 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 07 May 2012 14:03:13 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120507181940.3c4e6be8@pitrou.net> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> Message-ID: <4FA80E61.1080108@trueblade.com> On 05/07/2012 12:19 PM, Antoine Pitrou wrote: > In any case, do note that returning something else than either a loader > or None already breaks the API, AFAICT. Existing code calling > find_module() will have to be adapted... Which is perhaps worse than > the perceived migration problem in PEP 382. Right. I'm still looking at Martin's message from this morning. I had assumed that all callers of find_module() were in importlib, but obviously that's not correct. Are there any callers of find_module() outside of the standard library? That would be reason to invent a new API, instead of trying to reuse find_module(). Eric. From martin at v.loewis.de Mon May 7 22:33:01 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 07 May 2012 22:33:01 +0200 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120507170021.76f1594f@pitrou.net> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> Message-ID: <4FA8317D.8050704@v.loewis.de> > Why would it break existing finders? finder_module() would either return > a loader, or a dedicated object (or None). That wouldn't fix the issue at hand: callers of find_module that either expect None or a loader would still break when they get the dedicated object. > Returning a string is completely non-obvious to the caller (who may not > know about namespace packages or their precise implementation in PEP > 420). For the issue at hand, it makes no difference whether it's a string or a dedicated object. Regards, Martin From martin at v.loewis.de Mon May 7 22:35:28 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 07 May 2012 22:35:28 +0200 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA80E61.1080108@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <4FA80E61.1080108@trueblade.com> Message-ID: <4FA83210.2080105@v.loewis.de> > Are there any callers of find_module() outside of the standard library? > That would be reason to invent a new API, instead of trying to reuse > find_module(). Yes: the one I mentioned actually is in pkg_resources, which is part of distribute: def _handle_ns(packageName, path_item): """Ensure that named package includes a subpath of path_item (if needed)""" importer = get_importer(path_item) if importer is None: return None loader = importer.find_module(packageName) if loader is None: return None module = sys.modules.get(packageName) if module is None: module = sys.modules[packageName] = types.ModuleType(packageName) module.__path__ = []; _set_parent_ns(packageName) elif not hasattr(module,'__path__'): raise TypeError("Not a package:", packageName) handler = _find_adapter(_namespace_handlers, importer) subpath = handler(importer,path_item,packageName,module) if subpath is not None: path = module.__path__; path.append(subpath) loader.load_module(packageName); module.__path__ = path return subpath Regards, Martin From ncoghlan at gmail.com Tue May 8 00:56:26 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 8 May 2012 08:56:26 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120507181940.3c4e6be8@pitrou.net> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> Message-ID: On May 8, 2012 2:21 AM, "Antoine Pitrou" wrote: > > In any case, do note that returning something else than either a loader > or None already breaks the API, AFAICT. Existing code calling > find_module() will have to be adapted... Which is perhaps worse than > the perceived migration problem in PEP 382. 382 would have had the same problem. Given that sys.path already holds strings, as do __path__ attributes, I don't see any value in adding a separate PathEntry type just for signalling purposes. However, I do think we need to give more thought to allowing old finders to gracefully degrade by reporting "not found" instead of throwing an error. Will write more on that when I get to a real computer :) -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue May 8 02:27:56 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 8 May 2012 02:27:56 +0200 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> Message-ID: <20120508022756.18e83c18@pitrou.net> On Tue, 8 May 2012 08:56:26 +1000 Nick Coghlan wrote: > > Given that sys.path already holds strings, as do __path__ attributes, I > don't see any value in adding a separate PathEntry type just for signalling > purposes. How should I know, if load_module() returns a string, that it's supposed to denote a possible namespace package? sys.path and __path__ don't have anything to do with that. Regards Antoine. From eric at trueblade.com Tue May 8 02:50:48 2012 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 07 May 2012 20:50:48 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120508022756.18e83c18@pitrou.net> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> Message-ID: <4FA86DE8.6000607@trueblade.com> On 5/7/2012 8:27 PM, Antoine Pitrou wrote: > On Tue, 8 May 2012 08:56:26 +1000 > Nick Coghlan wrote: >> >> Given that sys.path already holds strings, as do __path__ attributes, I >> don't see any value in adding a separate PathEntry type just for signalling >> purposes. > > How should I know, if load_module() returns a string, that it's supposed > to denote a possible namespace package? Because it will be documented. From ncoghlan at gmail.com Tue May 8 03:50:12 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 8 May 2012 11:50:12 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120508022756.18e83c18@pitrou.net> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> Message-ID: On Tue, May 8, 2012 at 10:27 AM, Antoine Pitrou wrote: > On Tue, 8 May 2012 08:56:26 +1000 > Nick Coghlan wrote: >> >> Given that sys.path already holds strings, as do __path__ attributes, I >> don't see any value in adding a separate PathEntry type just for signalling >> purposes. > > How should I know, if load_module() returns a string, that it's supposed > to denote a possible namespace package? sys.path and __path__ don't > have anything to do with that. Because it would be documented that any string that find_module() returns is an entry for the namespace package's __path__. It's an extension of the existing meaning of strings in the module search algorithm, rather than anything radically new. However, I think you're right that suddenly returning a new type (*any* new type) from an interface that was previously documented as solely returning either a loader or None is too large a breach of backwards compatibility to be acceptable. So, here's my proposal: we instead build the PEP 420 namespace package construction algorithm *into a loader*. What my scheme would involve is this: - in find_module, when the *first* namespace portion is found, a new PackageLoader is created and is initialised with a copy of the trailing portions of the path being searched, as well as the "packages" suffix list from the FileFinder instance. We *do not* check for an __init__.py at this point - we only check for the existence of the directory (this directory existence check already exists in FileFinder [1]). - PackageLoader.load_module() would then be responsible for scanning the relevant sections of the path for additional portions. If it finds an __init__.py in *any* portion, then it immediately stops the scan and returns an ordinary self-contained package by creating a new loader of the appropriate type and using *that* to load the package. Otherwise it creates a namespace package directly. The rest of importlib should then remain largely untouched - all that should be necessary is the definition of PackageLoader and the update to FileFinder to return it when appropriate. Since PackageLoader would need its own variant of FileFinder in this case, I suggest refactoring a bit so that there are two classes: FileFinder and PortionFinder (with the latter being a subclass of the former). No magic return values, no backwards compatibility problem. Regards, Nick. [1] http://hg.python.org/cpython/file/default/Lib/importlib/_bootstrap.py#l833 -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Tue May 8 04:24:14 2012 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 07 May 2012 22:24:14 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> Message-ID: <4FA883CE.80705@trueblade.com> On 5/7/2012 9:50 PM, Nick Coghlan wrote: > However, I think you're right that suddenly returning a new type > (*any* new type) from an interface that was previously documented as > solely returning either a loader or None is too large a breach of > backwards compatibility to be acceptable. I'm now of this belief, too. > So, here's my proposal: we instead build the PEP 420 namespace package > construction algorithm *into a loader*. > > What my scheme would involve is this: > > - in find_module, when the *first* namespace portion is found, a new > PackageLoader is created and is initialised with a copy of the > trailing portions of the path being searched, as well as the > "packages" suffix list from the FileFinder instance. We *do not* check > for an __init__.py at this point - we only check for the existence of > the directory (this directory existence check already exists in > FileFinder [1]). > > - PackageLoader.load_module() would then be responsible for scanning > the relevant sections of the path for additional portions. If it finds > an __init__.py in *any* portion, then it immediately stops the scan > and returns an ordinary self-contained package by creating a new > loader of the appropriate type and using *that* to load the package. > Otherwise it creates a namespace package directly. I don't think you can do this, at least without losing the ability to create a namespace package where portions exist in different path_hook loaders. Currently (in the pep-420 branch) you can have a portion in the filesystem (FilePath loader), and a portion in a zip file (zipimport loader). See the SeparatedNestedZipNamespacePackages test in test_namespace_pkgs.py. I believe what you're suggesting requires the logic be moved from PathFinder (which is a meta path hook) in to FileFinder (which is a path hook). That's why it would break the cross-finder use case. Note that the meta path hook PathFinder doesn't know anything about directories or filesystems. That's why it currently (in the pep-420 branch) delegates everything to the path hook finders. I think the better solution is to create a new finder method, called something like find_module_or_namespace_portion (but obviously with a better name). If this exists, then it would be called and allowed to return a loader, string, or None. If it doesn't exist, find_module would be called. It could not participate in namespace packages and could only return a loader or None. I think the use case of being able to have namespace package portions returned by different path hooks is important. Imagine a case where the namespace "encodings" takes off. Who's to same some portions don't ship as zip files, some as regular files, and maybe some with a hypothetical http loader? Eric. From ncoghlan at gmail.com Tue May 8 05:10:41 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 8 May 2012 13:10:41 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA883CE.80705@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> Message-ID: On Tue, May 8, 2012 at 12:24 PM, Eric V. Smith wrote: > I don't think you can do this, at least without losing the ability to > create a namespace package where portions exist in different path_hook > loaders. Currently (in the pep-420 branch) you can have a portion in the > filesystem (FilePath loader), and a portion in a zip file (zipimport > loader). See the SeparatedNestedZipNamespacePackages test in > test_namespace_pkgs.py. > > I believe what you're suggesting requires the logic be moved from > PathFinder (which is a meta path hook) in to FileFinder (which is a path > hook). That's why it would break the cross-finder use case. > > Note that the meta path hook PathFinder doesn't know anything about > directories or filesystems. That's why it currently (in the pep-420 > branch) delegates everything to the path hook finders. No, it just means that PackageLoader needs to be based on PathFinder rather than FileFinder. That way the new logic can be fully isolated from the higher level finder implementation. > I think the better solution is to create a new finder method, called > something like find_module_or_namespace_portion (but obviously with a > better name). If this exists, then it would be called and allowed to > return a loader, string, or None. If it doesn't exist, find_module would > be called. It could not participate in namespace packages and could only > return a loader or None. I'd suggest the simpler hook "find_package" that has the new semantics and is *only* called by a new PackageLoader class. The algorithm would then be: - the main PathFinder loops scans the sys.path or the relevant __path__ attribute until it finds a loader. Full stop, end of story. - PackageLoader.load_module() handles scanning the *rest* of the path in order to populate namespace packages, roughly as follows: package_paths = [] for entry in path_to_scan: importer = _get_importer(entry) # Check path_importer_cache, etc try: find_loader = importer.find_package except AttributeError: find_loader = importer.find_module loader = find_loader(fullname) try: load_module = loader.load_module except AttributeError: pass else: return load_module(fullname) if loader is not None: package_paths.append(loader) return make_namespace_package(package_paths) The find_package vs find_module distinction also lets us resolve the potential for infinite recursion in FileFinder without needing an additional subclass. For find_module, FileFinder would return the new PackageLoader instances, while find_package would return either strings (for namespace package portions) or the appropriate loader for __init__.py (for self-contained packages) > I think the use case of being able to have namespace package portions > returned by different path hooks is important. Imagine a case where the > namespace "encodings" takes off. Who's to same some portions don't ship > as zip files, some as regular files, and maybe some with a hypothetical > http loader? Agreed, but I still want to get this out of the main import path, so that it only happens if a namespace portion gets encountered during the scan. For backwards compatibility with existing import reimplementations, the expected top level semantics should remain "when you find a loader, stop scanning and call the load_module() method". Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Tue May 8 05:57:45 2012 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 07 May 2012 23:57:45 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> Message-ID: <4FA899B9.9080904@trueblade.com> On 5/7/2012 11:10 PM, Nick Coghlan wrote: > On Tue, May 8, 2012 at 12:24 PM, Eric V. Smith wrote: >> I believe what you're suggesting requires the logic be moved from >> PathFinder (which is a meta path hook) in to FileFinder (which is a path >> hook). That's why it would break the cross-finder use case. >> >> Note that the meta path hook PathFinder doesn't know anything about >> directories or filesystems. That's why it currently (in the pep-420 >> branch) delegates everything to the path hook finders. > > No, it just means that PackageLoader needs to be based on PathFinder > rather than FileFinder. That way the new logic can be fully isolated > from the higher level finder implementation. Right. That's what the pep-420 branch [1] currently does. >> I think the better solution is to create a new finder method, called >> something like find_module_or_namespace_portion (but obviously with a >> better name). If this exists, then it would be called and allowed to >> return a loader, string, or None. If it doesn't exist, find_module would >> be called. It could not participate in namespace packages and could only >> return a loader or None. > > I'd suggest the simpler hook "find_package" that has the new semantics > and is *only* called by a new PackageLoader class. > > The algorithm would then be: > > - the main PathFinder loops scans the sys.path or the relevant > __path__ attribute until it finds a loader. Full stop, end of story. > - PackageLoader.load_module() handles scanning the *rest* of the path > in order to populate namespace packages, roughly as follows: > > package_paths = [] > for entry in path_to_scan: > importer = _get_importer(entry) # Check path_importer_cache, etc > try: > find_loader = importer.find_package > except AttributeError: > find_loader = importer.find_module > loader = find_loader(fullname) > try: > load_module = loader.load_module > except AttributeError: > pass > else: > return load_module(fullname) > if loader is not None: > package_paths.append(loader) > return make_namespace_package(package_paths) This is exactly what the pep-420 branch does (in PathFinder.find_module), with the addition of find_package (my find_module_or_namespace_portion, from above). For each entry in the path, get the finder. If it can load this path return the loader. If not, remember the path component. If no loaders are found, return a NamespaceLoader. > The find_package vs find_module distinction also lets us resolve the > potential for infinite recursion in FileFinder without needing an > additional subclass. For find_module, FileFinder would return the new > PackageLoader instances, while find_package would return either > strings (for namespace package portions) or the appropriate loader for > __init__.py (for self-contained packages) I don't see where FileFinder can infinitely recurse. I'm not sure find_package is a great name for something that can return a loader or a string, but surely it's better than the more descriptive find_module_or_namespace_portion! Have you looked at the pep-420 branch? > Agreed, but I still want to get this out of the main import path, so > that it only happens if a namespace portion gets encountered during > the scan. For backwards compatibility with existing import > reimplementations, the expected top level semantics should remain > "when you find a loader, stop scanning and call the load_module() > method". Again, that's what the PEP describes and is implemented in the pep-420 branch. I think the only difference between what you're describing and what the PEP currently specifies is the find_package() method. The PEP says that's the functionality of the modified find_module(), but I agree that find_module() should be unmodified and we need a new finder method. I think I'll modify the code in the pep-420 branch with find_package(), keeping find_module() unmodified from the version in the 3.3 branch. Assuming that works out, I'll modify the PEP and point it to this discussion. Eric. [1]: http://hg.python.org/features/pep-420/ From eric at trueblade.com Tue May 8 06:52:05 2012 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 08 May 2012 00:52:05 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA789F7.7080609@v.loewis.de> References: <4FA789F7.7080609@v.loewis.de> Message-ID: <4FA8A675.9010105@trueblade.com> On 05/07/2012 04:38 AM, "Martin v. L?wis" wrote: > I'd like to propose that pkgutil.extend_path is specified to also > consider portions according to the PEP. Currently, it will only > consider portions having an __init__.py > > If a namespace package gets a portion installed that has an __init__.py, > then all existing portions become ignored under > the current PEP. With that change, if a portion has an __init__.py > that uses extend_path, the other portions would still be considered. > > With the current PEP, all contributors to a package need to > simultaneously agree to drop their __init__.py for 3.3. Initially, > this could cause confusion, and hinder adoption of the PEP. > > The same would also apply to pkg_resources.declare_namespace. > Unfortunately, this is out of the scope of the PEP, but I'm sure Tarek > would accept a patch to distribute to bring it into conformance to > pkgutil. I agree this is an important consideration. I haven't had time to think it through, yet. > Interestingly, it appears that pkg_util will break under PEP 420, > anyway, as it currently does (in _handle_ns) > > loader = importer.find_module(packageName) > if loader is None: > return None > ... > loader.load_module(packageName); module.__path__ = path > > Now, if loader suddenly becomes a string, than the load_module > call will raise an attribute error (untested). I've become convinced that we need a new finder method, and leave find_module with its current (3.2) semantics. Eric. From eric at trueblade.com Tue May 8 07:09:53 2012 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 08 May 2012 01:09:53 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA899B9.9080904@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> Message-ID: <4FA8AAA1.6010202@trueblade.com> On 05/07/2012 11:57 PM, Eric V. Smith wrote: > I think I'll modify the code in the pep-420 branch with find_package(), > keeping find_module() unmodified from the version in the 3.3 branch. > Assuming that works out, I'll modify the PEP and point it to this > discussion. I've checked the find_package() version in to the pep-420 branch. I'm still not wild about the name, but I think leaving find_module() unmodified is an improvement. After some discussion and a possible name change, I'll update PEP 420. Eric. From martin at v.loewis.de Tue May 8 08:40:57 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Tue, 08 May 2012 08:40:57 +0200 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> Message-ID: <20120508084057.Horde.6FMVQdjz9kRPqL-5f43jlHA@webmail.df.eu> >> In any case, do note that returning something else than either a loader >> or None already breaks the API, AFAICT. Existing code calling >> find_module() will have to be adapted... Which is perhaps worse than >> the perceived migration problem in PEP 382. > > 382 would have had the same problem. No, it wouldn't. In PEP 382, find_module is left unmodified. Instead, find_package_portion can optionally be implemented by finders. So existing callers continue to work fine - they just won't find package portions. Existing finders continue to work fine as long as they properly give an AttributeError when someone tries to access the find_package_portion method. Regards, Martin From ncoghlan at gmail.com Tue May 8 10:04:40 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 8 May 2012 18:04:40 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA8AAA1.6010202@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> Message-ID: On Tue, May 8, 2012 at 3:09 PM, Eric V. Smith wrote: > On 05/07/2012 11:57 PM, Eric V. Smith wrote: > >> I think I'll modify the code in the pep-420 branch with find_package(), >> keeping find_module() unmodified from the version in the 3.3 branch. >> Assuming that works out, I'll modify the PEP and point it to this >> discussion. > > I've checked the find_package() version in to the pep-420 branch. I'm > still not wild about the name, but I think leaving find_module() > unmodified is an improvement. After some discussion and a possible name > change, I'll update PEP 420. Yes, I'd forgotten that we need to check the rest of the path for ordinary modules as well as self-contained packages after finding the initial directory, so my proposed adjustment doesn't actually save any complexity and my proposed method name is simply wrong. What we're really talking about now is a wholesale replacement for find_module with the new semantics (i.e. allowing a string to be returned for namespace package paths), rather than PEP 382's extra processing step (where find_module would still be called, even if find_package_portion was defined). So, let's call the replacement finder method "find_loader", since that's really what it's for (find_module was always a bit of a misnomer, since the method returns a loader object, not a module). This is where I start to have sympathy for Antoine's point of view: we're overloading the meaning of returning a string from find_loader() to say two things: 1. Here's my sys.path entry 2. Please continue scanning sys.path for additional entries Consider the following possible signature that avoids that overloading by passing in a separate callback for the "found a directory" aspect: def find_loader(fullname, dir_found=None) # dir_found is a callback that gets invoked if a matching directory is found if dir_found is None: def dir_found(dirpath): msg = "Not importing directory {}: missing __init__" _warnings.warn(msg.format(dirpath), ImportWarning) # As per existing find_module, but calls dir_found(base_path) instead # emitting ImportWarning directly This would then be used roughly as follows: package_path = [] for importer in _iter_importers(path): try: find_loader = importer.find_loader except AttributeError: # Backwards compatibility with the original PEP 302 finder API loader = importer.find_module(fullname) else: loader = importer.find_loader(fullname, package_path.append) if loader is not None: return loader if package_path: return NamespaceLoader(package_path) And the delegation from find_module would look like: def find_module(self, fullname): """Try to find a loader for the specified module.""" return self.find_loader(fullname) Advantages of this approach: - cleanly separates "here's my sys.path entry" (dir_found callback) and "please continue scanning sys.path" (None return value) - obvious and accurate name for the new method (i.e. "find_loader") - provides additional flexibility to import hook consumers - trivial to reproduce old API behaviour with the new API if desired Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Tue May 8 12:40:18 2012 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 08 May 2012 06:40:18 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> Message-ID: <4FA8F812.9000308@trueblade.com> On 05/08/2012 04:04 AM, Nick Coghlan wrote: > What we're really talking about now is a wholesale replacement for > find_module with the new semantics (i.e. allowing a string to be > returned for namespace package paths), rather than PEP 382's extra > processing step (where find_module would still be called, even if > find_package_portion was defined). > > So, let's call the replacement finder method "find_loader", since > that's really what it's for (find_module was always a bit of a > misnomer, since the method returns a loader object, not a module). Right. I'll call find_loader if it exists, else fall back to find_module. Legacy finders will still work with PathFinder, they'll just be unable to provide namespace package portions. And just as important, modified finders will be able to be used with legacy code that calls find_module. > This is where I start to have sympathy for Antoine's point of view: > we're overloading the meaning of returning a string from find_loader() > to say two things: > 1. Here's my sys.path entry > 2. Please continue scanning sys.path for additional entries That's correct on the two values to return. I'm just not very sympathetic for a more complex signature when returning a string will do, instead. > if dir_found is None: > def dir_found(dirpath): > msg = "Not importing directory {}: missing __init__" > _warnings.warn(msg.format(dirpath), ImportWarning) In this case (supporting the now-legacy find_module API), you need to make sure that find_loader returns None. So what you'd really want to do is not define dir_found, and later in find_loader, where you've found an __init__-less directory, say: if dir_found is None: msg = "Not importing directory {}: missing __init__" _warnings.warn(msg.format(dirpath), ImportWarning) return None dir_found(dirpath) return None > Advantages of this approach: > - cleanly separates "here's my sys.path entry" (dir_found callback) > and "please continue scanning sys.path" (None return value) > - obvious and accurate name for the new method (i.e. "find_loader") > - provides additional flexibility to import hook consumers > - trivial to reproduce old API behaviour with the new API if desired I'm okay with the name. The callback just seems like a hassle, especially if the finder is written in C. I don't have a problem specifying the new API as returning a loader, a string, or None. The only concern I have with returning a string is that it might be needed in the future for some other purpose. And I don't see any future-proofing benefit to the callback that isn't provided by just returning a string. I don't see the extra flexibility you mention. In my scheme, find_module would become: def find_module(self, fullname): result = self.find_loader(fullname) if isinstance(result, str): msg = "Not importing directory {}: missing __init__" _warnings.warn(msg.format(result), ImportWarning) result = None return result So I'm unconvinced the callback buys anything. I'm going to change the PEP to specify find_loader(name) as returning a loader, a string, or None. And since this only makes sense for a path loader (not a meta-path loader), it won't have the optional path argument. Sure, the isinstance call is slightly ugly, but I don't see any downside outside of aesthetics. Eric. From ncoghlan at gmail.com Tue May 8 14:36:43 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 8 May 2012 22:36:43 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA8F812.9000308@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> Message-ID: On Tue, May 8, 2012 at 8:40 PM, Eric V. Smith wrote: > In this case (supporting the now-legacy find_module API), you need to > make sure that find_loader returns None. So what you'd really want to do > is not define dir_found, and later in find_loader, where you've found an > __init__-less directory, say: > > ? ?if dir_found is None: > ? ? ? ?msg = "Not importing directory {}: missing __init__" > ? ? ? ?_warnings.warn(msg.format(dirpath), ImportWarning) > ? ? ? ?return None > ? ?dir_found(dirpath) > ? ?return None No, the idea is to make the two activities (identifying package portions and deciding whether or not to continue scanning sys.path) *independent*. I think "dir_found" is actually the wrong name for the proposed callback. I would instead call it "portion_found". Also, since this is an API for third parties to implement, I think the default behaviour if no callback is specified should be to silently ignore discovered portions - it would be up to FileFinder.find_module to pass in the callback with the current ImportWarning behaviour. Suppose I want to implement a loader where the main path entry is actually just a reference to a separately configured path definition (e.g. to an application configuration file with an extra set of paths to check for Python modules). With a callback API, I can implement that directly, since I would be able to just pass the received "portion_found" callback down while scanning the subpath with the usual sys.path_hooks entries. It doesn't matter if that callback is called zero, one or many times - it will still do the right thing. Even if the subscan finds several portions before discovering a loader, it will *still* do the right thing - the fact we end up returning return a loader instead of None would override the fact that we previously called "portion_found". With the current implementation, there's no option to return *multiple* path segments - loaders are restricted to returning at most one portion to add to the namespace package. I think Antoine's right - having to introspect the return type from the method call is a major code smell, and I think it's a sign we're asking one return value to serve too many different purposes. > Sure, the isinstance call is slightly ugly, but I don't see any downside > outside of aesthetics. How does a loader request that *multiple* entries be added to the namespace package path? With the callback API, it can just invokes the callback multiple times. Introspection on the return type would instead need another special case. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Tue May 8 16:07:32 2012 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 08 May 2012 10:07:32 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> Message-ID: <4FA928A4.4090408@trueblade.com> On 05/08/2012 08:36 AM, Nick Coghlan wrote: > No, the idea is to make the two activities (identifying package > portions and deciding whether or not to continue scanning sys.path) > *independent*. I'm all for this. > Suppose I want to implement a loader where the main path entry is > actually just a reference to a separately configured path definition > (e.g. to an application configuration file with an extra set of paths > to check for Python modules). With a callback API, I can implement > that directly, since I would be able to just pass the received > "portion_found" callback down while scanning the subpath with the > usual sys.path_hooks entries. It doesn't matter if that callback is > called zero, one or many times - it will still do the right thing. > > Even if the subscan finds several portions before discovering a > loader, it will *still* do the right thing - the fact we end up > returning return a loader instead of None would override the fact that > we previously called "portion_found". > > With the current implementation, there's no option to return > *multiple* path segments - loaders are restricted to returning at most > one portion to add to the namespace package. So have it return a list of strings instead of a single string. > I think Antoine's right - having to introspect the return type from > the method call is a major code smell, and I think it's a sign we're > asking one return value to serve too many different purposes. I don't disagree with this. But we've got a function that we're asking to return one of 2 things, as you say. How is this normally handled? I would not use a callback. I'd return a tuple with the two things: (loader, list_of_portions). That seems way more straightforward. Eric. From pje at telecommunity.com Tue May 8 18:09:07 2012 From: pje at telecommunity.com (PJ Eby) Date: Tue, 8 May 2012 12:09:07 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA928A4.4090408@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> Message-ID: On Tue, May 8, 2012 at 10:07 AM, Eric V. Smith wrote: > I don't disagree with this. But we've got a function that we're asking > to return one of 2 things, as you say. How is this normally handled? I > would not use a callback. I'd return a tuple with the two things: > (loader, list_of_portions). That seems way more straightforward. > +1. It's also easy to implement. I'm not sure why we *need* a list of portions, but if we do, simple return values seem like the way to go. But the 2-element tuple wins even in the single path portion case, and the tuple-return protoocol is extensible if we need more data returned in future anyway. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue May 8 18:18:02 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 8 May 2012 09:18:02 -0700 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA928A4.4090408@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> Message-ID: <20120508091802.7fbcb71b@resist.wooz.org> On May 08, 2012, at 10:07 AM, Eric V. Smith wrote: >I don't disagree with this. But we've got a function that we're asking >to return one of 2 things, as you say. How is this normally handled? I >would not use a callback. I'd return a tuple with the two things: >(loader, list_of_portions). That seems way more straightforward. +1 -Barry From eric at trueblade.com Tue May 8 20:08:39 2012 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 08 May 2012 14:08:39 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> Message-ID: <4FA96127.1080103@trueblade.com> On 5/8/2012 12:09 PM, PJ Eby wrote: > On Tue, May 8, 2012 at 10:07 AM, Eric V. Smith > wrote: > > I don't disagree with this. But we've got a function that we're asking > to return one of 2 things, as you say. How is this normally handled? I > would not use a callback. I'd return a tuple with the two things: > (loader, list_of_portions). That seems way more straightforward. > > > +1. It's also easy to implement. > > I'm not sure why we *need* a list of portions, but if we do, simple > return values seem like the way to go. But the 2-element tuple wins > even in the single path portion case, and the tuple-return protoocol is > extensible if we need more data returned in future anyway. Nick laid out a use case in a previous email. It makes sense to me. For example, a zip file could contain multiple portions from the same namespace package. You'd need a new path hook or mods to zipimport, but it's conceivable. Eric. From eric at trueblade.com Wed May 9 02:16:22 2012 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 08 May 2012 20:16:22 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA928A4.4090408@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> Message-ID: <4FA9B756.1080202@trueblade.com> On 05/08/2012 10:07 AM, Eric V. Smith wrote: > I don't disagree with this. But we've got a function that we're asking > to return one of 2 things, as you say. How is this normally handled? I > would not use a callback. I'd return a tuple with the two things: > (loader, list_of_portions). That seems way more straightforward. In the pep-420 branch I've checked in code where find_loader() returns (loader, list_of_portions). I've implemented it for the FileFinder and zipimport. loader can be None or a loader object. list_of_portions can be an empty list, or a list of strings. To indicate "no loader or portions found", return (None, []). If loader is not None, list_of_portions is ignored. I'm pretty happy with this API. Comments welcome. Eric. From barry at python.org Wed May 9 02:23:14 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 8 May 2012 17:23:14 -0700 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA9B756.1080202@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA9B756.1080202@trueblade.com> Message-ID: <20120508172314.0a8006a0@resist> On May 08, 2012, at 08:16 PM, Eric V. Smith wrote: >In the pep-420 branch I've checked in code where find_loader() returns >(loader, list_of_portions). I've implemented it for the FileFinder and >zipimport. > >loader can be None or a loader object. >list_of_portions can be an empty list, or a list of strings. > >To indicate "no loader or portions found", return (None, []). > >If loader is not None, list_of_portions is ignored. > >I'm pretty happy with this API. Comments welcome. Me too. Really nicely done. -Barry From ncoghlan at gmail.com Wed May 9 03:01:30 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 9 May 2012 11:01:30 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA9B756.1080202@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA9B756.1080202@trueblade.com> Message-ID: On Wed, May 9, 2012 at 10:16 AM, Eric V. Smith wrote: > In the pep-420 branch I've checked in code where find_loader() returns > (loader, list_of_portions). I've implemented it for the FileFinder and > zipimport. > > loader can be None or a loader object. > list_of_portions can be an empty list, or a list of strings. > > To indicate "no loader or portions found", return (None, []). > > If loader is not None, list_of_portions is ignored. > > I'm pretty happy with this API. Comments welcome. Yeah, I think that works well and covers all the even vaguely reasonable cases. Besides, any desire for truly exotic import behaviour can always be handled via sys.meta_path. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From martin at v.loewis.de Wed May 9 08:48:24 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Wed, 09 May 2012 08:48:24 +0200 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FA96127.1080103@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> Message-ID: <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> >> I'm not sure why we *need* a list of portions, but if we do, simple >> return values seem like the way to go. But the 2-element tuple wins >> even in the single path portion case, and the tuple-return protoocol is >> extensible if we need more data returned in future anyway. > > Nick laid out a use case in a previous email. It makes sense to me. For > example, a zip file could contain multiple portions from the same > namespace package. You'd need a new path hook or mods to zipimport, but > it's conceivable. I must have missed Nick's message where he explained it, so I still need to ask again: how exactly would such a zip file be structured? I fail to see the need to ever report both a loader and a portion, as well as the need to report multiple portions, for a single sys.path item. That sounds like an unnecessary complication. Regards, Martin From ncoghlan at gmail.com Wed May 9 10:19:20 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 9 May 2012 18:19:20 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> Message-ID: On Wed, May 9, 2012 at 4:48 PM, wrote: >>> I'm not sure why we *need* a list of portions, but if we do, simple >>> return values seem like the way to go. ?But the 2-element tuple wins >>> even in the single path portion case, and the tuple-return protoocol is >>> extensible if we need more data returned in future anyway. >> >> >> Nick laid out a use case in a previous email. It makes sense to me. For >> example, a zip file could contain multiple portions from the same >> namespace package. You'd need a new path hook or mods to zipimport, but >> it's conceivable. > > > I must have missed Nick's message where he explained it, so I still need > to ask again: how exactly would such a zip file be structured? > > I fail to see the need to ever report both a loader and a portion, > as well as the need to report multiple portions, for a single sys.path > item. That sounds like an unnecessary complication. My actual objection is the same as Antoine's: that needing to introspect the result of find_loader() to handle the PEP 420 use case is a code smell that suggests the API design is flawed. The problem I had with it was that find_loader() needs to report on 3 different scenarios: 1. I am providing a loader to fully load this module, stop scanning the path hooks 2. I am contributing to a potential namespace package, keep scanning the path hooks 3. I have nothing to provide for that name, keep scanning the path hooks. Using the type of the return value (or whether or not it has a "load_module" attribute) to decide between scenario 1 and 2 just feels wrong. My proposed alternative was to treat the "portion_found" event as a callback rather than as something to be handled via the return value. Then loaders would be free to report as many portions as they wished, with the final "continue scanning or not" decision handled via the existing "loader or None" semantics. The example I happened to use to illustrate the difference was one where a loader actually internally implements its *own* path scan of multiple locations. I wasn't specifically thinking of zipfiles, but you could certainly use it that way. The core concept was that a single entry on the main path would be handed off to a finder that actually knew about *multiple* code locations, and hence may want to report multiple path portions. The 3 scenarios above would then correspond to: 1. Loader was returned (doesn't matter if callback was invoked) 2. None was returned, callback was invoked one or more times 2. None was returned, callback was never invoked Eric's counter-proposal is to handle the 3 scenarios as: 1. (, ) 2. (None, []) 3. (None, []) Yet another option would be to pass a namespace_path list object directly into the find_loader() call, instead of passing namespace_path.append as a callback. Then the loader would append any portions it finds directly to the list, with the return value again left as the simple choice between a loader or None. One final option would be add an optional "extend_namespace" method to *loader* objects. Then the logic would become, instead of type introspection, more like the following: loader = find_loader(fullpath) try: extend_namespace = loader.extend_namespace except AttributeError: pass else: if extend_namespace(namespace_path): # The loader contributed to the namespace package rather than loading the full module continue if loader is not None: return loader It's definitely the switch-statement feel of the proposed type checks that rubs me the wrong way, though. Supporting multiple portions from a single loader was just the most straightforward example I could think of a limitation imposed by that mechanism. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Wed May 9 14:58:23 2012 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 09 May 2012 08:58:23 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> Message-ID: <4FAA69EF.6040708@trueblade.com> On 05/09/2012 02:48 AM, martin at v.loewis.de wrote: > I must have missed Nick's message where he explained it, so I still need > to ask again: how exactly would such a zip file be structured? I'll work on such an example. But I think Nick's example of a config file that has the configuration for multiple portions of a single namespace package is more compelling. I don't see where returning a list for the common case of a single portion is a large burden. > I fail to see the need to ever report both a loader and a portion, > as well as the need to report multiple portions, for a single sys.path > item. That sounds like an unnecessary complication. As Nick said, you'd return a loader, a list of portions, or neither. it would be an error to return both. I'm mildly sympathetic to not wanting to inspect either the type or attributes of the returned value to figure out which is being returned. A callback to specify the portions seem needlessly complex and a hassle for a C implementation. My compromise is to return a tuple. I don't think a tuple is much of a burden. It's not like writing finders which support namespace portions will be a common activity. From brett at python.org Wed May 9 16:33:32 2012 From: brett at python.org (Brett Cannon) Date: Wed, 9 May 2012 10:33:32 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> Message-ID: On Wed, May 9, 2012 at 4:19 AM, Nick Coghlan wrote: > On Wed, May 9, 2012 at 4:48 PM, wrote: > >>> I'm not sure why we *need* a list of portions, but if we do, simple > >>> return values seem like the way to go. But the 2-element tuple wins > >>> even in the single path portion case, and the tuple-return protoocol is > >>> extensible if we need more data returned in future anyway. > >> > >> > >> Nick laid out a use case in a previous email. It makes sense to me. For > >> example, a zip file could contain multiple portions from the same > >> namespace package. You'd need a new path hook or mods to zipimport, but > >> it's conceivable. > > > > > > I must have missed Nick's message where he explained it, so I still need > > to ask again: how exactly would such a zip file be structured? > > > > I fail to see the need to ever report both a loader and a portion, > > as well as the need to report multiple portions, for a single sys.path > > item. That sounds like an unnecessary complication. > > My actual objection is the same as Antoine's: that needing to > introspect the result of find_loader() to handle the PEP 420 use case > is a code smell that suggests the API design is flawed. The problem I > had with it was that find_loader() needs to report on 3 different > scenarios: > > 1. I am providing a loader to fully load this module, stop scanning > the path hooks > 2. I am contributing to a potential namespace package, keep scanning > the path hooks > 3. I have nothing to provide for that name, keep scanning the path hooks. > > Using the type of the return value (or whether or not it has a > "load_module" attribute) to decide between scenario 1 and 2 just feels > wrong. > > My proposed alternative was to treat the "portion_found" event as a > callback rather than as something to be handled via the return value. > Then loaders would be free to report as many portions as they wished, > with the final "continue scanning or not" decision handled via the > existing "loader or None" semantics. > > The example I happened to use to illustrate the difference was one > where a loader actually internally implements its *own* path scan of > multiple locations. I wasn't specifically thinking of zipfiles, but > you could certainly use it that way. The core concept was that a > single entry on the main path would be handed off to a finder that > actually knew about *multiple* code locations, and hence may want to > report multiple path portions. > > The 3 scenarios above would then correspond to: > > 1. Loader was returned (doesn't matter if callback was invoked) > 2. None was returned, callback was invoked one or more times > 2. None was returned, callback was never invoked > > Eric's counter-proposal is to handle the 3 scenarios as: > > 1. (, ) > 2. (None, []) > 3. (None, []) > > Yet another option would be to pass a namespace_path list object > directly into the find_loader() call, instead of passing > namespace_path.append as a callback. Then the loader would append any > portions it finds directly to the list, with the return value again > left as the simple choice between a loader or None. > IOW a path accumulator where a loader can say "I can't find anything, but these came *damn* close to working". Seems reasonable to me. > > One final option would be add an optional "extend_namespace" method to > *loader* objects. Then the logic would become, instead of type > introspection, more like the following: > > loader = find_loader(fullpath) > try: > extend_namespace = loader.extend_namespace > except AttributeError: > pass > else: > if extend_namespace(namespace_path): > # The loader contributed to the namespace package rather > than loading the full module > continue > if loader is not None: > return loader > How does this avoid an unnecessary stat call on the directory (or in this case namespace_path)? One of the reasons to have the detection of a namespace directory in the finder was to avoid doing an extra stat call for something the finder already noticed. You can obviously cache, but even then you will need to do a stat call to verify the cache is not out of date (unless we explicitly state that the extend_namespace() call is *only* made immediately after find_loader/find_module and so the possible race condition is small enough to ignore). -Brett > > It's definitely the switch-statement feel of the proposed type checks > that rubs me the wrong way, though. Supporting multiple portions from > a single loader was just the most straightforward example I could > think of a limitation imposed by that mechanism. > > Regards, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Wed May 9 18:14:29 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 9 May 2012 09:14:29 -0700 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> Message-ID: <20120509091429.776ae7bd@resist> On May 09, 2012, at 06:19 PM, Nick Coghlan wrote: >Eric's counter-proposal is to handle the 3 scenarios as: > >1. (, ) >2. (None, []) >3. (None, []) This seems quite Pythonic to me, and even convenient. With this API, you know that you're getting a 2-tuple, so you don't have to check the length of the return type to know how to unpack it. Then, you only need to check the first element to know whether you've got a loader or not. This also seems like a straightforward elaboration of the older find_module() API (i.e. the first element's return values are exactly like the single return value of find_module()). Take a look at the PathFinder.find_module() implementation to see how clear and concise this API is. I'd like to relax the formal specification just a bit though, so that the second element is a sequence, not necessarily a concrete list. Cheers, -Barry From martin at v.loewis.de Wed May 9 18:42:49 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Wed, 09 May 2012 18:42:49 +0200 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FAA69EF.6040708@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507125346.2ad355ed@pitrou.net> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAA69EF.6040708@trueblade.com> Message-ID: <20120509184249.Horde.FbBmBdjz9kRPqp6JLNYnMjA@webmail.df.eu> >> I must have missed Nick's message where he explained it, so I still need >> to ask again: how exactly would such a zip file be structured? > > I'll work on such an example. But I think Nick's example of a config > file that has the configuration for multiple portions of a single > namespace package is more compelling. I don't see where returning a list > for the common case of a single portion is a large burden. Unfortunately, I still can't locate the message where he explains the example. This sounds like a bit like .pth files, where the file contains the list of locations to be added to the path. It was supported in an early version of PEP 382, when the community declared YAGNI. > As Nick said, you'd return a loader, a list of portions, or neither. it > would be an error to return both. Ah, ok. This is fine, then (except that I still think that returning multiple portions needs to be supported). Regards, Martin From eric at trueblade.com Wed May 9 19:36:31 2012 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 09 May 2012 13:36:31 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120509184249.Horde.FbBmBdjz9kRPqp6JLNYnMjA@webmail.df.eu> References: <4FA789F7.7080609@v.loewis.de> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAA69EF.6040708@trueblade.com> <20120509184249.Horde.FbBmBdjz9kRPqp6JLNYnMjA@webmail.df.eu> Message-ID: <4FAAAB1F.6020507@trueblade.com> On 5/9/2012 12:42 PM, martin at v.loewis.de wrote: >>> I must have missed Nick's message where he explained it, so I still need >>> to ask again: how exactly would such a zip file be structured? >> >> I'll work on such an example. But I think Nick's example of a config >> file that has the configuration for multiple portions of a single >> namespace package is more compelling. I don't see where returning a list >> for the common case of a single portion is a large burden. > > Unfortunately, I still can't locate the message where he explains the > example. This sounds like a bit like .pth files, where the file contains > the list of locations to be added to the path. It was supported in an > early version of PEP 382, when the community declared YAGNI. I think that's this message: http://mail.python.org/pipermail/import-sig/2012-May/000585.html >> As Nick said, you'd return a loader, a list of portions, or neither. it >> would be an error to return both. > > Ah, ok. This is fine, then (except that I still think that returning > multiple > portions needs to be supported). I thought you were arguing that multiple portions per finder call didn't need to be supported. Maybe I misunderstand. Eric. From eric at trueblade.com Thu May 10 01:46:28 2012 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 09 May 2012 19:46:28 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120509091429.776ae7bd@resist> References: <4FA789F7.7080609@v.loewis.de> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <20120509091429.776ae7bd@resis t> Message-ID: <4FAB01D4.6020303@trueblade.com> On 5/9/2012 12:14 PM, Barry Warsaw wrote: > On May 09, 2012, at 06:19 PM, Nick Coghlan wrote: > >> Eric's counter-proposal is to handle the 3 scenarios as: >> >> 1. (, ) >> 2. (None, []) >> 3. (None, []) > ... > I'd like to relax the formal specification just a bit though, so that the > second element is a sequence, not necessarily a concrete list. I was going to say just use whatever list.extend() is documented to accept, but I notice that's "list" [1]. I would assume it can really be any iterable. But since len() is called on it (in case 3, above), I guess "sequence of strings" is the best description. But if case 3 were changed to (None, None), then I wouldn't need to call len(), and it could be any iterable returning strings in case 2. What are your thoughts on making case 3 (None, None)? I sort of like it. For case 1, I'm currently returning "" part as an empty string. Should I document it as that, or as really ""? I don't have an opinion on this. Eric. [1]: http://docs.python.org/tutorial/datastructures.html#more-on-lists From barry at python.org Thu May 10 02:23:11 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 9 May 2012 17:23:11 -0700 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FAB01D4.6020303@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <20120509091429.776ae7bd@resis t> <4FAB01D4.6020303@trueblade.com> Message-ID: <20120509172311.53e50515@rivendell> On May 09, 2012, at 07:46 PM, Eric V. Smith wrote: >On 5/9/2012 12:14 PM, Barry Warsaw wrote: >> On May 09, 2012, at 06:19 PM, Nick Coghlan wrote: >> >>> Eric's counter-proposal is to handle the 3 scenarios as: >>> >>> 1. (, ) >>> 2. (None, []) >>> 3. (None, []) >> > >... > >> I'd like to relax the formal specification just a bit though, so that the >> second element is a sequence, not necessarily a concrete list. > >I was going to say just use whatever list.extend() is documented to >accept, but I notice that's "list" [1]. I would assume it can really be >any iterable. But since len() is called on it (in case 3, above), I >guess "sequence of strings" is the best description. The help is a little better: >>> help([].extend) Help on built-in function extend: extend(...) L.extend(iterable) -- extend list by appending elements from the iterable >But if case 3 were changed to (None, None), then I wouldn't need to call >len(), and it could be any iterable returning strings in case 2. What >are your thoughts on making case 3 (None, None)? I sort of like it. +1. So the formal spec would be that the second item can be "any iterable returning strings, or None". I suspect implementations that want to be maximally Postel would accept any false-ish value instead of exactly None. >For case 1, I'm currently returning "" part as an empty >string. Should I document it as that, or as really ""? I >don't have an opinion on this. I think it should be documented as 'ignored' when the first argument is not None. Cheers, -Barry From eric at trueblade.com Thu May 10 02:26:39 2012 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 09 May 2012 20:26:39 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120509172311.53e50515@rivendell> References: <4FA789F7.7080609@v.loewis.de> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <20120509091429.776ae7bd@resis t> <4FAB01D4.6020303@trueblade.com> <20120509172311.53e50515@rivendel l> Message-ID: <4FAB0B3F.7020302@trueblade.com> On 05/09/2012 08:23 PM, Barry Warsaw wrote: >> But if case 3 were changed to (None, None), then I wouldn't need to call >> len(), and it could be any iterable returning strings in case 2. What >> are your thoughts on making case 3 (None, None)? I sort of like it. > > +1. So the formal spec would be that the second item can be "any iterable > returning strings, or None". I suspect implementations that want to be > maximally Postel would accept any false-ish value instead of exactly None. Well, see the python-ideas discussion of false datetime.time versus None! >> For case 1, I'm currently returning "" part as an empty >> string. Oops, I meant "empty list". > I think it should be documented as 'ignored' when the first argument is not > None. Okay. Eric. From eric at trueblade.com Thu May 10 02:19:42 2012 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 09 May 2012 20:19:42 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FAB01D4.6020303@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <20120509091429.776ae7bd@resis t> <4FAB01D4.6020303@trueblade.c om> Message-ID: <4FAB099E.90100@trueblade.com> On 05/09/2012 07:46 PM, Eric V. Smith wrote: > On 5/9/2012 12:14 PM, Barry Warsaw wrote: >> On May 09, 2012, at 06:19 PM, Nick Coghlan wrote: >> >>> Eric's counter-proposal is to handle the 3 scenarios as: >>> >>> 1. (, ) >>> 2. (None, []) >>> 3. (None, []) I modified the PEP to specify these as: 1. (loader, None) 2. (None, ) 3. (None, None) I'm willing to change this, but this is the direction I'm leaning. The code still reflects what Nick wrote above. If the PEP version sticks, I'll update the code. Eric. From barry at python.org Thu May 10 02:41:57 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 9 May 2012 17:41:57 -0700 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FAB099E.90100@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <20120509091429.776ae7bd@resis t> <4FAB01D4.6020303@trueblade.c om> <4FAB099E.90100@trueblade.com> Message-ID: <20120509174157.62cd6f2e@rivendell> On May 09, 2012, at 08:19 PM, Eric V. Smith wrote: >I modified the PEP to specify these as: > >1. (loader, None) >2. (None, ) >3. (None, None) wfm. -Barry From ncoghlan at gmail.com Thu May 10 02:52:39 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 May 2012 10:52:39 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FAB01D4.6020303@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <4FA7C793.4010501@trueblade.com> <20120507170021.76f1594f@pitrou.net> <4FA7F1B4.7070405@trueblade.com> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <20120509091429.776ae7bd@resist> <4FAB01D4.6020303@trueblade.com> Message-ID: On Thu, May 10, 2012 at 9:46 AM, Eric V. Smith wrote: > On 5/9/2012 12:14 PM, Barry Warsaw wrote: >> On May 09, 2012, at 06:19 PM, Nick Coghlan wrote: >> >>> Eric's counter-proposal is to handle the 3 scenarios as: >>> >>> 1. (, ) >>> 2. (None, []) >>> 3. (None, []) >> > > ... > >> I'd like to relax the formal specification just a bit though, so that the >> second element is a sequence, not necessarily a concrete list. > > I was going to say just use whatever list.extend() is documented to > accept, but I notice that's "list" [1]. I would assume it can really be > any iterable. But since len() is called on it (in case 3, above), I > guess "sequence of strings" is the best description. Why do we call len() specifically on the return value? Can't we just extend the namespace path unconditionally, and then call len() on *that* at the end to check if we found any namespace portions while iterating over the path? > But if case 3 were changed to (None, None), then I wouldn't need to call > len(), and it could be any iterable returning strings in case 2. What > are your thoughts on making case 3 (None, None)? I sort of like it. I'd personally prefer to make the requirement an "iterable of path entries", and adjust the outer algorithm so it doesn't call len() directly on the return value. PathFinder.find_module would become simply: @classmethod def find_module(cls, fullname, path=None): """Find the module on sys.path or 'path' based on sys.path_hooks and sys.path_importer_cache.""" if path is None: path = sys.path # If this ends up being a namespace package, this is the # list of paths that will become its __path__ namespace_path = [] for entry in path: finder = cls._path_importer_cache(entry) if finder is not None: if hasattr(finder, 'find_loader'): loader, portions = finder.find_loader(fullname) else: loader = finder.find_module(fullname) portions = [] # As soon as we find a loader, we're done if loader is not None: return loader # Otherwise, record the package portions (if any) and # continue scanning the path namespace_path.extend(portions) # Made it through the entire path without finding a loader if namespace_path: # We found at least one namespace directory. Return a loader # which can create the namespace package. return NamespaceLoader(namespace_path) # We got nuthin' return None > For case 1, I'm currently returning "" part as an empty > string. Should I document it as that, or as really ""? I > don't have an opinion on this. If the first field is not None, we shouldn't even look at the second field. However, the return value should still follow whatever conventions are established for the second field (i.e. it should be either an empty iterable, or an iterable of path entries). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu May 10 02:55:57 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 May 2012 10:55:57 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120509174157.62cd6f2e@rivendell> References: <4FA789F7.7080609@v.loewis.de> <20120507181940.3c4e6be8@pitrou.net> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> Message-ID: On Thu, May 10, 2012 at 10:41 AM, Barry Warsaw wrote: > On May 09, 2012, at 08:19 PM, Eric V. Smith wrote: > >>I modified the PEP to specify these as: >> >>1. (loader, None) >>2. (None, ) >>3. (None, None) > > wfm. I'd prefer to keep a consistent constraint of "iterable of path entries" for the second value, and allow people to return a non-empty iterable when they're returning a loader (since it will be ignored anyway). See my proposed revision to PathFinder.find_module in my other reply. I'm definitely *much* happier with the 2-tuple return format over the introspection based API, though :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From barry at python.org Thu May 10 04:22:04 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 9 May 2012 19:22:04 -0700 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> Message-ID: <20120509192204.6445a32d@rivendell> On May 10, 2012, at 10:55 AM, Nick Coghlan wrote: >On Thu, May 10, 2012 at 10:41 AM, Barry Warsaw wrote: >> On May 09, 2012, at 08:19 PM, Eric V. Smith wrote: >> >>>I modified the PEP to specify these as: >>> >>>1. (loader, None) >>>2. (None, ) >>>3. (None, None) >> >> wfm. > >I'd prefer to keep a consistent constraint of "iterable of path >entries" for the second value, and allow people to return a non-empty >iterable when they're returning a loader (since it will be ignored >anyway). See my proposed revision to PathFinder.find_module in my >other reply. I don't think the implementation should constrain the specification. Rather, what makes the most sense to someone reading the PEP, or the future language reference? In that respect, I think it's better to define the second item as "ignored" or None when not-None is returned as the first element. Requiring the return of an empty sequence when the value is semantically ignored makes no sense. There's also a semantic difference between returning None and returning an empty sequence as the second element when the first element is None. In the matrix of return states, "(None, ())" means "I found some namespace portions, and the number of portions I found is zero" which is clearly nonsensical, and subtly different than "I found neither a normal package nor portions of a namespace package." So I still prefer the current wording of the PEP. >I'm definitely *much* happier with the 2-tuple return format over the >introspection based API, though :) Yay! :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ncoghlan at gmail.com Thu May 10 05:31:15 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 May 2012 13:31:15 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120509192204.6445a32d@rivendell> References: <4FA789F7.7080609@v.loewis.de> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell> Message-ID: On Thu, May 10, 2012 at 12:22 PM, Barry Warsaw wrote: > On May 10, 2012, at 10:55 AM, Nick Coghlan wrote: >>I'd prefer to keep a consistent constraint of "iterable of path >>entries" for the second value, and allow people to return a non-empty >>iterable when they're returning a loader (since it will be ignored >>anyway). See my proposed revision to PathFinder.find_module in my >>other reply. > > I don't think the implementation should constrain the specification. ?Rather, > what makes the most sense to someone reading the PEP, or the future language > reference? I agree completely, but it's the decision to call len() directly on the returned value in PathFinder.find_module and thus unnecessarily constrain the return type where I see the implementation as driving the specification. That call is completely unnecessary. Remove it and call namespace.extend_path() unconditionally and the simple "iterable of path entries" definition works. By keeping it, the implementation is forcing the specification to tighten the requirement from "iterable of strings" to "sequence of strings". The specification should also take into account what's *easiest* for the API consumer. Forcing API users to check for None in the second argument is just obnoxious when the spec could instead say to return an empty iterable in this case. > In that respect, I think it's better to define the second item as "ignored" or > None when not-None is returned as the first element. ?Requiring the return of > an empty sequence when the value is semantically ignored makes no sense. We can still provide advice on what a well-behaved loader *should* do, even when it's not technically a requirement. However, I also really dislike conditional constraints on values - I believe it leads to much cleaner designs overall if the constraints on different elements are orthogonal. (You can't always achieve that, but when it's both possible and easy, as in this case, it's worth doing). > There's also a semantic difference between returning None and returning an > empty sequence as the second element when the first element is None. ?In the > matrix of return states, "(None, ())" means "I found some namespace portions, > and the number of portions I found is zero" which is clearly nonsensical, and > subtly different than "I found neither a normal package nor portions of a > namespace package." I think you're making up a distinction that doesn't exist. Both "None" and "()" (or any other empty container) would mean "no portions found" in practice, but the former requires an explicit check on the part of the API consumer, while the latter will be naturally ignored by ordinary iterable processing. Consider the old API, where the only return options were a loader, a string or None. Why introduce an arbitrary distinction between (None, ()) and (None, None), when we can simply declare the latter invalid behaviour on the finder's part? My proposal means there would only be two valid possible returns from find_loader: 1. (loader, ) 2. (None, ) In the first case, the iterable of path entries (which may be empty) is ignored. In the latter case, the iterable of path entries (which may be empty) is added to the prospective namespace package path. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Fri May 11 01:55:46 2012 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 10 May 2012 19:55:46 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <20120509192204.6445a32d@rivendell> References: <4FA789F7.7080609@v.loewis.de> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell > Message-ID: <4FAC5582.3000103@trueblade.com> On 05/09/2012 10:22 PM, Barry Warsaw wrote: > On May 10, 2012, at 10:55 AM, Nick Coghlan wrote: > >> On Thu, May 10, 2012 at 10:41 AM, Barry Warsaw wrote: >>> On May 09, 2012, at 08:19 PM, Eric V. Smith wrote: >>> >>>> I modified the PEP to specify these as: >>>> >>>> 1. (loader, None) >>>> 2. (None, ) >>>> 3. (None, None) >>> >>> wfm. >> >> I'd prefer to keep a consistent constraint of "iterable of path >> entries" for the second value, and allow people to return a non-empty >> iterable when they're returning a loader (since it will be ignored >> anyway). See my proposed revision to PathFinder.find_module in my >> other reply. > > I don't think the implementation should constrain the specification. Rather, > what makes the most sense to someone reading the PEP, or the future language > reference? > > In that respect, I think it's better to define the second item as "ignored" or > None when not-None is returned as the first element. Requiring the return of > an empty sequence when the value is semantically ignored makes no sense. > > There's also a semantic difference between returning None and returning an > empty sequence as the second element when the first element is None. In the > matrix of return states, "(None, ())" means "I found some namespace portions, > and the number of portions I found is zero" which is clearly nonsensical, and > subtly different than "I found neither a normal package nor portions of a > namespace package." > > So I still prefer the current wording of the PEP. I think trying to explain Nick's version is more complex that what's in the PEP. Sure, it's a generalization. But you're going to have to explain that if you discover you don't provide any portions, return a zero-length iterator. Both of the find_loader() methods I've written don't have an iterator around for the paths. It's "yes I have one, and I can compute the path and from that a list", or "no, I'm not part of a namespace". I think the case where you'd have a zero-or-more iterator are extremely infrequent. Eric. From ncoghlan at gmail.com Fri May 11 02:09:41 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 11 May 2012 10:09:41 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FAC5582.3000103@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <20120508022756.18e83c18@pitrou.net> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell> <4FAC5582.3000103@trueblade.com> Message-ID: How is starting an accumulator and returning it unconditionally *more* complicated than "if you don't find any portions return None". "Why should I do that, what happens if I just return my empty container instead?" "Oh, that's the same as returning None." It makes no sense. We *have* to handle the empty iterator case regardless. Allowing None *as well* is just plain redundant. Regards, Nick. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Fri May 11 02:31:12 2012 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 10 May 2012 20:31:12 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell> <4FAC5582.3000103@trueblade.com> Message-ID: <4FAC5DD0.5040702@trueblade.com> On 05/10/2012 08:09 PM, Nick Coghlan wrote: > How is starting an accumulator and returning it unconditionally *more* > complicated than "if you don't find any portions return None". There's no doubt in my mind that it would make zipimport more complex to use an accumulator. > It makes no sense. We *have* to handle the empty iterator case > regardless. Allowing None *as well* is just plain redundant. Okay, I find that compelling. Eric. From ncoghlan at gmail.com Fri May 11 02:48:55 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 11 May 2012 10:48:55 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FAC5DD0.5040702@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell> <4FAC5582.3000103@trueblade.com> <4FAC5DD0.5040702@trueblade.com> Message-ID: On Fri, May 11, 2012 at 10:31 AM, Eric V. Smith wrote: > On 05/10/2012 08:09 PM, Nick Coghlan wrote: >> How is starting an accumulator and returning it unconditionally *more* >> complicated than "if you don't find any portions return None". > > There's no doubt in my mind that it would make zipimport more complex to > use an accumulator. Yeah, I sent my reply before I had fully processed your second paragraph. For those either/or cases, I don't see much difference between: loader = portion = None # Algorithm that may set either loader or portion return loader, portion And: loader = None portion = () # Algorithm that may set either loader or portion return loader, portion It's just a matter of using "()" for the portion component wherever you would have otherwise written "None". >> It makes no sense. We *have* to handle the empty iterator case >> regardless. Allowing None *as well* is just plain redundant. > > Okay, I find that compelling. It took me a while to figure out exactly what was bugging me about the idea of allowing a None return, I think I finally got there with that paragraph :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Fri May 11 03:02:10 2012 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 10 May 2012 21:02:10 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FAC5DD0.5040702@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell> <4FAC5582.3000103@trueblade.com> <4FAC5DD0.5040702@trueblade.com> Message-ID: <4FAC6512.3070103@trueblade.com> On 05/10/2012 08:31 PM, Eric V. Smith wrote: > On 05/10/2012 08:09 PM, Nick Coghlan wrote: >> It makes no sense. We *have* to handle the empty iterator case >> regardless. Allowing None *as well* is just plain redundant. > > Okay, I find that compelling. I've checked in the updated PEP. The code in the pep-420 branch reflects the same logic. Feel free to wordsmith it. Eric. From ncoghlan at gmail.com Fri May 11 03:56:40 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 11 May 2012 11:56:40 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: <4FAC6512.3070103@trueblade.com> References: <4FA789F7.7080609@v.loewis.de> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell> <4FAC5582.3000103@trueblade.com> <4FAC5DD0.5040702@trueblade.com> <4FAC6512.3070103@trueblade.com> Message-ID: On Fri, May 11, 2012 at 11:02 AM, Eric V. Smith wrote: > Feel free to wordsmith it. Only minor quibbles: "load a regular package" -> "load a module or self-contained package" (when we get a loader, we don't know if it's for a module or a package, and I also prefer "self-contained package" over "regular package", since it's more future proof terminology) "will be a list containing only" -> "will contain only" Independent of PEP 420, something we should probably consider for the official documentation of the import system is making a clearer distinction between importers (on sys.meta_path) and finders (returned by sys.path_hooks entries) than was the case in PEP 302 (which calls them all finders). Importers: - installed directly on sys.meta_path - called via importer.find_module(fullname, path) (where path=None for a sys.path based import) Finders: - created by the path importer for individual path entries by traversing sys.path_hooks - stored in sys.path_importer_cache - called via finder.find_loader(fullname) (if defined), otherwise via finder.find_module(fullname) Loaders: - returned from finder.find_loader() (or finder.find_module()) - called via loader.load_module(fullname) importlib is *mostly* consistent with the above scheme (with FrozenImporter, BuiltinImporter, FileFinder and the various *Loader classes), but PathFinder doesn't fit (it's a meta_path importer, but uses a "*Finder" name). We could always just change the name to PathImporter, keeping PathFinder around as a backwards compatibility alias. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ericsnowcurrently at gmail.com Fri May 11 04:44:00 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 10 May 2012 20:44:00 -0600 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell> <4FAC5582.3000103@trueblade.com> <4FAC5DD0.5040702@trueblade.com> <4FAC6512.3070103@trueblade.com> Message-ID: On Thu, May 10, 2012 at 7:56 PM, Nick Coghlan wrote: > "will be a list containing only" -> "will contain only" > > Independent of PEP 420, something we should probably consider for the > official documentation of the import system is making a clearer > distinction between importers (on sys.meta_path) and finders (returned > by sys.path_hooks entries) than was the case in PEP 302 (which calls > them all finders). Yeah, it's a mess. From my perspective I'd label them in reverse of that: "finder" for sys.meta_path and "path importer" for sys.path_hooks. Finders are pretty well defined already and "importer" is unfortunately pretty overloaded already. In PEP 302, "importer" is the name of the finder/loader protocol and an object implementing *either* is an importer. The PEP also names the callables on sys.path_hooks as "importer factories" (they return finders). Another less desirable option is "path hook" (finder factory) and "meta path hook" (finder), but that implies the opposite precedence and adds yet another synonym for "finder". Regardless, +1 on better demarcating the distinction. -eric From brett at python.org Fri May 11 05:13:43 2012 From: brett at python.org (Brett Cannon) Date: Thu, 10 May 2012 23:13:43 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell> <4FAC5582.3000103@trueblade.com> <4FAC5DD0.5040702@trueblade.com> <4FAC6512.3070103@trueblade.com> Message-ID: On Thu, May 10, 2012 at 9:56 PM, Nick Coghlan wrote: > On Fri, May 11, 2012 at 11:02 AM, Eric V. Smith > wrote: > > Feel free to wordsmith it. > > Only minor quibbles: > > "load a regular package" -> "load a module or self-contained package" > (when we get a loader, we don't know if it's for a module or a > package, and I also prefer "self-contained package" over "regular > package", since it's more future proof terminology) > > "will be a list containing only" -> "will contain only" > > Independent of PEP 420, something we should probably consider for the > official documentation of the import system is making a clearer > distinction between importers (on sys.meta_path) and finders (returned > by sys.path_hooks entries) than was the case in PEP 302 (which calls > them all finders). > > Importers: > - installed directly on sys.meta_path > - called via importer.find_module(fullname, path) (where path=None for > a sys.path based import) > > Finders: > - created by the path importer for individual path entries by > traversing sys.path_hooks > - stored in sys.path_importer_cache > - called via finder.find_loader(fullname) (if defined), otherwise via > finder.find_module(fullname) > > Loaders: > - returned from finder.find_loader() (or finder.find_module()) > - called via loader.load_module(fullname) > > importlib is *mostly* consistent with the above scheme (with > FrozenImporter, BuiltinImporter, FileFinder and the various *Loader > classes), but PathFinder doesn't fit (it's a meta_path importer, but > uses a "*Finder" name). We could always just change the name to > PathImporter, keeping PathFinder around as a backwards compatibility > alias. But PathFinder is not an importer as it is not a finder *and* a loader: http://docs.python.org/dev/glossary.html#term-importer . -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri May 11 06:50:28 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 11 May 2012 14:50:28 +1000 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <4FA883CE.80705@trueblade.com> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell> <4FAC5582.3000103@trueblade.com> <4FAC5DD0.5040702@trueblade.com> <4FAC6512.3070103@trueblade.com> Message-ID: On Fri, May 11, 2012 at 1:13 PM, Brett Cannon wrote: > But PathFinder is not an importer as it is not a finder *and* a loader: > http://docs.python.org/dev/glossary.html#term-importer . In that case, I guess longer phrases like "meta path finder" and "path hook finder" will be needed in order to distinguish between the two kinds of finder. While they were always different (the latter didn't need to accept a path argument to find_module()), that difference becomes even more pronounced with PEP 420 (where only the latter can provide the new find_loader() method - the former must still provide find_module()). Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Fri May 11 10:38:15 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 11 May 2012 04:38:15 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell> <4FAC5582.3000103@trueblade.com> <4FAC5DD0.5040702@trueblade.com> <4FAC6512.3070103@trueblade.com> Message-ID: <4FACCFF7.7090305@trueblade.com> On 5/10/2012 9:56 PM, Nick Coghlan wrote: > On Fri, May 11, 2012 at 11:02 AM, Eric V. Smith wrote: >> Feel free to wordsmith it. > > Only minor quibbles: > > "load a regular package" -> "load a module or self-contained package" > (when we get a loader, we don't know if it's for a module or a > package, and I also prefer "self-contained package" over "regular > package", since it's more future proof terminology) The PEP in general always says packages, even when it means modules or packages. It also is really only talking about the sys.path aware meta-finder (aka importer). I'll change this instance of it, but it's really pervasive. Some of this is inherited from 382, but some of it is just me trying to be less verbose. > "will be a list containing only" -> "will contain only" Thanks. > Independent of PEP 420, something we should probably consider for the > official documentation of the import system is making a clearer > distinction between importers (on sys.meta_path) and finders (returned > by sys.path_hooks entries) than was the case in PEP 302 (which calls > them all finders). I'm hoping this all gets cleared up in the importer documentation. Eric. From eric at trueblade.com Fri May 11 14:54:41 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 11 May 2012 08:54:41 -0400 Subject: [Import-SIG] Dynamic path calculation in PEP 420 Message-ID: <4FAD0C11.3070107@trueblade.com> I think something along the lines of PJE's approach described in http://mail.python.org/pipermail/import-sig/2012-April/000473.html is doable. I think it's somewhat more complex because you need to call finders to do the path computation, plus there's a cache involved, etc. I'd be willing to consider include this in the PEP, but I've run out of time to prove it's feasible by implementing it in the features/pep-420 branch. If anyone else has time, that would be great. I'm hoping to ask that the PEP approved (or not) by the end of the weekend. Eric. From eric at trueblade.com Fri May 11 14:58:19 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 11 May 2012 08:58:19 -0400 Subject: [Import-SIG] Dynamic path calculation in PEP 420 In-Reply-To: <4FAD0C11.3070107@trueblade.com> References: <4FAD0C11.3070107@trueblade.com> Message-ID: <4FAD0CEB.2010106@trueblade.com> > I'd be willing to consider include this in the PEP, but I've run out of ^ including I should proofread before, not after, I hit "send". Eric. From eric at trueblade.com Fri May 11 18:19:03 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 11 May 2012 12:19:03 -0400 Subject: [Import-SIG] Dynamic path calculation in PEP 420 In-Reply-To: <4FAD0C11.3070107@trueblade.com> References: <4FAD0C11.3070107@trueblade.com> Message-ID: <4FAD3BF7.8020101@trueblade.com> On 05/11/2012 08:54 AM, Eric V. Smith wrote: > I think something along the lines of PJE's approach described in > http://mail.python.org/pipermail/import-sig/2012-April/000473.html is > doable. I think it's somewhat more complex because you need to call > finders to do the path computation, plus there's a cache involved, etc. > > I'd be willing to consider include this in the PEP, but I've run out of > time to prove it's feasible by implementing it in the features/pep-420 > branch. If anyone else has time, that would be great. Nevermind. I created some spare time and got it working. I'll update the PEP and the pep-420 branch sometime today. Eric. From eric at trueblade.com Fri May 11 18:48:07 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 11 May 2012 12:48:07 -0400 Subject: [Import-SIG] PEP 420 issue: extend_path In-Reply-To: References: <4FA789F7.7080609@v.loewis.de> <4FA899B9.9080904@trueblade.com> <4FA8AAA1.6010202@trueblade.com> <4FA8F812.9000308@trueblade.com> <4FA928A4.4090408@trueblade.com> <4FA96127.1080103@trueblade.com> <20120509084824.Horde.1XOmbML8999PqhM4pBazbuA@webmail.df.eu> <4FAB099E.90100@trueblade.com> <20120509174157.62cd6f2e@rivendell> <20120509192204.6445a32d@rivendell> <4FAC5582.3000103@trueblade.com> <4FAC5DD0.5040702@trueblade.com> <4FAC6512.3070103@trueblade.com> Message-ID: <4FAD42C7.7020605@trueblade.com> On 05/10/2012 09:56 PM, Nick Coghlan wrote: > On Fri, May 11, 2012 at 11:02 AM, Eric V. Smith wrote: >> Feel free to wordsmith it. > > Only minor quibbles: > > "load a regular package" -> "load a module or self-contained package" > (when we get a loader, we don't know if it's for a module or a > package, and I also prefer "self-contained package" over "regular > package", since it's more future proof terminology) For this particular edit, I stuck with "regular package". It's defined in the PEP, so I don't think it's confusing (or at least no more confusing that other instances of the term). I might consider a wholesale replacement in the PEP at a later date. > "will be a list containing only" -> "will contain only" Fixed. Thanks! Eric. From eric at trueblade.com Fri May 11 18:58:45 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 11 May 2012 12:58:45 -0400 Subject: [Import-SIG] Dynamic path calculation in PEP 420 In-Reply-To: <4FAD3BF7.8020101@trueblade.com> References: <4FAD0C11.3070107@trueblade.com> <4FAD3BF7.8020101@trueblade.com> Message-ID: <4FAD4545.70304@trueblade.com> On 05/11/2012 12:19 PM, Eric V. Smith wrote: > On 05/11/2012 08:54 AM, Eric V. Smith wrote: >> I think something along the lines of PJE's approach described in >> http://mail.python.org/pipermail/import-sig/2012-April/000473.html is >> doable. I think it's somewhat more complex because you need to call >> finders to do the path computation, plus there's a cache involved, etc. >> >> I'd be willing to consider include this in the PEP, but I've run out of >> time to prove it's feasible by implementing it in the features/pep-420 >> branch. If anyone else has time, that would be great. > > Nevermind. I created some spare time and got it working. I'll update the > PEP and the pep-420 branch sometime today. I've updated the PEP and the pep-420 branch to support dynamic path computation. Eric. From eric at trueblade.com Fri May 11 20:20:18 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 11 May 2012 14:20:18 -0400 Subject: [Import-SIG] PEP 420 outstanding issues Message-ID: <4FAD5862.2060500@trueblade.com> The only thing I'm aware of is that I need to look at Martin's issue of how do we gradually migrate to PEP 420 namespace packages from the existing pkgutil and pkg_resources versions of namespace packages. I'll do that this weekend. Does anyone know of any other PEP issues? I know there are some outstanding implementation and testing issues, but I'm not so concerned about those before getting the PEP ruled on. Eric. From pje at telecommunity.com Fri May 11 21:20:37 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 11 May 2012 15:20:37 -0400 Subject: [Import-SIG] Dynamic path calculation in PEP 420 In-Reply-To: <4FAD4545.70304@trueblade.com> References: <4FAD0C11.3070107@trueblade.com> <4FAD3BF7.8020101@trueblade.com> <4FAD4545.70304@trueblade.com> Message-ID: On Fri, May 11, 2012 at 12:58 PM, Eric V. Smith wrote: > On 05/11/2012 12:19 PM, Eric V. Smith wrote: > > On 05/11/2012 08:54 AM, Eric V. Smith wrote: > >> I think something along the lines of PJE's approach described in > >> http://mail.python.org/pipermail/import-sig/2012-April/000473.html is > >> doable. I think it's somewhat more complex because you need to call > >> finders to do the path computation, plus there's a cache involved, etc. > >> > >> I'd be willing to consider include this in the PEP, but I've run out of > >> time to prove it's feasible by implementing it in the features/pep-420 > >> branch. If anyone else has time, that would be great. > > > > Nevermind. I created some spare time and got it working. I'll update the > > PEP and the pep-420 branch sometime today. > > I've updated the PEP and the pep-420 branch to support dynamic path > computation. > Awesome! This now seems a more-than-worthy replacement for PEP 402. One nit: the PEP says a namespace package "Has a __path__ attribute set to the list of directories that were found and recorded during the scan." but this should be changed to say it's set to an iterable or sequence of the strings returned by the finders. That is, it should explicitly point out that __path__ is NOT a list for namespace packages, and instead is an automatically-updated sequence. The actual section you have on that is fine, it's just this other bit that needs to be clear. Maybe a section called "Differences Between Namespace Pacakges And Regular Packages" summarizing all the differences (no __file__, read-only updated __path__, no __init__.py, etc.) would be a good idea? Also a suggestion: given the number of rationale/use-case questions that seem to keep coming up, adding a note somewhere to say that people who want to know more about rationale, use cases, alternatives considered/rejected etc., to check out PEP 402. Yes, I know it's already mentioned, but it's mentioned as rejected, when in fact the main difference between 402 and 420 is that we've nailed down a few things that were open-ended before, and threw out a couple of minor features/limitations (i.e. the parent .py feature and the only-importing-child packages limitation), so virtually all of the rationale and alternatives discussion applies the same to 420. (To be clear, this isn't about credit - I'd be just as happy (maybe more so) if you simply copy-pasted relevant bits to 420 and omitted any reference to 402. I just hate to see the same stuff getting rehashed over and over again... and when this is more widely publicized, it almost certainly WILL all come up again.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Fri May 11 22:11:24 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 11 May 2012 16:11:24 -0400 Subject: [Import-SIG] PEP 420 outstanding issues In-Reply-To: <4FAD5862.2060500@trueblade.com> References: <4FAD5862.2060500@trueblade.com> Message-ID: On Fri, May 11, 2012 at 2:20 PM, Eric V. Smith wrote: > The only thing I'm aware of is that I need to look at Martin's issue of > how do we gradually migrate to PEP 420 namespace packages from the > existing pkgutil and pkg_resources versions of namespace packages. I'll > do that this weekend. > FWIW, setuptools and distribute (and maybe pip) already do the right thing when installing in "single version" mode, which is usually what the distros use. They would simply need to stop also creating a .pth file (or dummy __init__.py files) when running under 3.3. They'd also need to be updated to support new-style namespace packages in their source trees. The implementation of declare_namespace() for 3.3+ would then just be to create a module object (if not present) and set its __path__ to a virtual path object based on the parent. (Sadly, this can't be backported since older Pythons require an actual list object... hmm... or do they? Maybe a list *subclass* would work.... must check into that. If it works back to 2.3 I could backport... darn, it uses PyList_Size() and PyList_Getitem(), so I'd also need a meta_path hook to trigger updates. Hm. Got to think about that some more.) The only corner case I can think of is mixing __init__ portions with non-__init__ portions. If you have both, it's not sufficient to create a simple PEP 420 virtual path, because it won't include the __init__ portions. A backport or "transition support" version needs a way to force __init__ portions to be included in the resulting virtual path. This can't be done with the current find_loader() protocol, because the finder doesn't distinguish between package and non-package cases. If find_loader() always returned a path for a package (even non-namespace packages), then this would allow virtual paths to be made either inclusive or exclusive of __init__ segments. That is, it would let there be a transition period where you could explicitly declare a namespace to get a mixed namespace, but by default the paths would be exclusive. I'm not sure if anything I just said is clear without an example, so I'll throw one in. Let's say somebody's writing code that spans multiple Python versions, and they want their __init__-based namespace packages to work, but be forward compatible with new subpackages using PEP 420 portions. Basically, they write some code that calls declare_namespace(), which then sets the module's __path__ to be an "inclusive virtual" path. This path object is similar to the current virtual path object, except that it *always* uses the second find_loader() return value, even if the first value returned is not None. Poof! Instant "transitional" namespace package, backward-compatible with older Python versions, and forward-compatible with PEP 420. Okay, technically that was more of a rationale than an example, but I hope it's a bit clearer anyway. ;-) For purposes of the PEP, all I'd request changing is asking that find_loader() always return the path of an existing directory in the second return value, even if it's also returning a loader. More precisely, if it returns a loader for a package, it should also return the package directory in the second argument. importlib can still ignore this second argument, but a transitional version of declare_namespace() can use it to implement "mixed mode" namespace packages. (Which facilitates backporting the mechanism to older setuptools as well - I'll change the nspkg.pth files to do something like 'import pep420; pep420.declare_namespace("foo")' and my pep420 module will include its own mixed-mode virtual path support, and emulate find_loader() for the builtin importers in older Pythons.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat May 12 02:14:33 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 12 May 2012 10:14:33 +1000 Subject: [Import-SIG] PEP 420 outstanding issues In-Reply-To: References: <4FAD5862.2060500@trueblade.com> Message-ID: PJE's proposal that self-contained package loaders *also* report their prospective __path__ entries in the second half of the tuple sounds reasonable to me. It provides a way to cleanly distinguish all 4 significant cases (standalone module, regular package, package portion, not found). The standard import system will treat the first two cases the same way, but making the distinction official means custom import systems can decide to do something different. Cheers, Nick. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Sat May 12 02:46:44 2012 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 11 May 2012 20:46:44 -0400 Subject: [Import-SIG] PEP 420 outstanding issues In-Reply-To: References: <4FAD5862.2060500@trueblade.com> Message-ID: <4FADB2F4.1000702@trueblade.com> On 5/11/2012 4:11 PM, PJ Eby wrote: > If find_loader() always returned a path for a package (even > non-namespace packages), then this would allow virtual paths to be made > either inclusive or exclusive of __init__ segments. That is, it would > let there be a transition period where you could explicitly declare a > namespace to get a mixed namespace, but by default the paths would be > exclusive. > > I'm not sure if anything I just said is clear without an example, so > I'll throw one in. Let's say somebody's writing code that spans > multiple Python versions, and they want their __init__-based namespace > packages to work, but be forward compatible with new subpackages using > PEP 420 portions. Basically, they write some code that calls > declare_namespace(), which then sets the module's __path__ to be an > "inclusive virtual" path. This path object is similar to the current > virtual path object, except that it *always* uses the second > find_loader() return value, even if the first value returned is not > None. Poof! Instant "transitional" namespace package, > backward-compatible with older Python versions, and forward-compatible > with PEP 420. But the second value (the paths) won't include anything on the parent path that occurs after __init__.py is found. Or am I missing something? From pje at telecommunity.com Sat May 12 04:41:10 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 11 May 2012 22:41:10 -0400 Subject: [Import-SIG] PEP 420 outstanding issues In-Reply-To: <4FADB2F4.1000702@trueblade.com> References: <4FAD5862.2060500@trueblade.com> <4FADB2F4.1000702@trueblade.com> Message-ID: On Fri, May 11, 2012 at 8:46 PM, Eric V. Smith wrote: > On 5/11/2012 4:11 PM, PJ Eby wrote: > > > If find_loader() always returned a path for a package (even > > non-namespace packages), then this would allow virtual paths to be made > > either inclusive or exclusive of __init__ segments. That is, it would > > let there be a transition period where you could explicitly declare a > > namespace to get a mixed namespace, but by default the paths would be > > exclusive. > > > > I'm not sure if anything I just said is clear without an example, so > > I'll throw one in. Let's say somebody's writing code that spans > > multiple Python versions, and they want their __init__-based namespace > > packages to work, but be forward compatible with new subpackages using > > PEP 420 portions. Basically, they write some code that calls > > declare_namespace(), which then sets the module's __path__ to be an > > "inclusive virtual" path. This path object is similar to the current > > virtual path object, except that it *always* uses the second > > find_loader() return value, even if the first value returned is not > > None. Poof! Instant "transitional" namespace package, > > backward-compatible with older Python versions, and forward-compatible > > with PEP 420. > > But the second value (the paths) won't include anything on the parent > path that occurs after __init__.py is found. Or am I missing something? > What this is for is for __init__.py files that call declare_namespace() or some other API. When they call it, the API will replace the package's __path__ with a "mixed mode" virtual path object. This object would take the parent package __path__, and walk it to find *all* the subpaths, and support auto-updates if the parent __path__ or sys.path is modified. The reason for changing the protocol is that this alternate implementation wouldn't be able to add sections with __init__.py's if the finders didn't return the paths for non-namespace packages. In other words, this isn't about changing the PEP's normal import algorithm, it's just for tools that want to provide a compatibility upgrade path, so that existing __init__.py modules can play in the new, post-PEP 420 world. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sat May 12 05:58:11 2012 From: brett at python.org (Brett Cannon) Date: Fri, 11 May 2012 23:58:11 -0400 Subject: [Import-SIG] PEP 420 outstanding issues In-Reply-To: References: <4FAD5862.2060500@trueblade.com> Message-ID: On Fri, May 11, 2012 at 8:14 PM, Nick Coghlan wrote: > PJE's proposal that self-contained package loaders *also* report their > prospective __path__ entries in the second half of the tuple sounds > reasonable to me. It provides a way to cleanly distinguish all 4 > significant cases (standalone module, regular package, package portion, not > found). > > The standard import system will treat the first two cases the same way, > but making the distinction official means custom import systems can decide > to do something different. > If we are going as far as to have finders return the value of __path__, should we add an equivalent extension to the loader API to get the sequence back? I have always found it extremely regrettable that is_package() was defined to return a boolean instead of the list for __path__. Otherwise I would at least want to change the __init__ signature for the various loaders that FileFinder uses to take this new path argument so it doesn't need to be recalculated and then expose it somehow as an attribute (although right now FileLoader already sets a 'path' attribute; might need to rename that filepath/file_path). -------------- next part -------------- An HTML attachment was scrubbed... URL: From benoit at marmelune.net Sun May 13 15:18:17 2012 From: benoit at marmelune.net (=?UTF-8?Q?Beno=C3=AEt_Bryon?=) Date: Sun, 13 May 2012 15:18:17 +0200 Subject: [Import-SIG] Fwd: Re: Proposal and questions about PEP 420 Message-ID: -------- Original Message -------- Subject: Re: Proposal and questions about PEP 420 Date: Sun, 13 May 2012 08:10:53 -0400 From: "Eric V. Smith" To: Beno?t Bryon Hi, Benoit. This hasn't been specifically discussed. Part of the rejection of PEP 382 was due to dots in the names of directories, but as directory extensions (foo.pyp). I'm not sure if this concern would also apply to your proposal. Also, I'm not sure how common nested namespace packages are. I know I've never run across them. That said, I think you should post this idea to the import-sig mailing list, and see what others think. Eric. On 5/13/2012 7:31 AM, Beno?t Bryon wrote: > Hi, > > I just read PEP 420 about namespace packages, and I wonder if the > following points have been considered... > > .. note:: > > I'm asking you because PEP 1 tells: > > > When in doubt about where to send your changes, please check first > > with the PEP author and/or PEP editor. > > > Has the following proposal been considered or refused? > > As an example, to implement foo.bar and foo.baz namespaces: > > :: > > somewhere-in-sys.path > ??? foo.bar > ? ??? __init__.py > ??? foo.baz > ??? __init__.py > > I mean: > > * since "namespace" directories have to be empty, we can get rid of them. > * a namespace package would be "some directory with at least a dot in > the name". > > > As import machinery > =================== > > * It's easy to guess that "foo.bar.baz.other" is a package in > "foo.bar.baz" namespace. > * It's flat. > * It's quick to scan: no loops within "foo/" folder(s), only one disk > access to read an unique __init__.py. > > > As a developer > ============== > > As a developer, when I edit namespace packages, I have to deal with > nested empty directories. > As an example, here is a repository layout to edit some foo.bar.baz > package with PEP-420: > > :: > > path-to-python-foobarbaz/ > ??? setup.py > ??? README > ??? foo > ??? bar > ??? baz > ??? __init__.py > > .. note:: > > With current namespace packages implementation, foo/ and foo/bar/ > contain an __init__.py > > * We never edit foo/ and foo/bar/. They are empty. They contain nothing > valuable. > * When one sees package root folder, he cannot guess that foo/ is a > namespace package. > * To new Python users, it's not clear that foo/ must be empty (or must > contain only a "constant" __init__.py). > * It seems that many developers actually don't use namespace packages > because it's implementation is not flat. > As an example, I personally know some Django users who argue that > "Flat is better than nested" wins over "Namespaces are one honking > great idea -- let's do more of those!". > Since "more namespaces" currently means "more nested directories", we > can't convince these users to implement namespace packages. > > With "directories with dots in their names" proposal: > > :: > > path-to-python-foobarbaz/ > ??? setup.py > ??? README > ??? foo.bar.baz > ??? __init__.py > > * it's flat. > * it's easy to guess that "foo.bar.baz" is a namespace package. > * impossible to write code in foo/ folder or alter foo/__init__.py: > "foo" is not a classic module, it is a namespace. > * respects both "Flat is better than nested" **and** "Namespaces are one > honking great idea -- let's do more of those!". So it may attract > developers and maybe more namespaces would be created. > > > Regards, > > -- > Benoit Bryon > From guido at python.org Sun May 13 17:33:02 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 13 May 2012 08:33:02 -0700 Subject: [Import-SIG] Fwd: Re: Proposal and questions about PEP 420 In-Reply-To: References: Message-ID: -1. It would create a very busy toplevel directory. There's a reason people nest directories... On Sun, May 13, 2012 at 6:18 AM, Beno?t Bryon wrote: > -------- Original Message -------- > Subject: Re: Proposal and questions about PEP 420 > Date: Sun, 13 May 2012 08:10:53 -0400 > From: "Eric V. Smith" > To: Beno?t Bryon > > Hi, Benoit. > > This hasn't been specifically discussed. Part of the rejection of PEP > 382 was due to dots in the names of directories, but as directory > extensions (foo.pyp). I'm not sure if this concern would also apply to > your proposal. > > Also, I'm not sure how common nested namespace packages are. I know I've > never run across them. > > That said, I think you should post this idea to the import-sig mailing > list, and see what others think. > > Eric. > > On 5/13/2012 7:31 AM, Beno?t Bryon wrote: >> >> Hi, >> >> I just read PEP 420 about namespace packages, and I wonder if the >> following points have been considered... >> >> .. note:: >> >> ?I'm asking you because PEP 1 tells: >> >> ?> ?When in doubt about where to send your changes, please check first >> ?> ?with the PEP author and/or PEP editor. >> >> >> Has the following proposal been considered or refused? >> >> As an example, to implement foo.bar and foo.baz namespaces: >> >> :: >> >> ?somewhere-in-sys.path >> ???? foo.bar >> ?? ? ??? __init__.py >> ???? foo.baz >> ? ? ???? __init__.py >> >> I mean: >> >> * since "namespace" directories have to be empty, we can get rid of them. >> * a namespace package would be "some directory with at least a dot in >> ?the name". >> >> >> As import machinery >> =================== >> >> * It's easy to guess that "foo.bar.baz.other" is a package in >> ?"foo.bar.baz" namespace. >> * It's flat. >> * It's quick to scan: no loops within "foo/" folder(s), only one disk >> ?access to read an unique __init__.py. >> >> >> As a developer >> ============== >> >> As a developer, when I edit namespace packages, I have to deal with >> nested empty directories. >> As an example, here is a repository layout to edit some foo.bar.baz >> package with PEP-420: >> >> :: >> >> ?path-to-python-foobarbaz/ >> ???? setup.py >> ???? README >> ???? foo >> ? ? ???? bar >> ? ? ? ? ???? baz >> ? ? ? ? ? ? ???? __init__.py >> >> .. note:: >> >> ?With current namespace packages implementation, foo/ and foo/bar/ >> ?contain an __init__.py >> >> * We never edit foo/ and foo/bar/. They are empty. They contain nothing >> ?valuable. >> * When one sees package root folder, he cannot guess that foo/ is a >> ?namespace package. >> * To new Python users, it's not clear that foo/ must be empty (or must >> ?contain only a "constant" __init__.py). >> * It seems that many developers actually don't use namespace packages >> ?because it's implementation is not flat. >> ?As an example, I personally know some Django users who argue that >> ?"Flat is better than nested" wins over "Namespaces are one honking >> ?great idea -- let's do more of those!". >> ?Since "more namespaces" currently means "more nested directories", we >> ?can't convince these users to implement namespace packages. >> >> With "directories with dots in their names" proposal: >> >> :: >> >> ?path-to-python-foobarbaz/ >> ???? setup.py >> ???? README >> ???? foo.bar.baz >> ? ? ???? __init__.py >> >> * it's flat. >> * it's easy to guess that "foo.bar.baz" is a namespace package. >> * impossible to write code in foo/ folder or alter foo/__init__.py: >> ?"foo" is not a classic module, it is a namespace. >> * respects both "Flat is better than nested" **and** "Namespaces are one >> ?honking great idea -- let's do more of those!". So it may attract >> ?developers and maybe more namespaces would be created. >> >> >> Regards, >> >> -- >> Benoit Bryon >> > > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig -- --Guido van Rossum (python.org/~guido) From pje at telecommunity.com Sun May 13 19:25:58 2012 From: pje at telecommunity.com (PJ Eby) Date: Sun, 13 May 2012 13:25:58 -0400 Subject: [Import-SIG] Fwd: Re: Proposal and questions about PEP 420 In-Reply-To: References: Message-ID: On Sun, May 13, 2012 at 9:18 AM, Beno?t Bryon wrote: > "Eric V. Smith" wrote: > > > Hi, Benoit. > > > This hasn't been specifically discussed. Part of the rejection of PEP > > 382 was due to dots in the names of directories, but as directory > > extensions (foo.pyp). I'm not sure if this concern would also apply to > > your proposal. > > > Also, I'm not sure how common nested namespace packages are. I know I've > > never run across them. > zope.app and peak.util are two I know of. But deep nesting isn't common in the Python world, despite the popularity of things like org.apache.someproject.gizmos.GizmoFactory in the Java world. The proposal itself is intriguing, but it's not only less backward compatible and directory-cluttering, it has some potential for ambiguity in the spec and doesn't seem like a reasonable departure from other languages' conventions in this area. Regarding the nesting issue and persuading Django developers to use namespaces, I would note that there isn't any reason for namespaces to be deeply nested in the first place. By convention, top-level namespace packages should be the name of a project or its sponsoring organization, which means there is rarely a need for deep nesting. Even cases like zope.app and peak.util are rare: usually a project or organization will have only one such "miscellaneous" namespace with lots of separately-distibuted components. (After all, the main reason to *have* a namespace package is to have separately-distributed subpackages. So, self-contained packages don't need to have namespaces of their own, almost by definition.) Anyway, what I've noticed is that when people want to deeply nest namespaces, it's usually because they're trying to share a namespace across organizations, like making a shared 'net.*' namespace. The idea of namespaces isn't for that kind of categorization, though, it's for *ownership*. If two developers are fighting over where to put something in a category hierarchy, it's a sign that they need to be working in different namespaces, with each developer staking a claim to a *top-level* package -- like OSAF's osaf.*, Zope Corporation's zc.* (vs. the community project's zope.*), and so on. When developers use namespaces for project/ownership distinction, the resulting package hierarchies can be pretty much as flat as you like. -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Sun May 13 20:56:03 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 13 May 2012 20:56:03 +0200 Subject: [Import-SIG] Fwd: Re: Proposal and questions about PEP 420 In-Reply-To: References: Message-ID: <4FB003C3.1070405@v.loewis.de> > This hasn't been specifically discussed. Part of the rejection of PEP > 382 was due to dots in the names of directories, but as directory > extensions (foo.pyp). I'm not sure if this concern would also apply to > your proposal. Indeed, this hasn't been proposed (actually, when I presented namespace packages to the Berlin Python user group last week, someone proposed that). > Also, I'm not sure how common nested namespace packages are. I know I've > never run across them. Not sure you understood the proposal. With namespace packages, you always have nested packages, so zope.interfaces would live in a directory "zope.interfaces". Apparently, Java also supports that for packages these days. > That said, I think you should post this idea to the import-sig mailing > list, and see what others think. I would refuse that as the primary mechanism for namespace packages. However, I see this as a plausible extension of PEP 420: Lookup of a package foo.bar would look in foo.__path__ for bar/, but would also look in sys.path for foo.bar/. In Java, people apparently want that because they get these deeply nested directory hiearchies (org/apache/commons/betwixt/expression). It's apparently possible to condense this into org.apache.commons.betwixt/expression (which isn't a shorter string, but fewer cd commands / explorer clicks / .svn folders). I predict that people will start using PEP 420 in the reversed-domain fashion also, so we eventually might end up wanting something like this for Python. However, as it can safely be added later, there is no hurry. Regards, Martin From martin at v.loewis.de Sun May 13 20:58:41 2012 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 13 May 2012 20:58:41 +0200 Subject: [Import-SIG] Fwd: Re: Proposal and questions about PEP 420 In-Reply-To: References: Message-ID: <4FB00461.4060402@v.loewis.de> On 13.05.2012 17:33, Guido van Rossum wrote: > -1. It would create a very busy toplevel directory. There's a reason > people nest directories... Alas, thanks to egg files, we already have busy toplevel directories, which extend into long sys.path lists. With this proposal, we would still have busy toplevel directories, but shorter search path - while maintaining the "remove this directory to uninstall" feature that people apparently like about eggs. Regards, Martin From solipsis at pitrou.net Sun May 13 21:15:30 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 13 May 2012 21:15:30 +0200 Subject: [Import-SIG] Proposal and questions about PEP 420 References: <4FB00461.4060402@v.loewis.de> Message-ID: <20120513211530.6d2a4338@pitrou.net> On Sun, 13 May 2012 20:58:41 +0200 "Martin v. L?wis" wrote: > On 13.05.2012 17:33, Guido van Rossum wrote: > > -1. It would create a very busy toplevel directory. There's a reason > > people nest directories... > > Alas, thanks to egg files, we already have busy toplevel directories, > which extend into long sys.path lists. I don't think it's egg files themselves, but the habit setuptools/distribute has to extend sys.path as part of its pth files. Regards Antoine. From benoit at marmelune.net Sun May 13 22:10:05 2012 From: benoit at marmelune.net (=?UTF-8?B?QmVub8OudCBCcnlvbg==?=) Date: Sun, 13 May 2012 22:10:05 +0200 Subject: [Import-SIG] Fwd: Re: Proposal and questions about PEP 420 In-Reply-To: References: Message-ID: <4FB0151D.8010208@marmelune.net> Eric V. Smith wrote : > Also, I'm not sure how common nested namespace packages are. They seem common at least in Plone community : * http://pypi.python.org/pypi?%3Aaction=search&term=plone&submit=search * http://pypi.python.org/pypi?%3Aaction=search&term=collective&submit=search Maybe a Plone user (I'm not a Plone user) could tell us more about it... PJ Eby wrote: > The proposal itself is intriguing, but it's not only less backward > compatible and directory-cluttering, it has some potential for > ambiguity in the spec and doesn't seem like a reasonable departure > from other languages' conventions in this area. I have poor knowledge of other languages. I just asked, because I felt surprised we are about to create (potentially nested) empty directories in order to implement namespaces, then wondered if we could do it with only one directory. That said, my motivation isn't to block or change PEP 420. I was wondering why such a solution wasn't at least mentionned in the PEP as part of the discussions. PJ Eby wrote: > there is rarely a need for deep nesting +1 I currently don't know a Python package with more than 3 levels (like zc.recipe.egg), and I guess that more than 3 levels would be too much. Martin v. L?wis wrote: > In Java, people apparently want that because they get these deeply > nested directory hiearchies (org/apache/commons/betwixt/expression). > It's apparently possible to condense this into > org.apache.commons.betwixt/expression (which isn't a shorter string, > but fewer cd commands / explorer clicks / .svn folders). With a maximum of 3 levels, it's not a so big issue. A bit annoying, but low priority. Martin v. L?wis wrote: > On 13.05.2012 17:33, Guido van Rossum wrote: >> -1. It would create a very busy toplevel directory. There's a reason >> people nest directories... > > Alas, thanks to egg files, we already have busy toplevel directories, > which extend into long sys.path lists. http://pypi.python.org/pypi/collective.recipe.omelette/0.15 seems a fair solution for this issue. Regards, Benoit From eric at trueblade.com Tue May 15 19:26:31 2012 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 15 May 2012 13:26:31 -0400 Subject: [Import-SIG] pkgutil.extend_path Message-ID: <4FB291C7.2020700@trueblade.com> I'm looking at fixing pkgutil.extend_path in order to support namespace packages where some portions use PEP-420 and some use extend_path. The first thing I notice is that there are no tests for pkgutil.extend_path :( extend_path currently just examines the filesystem directly, which means it doesn't support portions in zip files or other finder/loaders. But if I understand PJE and others correctly, the idea is to modify extend_path so it calls the path_hook finders instead of looking at the filesystem (in order to find the other __path__ entries). This looks like a change in functionality: previously only real filesystem packages would be found. With this change it would include other finders (like zip files). While it's a change, it would align well with importlib. Personally I'm okay with this change. There's the added issue of how to deal with .pkg files. Is only supporting them from the filesystem okay? Or is it worth the hassle of creating some finder API to access them? Any thoughts? Eric. From pje at telecommunity.com Wed May 16 00:04:06 2012 From: pje at telecommunity.com (PJ Eby) Date: Tue, 15 May 2012 18:04:06 -0400 Subject: [Import-SIG] pkgutil.extend_path In-Reply-To: <4FB291C7.2020700@trueblade.com> References: <4FB291C7.2020700@trueblade.com> Message-ID: On Tue, May 15, 2012 at 1:26 PM, Eric V. Smith wrote: > I'm looking at fixing pkgutil.extend_path in order to support namespace > packages where some portions use PEP-420 and some use extend_path. > > The first thing I notice is that there are no tests for > pkgutil.extend_path :( > > extend_path currently just examines the filesystem directly, which means > it doesn't support portions in zip files or other finder/loaders. > > But if I understand PJE and others correctly, the idea is to modify > extend_path so it calls the path_hook finders instead of looking at the > filesystem (in order to find the other __path__ entries). Well, I actually wasn't trying to support that at all; I've never used extend_path() myself, so I was only talking about implementing transitional support for the PEP in declare_namespace(). That being said, it sounds like adding support in extend_path() might also be worthwhile. Presumably, it would *not* auto-update, since extend_path() doesn't currently do that. (Just as declare_namespace() *should* auto-update, because it currently does.) > This looks > like a change in functionality: previously only real filesystem packages > would be found. With this change it would include other finders (like > zip files). While it's a change, it would align well with importlib. > Personally I'm okay with this change. > Makes sense to me. There's the added issue of how to deal with .pkg files. Is only > supporting them from the filesystem okay? Or is it worth the hassle of > creating some finder API to access them? > I don't know who actually uses them, but then I don't know who uses extend_path(), period. The only examples I was able to find with Google and Nullege are of the form "try: declare_namespace() except: extend_path()" -- that is, code that first tries to use declare_namespace() instead. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Wed May 16 03:40:43 2012 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 15 May 2012 21:40:43 -0400 Subject: [Import-SIG] pkgutil.extend_path In-Reply-To: References: <4FB291C7.2020700@trueblade.com> Message-ID: <4FB3059B.7010809@trueblade.com> On 05/15/2012 06:04 PM, PJ Eby wrote: > On Tue, May 15, 2012 at 1:26 PM, Eric V. Smith > wrote: > > I'm looking at fixing pkgutil.extend_path in order to support namespace > packages where some portions use PEP-420 and some use extend_path. > > The first thing I notice is that there are no tests for > pkgutil.extend_path :( > > extend_path currently just examines the filesystem directly, which means > it doesn't support portions in zip files or other finder/loaders. > > But if I understand PJE and others correctly, the idea is to modify > extend_path so it calls the path_hook finders instead of looking at the > filesystem (in order to find the other __path__ entries). > > > Well, I actually wasn't trying to support that at all; I've never used > extend_path() myself, so I was only talking about implementing > transitional support for the PEP in declare_namespace(). I'm mostly interested in extend_path() because: a) it's in the standard library b) if it can be made to work, I assume declare_namespace() can, too > That being said, it sounds like adding support in extend_path() might > also be worthwhile. Presumably, it would *not* auto-update, since > extend_path() doesn't currently do that. (Just as declare_namespace() > *should* auto-update, because it currently does.) Right. I have extend_path() working with mixed namespace packages: those that have an __init__.py with extend_path, and those with no __init__.py. In this case, it does not support auto-updating (although it easily could, if we expose that from _bootstrap). I'll update the PEP to mention returning the portions iterable even if a loader is returned, clean up the code, and check it in. I'll leave declare_namespace() for someone else. I might have time to look at it if/when the PEP is accepted, but no promises. > There's the added issue of how to deal with .pkg files. Is only > supporting them from the filesystem okay? Or is it worth the hassle of > creating some finder API to access them? > > > I don't know who actually uses them, but then I don't know who uses > extend_path(), period. The only examples I was able to find with Google > and Nullege are of the form "try: declare_namespace() except: > extend_path()" -- that is, code that first tries to use > declare_namespace() instead. I haven't touched the .pkg code. It doesn't work terribly well if a zipimporter is used, but then it never did. At least I'm not making it worse. Once this is checked in, I think the work on the PEP and sample implementation is done. Eric. From eric at trueblade.com Wed May 16 04:44:54 2012 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 15 May 2012 22:44:54 -0400 Subject: [Import-SIG] PEP 420 status Message-ID: <4FB314A6.9080000@trueblade.com> I think I've addressed all outstanding issues. So if no one disagrees, I'll ask Guido to rule on it. The implementation in features/pep-420 is basically complete. I'll address the remaining issues once it's accepted (if it is). These issues include: - more complete tests. The existing tests are pretty good, but there are some corner cases that need exercising. - zipimport needs to work even if there's no "directory" entry for a portion. Eric. From guido at python.org Thu May 17 00:47:43 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 16 May 2012 15:47:43 -0700 Subject: [Import-SIG] PEP 420 status In-Reply-To: <4FB314A6.9080000@trueblade.com> References: <4FB314A6.9080000@trueblade.com> Message-ID: Eric did ask me and I'm busy studying the PEP. Thanks all who participated, and thanks to Eric for managing the project so well! --Guido On Tue, May 15, 2012 at 7:44 PM, Eric V. Smith wrote: > I think I've addressed all outstanding issues. So if no one disagrees, > I'll ask Guido to rule on it. > > The implementation in features/pep-420 is basically complete. I'll > address the remaining issues once it's accepted (if it is). > > These issues include: > - more complete tests. The existing tests are pretty good, but there are > ?some corner cases that need exercising. > - zipimport needs to work even if there's no "directory" entry for > ?a portion. > > Eric. > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig -- --Guido van Rossum (python.org/~guido)