From guido at python.org Mon Mar 12 00:02:07 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 11 Mar 2012 16:02:07 -0700 Subject: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace packages)? Message-ID: Martin has asked me to decide on PEP 382 vs. PEP 402 (namespace packages) in time for inclusion of the decision in Python 3.3. As people who attended the language-sig know, I am leaning towards PEP 402 but I admit that at this point I don't have enough information. If I have questions, should I be asking them on the import-sig or on python-dev? Is it tolerable if I ask questions even if the answer is somewhere in the archives? (I spent a lot of time reviewing the "pitchfork thread", http://mail.python.org/pipermail/python-dev/2006-April/064400.html, but that wasn't particularly fruitful, so I'm worried I'd just waste my time browsing the archives -- if the PEP authors did their jobs well the PEPs should include summaries of the discussion anyways.) -- --Guido van Rossum (python.org/~guido) From eric at trueblade.com Mon Mar 12 00:05:48 2012 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 11 Mar 2012 16:05:48 -0700 Subject: [Import-SIG] =?utf-8?q?Where_to_discuss_PEP_382_vs=2E_PEP_402_=28?= =?utf-8?q?namespace=09packages=29=3F?= In-Reply-To: References: Message-ID: I think restarting the discussion anew here on distutils-sig is appropriate. -- Eric. Guido van Rossum wrote: Martin has asked me to decide on PEP 382 vs. PEP 402 (namespace packages) in time for inclusion of the decision in Python 3.3. As people who attended the language-sig know, I am leaning towards PEP 402 but I admit that at this point I don't have enough information. If I have questions, should I be asking them on the import-sig or on python-dev? Is it tolerable if I ask questions even if the answer is somewhere in the archives? (I spent a lot of time reviewing the "pitchfork thread", http://mail.python.org/pipermail/python-dev/2006-April/064400.html, but that wasn't particularly fruitful, so I'm worried I'd just waste my time browsing the archives -- if the PEP authors did their jobs well the PEPs should include summaries of the discussion anyways.) -- --Guido van Rossum (python.org/~guido) _____________________________________________ Import-SIG mailing list Import-SIG at python.org http://mail.python.org/mailman/listinfo/import-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Mon Mar 12 00:06:54 2012 From: eric at trueblade.com (Eric V. Smith) Date: Sun, 11 Mar 2012 16:06:54 -0700 Subject: [Import-SIG] =?utf-8?q?Where_to_discuss_PEP_382_vs=2E_PEP_402_=28?= =?utf-8?q?namespace=09packages=29=3F?= In-Reply-To: References: Message-ID: <5b62ac97-c71b-4a10-a713-4c1aa43da783@email.android.com> And of course I meant import-sig. -- Eric. "Eric V. Smith" wrote: I think restarting the discussion anew here on distutils-sig is appropriate. -- Eric. Guido van Rossum wrote: Martin has asked me to decide on PEP 382 vs. PEP 402 (namespace packages) in time for inclusion of the decision in Python 3.3. As people who attended the language-sig know, I am leaning towards PEP 402 but I admit that at this point I don't have enough information. If I have questions, should I be asking them on the import-sig or on python-dev? Is it tolerable if I ask questions even if the answer is somewhere in the archives? (I spent a lot of time reviewing the "pitchfork thread", http://mail.python.org/pipermail/python-dev/2006-April/064400.html, but that wasn't particularly fruitful, so I'm worried I'd just waste my time browsing the archives -- if the PEP authors did their jobs well the PEPs should include summaries of the discussion anyways.) -- --Guido van Rossum (python.org/~guido) _____________________________________________ Import-SIG mailing list Import-SIG at python.org http://mail.python.org/mailman/listinfo/import-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Mon Mar 12 00:24:52 2012 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 11 Mar 2012 16:24:52 -0700 Subject: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace packages)? In-Reply-To: References: Message-ID: <4F5D3444.8060402@v.loewis.de> Am 11.03.12 16:02, schrieb Guido van Rossum: > Martin has asked me to decide on PEP 382 vs. PEP 402 (namespace > packages) in time for inclusion of the decision in Python 3.3. As > people who attended the language-sig know, I am leaning towards PEP > 402 but I admit that at this point I don't have enough information. If > I have questions, should I be asking them on the import-sig or on > python-dev? import-sig would be best. > Is it tolerable if I ask questions even if the answer is > somewhere in the archives? Sure! Martin From ncoghlan at gmail.com Mon Mar 12 01:58:01 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 12 Mar 2012 10:58:01 +1000 Subject: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace packages)? In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 9:02 AM, Guido van Rossum wrote: > Martin has asked me to decide on PEP 382 vs. PEP 402 (namespace > packages) in time for inclusion of the decision in Python 3.3. ?As > people who attended the language-sig know, I am leaning towards PEP > 402 but I admit that at this point I don't have enough information. If > I have questions, should I be asking them on the import-sig or on > python-dev? I agree with the others that import-sig is the right place. > Is it tolerable if I ask questions even if the answer is > somewhere in the archives? (I spent a lot of time reviewing the > "pitchfork thread", > http://mail.python.org/pipermail/python-dev/2006-April/064400.html, > but that wasn't particularly fruitful, so I'm worried I'd just waste > my time browsing the archives -- if the PEP authors did their jobs > well the PEPs should include summaries of the discussion anyways.) When PJE first proposed PEP 402 I was a fan, but I changed my mind once I realised it was fundamentally incompatible with PEP 395 (where I want to fix the way we initialise sys.path[0] so that direct execution of modules inside packages isn't broken). With the status quo or PEP 382's explicit namespace packages, the interpreter can look at the *filesystem* to figure out where the root directory of the package lives, as the explicit package markers (i.e. __init__.py files or *.pyp extensions) mean there is an unambiguous 1:1 mapping from a filesystem path to a (sys.path entry, module reference) pair. With PEP 402, however, the filesystem layout becomes ambigous - you *can't* derive the appropriate sys.path entry from the filesystem any more, because the meaning of the filesystem layout *depends on* what you put in sys.path. To give a simple example: Status quo: myproject/ __init__.py __main__.py module.py tests/ __init__.py test_module.py Clearly, "myproject" is a Python package with "myproject.module" and "myproject.tests.test_module" as the contents. Only the parent directory of "myproject" should be placed on sys.path. The package can be executed as a script via "python -m myproject". PEP 382: myproject.pyp/ __main__.py module.py tests.pyp/ test_module.py No real change from the status quo - the "__init__.py" marker files are simply replaced by the "*.pyp" marker extension. PEP 402: myproject/ __main__.py module.py tests/ test_module.py Uh-oh, now we have a problem. This is either an executable directory (designed to be run as "python myproject" and providing a top-level "module" and "tests.test_module") or an executable package (designed to be run as "python -m myproject" and providing "myproject.module" and "myproject.tests.test_module"). Because the filesystem layout is ambiguous, there's no way for the interpreter to figure out what the developer meant and we'd be stuck with the current broken* sys.path[0] initialisation forever. I've described the implications of the two namespace PEPs briefly in PEP 395, which also has links to the relevant import-sig threads: http://www.python.org/dev/peps/pep-0395/#compatibility-with-pep-382 http://www.python.org/dev/peps/pep-0395/#incompatibility-with-pep-402 *PEP 395 also explains my rationale for calling our current sys.path[0] initialisation mechanism broken: http://www.python.org/dev/peps/pep-0395/#why-are-my-imports-broken Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From barry at python.org Mon Mar 12 02:43:52 2012 From: barry at python.org (Barry Warsaw) Date: Sun, 11 Mar 2012 18:43:52 -0700 Subject: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace packages)? In-Reply-To: References: Message-ID: <20120311184352.02a50c9c@resist> On Mar 12, 2012, at 10:58 AM, Nick Coghlan wrote: >With the status quo or PEP 382's explicit namespace packages, the >interpreter can look at the *filesystem* to figure out where the root >directory of the package lives, as the explicit package markers (i.e. >__init__.py files or *.pyp extensions) mean there is an unambiguous >1:1 mapping from a filesystem path to a (sys.path entry, module >reference) pair. Just a quick note about PEP 382. As it's currently written (i.e. with the .pyp extension on directories), I worry about whether it will be possible to support both Python 3.2 and Python >= 3.3 with a single vendor package. This may only be an interim problem until we never care about Python 3.2 anymore 5 years from now, or it might be solvable by backporting the PEP into the vendor's version of Python 3.2, or some other solution such as symlinks or what not (though I would definitely not want to reintroduce a huge symlink farm again). Or maybe marker files such as described in the original version of the PEP are more appropriate after all. Or maybe PEP 402 is the right approach after all, and Nick's concerns about its interactions with PEP 395 can be addressed in other ways. What I'd really like Guido to do is to provide his opinion about which PEP in general he favors. Then I think we can hash out the details and corner cases here in this mailing list before a final pronouncement is made. Cheers, -Barry From ncoghlan at gmail.com Mon Mar 12 03:28:14 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 12 Mar 2012 12:28:14 +1000 Subject: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace packages)? In-Reply-To: <20120311184352.02a50c9c@resist> References: <20120311184352.02a50c9c@resist> Message-ID: On Mon, Mar 12, 2012 at 11:43 AM, Barry Warsaw wrote: > Just a quick note about PEP 382. ?As it's currently written (i.e. with the > .pyp extension on directories), I worry about whether it will be possible to > support both Python 3.2 and Python >= 3.3 with a single vendor package. ?This > may only be an interim problem until we never care about Python 3.2 anymore 5 > years from now, or it might be solvable by backporting the PEP into the > vendor's version of Python 3.2, or some other solution such as symlinks or > what not (though I would definitely not want to reintroduce a huge symlink > farm again). ?Or maybe marker files such as described in the original version > of the PEP are more appropriate after all. Why is this a problem? PEP 382 still supports ordinary __init__.py based package directories, so why can't the vendor packages just use that and ignore PEP 382 altogether until they drop 3.2 support? None of the existing namespace package mechanisms will break, the PEP just introduces a new simpler alternative for code that doesn't need to care about older versions of Python. Delaying adoption of a new feature due to the need to maintain compatibility with older versions of Python is hardly a novel situation for a Linux distribution. > Or maybe PEP 402 is the right approach after all, and Nick's concerns about > its interactions with PEP 395 can be addressed in other ways. I'd be interested in how - under PEP 402, it's completely impossible to tell *from the filesystem alone* where a package hierarchy starts. The information simply isn't there. Without some form of explicit filesystem marker, there's always going to be a fundamental ambiguity between "this is a sys.path directory with some top level modules" and "this is a top-level package directory". Since fixing main module imports requires the ability to unambiguously translate a filesystem path into a sys.path entry and a module reference, the lack of information is fatal to the primary goal of PEP 395. > What I'd really like Guido to do is to provide his opinion about which PEP in > general he favors. ?Then I think we can hash out the details and corner cases > here in this mailing list before a final pronouncement is made. I thought he did that in the initial email (i.e. stating a general preference for PEP 402). I just want to make it clear that accepting PEP 402 means rejecting key parts of PEP 395 as a natural consequence, as the PEP 402 approach to namespace packages eliminates currently available information that PEP 395 will need in order to do its job. After exploring the consequence of PEP 402 and the path entry vs package directory ambiguity it introduces, I now believe package layouts still fall under the general guideline that explicit is better than implicit. If anyone can demonstrate how to make all the main module invocation methods that would *just work* under PEP 395 still work in a PEP 402 world, then I'll be happy to withdraw my objection, but I simply don't see how it can be done. Under PEP 402, the precise meaning of a given filesystem layout depends on the contents of sys.path. That means you can't go back the other way and derive the "one true sys.path entry" for that layout, because there will always be at least two valid answers where there is currently only one. I'm just repeating myself now, though, so I'll stop posting about it until someone proposes a specific implementation strategy that would resolve the incompatibility between the two PEPs. In the absence of such a strategy, my position on the namespace package PEPs will remain: "I want to implement PEP 395 to fix direct execution of modules inside packages, which requires a 1:1 mapping between the filesystem layout and the Python module hierarchy created by the default import mechanism. I am therefore opposed to the implicit packages in PEP 402 and in favour of the explicit packages in PEP 382". Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Mon Mar 12 03:39:20 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 11 Mar 2012 19:39:20 -0700 Subject: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace packages)? In-Reply-To: References: <20120311184352.02a50c9c@resist> Message-ID: I'm leaning towards PEP 402 or some variant. Let's have a pow-wow at the sprint tomorrow (I'll arrive in Santa Clara between 10 and 10:30). I do want to understand Nick's argument better; I haven't studied PEP 395 yet. --Guido On Sun, Mar 11, 2012 at 7:28 PM, Nick Coghlan wrote: > On Mon, Mar 12, 2012 at 11:43 AM, Barry Warsaw wrote: >> Just a quick note about PEP 382. ?As it's currently written (i.e. with the >> .pyp extension on directories), I worry about whether it will be possible to >> support both Python 3.2 and Python >= 3.3 with a single vendor package. ?This >> may only be an interim problem until we never care about Python 3.2 anymore 5 >> years from now, or it might be solvable by backporting the PEP into the >> vendor's version of Python 3.2, or some other solution such as symlinks or >> what not (though I would definitely not want to reintroduce a huge symlink >> farm again). ?Or maybe marker files such as described in the original version >> of the PEP are more appropriate after all. > > Why is this a problem? PEP 382 still supports ordinary __init__.py > based package directories, so why can't the vendor packages just use > that and ignore PEP 382 altogether until they drop 3.2 support? None > of the existing namespace package mechanisms will break, the PEP just > introduces a new simpler alternative for code that doesn't need to > care about older versions of Python. > > Delaying adoption of a new feature due to the need to maintain > compatibility with older versions of Python is hardly a novel > situation for a Linux distribution. > >> Or maybe PEP 402 is the right approach after all, and Nick's concerns about >> its interactions with PEP 395 can be addressed in other ways. > > I'd be interested in how - under PEP 402, it's completely impossible > to tell *from the filesystem alone* where a package hierarchy starts. > The information simply isn't there. Without some form of explicit > filesystem marker, there's always going to be a fundamental ambiguity > between "this is a sys.path directory with some top level modules" and > "this is a top-level package directory". Since fixing main module > imports requires the ability to unambiguously translate a filesystem > path into a sys.path entry and a module reference, the lack of > information is fatal to the primary goal of PEP 395. > >> What I'd really like Guido to do is to provide his opinion about which PEP in >> general he favors. ?Then I think we can hash out the details and corner cases >> here in this mailing list before a final pronouncement is made. > > I thought he did that in the initial email (i.e. stating a general > preference for PEP 402). I just want to make it clear that accepting > PEP 402 means rejecting key parts of PEP 395 as a natural consequence, > as the PEP 402 approach to namespace packages eliminates currently > available information that PEP 395 will need in order to do its job. > After exploring the consequence of PEP 402 and the path entry vs > package directory ambiguity it introduces, I now believe package > layouts still fall under the general guideline that explicit is better > than implicit. > > If anyone can demonstrate how to make all the main module invocation > methods that would *just work* under PEP 395 still work in a PEP 402 > world, then I'll be happy to withdraw my objection, but I simply don't > see how it can be done. Under PEP 402, the precise meaning of a given > filesystem layout depends on the contents of sys.path. That means you > can't go back the other way and derive the "one true sys.path entry" > for that layout, because there will always be at least two valid > answers where there is currently only one. > > I'm just repeating myself now, though, so I'll stop posting about it > until someone proposes a specific implementation strategy that would > resolve the incompatibility between the two PEPs. In the absence of > such a strategy, my position on the namespace package PEPs will > remain: "I want to implement PEP 395 to fix direct execution of > modules inside packages, which requires a 1:1 mapping between the > filesystem layout and the Python module hierarchy created by the > default import mechanism. I am therefore opposed to the implicit > packages in PEP 402 and in favour of the explicit packages in PEP > 382". > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Mon Mar 12 05:23:10 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 12 Mar 2012 14:23:10 +1000 Subject: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace packages)? In-Reply-To: References: <20120311184352.02a50c9c@resist> Message-ID: On Mon, Mar 12, 2012 at 12:39 PM, Guido van Rossum wrote: > I'm leaning towards PEP 402 or some variant. Let's have a pow-wow at > the sprint tomorrow (I'll arrive in Santa Clara between 10 and 10:30). > I do want to understand Nick's argument better; I haven't studied PEP > 395 yet. To save you some reading, the affected part of PEP 395 is the bit where I want to make it so that direct execution of modules inside packages just *works*. Currently, most ways of running code inside packages will do the wrong thing because sys.path[0] is set to a directory *inside* the package, thus all the module naming and referencing gets thrown out of whack. Absolute imports fail because the top-level package isn't on sys.path, while explicit relative imports also fail, because the interpreter thinks __main__ is a top level module. My experience answering questions on Stack Overflow is that this is a major source of confusion that leads directly to the perception that Python packages are hard to use (far more so than the requirement to explicitly mark package directories with __init__.py files) The only current way to get the paths to work out properly is to use "-m" to invoke the code with the current working directory set to the directory that contains the top-level package. That way, sys.path[0] is set to a value that permits absolute imports of the top-level package and "-m" ensures __main__.__package__ is set correctly so that explicit relative imports work. What I realised when I wrote PEP 395 is that, with explicitly marked package directories, there's enough information already in the filesystem for the interpreter to figure out what's going on and do the right thing *automatically*. So we actually have the power to set sys.path[0] and __main__.__package__ appropriately, even when the main module is specified by file name rather than module name, or the current working directory isn't the one that contains the top level package directory. PEP 402 effectively proposes to eliminate one minor source of beginner confusion (the need to explicitly mark package directories) at the expense of entrenching a bigger one (the fact that most means of invoking a submodule as a script don't actually work because they initialise the import system incorrectly). Once I realised that, my preference switched firmly back to PEP 382 (since adjusting PEP 395 to handle PEP 382 style layouts is a trivial tweak to the package directory detection). Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From barry at python.org Mon Mar 12 06:07:03 2012 From: barry at python.org (Barry Warsaw) Date: Sun, 11 Mar 2012 22:07:03 -0700 Subject: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace packages)? In-Reply-To: References: <20120311184352.02a50c9c@resist> Message-ID: <20120311220703.663dd43b@rivendell> On Mar 12, 2012, at 12:28 PM, Nick Coghlan wrote: >Why is this a problem? % mkdir parent % mkdir parent/zope.pyp % touch parent/zope.pyp/__init__.py % PYTHONPATH=parent python3.2 Python 3.2.3rc1 (default, Mar 9 2012, 23:02:43) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import zope Traceback (most recent call last): File "", line 1, in ImportError: No module named zope >>> import zope.pyp Traceback (most recent call last): File "", line 1, in ImportError: No module named zope.pyp >>> % mv parent/zope{.pyp,}/ % PYTHONPATH=parent python3.2 Python 3.2.3rc1 (default, Mar 9 2012, 23:02:43) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import zope >>> >Delaying adoption of a new feature due to the need to maintain compatibility >with older versions of Python is hardly a novel situation for a Linux >distribution. This PEP was motivated by a problem encountered by vendors, so all else being equal, we shouldn't trade one vendor problem for another. More after sleep. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ncoghlan at gmail.com Mon Mar 12 06:17:55 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 12 Mar 2012 15:17:55 +1000 Subject: [Import-SIG] Where to discuss PEP 382 vs. PEP 402 (namespace packages)? In-Reply-To: <20120311220703.663dd43b@rivendell> References: <20120311184352.02a50c9c@resist> <20120311220703.663dd43b@rivendell> Message-ID: On Mon, Mar 12, 2012 at 3:07 PM, Barry Warsaw wrote: > On Mar 12, 2012, at 12:28 PM, Nick Coghlan wrote: > >>Why is this a problem? > > % mkdir parent > % mkdir parent/zope.pyp > % touch parent/zope.pyp/__init__.py > % PYTHONPATH=parent python3.2 > Python 3.2.3rc1 (default, Mar ?9 2012, 23:02:43) > [GCC 4.6.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import zope > Traceback (most recent call last): > ?File "", line 1, in > ImportError: No module named zope But that's the moral equivalent of using "yield from" and expecting your code to still be importable under Python 3.2 instead of triggering SyntaxError. This is a new-in-Python-3.3 feature, *of course* it isn't going to work with Python 3.2. If you have to support 3.2, then you have to write 3.2 compatible code, *including* your package layouts. Ergo: % mkdir parent % mkdir parent/zope % touch parent/zope/__init__.py % PYTHONPATH=parent python3.2 Python 3.2.3rc1 (default, Mar 9 2012, 23:02:43) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import zope >>> And the legacy package layout will still work in Python 3.3. This whole objection just doesn't make any sense to me - yes, the new feature won't work in old versions of Python. Code that wants to remain compatible with old versions of Python can't rely on the new feature. How is this different from having to avoid using any other new feature like "yield from"? Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From martin at v.loewis.de Mon Mar 12 22:13:39 2012 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 12 Mar 2012 14:13:39 -0700 Subject: [Import-SIG] Namespace Packages resolution Message-ID: <4F5E6703.2080503@v.loewis.de> We were just discussing the namespaces PEPs here at PyCon. We agreed to accept many, but not all principles of PEP 402; PEP 382 was essentially rejected. As a consequence, Eric Smith volunteered to write a new PEP covering the consensus. Here are the basic principles: - there will be two kinds of packages, "regular packages" and "namespace packages" (exact terminology subject to bikeshedding) - "regular" packages have an __init__.py, and live in a single directory - namespace packages can span multiple directories, and cannot have code on their own - there will be no explicit marker for namespace packages; any directory on the path (sys.path or package __path__) can constitute a package. Package names equal directory names - Importing a module/package keeps iterating over the parent path as before, and keeps with the current precedence: * if foo/__init__.py is found, a regular package is imported * if not, but foo.{py,pyc,so,pyd} is found, a module is imported * if not, but foo is found as a directory, it is recorded When search completes without importing a module, but it did find directories, then a namespace package is created. That namespace package * has an __name__ of the first directory that was found * has an __path__ which is the full list of directories that were collected Of PEP 402, the following features where rejected: - there is no support for code in a namespace package (i.e. you cannot use both foo/ and foo.py, but one will take precedence - depending on which one occurs first on the path) - the PEP 402 terminology calling things "virtual" was rejected - in the consensus spec, "import foo" alone will already trigger the path search, i.e. it is not deferred until a sub-level import occurs A number of aspects are still undecided - the API to dynamically update the __path__ was not immediately considered necessary. It may or may not be part of the PEP - the specific API of PEP 302 finders and loaders for this feature was not specified; it will draw on the research done for PEP 382 and PEP 402. The implementation will likely be provided once the import mechanism in CPython is based on importlib. Regards, Martin From eric at trueblade.com Mon Mar 12 23:25:20 2012 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 12 Mar 2012 15:25:20 -0700 Subject: [Import-SIG] Namespace Packages resolution In-Reply-To: <4F5E6703.2080503@v.loewis.de> References: <4F5E6703.2080503@v.loewis.de> Message-ID: <4F5E77D0.4000806@trueblade.com> On 03/12/2012 02:13 PM, "Martin v. L?wis" wrote: > We were just discussing the namespaces PEPs here at PyCon. > We agreed to accept many, but not all principles of PEP 402; > PEP 382 was essentially rejected. As a consequence, Eric Smith > volunteered to write a new PEP covering the consensus. > > Here are the basic principles: Thanks for writing this up, Martin. Can someone at the sprints talk to the packaging folks and see if they have any concerns with this approach? I wouldn't think so, but I'd rather find out soon if they do. Eric. From ncoghlan at gmail.com Tue Mar 13 01:21:17 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Mar 2012 10:21:17 +1000 Subject: [Import-SIG] My objections to implicit package directories Message-ID: (originally sent to python-ideas by mistake - redirecting here) It seems the consensus at the PyCon US sprints is that implicit package directories are a wonderful idea and we should have more of those. I still disagree (emphatically), but am prepared to go along with it so long as my documented objections are clearly and explicitly addressed in the new combined PEP, and the benefits ascribed to implicit package directories in the new PEP are more compelling than "other languages do it that way, so we should too". To save people having to trawl around various mailing list threads and read through PEP 395, I'm providing those objections in a consolidated form here. If reading these objections in one place causes people to have second thoughts about the wisdom of implicit package directories, even better. 1. Implicit package directories go against the Zen of Python Getting this one out of the way first. As I see it, implicit package directories violate at least 4 of the design principles in the Zen: - Explicit is better than implicit (my calling them implicit package directories is a deliberate rhetorical ploy to harp on this point, although it's also an accurate name) - If the implementation is hard to explain, it's a bad idea (see the section about backwards compatibility challenges) - Readability counts (see the section about introducing ambiguity into filesystem layouts) - Errors should never pass silently (see the section about implicit relative imports from main) 2. Implicit package directories pose awkward backwards compatibility challenges It concerns me gravely that the consensus proposal MvL posted is *backwards incompatible with Python 3.2*, as it deliberately omits one of the PEP 402 features that provided that backwards compatibility. Specifically, under the consensus, a subdirectory "foo" of a directory on sys.path will shadow a "foo.py" or "foo/__init__.py" that appears later on sys.path. As Python 3.2 would have found that latter module/package correctly, this is an unacceptable breach of the backwards compatibility requirements. PEP 402 at least got this right by always executing the first "foo.py" or "foo/__init__.py" it found, even if another "foo" directory was found earlier in sys.path. We can't just wave that additional complexity away if an implicit package directory proposal is going to remain backwards compatible with current layouts (e.g. if an application's starting directory included a "json" subfolder containing json files rather than Python code, the consensus approach as posted by MvL would render the standard library's json module inaccessible) 3. Implicit package directories introduce ambiguity into filesystem layouts With the current Python package design, there is a clear 1:1 mapping between the filesystem layout and the module hierarchy. For example: parent/ # This directory goes on sys.path project/ # The "project" package __init__.py # Explicit package marker code.py # The "project.code" module tests/ # The "project.tests" package __init__.py # Explicit package marker test_code.py # The "projects.tests.test_code" module Any explicit package directory approach will preserve this 1:1 mapping. For example, under PEP 382: parent/ # This directory goes on sys.path project.pyp/ # The "project" package code.py # The "project.code" module tests.pyp/ # The "project.tests" package test_code.py # The "projects.tests.test_code" module With implicit package directories, you can no longer tell purely from the code structure which directory is meant to be added to sys.path, as there are at least two valid mappings to the Python module hierarchy: parent/ # This directory goes on sys.path project/ # The "project" package code.py # The "project.code" module tests/ # The "project.tests" package test_code.py # The "projects.tests.test_code" module parent/ project/ # This directory goes on sys.path code.py # The "code" module tests/ # The "tests" package test_code.py # The "tests.test_code" module What are implicit package directories buying us in exchange for this inevitable ambiguity? What can we do with them that can't be done with explicit package directories? And no, "Java does it that way" is not a valid argument. 4. Implicit package directories will permanently entrench current newbie-hostile behaviour in __main__ It's a fact of life that Python beginners learn that they can do a quick sanity check on modules they're writing by including an "if __name__ == '__main__':" section at the end and doing one of 3 things: - run "python mymodule.py" - hit F5 (or the relevant hot key) in their IDE - double click the module in their filesystem browser - start the Python REPL and do "import mymodule" However, there are some serious caveats to that as soon as you move the module inside a package: - if you use explicit relative imports, you can import it, but not run it directly using any of the above methods - if you rely on implicit relative imports, the above direct execution methods should work most of the time, but you won't be able to import it - if you use absolute imports for your own package, nothing will work (unless the parent directory for your package is already on sys.path) - if you only use absolute imports for *other* packages, everything should be fine The errors you get in these cases are *horrible*. The interpreter doesn't really know what is going on, so it gives the user bad error messages. In large part, the "Why are my imports broken?" section in PEP 395 exists because I sat down to try to document what does and doesn't work when you attempt to directly execute a module from inside a package directory. In building the list of what would work properly ("python -m" from the parent directory of the package) and what would sometimes break (everything else), I realised that instead of documenting the entire hairy mess, the 1:1 mapping from the filesystem layout to the Python module hierarchy meant we could *just fix it* to not do the wrong thing by default. If implicit package directories are blessed for inclusion in Python 3.3, that opportunity is lost forever - with the loss of the unambiguous 1:1 mapping from the filesystem layout to the module hierarchy, it's no longer possible for the interpreter to figure out the right thing to do without guessing. PJE proposed that newbies be instructed to add the following boilerplate to their modules if they want to use "if __name__ == '__main__':" for sanity checking: import pkgutil pkgutil.script_module(__name__, 'project.code.test_code') This completely defeats the purpose of having explicit relative imports in the language, as it embeds the absolute name of the module inside the module itself. If a package subtree is ever moved or renamed, you will have to manually fix every script_module() invocation in that subtree. Double-keying data like this is just plain bad design. The package structure should be recorded explicitly in exactly one place: the filesystem. PJE has other objections to the PEP 395 proposal, specifically relating to its behaviour on package layouts where the directories added to sys.path contain __init__.py files, such that the developer's intent is not accurately reflected in their filesystem layout. Such layouts are *broken*, and the misbehaviour under PEP 395 won't be any worse than the misbehaviour with the status quo (sys.path[0] is set incorrectly in either case, it will just be fixable under PEP 395 by removing the extraneous __init__.py files). A similar argument applies to cases where a parent package __init__ plays games with sys.path (although the PEP 395 algorithm could likely be refined to better handle that situation). Regardless, if implicit package directories are accepted into Python 3.3 in any form, I *will* be immediately marking PEP 395 as Rejected due to incompatibility with an accepted PEP. I'll then (eventually, once I'm less annoyed about the need to do so) write a new PEP to address a subset of the issues previously covered by PEP 395 that omits any proposals that rely on explicit package directories. Also, I consider it a requirement that any implicit packages PEP include an update to the tutorial to explain to beginners what will and won't work when they attempt to directly execute a module from inside a Python package. After all, such a PEP is closing off any possibility of ever fixing the problem: it should have to deal with the consequences. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Mar 13 01:47:09 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Mar 2012 10:47:09 +1000 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 10:21 AM, Nick Coghlan wrote: > However, there are some serious caveats to that as soon as you move > the module inside a package: > - if you use explicit relative imports, you can import it, but not run > it directly using any of the above methods > - if you rely on implicit relative imports, the above direct execution > methods should work most of the time, but you won't be able to import > it Sorry, those two caveats are not completely accurate (like I said, the current behaviour is a hairy mess): - if you use explicit relative imports, you can import it under it's full name if you started the REPL from the parent directory of the package, but cannot import it from the current directory or run it directly - if you use implicit relative imports (which are supposed to be completely gone in Python 3), you can run it directly or import it under its module name if you started the REPL from the current directory, but cannot import it under its full name Is it any wonder beginners get confused by packages and imports? We tell third party developers "never ever put a package directory directly on sys.path", and then go ahead it and do it ourselves. Adopting implicit package directories means forever giving up the possibility of even providing better error messages for this case (because __main__ will now have no way to tell whether it's in a package or not). Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Mar 13 01:48:23 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Mar 2012 10:48:23 +1000 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 10:47 AM, Nick Coghlan wrote: > Sorry, those two caveats are not completely accurate (like I said, the > current behaviour is a hairy mess): > - if you use explicit relative imports, you can import it under it's > full name if you started the REPL from the parent directory of the > package, but cannot import it from the current directory or run it > directly > - if you use implicit relative imports (which are supposed to be > completely gone in Python 3), you can run it directly or import it > under its module name if you started the REPL from the current > directory, but cannot import it under its full name s/current directory/directory containing the module/ -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ericsnowcurrently at gmail.com Tue Mar 13 03:50:56 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 12 Mar 2012 19:50:56 -0700 Subject: [Import-SIG] [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 7:43 PM, Eric Snow wrote: > On Mon, Mar 12, 2012 at 5:03 PM, Nick Coghlan wrote: >> It seems the consensus at the PyCon US sprints is that implicit >> package directories are a wonderful idea and we should have more of >> those. I still disagree (emphatically), but am prepared to go along >> with it so long as my documented objections are clearly and explicitly >> addressed in the new combined PEP, and the benefits ascribed to >> implicit package directories in the new PEP are more compelling than >> "other languages do it that way, so we should too". >> >> To save people having to trawl around various mailing list threads and >> reading through PEP 395, I'm providing those objections in a >> consolidated form here. If reading these objections in one place >> causes people to have second thoughts about the wisdom of implicit >> package directories, even better. >> >> 1. Implicit package directories go against the Zen of Python >> >> Getting this one out of the way first. As I see it, implicit package >> directories violate at least 4 of the design principles in the Zen: >> - Explicit is better than implicit (my calling them implicit package >> directories is a deliberate rhetorical ploy to harp on this point, >> although it's also an accurate name) >> - If the implementation is hard to explain, it's a bad idea (see the >> section about backwards compatibility challenges) >> - Readability counts (see the section about introducing ambiguity into >> filesystem layouts) >> - Errors should never pass silently (see the section about implicit >> relative imports from main) >> >> 2. Implicit package directories pose awkward backwards compatibility challenges >> >> It concerns me gravely that the consensus proposal MvL posted is >> *backwards incompatible with Python 3.2*, as it deliberately omits one >> of the PEP 402 features that provided that backwards compatibility. >> Specifically, under the consensus, a subdirectory "foo" of a directory >> on sys.path will shadow a "foo.py" or "foo/__init__.py" that appears >> later on sys.path. As Python 3.2 would have found that latter >> module/package correctly, this is an unacceptable breach of the >> backwards compatibility requirements. PEP 402 at least got this right >> by always executing the first "foo.py" or "foo/__init__.py" it found, >> even if >> another "foo" directory was found earlier in sys.path. >> >> We can't just wave that additional complexity away if an implicit >> package directory proposal is going to remain backwards compatible >> with current layouts (e.g. if an application's starting directory >> included a "json" subfolder containing json files rather than Python >> code, the consensus approach as posted by MvL would render the >> standard library's json module inaccessible) >> >> 3. Implicit package directories introduce ambiguity into filesystem layouts >> >> With the current Python package design, there is a clear 1:1 mapping >> between the filesystem layout and the module hierarchy. For example: >> >> ? ?parent/ ?# This directory goes on sys.path >> ? ? ? ?project/ ?# The "project" package >> ? ? ? ? ? ?__init__.py ?# Explicit package marker >> ? ? ? ? ? ?code.py ?# The "project.code" module >> ? ? ? ? ? ?tests/ ?# The "project.tests" package >> ? ? ? ? ? ? ? ?__init__.py ?# Explicit package marker >> ? ? ? ? ? ? ? ?test_code.py ?# The "projects.tests.test_code" module >> >> Any explicit package directory approach will preserve this 1:1 >> mapping. For example, under PEP 382: >> >> ? ?parent/ ?# This directory goes on sys.path >> ? ? ? ?project.pyp/ ?# The "project" package >> ? ? ? ? ? ?code.py ?# The "project.code" module >> ? ? ? ? ? ?tests.pyp/ ?# The "project.tests" package >> ? ? ? ? ? ? ? ?test_code.py ?# The "projects.tests.test_code" module >> >> With implicit package directories, you can no longer tell purely from >> the code structure which directory is meant to be added to sys.path, >> as there are at least two valid mappings to the Python module >> hierarchy: >> >> ? ?parent/ ?# This directory goes on sys.path >> ? ? ? ?project/ ?# The "project" package >> ? ? ? ? ? ?code.py ?# The "project.code" module >> ? ? ? ? ? ?tests/ ?# The "project.tests" package >> ? ? ? ? ? ? ? ?test_code.py ?# The "projects.tests.test_code" module >> >> ? ?parent/ >> ? ? ? ?project/ ?# This directory goes on sys.path >> ? ? ? ? ? ?code.py ?# The "code" module >> ? ? ? ? ? ?tests/ ?# The "tests" package >> ? ? ? ? ? ? ? ?test_code.py ?# The "tests.test_code" module >> >> What are implicit package directories buying us in exchange for this >> inevitable ambiguity? What can we do with them that can't be done with >> explicit package directories? And no, "Java does it that way" is not a >> valid argument. >> >> 4. Implicit package directories will permanently entrench current >> newbie-hostile behaviour in __main__ >> >> It's a fact of life that Python beginners learn that they can do a >> quick sanity check on modules they're writing by including an "if >> __name__ == '__main__':" section at the end and doing one of 3 things: >> - run "python mymodule.py" >> - hit F5 (or the relevant hot key) in their IDE >> - double click the module in their filesystem browser >> - start the Python REPL and do "import mymodule" >> >> However, there are some serious caveats to that as soon as you move >> the module inside a package: >> - if you use explicit relative imports, you can import it, but not run >> it directly using any of the above methods >> - if you rely on implicit relative imports, the above direct execution >> methods should work most of the time, but you won't be able to import >> it >> - if you use absolute imports for your own package, nothing will work >> (unless the parent directory for your package is already on sys.path) >> - if you only use absolute imports for *other* packages, everything >> should be fine >> >> The errors you get in these cases are *horrible*. The interpreter >> doesn't really know what is going on, so it gives the user bad error >> messages. >> >> In large part, the "Why are my imports broken?" section in PEP 395 >> exists because I sat down to try to document what does and doesn't >> work when you attempt to directly execute a module from inside a >> package directory. In building the list of what would work properly >> ("python -m" from the parent directory of the package) and what would >> sometimes break (everything else), I realised that instead of >> documenting the entire hairy mess, the 1:1 mapping from the filesystem >> layout to the Python module hierarchy meant we could *just fix it* to >> not do the wrong thing by default. If implicit package directories are >> blessed for inclusion in Python 3.3, that opportunity is lost forever >> - with the loss of the unambiguous 1:1 mapping from the filesystem >> layout to the module hierarchy, it's no longer possible for the >> interpreter to figure out the right thing to do without guessing. >> >> PJE proposed that newbies be instructed to add the following >> boilerplate to their modules if they want to use "if __name__ == >> '__main__':" for sanity checking: >> >> ? ?import pkgutil >> ? ?pkgutil.script_module(__name__, 'project.code.test_code') >> >> This completely defeats the purpose of having explicit relative >> imports in the language, as it embeds the absolute name of the module >> inside the module itself. If a package subtree is ever moved or >> renamed, you will have to manually fix every script_module() >> invocation in that subtree. Double-keying data like this is just plain >> bad design. The package structure should be recorded explicitly in >> exactly one place: the filesystem. >> >> PJE has other objections to the PEP 395 proposal, specifically >> relating to its behaviour on package layouts where the directories >> added to sys.path contain __init__.py files, such that the developer's >> intent is not accurately reflected in their filesystem layout. Such >> layouts are *broken*, and the misbehaviour under PEP 395 won't be any >> worse than the misbehaviour with the status quo (sys.path[0] is set >> incorrectly in either case, it will just be fixable under PEP 395 by >> removing the extraneous __init__.py files). A similar argument applies >> to cases where a parent package __init__ plays games with sys.path >> (although the PEP 395 algorithm could likely be refined to better >> handle that situation). Regardless, if implicit package directories >> are accepted into Python 3.3 in any form, I *will* be immediately >> marking PEP 395 as Rejected due to incompatibility with an accepted >> PEP. I'll then (eventually, once I'm less annoyed about the need to do >> so) write a new PEP to address a subset of the issues previously >> covered by PEP 395 that omits any proposals that rely on explicit >> package directories. >> >> Also, I consider it a requirement that any implicit packages PEP >> include an update to the tutorial to explain to beginners what will >> and won't work when they attempt to directly execute a module from >> inside a Python package. After all, such a PEP is closing off any >> possibility of ever fixing the problem: it should have to deal with >> the consequences. > > Hi Nick, > > The write-up was a little unclear on a main point and I think that's > contributed to some confusion here. ?The path search will continue to > work in exactly the same way as it does now, with one difference. > Instead of the current ImportError when nothing matches, the mechanism > for namespace packages would be used. > > The mechanism would create a namespace package with a __path__ > matching the paths corresponding to all namespace package "portions". > The likely implementation will simply track the namespace package > __path__ during the initial (normal) path search and use it only when > there are no matching modules nor regular packages. > > Packages without __init__.py would only be allowed for namespace > packages. ?So effectively namespace packages would be problematic for > PEP 395, but not normal packages. > > Ultimately this is a form of PEP 402 without so much complexity. ?The > trade-off is it requires a new kind of package. ?As far as I > understand them, most of your concerns are based on the idea that > namespace packages would be included in the initial traversal of > sys.path, which is not the case. ?It sounds like there are a couple > points you made that may still need attention, but hopefully this at > least helps clarify what we talked about. > > -eric sorry (reply all failed me here :)) reposting to import-sig -eric From carl at oddbird.net Tue Mar 13 03:30:14 2012 From: carl at oddbird.net (Carl Meyer) Date: Mon, 12 Mar 2012 19:30:14 -0700 Subject: [Import-SIG] Namespace Packages resolution In-Reply-To: <4F5E6703.2080503@v.loewis.de> References: <4F5E6703.2080503@v.loewis.de> Message-ID: <4F5EB136.6080101@oddbird.net> Thanks Martin for this summary. I do have one question; I'd thought I'd just wait and see if the upcoming new PEP clarified, but since my question also shows up as a major item in Nick's list of objections, it may be worth asking for clarity now: On 03/12/2012 02:13 PM, "Martin v. L?wis" wrote: [snip] > - Importing a module/package keeps iterating over the parent > path as before, and keeps with the current precedence: > * if foo/__init__.py is found, a regular package is imported > * if not, but foo.{py,pyc,so,pyd} is found, a module is > imported > > * if not, but foo is found as a directory, it is recorded > When search completes without importing a module, but it did > find directories, then a namespace package is created. That > namespace package > * has an __name__ of the first directory that was found > * has an __path__ which is the full list of directories > that were collected > > Of PEP 402, the following features where rejected: > - there is no support for code in a namespace package > (i.e. you cannot use both foo/ and foo.py, but one > will take precedence - depending on which one > occurs first on the path) If there is a "foo/" earlier on sys.path and a "foo.py" later, which wins? This last parenthetical seems to say that the earlier "foo/" would win, but the more thorough precedence description above indicates that the later "foo.py" would (because the creation of a namespace package only kicks in if no other module is importable anywhere on sys.path). Here is a more precise pseudocode description of how I understand the proposed algorithm, assuming that someone tries to import "foo". Hopefully someone involved in the conversation can correct this if it is wrong: namespace_paths = [] for entry in sys.path: if isfile("{entry}/foo/__init__.py"): # import and return the old-style package elif isfile("{entry}/foo.py"): # import and return the module elif isdir("{entry}/foo/"): namespace_paths.append("{entry}/foo/") # no regular module/package was importable anywhere on sys.path # return a namespace module with __path__ = namespace_paths If this is indeed the proposed algorithm, then I think the "backwards compatibility" item on Nick's list of objections is unfounded. The new behavior will only occur in situations that would previously have resulted in an ImportError. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: OpenPGP digital signature URL: From ncoghlan at gmail.com Tue Mar 13 04:17:59 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Mar 2012 13:17:59 +1000 Subject: [Import-SIG] Namespace Packages resolution In-Reply-To: <4F5EB136.6080101@oddbird.net> References: <4F5E6703.2080503@v.loewis.de> <4F5EB136.6080101@oddbird.net> Message-ID: On Tue, Mar 13, 2012 at 12:30 PM, Carl Meyer wrote: > If this is indeed the proposed algorithm, then I think the "backwards > compatibility" item on Nick's list of objections is unfounded. The new > behavior will only occur in situations that would previously have > resulted in an ImportError. The complaint that it's a complicated hack that is only necessary to cope with the lack of explicit package markers still stands, though. (But yes, I was definitely going off the parenthetical comment rather than the earlier description that contradicted that part). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Mar 13 04:35:13 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Mar 2012 13:35:13 +1000 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 12:50 PM, Eric Snow wrote: >> Ultimately this is a form of PEP 402 without so much complexity. ?The >> trade-off is it requires a new kind of package. ?As far as I >> understand them, most of your concerns are based on the idea that >> namespace packages would be included in the initial traversal of >> sys.path, which is not the case. ?It sounds like there are a couple >> points you made that may still need attention, but hopefully this at >> least helps clarify what we talked about. No, the backwards compatibility issue was only one minor concern. I'm pleased to hear it was a misreading of MvL's message on my part, but dealing with implicit package directories in a backwards compatible way is still a lot more complicated than "hey, instead of including '__init__.py' as a marker file, you can just append '.pyp' to the directory name. If Python encounters a 'foo.pyp' directory when trying to import 'foo', it will create a new 'foo' package, then keep scanning sys.path and append any additional 'foo.pyp' directories it finds to foo.__path__." The deal breaker for me is the loss of the 1:1 mapping between filesystem layouts and the module heirarchy. Losing that sucks, but people appear to be willing to throw it away for no payoff beyond "Java and Perl users don't have to explicitly mark their package directories, why do Python users?". Python chooses explicit rather than implicit self, why should package directories be any different? Anything implicit should require an extraordinary pay-off in expressiveness before it gets added to Python, and I haven't seen anything remotely like that in this case. Instead, it looks like we're going to get a solution that is both implicit and *less* expressive than the status quo. I could actually fix the interaction between multiprocessing and -m *today* in 2.7 and 3.2, but what's the point if 3.3 is just going to add a feature that makes that fix invalid? (To elaborate on that last point: I recently realised that the algorithm described in PEP 395 could be applied *today* in the part of multiprocessing that figures out how to launch the child process on Windows. No PEP needed or public API change needed) Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Tue Mar 13 04:49:12 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Mar 2012 20:49:12 -0700 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 5:21 PM, Nick Coghlan wrote: > It seems the consensus at the PyCon US sprints is that implicit > package directories are a wonderful idea and we should have more of > those. I still disagree (emphatically), but am prepared to go along > with it so long as my documented objections are clearly and explicitly > addressed in the new combined PEP, and the benefits ascribed to > implicit package directories in the new PEP are more compelling than > "other languages do it that way, so we should too". > > To save people having to trawl around various mailing list threads and > read through PEP 395, I'm providing those objections in a > consolidated form here. (Thanks for that.) > If reading these objections in one place > causes people to have second thoughts about the wisdom of implicit > package directories, even better. > > 1. Implicit package directories go against the Zen of Python > > Getting this one out of the way first. As I see it, implicit package > directories violate at least 4 of the design principles in the Zen: > - Explicit is better than implicit (my calling them implicit package > directories is a deliberate rhetorical ploy to harp on this point, > although it's also an accurate name) > - If the implementation is hard to explain, it's a bad idea (see the > section about backwards compatibility challenges) > - Readability counts (see the section about introducing ambiguity into > filesystem layouts) > - Errors should never pass silently (see the section about implicit > relative imports from main) Whatever. There's "practicality beats purity" though, and unmarked directories are quite intuitive and logical to newcomers. In fact, the original package implementation (ni.py, checked in originally with rev 2887:ec0b42889243), __init__.py was optional. At the time we hadn't thought of the use case of "namespace packages" like zope.interfaces and zope.components, where there are multiple distributable "bundles" that install different *portions* of the package. During the meeting it also came up that there are two styles in use for this purpose: multiple distro bundles that install into the *same* directory, or multiple distro bundles installing into different directories (whose parents are all added to sys.path separately). We also came up with the encodings package as a potential namespace package, although it currently doesn't have an empty __init__.py. But hold that thought, there's more that I'll address later. > 2. Implicit package directories pose awkward backwards compatibility challenges > > It concerns me gravely that the consensus proposal MvL posted is > *backwards incompatible with Python 3.2*, as it deliberately omits one > of the PEP 402 features that provided that backwards compatibility. > Specifically, under the consensus, a subdirectory "foo" of a directory > on sys.path will shadow a "foo.py" or "foo/__init__.py" that appears > later on sys.path. As Python 3.2 would have found that latter > module/package correctly, this is an unacceptable breach of the > backwards compatibility requirements. PEP 402 at least got this right > by always executing the first "foo.py" or "foo/__init__.py" it found, > even if > another "foo" directory was found earlier in sys.path. > > We can't just wave that additional complexity away if an implicit > package directory proposal is going to remain backwards compatible > with current layouts (e.g. if an application's starting directory > included a "json" subfolder containing json files rather than Python > code, the consensus approach as posted by MvL would render the > standard library's json module inaccessible) We must have explained this badly, because (just like PEP 402, AFAIK) this is *not* how it works. It works as follows: *If* there is a foo.py or a foo/__init__.py anywhere along sys.path, the *current* rules apply. That is, of one of these occurs on an earlier sys.path entry, it wins; if both of these occur together on the same sys.path entry, foo/__init__.py wins. (We discovered that the latter disambiguation must prefer the directory, not just for backwards compatibility, but also to make relative imports in subpackages work right. This is probably the biggest deviation from PEP 402.) And in this case all those foo/ directories *without* a __init__.py in them are completely ignored, even if they come before either foo.py or foo/__init__.py on sys.path. (If __init__.py wants to manipulate its own __path__, that's fine.) *Only* if *neither* foo.py *nor* foo/__init__.py is found *anywhere* along sys.path do we take all directories foo/ along sys.path together and combine them into a namespace package. If there are no foo/ directories at all, the import fails. If there is exactly one foo/, it acts like a classic package with an empty __init__.py. We avoid having to do two scans of sys.path by collecting info about __init__.py-less foo/ directories during the same scan where we look for foo.py and foo/__init__.py; but we collect it in a separate variable. (It occurs to me that this may not be trivial when PEP-302-style finders are involved. That's a detail that will to be figured out later.) So the only backwards incompatibility is that "import foo" may succeed where it previously failed if there is a directory foo/ somewhere on sys.path but no foo.py and no foo/__init__.py anywhere. I don't think this is a big deal. (Note: where I write foo.py, I should really write foo.py/foo.pyc/foo.pyo/foo.so/foo.pyd. But that's such a mouthful...) > 3. Implicit package directories introduce ambiguity into filesystem layouts > > With the current Python package design, there is a clear 1:1 mapping > between the filesystem layout and the module hierarchy. For example: > > ? parent/ ?# This directory goes on sys.path > ? ? ? project/ ?# The "project" package > ? ? ? ? ? __init__.py ?# Explicit package marker > ? ? ? ? ? code.py ?# The "project.code" module > ? ? ? ? ? tests/ ?# The "project.tests" package > ? ? ? ? ? ? ? __init__.py ?# Explicit package marker > ? ? ? ? ? ? ? test_code.py ?# The "projects.tests.test_code" module > > Any explicit package directory approach will preserve this 1:1 > mapping. For example, under PEP 382: > > ? parent/ ?# This directory goes on sys.path > ? ? ? project.pyp/ ?# The "project" package > ? ? ? ? ? code.py ?# The "project.code" module > ? ? ? ? ? tests.pyp/ ?# The "project.tests" package > ? ? ? ? ? ? ? test_code.py ?# The "projects.tests.test_code" module > > With implicit package directories, you can no longer tell purely from > the code structure which directory is meant to be added to sys.path, > as there are at least two valid mappings to the Python module > hierarchy: > > ? parent/ ?# This directory goes on sys.path > ? ? ? project/ ?# The "project" package > ? ? ? ? ? code.py ?# The "project.code" module > ? ? ? ? ? tests/ ?# The "project.tests" package > ? ? ? ? ? ? ? test_code.py ?# The "projects.tests.test_code" module > > ? parent/ > ? ? ? project/ ?# This directory goes on sys.path > ? ? ? ? ? code.py ?# The "code" module > ? ? ? ? ? tests/ ?# The "tests" package > ? ? ? ? ? ? ? test_code.py ?# The "tests.test_code" module I know this bothers you greatly, because you wrote at great length about it in PEP 395. But personally I think that being able to guess the highest package directory given the name of a .py file nested deep inside it is a pretty esoteric use case and I can live with this continuing to be broken (since it is already broken) for the sake of a simpler package structure (no __init__.py files!). > What are implicit package directories buying us in exchange for this > inevitable ambiguity? What can we do with them that can't be done with > explicit package directories? And no, "Java does it that way" is not a > valid argument. Apart from the pitchfork incident referenced in PEP 402, I have had many other complaints about the ubiquitous empty __init__.py files. They may be empty, but they sure take up space in e.g. directory listings or zipfiles. For example, there are 409 empty __init__.py files in the Django 1.4c1 distro, plus 25 more that contain either an empty comment or a blank line. I've also seen __init__.py files with a single rude comment in them, and in my G+ stream I've seen comments on random Python topics making a snide reference to empty __init__.py files. (There are also coding guidelines in some places that prohibit having real code in __init__.py files.) Quite separately, it also gives us an easy way to have namespace packages spread across multiple directories. This is clearly a popular feature, given that there are at least *two* different convenience APIs to make this easy (one in pkgutil.py, another in setuptools). I did a quick search for "import pkgutil" on koders.com and the first 25 hits (of 792) are all declaring namespace packages, many using an awkward idiom using a try/except to import either pkg_resources or pkgutil. This awkwardness really bugs me and being able to eventually drop it is a big draw for me. > 4. Implicit package directories will permanently entrench current > newbie-hostile behaviour in __main__ > > It's a fact of life that Python beginners learn that they can do a > quick sanity check on modules they're writing by including an "if > __name__ == '__main__':" section at the end and doing one of 3 things: > - run "python mymodule.py" > - hit F5 (or the relevant hot key) in their IDE > - double click the module in their filesystem browser > - start the Python REPL and do "import mymodule" [...] Our *four*...no... *Amongst* our weapons.... Amongst our weaponry...are such elements as fear, surprise.... I'll come in again. > However, there are some serious caveats to that as soon as you move > the module inside a package: > - if you use explicit relative imports, you can import it, but not run > it directly using any of the above methods > - if you rely on implicit relative imports, the above direct execution > methods should work most of the time, but you won't be able to import > it > - if you use absolute imports for your own package, nothing will work > (unless the parent directory for your package is already on sys.path) > - if you only use absolute imports for *other* packages, everything > should be fine > > The errors you get in these cases are *horrible*. The interpreter > doesn't really know what is going on, so it gives the user bad error > messages. > > In large part, the "Why are my imports broken?" section in PEP 395 > exists because I sat down to try to document what does and doesn't > work when you attempt to directly execute a module from inside a > package directory. In building the list of what would work properly > ("python -m" from the parent directory of the package) and what would > sometimes break (everything else), I realised that instead of > documenting the entire hairy mess, the 1:1 mapping from the filesystem > layout to the Python module hierarchy meant we could *just fix it* to > not do the wrong thing by default. If implicit package directories are > blessed for inclusion in Python 3.3, that opportunity is lost forever > - with the loss of the unambiguous 1:1 mapping from the filesystem > layout to the module hierarchy, it's no longer possible for the > interpreter to figure out the right thing to do without guessing. I understand your frustration at just having analyzed this mess and come up with a solution, only to see it permanently sabotaged before you could even implement it. But it's an existing mess, and if I really have to choose between solving this mess or solving the empty-init mess, I vote for solving the latter. But I would hope that the most common cases are still that the package in fact already exists on sys.path, possibly because it is rooted in the current directory, or because the package has been properly installed. In this case you should have no problem computing the toplevel package implied. The other common case is where the current directory is *inside* the package. I agree this is a bad mess. But does this happen with a typical IDE? It seems more common when using the shell. Anyway, maybe we just have to document more aggressively that this is a bad idea and explain to people how to avoid it. (One of the ways to avoid it would be "add an empty __init__.py to your package directories", since that will in fact still avoid it.) There's also a nasty habit that Django has around packages and parent directories. The Django developers announced at PyCon that they're breaking this habit in Django 1.4. (And they also announced that Django 1.5 will be compatible with Python 3.3!) > PJE proposed that newbies be instructed to add the following > boilerplate to their modules if they want to use "if __name__ == > '__main__':" for sanity checking: > > ? import pkgutil > ? pkgutil.script_module(__name__, 'project.code.test_code') > > This completely defeats the purpose of having explicit relative > imports in the language, as it embeds the absolute name of the module > inside the module itself. If a package subtree is ever moved or > renamed, you will have to manually fix every script_module() > invocation in that subtree. Double-keying data like this is just plain > bad design. The package structure should be recorded explicitly in > exactly one place: the filesystem. I agree that telling newbies to do *anything* with pkgutil is backwards. > PJE has other objections to the PEP 395 proposal, specifically > relating to its behaviour on package layouts where the directories > added to sys.path contain __init__.py files, such that the developer's > intent is not accurately reflected in their filesystem layout. Such > layouts are *broken*, and the misbehaviour under PEP 395 won't be any > worse than the misbehaviour with the status quo (sys.path[0] is set > incorrectly in either case, it will just be fixable under PEP 395 by > removing the extraneous __init__.py files). A similar argument applies > to cases where a parent package __init__ plays games with sys.path > (although the PEP 395 algorithm could likely be refined to better > handle that situation). Regardless, if implicit package directories > are accepted into Python 3.3 in any form, I *will* be immediately > marking PEP 395 as Rejected due to incompatibility with an accepted > PEP. I'll then (eventually, once I'm less annoyed about the need to do > so) write a new PEP to address a subset of the issues previously > covered by PEP 395 that omits any proposals that rely on explicit > package directories. Please reconsider -- there was at least one important detail in the proposal that you misunderstood. > Also, I consider it a requirement that any implicit packages PEP > include an update to the tutorial to explain to beginners what will > and won't work when they attempt to directly execute a module from > inside a Python package. That's fine. > After all, such a PEP is closing off any > possibility of ever fixing the problem: it should have to deal with > the consequences. Not so gloomy, Nick! There are still quite a few cases that can be detected properly. I think the rule "don't cd into a package" covers most cases. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Tue Mar 13 04:51:05 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Mar 2012 13:51:05 +1000 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 1:35 PM, Nick Coghlan wrote: > (To elaborate on that last point: I recently realised that the > algorithm described in PEP 395 could be applied *today* in the part of > multiprocessing that figures out how to launch the child process on > Windows. No PEP needed or public API change needed) On further reflection, I'll retract that comment. I suspect the bad interaction between multiprocessing and -m may be fixable just be relying on the fact that -m already sets "__package__" and sys.argv[0] appropriately and passing those two values through to the subprocess. So, while potentially interesting in its own right, ultimately irrelevant to the question of whether or not package directories should require an explicit marker or if its OK for every directory encountered to be implicitly considered a package. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ironfroggy at gmail.com Tue Mar 13 04:51:30 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Mon, 12 Mar 2012 23:51:30 -0400 Subject: [Import-SIG] Namespace Packages resolution In-Reply-To: References: <4F5E6703.2080503@v.loewis.de> <4F5EB136.6080101@oddbird.net> Message-ID: On Mar 12, 2012 11:18 PM, "Nick Coghlan" wrote: > > On Tue, Mar 13, 2012 at 12:30 PM, Carl Meyer wrote: > > If this is indeed the proposed algorithm, then I think the "backwards > > compatibility" item on Nick's list of objections is unfounded. The new > > behavior will only occur in situations that would previously have > > resulted in an ImportError. > > The complaint that it's a complicated hack that is only necessary to > cope with the lack of explicit package markers still stands, though. > (But yes, I was definitely going off the parenthetical comment rather > than the earlier description that contradicted that part). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > I hate to have left before the sprints and not be around to chat about this! I maintain a not-unpopular library, straight.plugin, which uses namespace packages to collect and load various types of plugins as or from the modules in that namespace. So, I'm interested in the impact this will have on my little project. At the first posting of these objections, it seemed to indicate these implicit packages would break my system for plugins. with the clarifications since made, out seems it would make this and similar plugin loaders easier to use, lowering the barrier for less experienced developers who do bit understand the package system. Thus, FWIW, if the changes proposed would only alter the namespace package writing by simplifying the process and excluding the boilerplate __init__.py, this humble plugin library author gives two thumbs up. _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Mar 13 08:04:04 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 13 Mar 2012 00:04:04 -0700 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: <20120313000404.379ab09c@resist> On Mar 12, 2012, at 08:49 PM, Guido van Rossum wrote: >*If* there is a foo.py or a foo/__init__.py anywhere along sys.path, >the *current* rules apply. That is, of one of these occurs on an >earlier sys.path entry, it wins; if both of these occur together on >the same sys.path entry, foo/__init__.py wins. (We discovered that the >latter disambiguation must prefer the directory, not just for >backwards compatibility, but also to make relative imports in >subpackages work right. This is probably the biggest deviation from >PEP 402.) And in this case all those foo/ directories *without* a >__init__.py in them are completely ignored, even if they come before >either foo.py or foo/__init__.py on sys.path. (If __init__.py wants to >manipulate its own __path__, that's fine.) > >*Only* if *neither* foo.py *nor* foo/__init__.py is found *anywhere* >along sys.path do we take all directories foo/ along sys.path together >and combine them into a namespace package. If there are no foo/ >directories at all, the import fails. If there is exactly one foo/, it >acts like a classic package with an empty __init__.py. We avoid having >to do two scans of sys.path by collecting info about __init__.py-less >foo/ directories during the same scan where we look for foo.py and >foo/__init__.py; but we collect it in a separate variable. (It occurs >to me that this may not be trivial when PEP-302-style finders are >involved. That's a detail that will to be figured out later.) > >So the only backwards incompatibility is that "import foo" may succeed >where it previously failed if there is a directory foo/ somewhere on >sys.path but no foo.py and no foo/__init__.py anywhere. I don't think >this is a big deal. In my sleep-deprived thinking about this, I notice that like my previous objection to .pyp directory names, this will make it difficult for vendors like Debian to support namespace packages when providing both Python 3.2 and 3.3. However, unlike the current PEP 382, I see an easy way out during the transition until we can ignore anything < 3.3. While we still support 3.2 and 3.3, our package installation helpers will simply continue to write __init__.py files into namespace package directories, just as it does now. Since we (generally, assuming nothing insane like extra non-__path__ hacking code is in the upstream namespace __init__.py file) generally substitute a consistently written __init__.py file for namespace packages, containing only the standard __path__ hackery, this will continue to work just as it does today even after this new PEP lands. IOW, backward compatibility FTW! Once we can drop 3.2, our helper simply removes those namespace __init__.py files, and all is happy in a 3.3-and-beyond world. Cheers, -Barry From ncoghlan at gmail.com Tue Mar 13 08:23:39 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Mar 2012 17:23:39 +1000 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: <20120313000404.379ab09c@resist> References: <20120313000404.379ab09c@resist> Message-ID: On Tue, Mar 13, 2012 at 5:04 PM, Barry Warsaw wrote: > While we still support 3.2 and 3.3, our package installation helpers will > simply continue to write __init__.py files into namespace package directories, > just as it does now. ?Since we (generally, assuming nothing insane like extra > non-__path__ hacking code is in the upstream namespace __init__.py file) > generally substitute a consistently written __init__.py file for namespace > packages, containing only the standard __path__ hackery, this will continue to > work just as it does today even after this new PEP lands. ?IOW, backward > compatibility FTW! > > Once we can drop 3.2, our helper simply removes those namespace __init__.py > files, and all is happy in a 3.3-and-beyond world. Right, that's why I was confused by your earlier objections - existing namespace package implementations should continue to work just as they do now regardless of whether the native namespace pacakge solution in Python 3.3 uses implicit or explicit package directories (although, in the latter case, pkgutils should likely be updated to consider both "name" and "name.pyp" directories when asked to explicitly extending a namespace package path in 3.3). If that ever isn't the case, then the new proposal has a backwards compatibility problem that needs to be fixed. (And yes, I still have some hope of persuading Guido that the incremental step from a marker file to a marker suffix is a better option than making the irreversible leap straight to implicit package directories). Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From yselivanov.ml at gmail.com Tue Mar 13 16:05:23 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 13 Mar 2012 11:05:23 -0400 Subject: [Import-SIG] Namespace Packages resolution Message-ID: First of all, I'm sorry for breaking the thread, I've just subscribed for it and have no email to reply to. In PEP 382 it was proposed to have explicit namespace packages markers '.pyp'. And with that approach it was relatively easy to implement nested namespace packages, for instance: package1/com.pyp/acme.pyp/package1/__init__.py and package2/com.pyp/acme.pyp/package2/__init__.py So that we can later import >>> from com.acme import package1, package2 Is this possible with the new approach? Thanks, - Yury From guido at python.org Tue Mar 13 16:48:27 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 13 Mar 2012 08:48:27 -0700 Subject: [Import-SIG] Namespace Packages resolution In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 8:05 AM, Yury Selivanov wrote: > First of all, I'm sorry for breaking the thread, I've just subscribed > for it and have no email to reply to. > > In PEP 382 it was proposed to have explicit namespace packages markers > '.pyp'. ?And with that approach it was relatively easy to implement > nested namespace packages, for instance: > > ?package1/com.pyp/acme.pyp/package1/__init__.py > > and > > ?package2/com.pyp/acme.pyp/package2/__init__.py > > So that we can later import > > ?>>> from com.acme import package1, package2 > > Is this possible with the new approach? Yes, that is possible with the new approach -- the same approach (look for foo/__init__.py and foo.py first, then fall back to foo/ and collapse all foo/ you find) is meant to work at every level: toplevel, in a package, in a subpackage, etc... -- --Guido van Rossum (python.org/~guido) From guido at python.org Tue Mar 13 17:26:55 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 13 Mar 2012 09:26:55 -0700 Subject: [Import-SIG] [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: Oh, shit. Nick posted a bunch of messages to python-ideas instead of import-sig, and I followed up there. Instead of reposting, I'm just going to suggest that people interested in this discussion will, unfortunately, have to follow both lists. -- --Guido van Rossum (python.org/~guido) From ericsnowcurrently at gmail.com Tue Mar 13 21:25:04 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 13 Mar 2012 13:25:04 -0700 Subject: [Import-SIG] [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 5:03 PM, Nick Coghlan wrote: > In large part, the "Why are my imports broken?" section in PEP 395 > exists because I sat down to try to document what does and doesn't > work when you attempt to directly execute a module from inside a > package directory. In building the list of what would work properly > ("python -m" from the parent directory of the package) and what would > sometimes break (everything else), I realised that instead of > documenting the entire hairy mess, the 1:1 mapping from the filesystem > layout to the Python module hierarchy meant we could *just fix it* to > not do the wrong thing by default. If implicit package directories are > blessed for inclusion in Python 3.3, that opportunity is lost forever > - with the loss of the unambiguous 1:1 mapping from the filesystem > layout to the module hierarchy, it's no longer possible for the > interpreter to figure out the right thing to do without guessing. Here's one idea to address the PEP 395 concern. Traverse up the directory tree until you hit one of the following three markers: * there is no __init__.py in the current directory (in the case where there was one adjacent to the original module) * current directory is on sys.path * setup.py is in the current directory All these indicate that you have left the package. If you make it to the FS root, the module would not be considered to exist in a package. The third option is the new idea. As a bonus, using setup.py as a marker would also nudge people toward packaging. -eric From ncoghlan at gmail.com Tue Mar 13 22:20:20 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 14 Mar 2012 07:20:20 +1000 Subject: [Import-SIG] [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Mar 14, 2012 6:25 AM, "Eric Snow" wrote: > > Here's one idea to address the PEP 395 concern. > > Traverse up the directory tree until you hit one of the following three markers: > > * there is no __init__.py in the current directory (in the case where > there was one adjacent to the original module) > * current directory is on sys.path > * setup.py is in the current directory > > All these indicate that you have left the package. If you make it to > the FS root, the module would not be considered to exist in a package. > > The third option is the new idea. As a bonus, using setup.py as a > marker would also nudge people toward packaging. > > -eric Alas, that doesn't work - to avoid slowing down normal startup too much, there needs to be a *fast* check that tells the interpreter immediately whether or not it is inside a package and needs to search the filesystem for the parent of the package directory. That's only possible when each package directory has an explicit marker. Instead, I'll have to put the search in importlib so the import error message can tell the user which directory to switch to and what "python -m" command to use to run their module. I do like the idea of using setup.py/cfgas an extra marker, though. I'll be moving house over the next few days and only have mobile internet at home for a while after that, so I'll probably revise PEP 395 some time after the next 3.3 alpha. Cheers, Nick. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Mar 13 22:23:12 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 13 Mar 2012 14:23:12 -0700 Subject: [Import-SIG] [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 1:25 PM, Eric Snow wrote: > Here's one idea to address the PEP 395 concern. > > Traverse up the directory tree until you hit one of the following three markers: > > * there is no __init__.py in the current directory (in the case where > there was one adjacent to the original module) > * current directory is on sys.path > * setup.py is in the current directory > > All these indicate that you have left the package. ?If you make it to > the FS root, the module would not be considered to exist in a package. > > The third option is the new idea. ?As a bonus, using setup.py as a > marker would also nudge people toward packaging. I like it! FWIW, Thomas Wouters disagrees with the goal of PEP 395 -- he would much rather issue an error message when you're trying to execute a file living inside a package (without using -m) instead of Nick's proposal to fix the situation. But that's a second-order issue; Eric's algorithm (or some variant of it) might work for either approach. Anyway, I'd like to stay out of *that* particular discussion -- either way it doesn't affect my position on namespace packages. -- --Guido van Rossum (python.org/~guido) From p.f.moore at gmail.com Tue Mar 13 22:40:46 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 13 Mar 2012 21:40:46 +0000 Subject: [Import-SIG] [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On 13 March 2012 20:25, Eric Snow wrote: > * setup.py is in the current directory Given that packaging uses setup.cfg, should that be a marker as well? Paul From ncoghlan at gmail.com Tue Mar 13 23:37:29 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 14 Mar 2012 08:37:29 +1000 Subject: [Import-SIG] Revising PEP 395 to handle implicit package directories Message-ID: On Wed, Mar 14, 2012 at 7:23 AM, Guido van Rossum wrote: > FWIW, Thomas Wouters disagrees with the goal of PEP 395 -- he would > much rather issue an error message when you're trying to execute a > file living inside a package (without using -m) instead of Nick's > proposal to fix the situation. But that's a second-order issue; Eric's > algorithm (or some variant of it) might work for either approach. > Anyway, I'd like to stay out of *that* particular discussion -- either > way it doesn't affect my position on namespace packages. Your position on namespace packages definitely affects this part of PEP 395, though :) My current thought is that I'll revise the affected portion of the PEP to add a "importlib._package_hint()" (or similar) that gets invoked to display a message on stderr when either __main__ or a command at the interactive prompt raises ImportError. This is still just the glimmering of an idea and there are a lot of practical details still to figure out, but it should be feasible to produce errors like those shown below. Interactive prompt: The current working directory appears to be inside a Python package. This can cause unexpected Import Errors and other strange behaviour. Consider using "os.chdir('')" to move the working directory to the parent directory of the package and accessing modules under their full name. Module/package execution via -m: The current working directory appears to be inside a Python package. This can cause unexpected Import Errors and other strange behaviour. Consider changing the working directory to the parent directory of the package ("") and running the command as "python -m ". Direct execution: The script directory appears to be inside a Python package. This can cause unexpected Import Errors and other strange behaviour. Consider changing the working directory to the parent directory of the package ("") and running the command as "python -m ". Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From p.f.moore at gmail.com Wed Mar 14 00:40:37 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 13 Mar 2012 23:40:37 +0000 Subject: [Import-SIG] Revising PEP 395 to handle implicit package directories In-Reply-To: References: Message-ID: On 13 March 2012 22:37, Nick Coghlan wrote: > Interactive prompt: > > ? ?The current working directory appears to be inside a Python > package. This can cause unexpected Import Errors and other strange > behaviour. Consider using "os.chdir('')" to move the working > directory to the parent directory of the package and accessing modules > under their full name. [...] This remains the crux of my problem with this issue. Changing the current directory affects global state. I can *consider* doing so, but I may have reasons not to want to. The recommendation offers me no option then. An alternative option might be to set PYTHONPATH, but that's also global state. (Unix users can do "PYTHONPATH=xxx python" but that's not an option on Windows). Actually, explaining the problem and issues isn't hard: - Apart from sys.path[0], sys.path is static. You know what's on it (basically the stdlib and installed packages, except if you get fancy). - sys.path[0] is set to the directory containing the executed file, or the current directory for interactive use or -m. The consequences of this may be unexpected, and they may not be what the user would like, but they aren't obscure, surely? (Nick, feel free to tell me I'm wrong, and there's a lot more complexity here - but I hope there isn't) As Nick says, it's possible to detect when the user has a mistaken view of what's happening and issue a warning, but I don't think the warning needs to be quite so scary ("strange behaviour"). Something simpler, like the following, would be better in my view: "You appear to be trying to import the package containing XXX[1]. But the Python module path does not include the directory containing that package. You can add that directory to the module path either by setting PYTHONPATH, or by changing the current directory (and using python -m if you want to execute a module in the package)." [1] either "the current directory" or "the script you are running" as appropriate. This has the benefit of explaining the actual problem and its cause, and giving the user some alternative options to fix it. (The wording could possibly be improved a bit). To alleviate the problem of having to alter global state, would it be worth having an option (like Java's -classpath option) to add sys.path entries for one invocation of Python (Unix users can do "PYTHONPATH=xxx python", but that's a bit verbose and not available to Windows users). Something like python -P dir1;dir2;dir3 rest_of_args which works like temporarily adding dir1;dir2;dir3 to the start of PYTHONPATH. If this was added, the error message could offer using -P as another option. Paul From pje at telecommunity.com Thu Mar 15 04:24:07 2012 From: pje at telecommunity.com (PJ Eby) Date: Wed, 14 Mar 2012 23:24:07 -0400 Subject: [Import-SIG] [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 5:20 PM, Nick Coghlan wrote: > Alas, that doesn't work - to avoid slowing down normal startup too much, > there needs to be a *fast* check that tells the interpreter immediately > whether or not it is inside a package and needs to search the filesystem > for the parent of the package directory. That's only possible when each > package directory has an explicit marker. > I'm repeating myself now, but if we are preferring explicit to implict, why can't we just put the marker in the script itself? This is actually the tradeoff made in Perl and Java; you don't have to mark the directory as a package, but you *do* have to declare the package in the class/module file. What's more, that explicit declaration could be in the form of a library call that implements the actual sys.path manipulation -- neatly solving the performance question and avoiding the need to build the logic into the interpreter, all in one fell swoop. (If somebody has already raised a killer objection to this proposal, sorry; I was not at PyCon.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Thu Mar 15 04:33:29 2012 From: pje at telecommunity.com (PJ Eby) Date: Wed, 14 Mar 2012 23:33:29 -0400 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 11:49 PM, Guido van Rossum wrote: > So the only backwards incompatibility is that "import foo" may succeed > where it previously failed if there is a directory foo/ somewhere on > sys.path but no foo.py and no foo/__init__.py anywhere. I don't think > this is a big deal. > Actually, it *is* a big deal. An early draft of PEP 402 was like the newly-proposed approach, and it was shot down by an *actual code sample from a real package* that somebody posted to show the problem. The motivating example is actually presented in the PEP: trying to "import json" with a json/ directory present -- and the "json" package *not* present -- and then trying to use its contents. The PEP works around this by taking the namespace concept to its logical extreme and saying you can only import the *contents* of the namespace, because the namespace itself doesn't exist. (i.e., it's virtual.) tl;dr version: the killer problem with allowing "import foo" to succeed is that code which does try/except ImportError to test for a package's presence will get false positives. That's the motivating reason for only allowing sub-namespace package imports to succeed: it prevents the immediate breakage of real code, the moment the feature is introduced. :-( -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Mar 15 04:56:26 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 14 Mar 2012 20:56:26 -0700 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Wed, Mar 14, 2012 at 8:33 PM, PJ Eby wrote: > On Mon, Mar 12, 2012 at 11:49 PM, Guido van Rossum wrote: >> >> So the only backwards incompatibility is that "import foo" may succeed >> where it previously failed if there is a directory foo/ somewhere on >> sys.path but no foo.py and no foo/__init__.py anywhere. I don't think >> this is a big deal. > > > Actually, it *is* a big deal. ?An early draft of PEP 402 was like the > newly-proposed approach, and it was shot down by an *actual code sample from > a real package* that somebody posted to show the problem. ?The motivating > example is actually presented in the PEP: trying to "import json" with a > json/ directory present -- and the "json" package *not* present -- and then > trying to use its contents. ?The PEP works around this by taking the > namespace concept to its logical extreme and saying you can only import the > *contents* of the namespace, because the namespace itself doesn't exist. > ?(i.e., it's virtual.) > > tl;dr version: the killer problem with allowing "import foo" to succeed is > that code which does try/except ImportError to test for a package's presence > will get false positives. ?That's the motivating reason for only allowing > sub-namespace package imports to succeed: it prevents the immediate breakage > of real code, the moment the feature is introduced. ?:-( Well, too bad. It's too much of a wart. Such try/except clauses make me sad anyway -- and we're trying to get rid of them by e.g. encouraging the pattern where heapq.py tries to import _heapq rather than having user code try first _heapq and then heapq. (And the idiom used in heapq.py doesn't mind if _heapq exists as an empty package.) -- --Guido van Rossum (python.org/~guido) From ironfroggy at gmail.com Thu Mar 15 12:26:55 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Thu, 15 Mar 2012 07:26:55 -0400 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Mar 14, 2012 11:56 PM, "Guido van Rossum" wrote: > > On Wed, Mar 14, 2012 at 8:33 PM, PJ Eby wrote: > > On Mon, Mar 12, 2012 at 11:49 PM, Guido van Rossum wrote: > >> > >> So the only backwards incompatibility is that "import foo" may succeed > >> where it previously failed if there is a directory foo/ somewhere on > >> sys.path but no foo.py and no foo/__init__.py anywhere. I don't think > >> this is a big deal. > > > > > > Actually, it *is* a big deal. An early draft of PEP 402 was like the > > newly-proposed approach, and it was shot down by an *actual code sample from > > a real package* that somebody posted to show the problem. The motivating > > example is actually presented in the PEP: trying to "import json" with a > > json/ directory present -- and the "json" package *not* present -- and then > > trying to use its contents. The PEP works around this by taking the > > namespace concept to its logical extreme and saying you can only import the > > *contents* of the namespace, because the namespace itself doesn't exist. > > (i.e., it's virtual.) > > > > tl;dr version: the killer problem with allowing "import foo" to succeed is > > that code which does try/except ImportError to test for a package's presence > > will get false positives. That's the motivating reason for only allowing > > sub-namespace package imports to succeed: it prevents the immediate breakage > > of real code, the moment the feature is introduced. :-( > > Well, too bad. It's too much of a wart. Such try/except clauses make > me sad anyway -- and we're trying to get rid of them by e.g. > encouraging the pattern where heapq.py tries to import _heapq rather > than having user code try first _heapq and then heapq. (And the idiom > used in heapq.py doesn't mind if _heapq exists as an empty package.) While I agree it is a wart, the heapq.py example might not be the best. A more real-world example might be the situation is PEP 417, where we'll be seeing people write try: import unittest.mock except ImportError: import mock And owning neither the two modules, the _ trick is not available. I like the idea that if a package is only an implied namespace package, with no contents of it's own, it cannot itself be directly imported. This still allows the simple check, but without seriously modifying the new feature itself. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Import-SIG mailing list > Import-SIG at python.org > http://mail.python.org/mailman/listinfo/import-sig -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Mar 15 18:50:50 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 15 Mar 2012 10:50:50 -0700 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Thu, Mar 15, 2012 at 4:26 AM, Calvin Spealman wrote: > > On Mar 14, 2012 11:56 PM, "Guido van Rossum" wrote: >> >> On Wed, Mar 14, 2012 at 8:33 PM, PJ Eby wrote: >> > On Mon, Mar 12, 2012 at 11:49 PM, Guido van Rossum >> > wrote: >> >> >> >> So the only backwards incompatibility is that "import foo" may succeed >> >> where it previously failed if there is a directory foo/ somewhere on >> >> sys.path but no foo.py and no foo/__init__.py anywhere. I don't think >> >> this is a big deal. >> > >> > >> > Actually, it *is* a big deal. ?An early draft of PEP 402 was like the >> > newly-proposed approach, and it was shot down by an *actual code sample >> > from >> > a real package* that somebody posted to show the problem. ?The >> > motivating >> > example is actually presented in the PEP: trying to "import json" with a >> > json/ directory present -- and the "json" package *not* present -- and >> > then >> > trying to use its contents. ?The PEP works around this by taking the >> > namespace concept to its logical extreme and saying you can only import >> > the >> > *contents* of the namespace, because the namespace itself doesn't exist. >> > ?(i.e., it's virtual.) >> > >> > tl;dr version: the killer problem with allowing "import foo" to succeed >> > is >> > that code which does try/except ImportError to test for a package's >> > presence >> > will get false positives. ?That's the motivating reason for only >> > allowing >> > sub-namespace package imports to succeed: it prevents the immediate >> > breakage >> > of real code, the moment the feature is introduced. ?:-( >> >> Well, too bad. It's too much of a wart. Such try/except clauses make >> me sad anyway -- and we're trying to get rid of them by e.g. >> encouraging the pattern where heapq.py tries to import _heapq rather >> than having user code try first _heapq and then heapq. (And the idiom >> used in heapq.py doesn't mind if _heapq exists as an empty package.) > > While I agree it is a wart, the heapq.py example might not be the best. A > more real-world example might be the situation is PEP 417, where we'll be > seeing people write > > try: > ??? import unittest.mock > except ImportError: > ??? import mock > > And owning neither the two modules, the _ trick is not available. > > I like the idea that if a package is only an implied namespace package, with > no contents of it's own, it cannot itself be directly imported. This still > allows the simple check, but without seriously modifying the new feature > itself. How would you implement that anyway? The import logic always tries to import the parent module before importing the child module. So the import attempt for "foo" has no idea whether it is imported as *part* of "import foo.bar", or as plain "import foo", or perhaps as part of "from foo import bar". It would also be odd to find that import foo import foo.bar would fail, whereas import foo.bar import foo would succeed, because as a side effect of "import foo.bar", a module object for foo is created and inserted as sys.modules['foo']. I really don't want to have to maintain and explain such complexity -- nor the complexity that follows from attempts to "fix" this objection. Finally, in your example, why on earth would unittest/mock/ exist as an empty directory??? -- --Guido van Rossum (python.org/~guido) From pje at telecommunity.com Thu Mar 15 21:17:26 2012 From: pje at telecommunity.com (PJ Eby) Date: Thu, 15 Mar 2012 16:17:26 -0400 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Thu, Mar 15, 2012 at 1:50 PM, Guido van Rossum wrote: > How would you implement that anyway? > >From the PEP: """If the parent package does not exist, or exists but lacks a __path__ attribute, an attempt is first made to create a "virtual path" for the parent package (following the algorithm described in the section on virtual paths, below).""" This is actually a pretty straightforward change to the import process; I drafted a patch for importlib at one point, and somebody else created another. (The main difference from the new proposal is that you do have to go back over the path list a second time in the event the parent package isn't found; but there's no reason why the protocols in the PEP wouldn't allow you to build and cache a virtual path while doing the first search, if you're worried about the performance.) > The import logic always tries to > import the parent module before importing the child module. So the > import attempt for "foo" has no idea whether it is imported as *part* > of "import foo.bar", or as plain "import foo", or perhaps as part of > "from foo import bar". > Actually, this isn't entirely true. __import__ is called with 'foo.bar' when you import foo.bar. In importlib, it recursively invokes __import__ with parent portions, and in import.c, it loops left to right for the parents. Either way, it knows the difference throughout the process, and it's fairly straightforward to backtrack and create the parent modules when the submodule import succeeds. It would also be odd to find that > > import foo > import foo.bar > > would fail, whereas > > import foo.bar > import foo > > would succeed, because as a side effect of "import foo.bar", a module > object for foo is created and inserted as sys.modules['foo']. > Assuming we know that the foo subdirectories actually exist, the ImportError would simply say, "Can't import namespace package 'foo' before one of its modules or subpackages are imported". Granted, that does seem a bit crufty. I erred this direction in order to avoid pitchforks coming from the backward-compatibility direction, on account of the ease with which something can get messed up at a distance without this condition, and in a way that may be hard to identify, if a piece of code is using package presence to control optional features. IOW, it's not like either proposal results in a perfect clean result for everybody. It's a choice of which group to upset, where one group is developers fiddling with their import order (and getting an error message that says how to fix it), and the other group is people whose code suddenly crashes or behaves differently because somebody created a directory somewhere they shouldn't have (and which they might not be able to delete or remove from sys.path for one reason or another), and which was there and worked okay before until they installed a new version of the application that's built on a new version of Python. That is, the backward compatibility problem can break an app in the field that worked perfectly in the developer's testing, and only fails when it gets to the end user who has no way of even knowing it could be a problem. It's up to you decide which of those groups' pitchforks to face; I just want to be clear about why the tradeoff was proposed the way it was. It's not that the backward compatibilty problem harms a lot of people, so much as that when it harms them, it can harm them a lot (e.g. crashing), and at *runtime*, compared to tweaking your import sequence during *development* and getting a clear and immediate "don't do that." Why crashing? Because "try: import json" will succeed, and then the app does json.foobar() and boom, an unexpected AttributeError. Far fetched? Perhaps, but the worst runtime import ordering problem I can think of is if you have a bad import that's working due to a global import ordering that's determined at runtime because of plugin loading. But if you have that problem, you correct the bad import in the plugin and it can never happen again. Granted, directory naming conflicts can *also* be fixed by changing your imports; you can (and should) "try: from json import foobar" instead. But there isn't any way for us to give the user or developer an error message that *tells* them that, or even clues them in as to why the json module on that user's machine seems to be borked whenever they run the app from a certain directory... > Finally, in your example, why on earth would unittest/mock/ exist as > an empty directory??? > It's definitely true that the impact is limited in scope; the things most likely to be affected are generically-named top-level packages like json, email, text, xml, html, etc., that could collide with other directories lying around, AND it's a package name you try/import to test for the presence of. As I said though, it's just that when it happens, it can happen to an *end user*, whereas import order crankiness can essentially only happen during actual coding. Also, nobody's come up with examples of breakage caused by trying to import the namespace, on account of there aren't many use cases for importing an empty namespace, vs use cases for having a 'json' directory or some such. ;-) All this being said, if you're happy with the tradeoff, I'm happy with the tradeoff. I'm not the one they're gonna come after with the pitchforks. ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Mar 15 22:44:59 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 15 Mar 2012 14:44:59 -0700 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Thu, Mar 15, 2012 at 1:17 PM, PJ Eby wrote: > On Thu, Mar 15, 2012 at 1:50 PM, Guido van Rossum wrote: >> >> How would you implement that anyway? > > > From the PEP: > > """If the parent package does not exist, or exists but lacks > a?__path__?attribute, an attempt is first made to create a "virtual path" > for the parent package (following the algorithm described in the section > on?virtual paths, below).""" > > This is actually a pretty straightforward change to the import process; I > drafted a patch for importlib at one point, and somebody else created > another. > > (The main difference from the new proposal is that you do have to go back > over the path list a second time in the event the parent package isn't > found; but there's no reason why the protocols in the PEP wouldn't allow you > to build and cache a virtual path while doing the first search, if you're > worried about the performance.) > >> >> The import logic always tries to >> import the parent module before importing the child module. So the >> import attempt for "foo" has no idea whether it is imported as *part* >> of "import foo.bar", or as plain "import foo", or perhaps as part of >> "from foo import bar". > > > Actually, this isn't entirely true. ? __import__ is called with 'foo.bar' > when you import foo.bar. ?In importlib, it recursively invokes __import__ > with parent portions, and in import.c, it loops left to right for the > parents. ?Either way, it knows the difference throughout the process, and > it's fairly straightforward to backtrack and create the parent modules when > the submodule import succeeds. > > >> It would also be odd to find that >> >> ?import foo >> ?import foo.bar >> >> would fail, whereas >> >> ?import foo.bar >> ?import foo >> >> would succeed, because as a side effect of "import foo.bar", a module >> object for foo is created and inserted as sys.modules['foo']. > > > Assuming we know that the foo subdirectories actually exist, the ImportError > would simply say, "Can't import namespace package 'foo' before one of its > modules or subpackages are imported". > > Granted, that does seem a bit crufty. ?I erred this direction in order to > avoid pitchforks coming from the backward-compatibility direction, on > account of the ease with which something can get messed up at a distance > without this condition, and in a way that may be hard to identify, if a > piece of code is using package presence to control optional features. > > IOW, it's not like either proposal results in a perfect clean result for > everybody. ?It's a choice of which group to upset,?where one group is > developers fiddling with their import order (and getting an error message > that says how to fix it), and the other group is people whose code suddenly > crashes or behaves differently because somebody created a directory > somewhere they shouldn't have (and which they might not be able to delete or > remove from sys.path for one reason or another), and which was there and > worked okay before until they installed a new version of the application > that's built on a new version of Python. > > That is, the backward compatibility problem can break an app in the field > that worked perfectly in the developer's testing, and only fails when it > gets to the end user who has no way of even knowing it could be a problem. > > It's up to you decide which of those groups' pitchforks to face; I just want > to be clear about why the tradeoff was proposed the way it was. ?It's not > that the backward compatibilty problem harms a lot of people, so much as > that when it harms them, it can harm them a lot (e.g. crashing), and at > *runtime*, compared to tweaking your import sequence during *development* > and getting a clear and immediate "don't do that." > > Why crashing? ?Because "try: import json" will succeed, and then the app > does json.foobar() and boom, an unexpected AttributeError. ?Far fetched? > ?Perhaps, but the worst runtime import ordering problem I can think of is if > you have a bad import that's working due to a global import ordering that's > determined at runtime because of plugin loading. ?But if you have that > problem, you correct the bad import in the plugin and it can never happen > again. > > Granted, directory naming conflicts can *also* be fixed by changing your > imports; you can (and should) "try: from json import foobar" instead. ?But > there isn't any way for us to give the user or developer an error message > that *tells* them that, or even clues them in as to why the json module on > that user's machine seems to be borked whenever they run the app from a > certain directory... > > >> >> Finally, in your example, why on earth would unittest/mock/ exist as >> an empty directory??? > > > It's definitely true that the impact is limited in scope; the things most > likely to be affected are generically-named top-level packages like json, > email, text, xml, html, etc., that could collide with other directories > lying around, AND it's a package name you try/import to test for the > presence of. > > As I said though, it's just that when it happens, it can happen to an *end > user*, whereas import order crankiness can essentially only happen during > actual coding. ?Also, nobody's come up with examples of breakage caused by > trying to import the namespace, on account of there aren't many use cases > for importing an empty namespace, vs use cases for having a 'json' directory > or some such. ?;-) > > All this being said, if you're happy with the tradeoff, I'm happy with the > tradeoff. ?I'm not the one they're gonna come after with the pitchforks. > ?;-) Yeah, I'm still happy with the tradeoff, even though it's a case of picking your poison. In this case I much prefer having simpler rules going forward than bending over *too* far for backward compatibility -- even if we still have two types of packages, those with an __init__.py and those without. Also, it's not like there aren't already 50 ways to break things with odd paths or modules -- I don't know if it's more likely that a user would create an unrelated directory named json or an unrelated module named json.py. At least we're now also *removing* a way to break things: forgetting an empty __init__.py. Gentlemen, sharpen your pitchforks! :) -- --Guido van Rossum (python.org/~guido) From ironfroggy at gmail.com Fri Mar 16 02:50:39 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Thu, 15 Mar 2012 21:50:39 -0400 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Thu, Mar 15, 2012 at 1:50 PM, Guido van Rossum wrote: > On Thu, Mar 15, 2012 at 4:26 AM, Calvin Spealman wrote: >> >> On Mar 14, 2012 11:56 PM, "Guido van Rossum" wrote: >>> >>> On Wed, Mar 14, 2012 at 8:33 PM, PJ Eby wrote: >>> > On Mon, Mar 12, 2012 at 11:49 PM, Guido van Rossum >>> > wrote: >>> >> >>> >> So the only backwards incompatibility is that "import foo" may succeed >>> >> where it previously failed if there is a directory foo/ somewhere on >>> >> sys.path but no foo.py and no foo/__init__.py anywhere. I don't think >>> >> this is a big deal. >>> > >>> > >>> > Actually, it *is* a big deal. ?An early draft of PEP 402 was like the >>> > newly-proposed approach, and it was shot down by an *actual code sample >>> > from >>> > a real package* that somebody posted to show the problem. ?The >>> > motivating >>> > example is actually presented in the PEP: trying to "import json" with a >>> > json/ directory present -- and the "json" package *not* present -- and >>> > then >>> > trying to use its contents. ?The PEP works around this by taking the >>> > namespace concept to its logical extreme and saying you can only import >>> > the >>> > *contents* of the namespace, because the namespace itself doesn't exist. >>> > ?(i.e., it's virtual.) >>> > >>> > tl;dr version: the killer problem with allowing "import foo" to succeed >>> > is >>> > that code which does try/except ImportError to test for a package's >>> > presence >>> > will get false positives. ?That's the motivating reason for only >>> > allowing >>> > sub-namespace package imports to succeed: it prevents the immediate >>> > breakage >>> > of real code, the moment the feature is introduced. ?:-( >>> >>> Well, too bad. It's too much of a wart. Such try/except clauses make >>> me sad anyway -- and we're trying to get rid of them by e.g. >>> encouraging the pattern where heapq.py tries to import _heapq rather >>> than having user code try first _heapq and then heapq. (And the idiom >>> used in heapq.py doesn't mind if _heapq exists as an empty package.) >> >> While I agree it is a wart, the heapq.py example might not be the best. A >> more real-world example might be the situation is PEP 417, where we'll be >> seeing people write >> >> try: >> ??? import unittest.mock >> except ImportError: >> ??? import mock >> >> And owning neither the two modules, the _ trick is not available. >> >> I like the idea that if a package is only an implied namespace package, with >> no contents of it's own, it cannot itself be directly imported. This still >> allows the simple check, but without seriously modifying the new feature >> itself. > > How would you implement that anyway? The import logic always tries to > import the parent module before importing the child module. So the > import attempt for "foo" has no idea whether it is imported as *part* > of "import foo.bar", or as plain "import foo", or perhaps as part of > "from foo import bar". > > It would also be odd to find that > > ?import foo > ?import foo.bar > > would fail, whereas > > ?import foo.bar > ?import foo > I would prefer neither `import foo` succeed, in fact, if foo is only an implied namespace package. The import machinery could treat it the same, up to the point that it would bind the name, where it would check if it was such an implied package and raise an appropriate exception. Maybe some ImportError subclass? So, this would not be in an error in the loading portion of the import, but in adding it to the namespace where the import statement was executed. > would succeed, because as a side effect of "import foo.bar", a module > object for foo is created and inserted as sys.modules['foo']. > > I really don't want to have to maintain and explain such complexity -- > nor the complexity that follows from attempts to "fix" this objection. > > Finally, in your example, why on earth would unittest/mock/ exist as > an empty directory??? It was only meant as an example where the try/except for import fallbacks is natural and wise. It seemed your previous statements were calling all such things a wart. > -- > --Guido van Rossum (python.org/~guido) -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From guido at python.org Fri Mar 16 03:20:59 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 15 Mar 2012 19:20:59 -0700 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Thu, Mar 15, 2012 at 6:50 PM, Calvin Spealman wrote: > On Thu, Mar 15, 2012 at 1:50 PM, Guido van Rossum wrote: [...] >> It would also be odd to find that >> >> ?import foo >> ?import foo.bar >> >> would fail, whereas >> >> ?import foo.bar >> ?import foo >> > > I would prefer neither `import foo` succeed, in fact, if foo is only > an implied namespace package. The import machinery could treat it the > same, up to the point that it would bind the name, where it would > check if it was such an implied package and raise an appropriate > exception. Maybe some ImportError subclass? So, this would not be in > an error in the loading portion of the import, but in adding it to the > namespace where the import statement was executed. I'm sure someone can also construct a counterexample where this will break existing code -- something that inspects a package hierarchy might not expect that foo.bar is importable but yet foo is not. (And if it is not, does that mean that it isn't in sys.modules? That violates a whole 'nother set of expectations.) >> would succeed, because as a side effect of "import foo.bar", a module >> object for foo is created and inserted as sys.modules['foo']. >> >> I really don't want to have to maintain and explain such complexity -- >> nor the complexity that follows from attempts to "fix" this objection. >> >> Finally, in your example, why on earth would unittest/mock/ exist as >> an empty directory??? > > It was only meant as an example where the try/except for import > fallbacks is natural and wise. It seemed your previous statements were > calling all such things a wart. Heh. I still find try / import / except / other import a warty pattern. I agree it happens. And I agree it can be broken if an empty directory creates a dummy for the first attempted import. But then, there are plenty of other ways to break that (e.g. an unrelated .py file). To debug this you just print the __file__ of the suspected module, and then you'll know. And I find the modifications to the import rules proposed to avoid the wart, wartier. (PS. We agreed that the __file__ of a namespace package would be equal to the first directory name plus a trailing os.sep -- os.path.split() correctly breaks this into the directory name and an empty string, and os.path.dirname() matches that behavior.) -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Mon Mar 19 09:48:20 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 19 Mar 2012 18:48:20 +1000 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: For the record, I just thought of an easy workaround for the "erroneous successful import" case: drop an __init__.py file containing "raise ImportError" into the offending data directory. Cheers, Nick. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Mon Mar 19 19:52:05 2012 From: pje at telecommunity.com (PJ Eby) Date: Mon, 19 Mar 2012 14:52:05 -0400 Subject: [Import-SIG] My objections to implicit package directories In-Reply-To: References: Message-ID: On Mar 19, 2012 4:48 AM, "Nick Coghlan" wrote: > > For the record, I just thought of an easy workaround for the "erroneous successful import" case: drop an __init__.py file containing "raise ImportError" into the offending data directory. It only works if you don't mind shadowing a later installation of the same-named package, further along on sys.path. More to the point, it doesn't solve the real problem, which is noticing in the first place that this is happening. The right way to fix it is to change your try:"import foo" to "try:from foo import something" (Unfortunately, even that fix isn't purely correct, since if there's a foo/something subdirectory, you still get a false positive. It just further reduces the odds of a collision.) -------------- next part -------------- An HTML attachment was scrubbed... URL: