From ncoghlan at gmail.com Tue Mar 28 06:25:06 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 Mar 2017 20:25:06 +1000 Subject: [Import-SIG] Eliminating implicit __main__ relative imports Message-ID: Hi folks, http://bugs.python.org/issue29929 covers an idea I had today that may help us finally resolve the name shadowing problem, where folks inadvertently import their __main__ script as a module because it happens to shadow the name of a standard library or third party module that they or one of their dependencies is trying to import. The gist of the idea is to ask what if, instead of doing: sys.path.insert(0, ) we instead did this little dance to add an anonymous top-level main-relative namespace package: mod = mod.ModuleType("") mod.__path__ = [] sys.modules[""] = mod __main__.__package__ = "" That's already enough to allow explicit relative imports via "mod = importlib.import_module('name', package='')", but a couple of sanity checks elsewhere in importlib guarding against empty module names would need to be relaxed to allow the "from . import name" syntax. sys.path would then only be modified when: - you used the -m switch (current directory inserted as sys.path[0]) - you executed a sys.path entry (the entry inserted as sys.path[0]) - you modified it explicitly For the other forms of __main__ invocation: - the implicit main-relative namespace would be created when a top-level module was executed with "-m" (including when running __main__ from a sys.path entry) - it would *not* be created when a package or submodule was executed with "-m" (as in that case, there's already a real package to anchor any explicitly relative imports) Thanks to PEP 366, cross-version compatible main-relative imports would then look like: if __package__ is not None: from . import relative_module_name else: import relative_module_name As a transition plan, a deprecation warning could be emitted for the latter form by: 1. In 3.7, populating a private sys._main_path_entry in the sys module in addition to including it in both __main__.__path__ and sys.path 2. In 3.7, emit a warning when an import is satisfied from the sys._main_path_entry directory and the fully qualified module name *doesn't* start with "." (i.e. an empty parent package) 3. In 3.8, stop populating sys.path[0] with the script directory, stop setting sys._main_path_entry, and stop emitting the deprecation warning I'm posting the idea here first to give folks a change to poke technical holes in it before I start trying to formulate it into a PEP for python-ideas and python-dev. Cheers. Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p.f.moore at gmail.com Tue Mar 28 06:38:27 2017 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 28 Mar 2017 11:38:27 +0100 Subject: [Import-SIG] Eliminating implicit __main__ relative imports In-Reply-To: References: Message-ID: On 28 March 2017 at 11:25, Nick Coghlan wrote: > Thanks to PEP 366, cross-version compatible main-relative imports > would then look like: > > if __package__ is not None: > from . import relative_module_name > else: > import relative_module_name I'd be a little concerned that certain types of program (adhoc scripts and the like that got too big to be a single file) would be broken by this. Even if they *don't* care about cross-version compatibility, they'd need to change from "import foo" to "from . import foo". I honestly don't know how common this problem would be - I've tried to think what programs I have that would be affected, and honestly, my feeling is that it's either very few, or it's a lot but I use main-relative imports so automatically that I can't remember any ;-) (Although TBH, I think the former, based on the fact that most of my adhoc scripts are very small). The particular concern is that if it *is* an issue, it'd mostly hit the sort of code that doesn't get published on the web, so it's probably pretty hard to get a sense of the scale of the impact. In isolation, though, the idea sounds good. Paul From ncoghlan at gmail.com Tue Mar 28 07:42:44 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 Mar 2017 21:42:44 +1000 Subject: [Import-SIG] Eliminating implicit __main__ relative imports In-Reply-To: References: Message-ID: On 28 March 2017 at 20:38, Paul Moore wrote: > On 28 March 2017 at 11:25, Nick Coghlan wrote: >> Thanks to PEP 366, cross-version compatible main-relative imports >> would then look like: >> >> if __package__ is not None: >> from . import relative_module_name >> else: >> import relative_module_name > > I'd be a little concerned that certain types of program (adhoc scripts > and the like that got too big to be a single file) would be broken by > this. Even if they *don't* care about cross-version compatibility, > they'd need to change from "import foo" to "from . import foo". I > honestly don't know how common this problem would be - I've tried to > think what programs I have that would be affected, and honestly, my > feeling is that it's either very few, or it's a lot but I use > main-relative imports so automatically that I can't remember any ;-) > (Although TBH, I think the former, based on the fact that most of my > adhoc scripts are very small). > > The particular concern is that if it *is* an issue, it'd mostly hit > the sort of code that doesn't get published on the web, so it's > probably pretty hard to get a sense of the scale of the impact. Yeah, that's why I considered coming up with a plausible transition plan (where the explicit relative imports work, and the implicit relative imports emit a deprecation warning) to be an essential aspect of being able to even consider the idea. Rather than deprecation and eventual removal, another option would be to make the implicit main-relative imports instead emit a RuntimeWarning, since those are intended to cover "dubious runtime features" without implying any kind of timeline for getting rid of them entirely. If we did that, then "sys.main_path_entry" could become a public attribute, allowing the use of "sys.path.remove(sys.main_path_entry)" to explicitly disable implicit main-relative imports. That way folks potentially affected by name shadowing would get an explicit warning in addition to the resulting cryptic misbehaviour, while folks deliberately doing main-relative imports could choose between living with the warning, suppressing it through the warning system's options, or switching to explicit relative imports. Cheers, Nick. P.S. As an interesting side benefit of the proposed approach, folks would also be able to do "from . import __path__ as main_relative_path" to get access to the list of directories that are searched for top-level explicit relative imports, which could prove to be less error prone than manipulating sys.path and potentially affecting *all* imports, rather than just main-relative ones. You'd also be able to do "main_relative_path.clear()" to disable even explicit main-relative imports. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Mar 29 04:00:20 2017 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 29 Mar 2017 18:00:20 +1000 Subject: [Import-SIG] Eliminating implicit __main__ relative imports In-Reply-To: References: Message-ID: On 28 March 2017 at 20:25, Nick Coghlan wrote: > The gist of the idea is to ask what if, instead of doing: > > sys.path.insert(0, ) > > we instead did this little dance to add an anonymous top-level > main-relative namespace package: > > mod = mod.ModuleType("") > mod.__path__ = [] > sys.modules[""] = mod > __main__.__package__ = "" > > That's already enough to allow explicit relative imports via "mod = > importlib.import_module('name', package='')", but a couple of sanity > checks elsewhere in importlib guarding against empty module names > would need to be relaxed to allow the "from . import name" syntax. While this approach is cute & clever, I'm now leaning back towards the approach I wrote up in http://bugs.python.org/issue29929#msg290689, which would be to make *__main__ itself* a runtime pseudo-package by setting `__main__.__path__` appropriately. The big advantage of that approach is that it already "just works", even at the syntactic level: $ echo "print(__name__)" > relative.py $ python3 Python 3.5.3 (default, Mar 21 2017, 17:21:33) [GCC 6.3.1 20161221 (Red Hat 6.3.1-1)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> __path__ = [os.getcwd()] >>> from . import relative __main__.relative >>> It's also going to be more compatible with other tools that work with sys.modules and other interfaces, and expect packages to always have a non-empty name. The backwards-and-forwards compatible check is then to look for "__path__" in the module globals: if "__path__" in globals(): from . import relative_module_name else: import relative_module_name This is also robust against dunder-variables being set in the builtins module, as I discovered today that that has "__package__" explicitly set to the empty string. That then leaves the question of what to do about the double-import trap (http://python-notes.curiousefficiency.org/en/latest/python_concepts/import_traps.html#the-double-import-trap) as any change along these lines will make all main relative imports accessible under two names: * their implicitly relative top-level import name * their explicit relative name under the "__main__." namespace As a first step, I think actually following through on this idea would require also accepting PEP 499 so that modules executed via the -m switch were added to sys.modules under both "__name__" and "__spec__.name": https://www.python.org/dev/peps/pep-0499/ It would also mean that in cases where `__main__` is a pseudo-package, we would bind it in sys.modules as both `__main__` and `__main__.` (so "import __main__" and "from . import " gave you the same answer). Beyond that, rather than actively preventing double-import errors, we'd just emit RuntimeWarning messages to make them easier to debug when they happened: * for `sys.main_path_entry`, we'd warn for any import satisfied from that directory that *didn't* start with the `__main__.` prefix (e.g. "Implicit relative import {name} from main path entry {dirpath}"). * we'd add a check to the import system that when populating __path__ for a package based __spec__.submodule_search_locations, we'd warn for any entries that also appeared in sys.path ((e.g. "{name} package directory {dirpath} also appears in sys.path")). The latter check wouldn't be completely free, but it should be fast relative to actual module imports (and we could build a dynamic cache as a set if the linear search through sys.path turned out to be overly slow). The transition plan would then be to emit those warnings as DeprecationWarning in 3.7, upgrade to a visible-by-default RuntimeWarning in 3.8, and then *maybe* consider omitting sys.main_path_entry from sys.path by default in 3.9. I'm not going to pursue this idea further until after PEP 538 is squared away, but it's currently seeming feasible to me. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia