[Python-Dev] Draft PEP: "Simplified Package Layout and Partitioning"

Eric Snow ericsnowcurrently at gmail.com
Wed Jul 20 21:35:46 CEST 2011


On Wed, Jul 20, 2011 at 11:04 AM, P.J. Eby <pje at telecommunity.com> wrote:
> Hm.  Here's another variant that might be easier to implement (even in C),
> and could offer some other advantages as well.
>
> Suppose we replace the sys.virtual_packages set() with a sys.virtual_paths
> dict(): a dictionary that maps from module names to __path__ lists, and
> that's populated by the __path__ creation algorithm described in the PEP.
>  (An empty list would mean that __path__ creation failed for that
> module/package name.)
>
> Now, if a module doesn't have a __path__ (or doesn't exist), we look in
> sys.virtual_paths for the module name.  If the retrieved list is empty, we
> fail the import.  If it's not, we proceed...  but *don't* create a module or
> set the existing module's __path__.
>
> Then, at the point where an import succeeds, and we're going to set an
> attribute on the parent module, we recursively construct parent modules and
> set their __path__ attributes from sys.virtual_paths, if a module doesn't
> exist in sys.path, or its __path__ isn't set.

(I'm guessing you meant sys.modules in that last sentence.)

This is a really nice solution.  So a virtual package is not imported
until a submodule of the virtual package is successfully imported
(except for direct import of pure virtual packages).  It seems like
sys.virtual_packages should be populated even during a failed
submodule import.  Is that right?

Also, it makes sense that the above applies to all virtual packages,
not just pure ones.

>
> Voila.  Now there are fewer introspection problems as well: trying to
> 'import json.foo' when there's no 'foo.py' in any json/ directory will *not*
> create an empty 'json' package in sys.modules as a side-effect.  And it
> won't add a __path__ to the 'json' module if there were a json.py found,
> either.
>
> What's more, since importing a pure virtual package now fails unless you've
> successfully imported something from it before, it makes more sense for it
> to not have a __file__, or a __file__ of None.
>
> Actually, it's too bad that we have to have parent packages in sys.modules,
> or I'd suggest we just make pure virtual packages unimportable, period.

It wouldn't be that hard to disallow their direct import entirely, but
still allow the indirect import when successfully importing a
submodule.  However, that would effectively imply that the import of
submodules of the virtual package will also fail.  In other words, it
may be a source of confusion if a package can't be imported but its
submodule can.

There is one remaining difference between the two types of virtual
packages that's derived from allowing direct import of pure virtual
packages.

When a pure virtual package is directly imported, a new [empty] module
is created and its __path__ is set to the matching value in
sys.virtual_packages.  However, an "impure" virtual package is not
created upon direct import, and its __path__ is not updated until a
submodule import is attempted.  Even the sys.virtual_packages entry is
not generated until the submodule attempt, since the virtual package
mechanism doesn't kick in until the point that an ImportError is
currently raised.

This isn't that big a deal, but it would be the one behavioral
difference between the two kinds of virtual packages.  So either leave
that one difference, disallow direct import of pure virtual packages,
or attempt to make virtual packages for all non-package imports.  That
last one would impose the virtual package overhead on many more
imports so it is probably too impractical.  I'm fine with leaving the
one difference.

>
> Technically, we *could* always create dummy parent modules for virtual
> packages and *not* put them in sys.modules, but I'm not sure if that's a
> good idea.  It would be more consistent in some ways with the idea that
> virtual packages are not directly importable, but an interesting side effect
> would be that if module A does:
>
>  import foo.bar
>
> and module B does:
>
>  import foo.baz
>
> Then module A's version of 'foo' has *only* a 'bar' attribute and B's
> version has *only* a 'baz' attribute.  This could be considered a good
> thing, a bad thing, or a weird thing, depending on how you look at it.  ;-)
>
> Probably, we should stick with the current shared 'foo' instance, even for
> pure virtual packages.  It's just that 'foo' should not exist in
> sys.packages until one of the above imports succeeds.

(Guessing you meant sys.virtual_packages.)

Agreed.

FYI, last night I started on an importlib-based implementation for the
PEP and the above solution would be really easy to incorporate.

-eric

>
> Anyway, thanks for bringing this issue up, because now we can fix the hole
> *entirely*.  If pure virtual packages can never be imported directly, then
> they can *never* create false positive imports -- and the "Backward
> Compatibility" part of the PEP gets shorter.  ;-)
>
> Hurray!  (I'm tempted to run off and tweak the PEP for this right now, but I
> want to see if any of the folks who'd be doing the actual 3.x implementation
> of this want to weigh in on the details first.)
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com
>


More information about the Python-Dev mailing list