[Import-SIG] New draft revision for PEP 382

Eric Snow ericsnowcurrently at gmail.com
Sun Jul 10 00:30:28 CEST 2011


On Sat, Jul 9, 2011 at 3:49 PM, P.J. Eby <pje at telecommunity.com> wrote:
> At 02:42 AM 7/9/2011 -0600, Eric Snow wrote:
>>
>> And if there were a "zope_part1" and a "zope_part2" directory, both
>> with a zope.ns file in them, that namespace_subpath("zope") call would
>> return ["/usr/lib/site-packages/zope_part1",
>> "/usr/lib/site-packages/zope_part2"], right?  And if both also had a
>> foo.ns file in them, the same would be returned for
>> namespace_subpath("foo").
>
> No; the directory is always named for the package, just like now.  We're
> just saying that we replace looking for __init__.py with looking for *.ns.
>
>
>> > Finally, the ``__init__`` module code for the package (if it exists)
>> > is located and executed in the new module's namespace.
>> >
>> > Each importer that returns a ``namespace_subpath()`` for the package
>> > is asked to perform a standard ``find_module()`` for the package.
>> > Since by the normal import rules, a directory containing an
>> > ``__init__`` module is a package, this call should succeed if the
>> > namespace package portion contains an ``__init__`` module, and the
>> > importing can proceed normally from that point.
>> >
>>
>> Is this last paragraph part of the finally?
>
> Yes.
>
>>  If so, what does calling
>> find_module at this point accompish?  Do you mean load_module is also
>> called for each that is found?
>
> I'm adding this sentence to the end of that paragraph for clarification:
>
> """(That is, with a ``load_module()`` call to execute the first ``__init__``
> module found on the package's ``__path__``.)"""
>
> Does that make it clearer?
>

Yeah, that's great.

>
>> Will it be too easy (or conversely
>> very likely) to have __init__.py collisions?
>
> __init__ collisions (and the recommendation to not use __init__ modules at
> all) are addressed later below in the implementation notes.
>
>
>
>> > There is one caveat, however.  The importers currently distributed
>> > with Python expect that *they* will be the ones to initialize the
>> > ``__path__`` attribute, which means that they must be changed to
>> > either recognize that ``__path__`` has already been set and not
>> > change it, or to handle namespace packages specially (e.g., via an
>> > internal flag or checking ``sys.namespace_packages``).
>> >
>> > Similarly, any third-party importers wishing to support namespace
>> > packages must make similar changes.
>> >
>>
>> Seems like the caveat is dependent on the above algorithm.  If the
>> module's __path__ were set with the namespace_subpath() results after
>> the namespace package's import was all over, would it still be an
>> issue?
>
> No, but then we couldn't support __init__ modules executing with the correct
> __path__ value; notably, this would prevent __init__ modules from
> manipulating their own __path__.
>

Good point.

> Honestly, throwing out __init__ support entirely would make a LOT of things
> easier and simpler here, especially in the 2.x version.  But there was a
> vocal contingent of support for them in the original Python-Dev discussion.
>
>
>> > Specifically the proposed changes and additions are:
>>
>> Maybe, "Specifically the proposed changes and additions to pkgutil
>> are:", to clarify the context?
>
> Ok.
>
>> >
>> > * A new ``namespace_subpath(importer, fullname)`` generic, allowing
>> >  implementations to be registered for existing importers.
>> >
>> > * A new ``extend_namespaces(path_entry)`` function, to extend existing
>> >  and already-imported namespace packages' ``__path__`` attributes to
>> >  include any portions found in a new ``sys.path`` entry.  This
>> >  function should be called by applications extending ``sys.path``
>> >  at runtime, e.g. to include a plugin directory or add an egg to the
>> >  path.
>> >
>> >  The implementation of this function does a simple breadth-first walk
>> >  of ``sys.namespace_packages``, and performs any necessary
>> >  ``namespace_subpath()`` calls to identify what path entries need to
>> >  be added to each package's ``__path__``, given that `path_entry`
>> >  has been added to ``sys.path``.
>> >
>>
>> Does the same apply to namespace sub-packages where their parent
>> package has an updated __path__?  So a recursion would take place in
>> some cases.
>
> Yes, that's what "breadth-first" meant here; i.e., first top-level
> namespaces, then second-level namespaces, and so on.  In actuality, I erred
> by saying breadth-first, though, what I actually meant is technically
> "pre-order traversal", i.e., parent nodes are touched before their children.
>  I'll tweak that to "top-down traversal" instead of "breadth-first walk",
> and add:
>
> """(Or, in the case of sub-packages, adding a derived subpath entry, based
> on their parent namespace's ``__path__``.)"""
>

That helps a lot.

>
>> The iter_modules() method isn't part of PEP 302, is it?  Where can I
>> find out more about it?
>
> See pkgutil; it's something I added in Python 2.5 to help tools like pydoc
> better support zipfiles and other exotic importers.
>

Yeah, I see iter_modules() in pkgutil, but was unaware of it on
importer objects.  However, my ignorance is irrelevant to the PEP, as
I certainly agree with the bullet in the case that importer objects
have iter_modules().  :)

>
>> > * "Meta" importers (i.e., importers placed on ``sys.meta_path``) do
>> >  not need to implement ``namespace_subpath()``, because the method
>> >  is only called on importers corresponding to ``sys.path`` entries.'
>>
>> And parent.__path__ for namespace submodules?
>
> Yes.  Fixed.
>
>
>> >  If a meta importer wishes to support namespace packages, it must
>> >  do so entirely within its ``find_module()`` implementation.
>> >
>> >  Unfortunately, it is unlikely that any such implementation will be
>> >  able to merge its namespace portions with those of other meta
>> >  importers or ``sys.path`` importers, so the meaning of "supporting
>> >  namespace packages" for a meta importer is currently undefined.
>> >
>>
>> While I'm not sure meta importers need to be left out, I suppose it
>> isn't critical since the work-around isn't that hard, nor widely
>> needed.  Thus the message here is that this PEP only applies to the
>> use of sys.path_hooks and sys.path_importer_cache.  It would be nice
>> for that to be clear up front.
>
> Ok, I added this:
>
> """(Note: the import machinery will NOT invoke this method for importers
> on ``sys.meta_path``, because there is no path string associated with
> such importers, and so the idea of a "subpath" makes no sense in that
> case.)"""
>
> just after this bit:
>
> """The Python import machinery will call this method on each importer
> corresponding to a path entry in ``sys.path`` (for top-level imports)
> or in a parent package ``__path__`` (for subpackage imports)."""
>
> in the PEP 302 protocol description.
>

Nice!  Would it be worth pointing out that the focus is on
sys.pathhooks and sys.path_importer_cache?  Something like "Note: ...
Instead, this PEP is focused on the import machinery surrounding
sys.pathhooks.)"  I only bring this up because the specificity of what
the focus **is** helped me grasp what the implementation involves.

>
>
>> >  Further, there is an immense body of existing code (including the
>> >  distutils and many other packaging tools) that expect a package
>> >  directory's name to be the same as the package name.
>>
>> Correct me if I'm wrong, but I have understood that for namespace
>> packages in the PEP, the directory name does not have to be the
>> package name.
>
> Consider yourself corrected.  ;-)
>
>
>> Back to namespace subpackages, it's unclear how they should work.
>> Either a namespace package is at the top level of a sys.path entry, or
>> its a module of a parent package, namespace or otherwise.  The
>> top-level case is pretty clear.  However, the subpackage case is not.
>> I don't see namespace subpackages as being too practical with
>> non-namespace parent packages, but I'm probably missing something.
>
> They aren't practical at all, no.  ;-)  I'll add an implementation note
> explaining that even though the spec doesn't require a namespace package's
> parent to also be a namespace, that there isn't any practical use in doing
> so, as the child __path__ is a collection of subpaths derived from the
> parent __path__, and thus it wouldn't combine with any other contributions
> that weren't installed to the same location.
>
> Here's the text:
>
> * In general, a namespace subpackage (e.g. ``peak.util``, ``zope.app``,
>  etc.) must be a child of another namespace package (e.g. ``peak``,
>  ``zope``, etc.).  This is not required by the spec or enforced by
>  the implementation, but in practice, it is useless to put a
>  namespace package inside a non-namespace package, as the child
>  package's ``__path__`` will be a subset of the parent's.
>
>  In other words, it will only work correctly if all the contributions
>  to that namespace package are installed to the same physical
>  location.  So, if you intend to use a namespace subpackage, you
>  should always make its parent package a namespace as well.
>
>
>

Sounds great.  Much appreciated.

-eric


More information about the Import-SIG mailing list