[Import-SIG] New PEP draft: "Simplified Package Layout and Partitioning"

Thu Jul 14 00:27:01 CEST 2011

On Wed, Jul 13, 2011 at 11:11 AM, P.J. Eby <pje at telecommunity.com> wrote:
> I'd appreciate any questions, problems, clarifications, concerns, etc. so we
> can clean this up before we run it past Python-Dev.  There are also a couple
> of "XXX" comments down in the "Implementation Notes" section, with open
> questions we need to nail down.  Mostly, though, this is looking...  pretty
> doable, actually.
>

This is cool stuff.  And you have presented it really well.  I have
some (probably too much) feedback inline.

> Thanks!
>
>
> PEP: XXX
> Title: Simplified Package Layout and Partitioning
> Version: $Revision$
> Last-Modified: $Date$
> Author: P.J. Eby
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 12-Jul-2011
> Python-Version: 3.3
> Post-History:
> Replaces: 382
>
> Abstract
> ========
>
> This PEP proposes an enhancement to Python's package importing
> to:
>
> * Surprise users of other languages less,
> * Make it easier to convert a module into a package, and
> * Support dividing packages into separately installed components
>  (ala "namespace packages", as described in PEP 382)
>
> The proposed enhancements do not change the semantics of any
> currently-importable directory layouts, but make it possible for
> packages to use a simplified directory layout (that is not importable
> currently).
>
> However, the proposed changes do NOT add any performance overhead to
> the importing of existing modules or packages, and performance for the
> new directory layout should be about the same as that of previous
> "namespace package" solutions (such as ``pkgutil.extend_path()``).
>
>
> The Problem
> ===========
>
> .. epigraph::
>
>    "Most packages are like modules.  Their contents are highly
>    interdependent and can't be pulled apart.  [However,] some
>    packages exist to provide a separate namespace. ...  It should
>    be possible to distribute sub-packages or submodules of these
>    [namespace packages] independently."
>
>    -- Jim Fulton, shortly before the release of Python 2.3 [1]_
>
>
> When new users come to Python from other languages, they are often
> confused by Python's packaging semantics.  At Google, for example,
> Guido received complaints from "a large crowd with pitchforks" [2]_
> that the requirement for packages to contain an ``__init__`` module
> was a "misfeature", and should be dropped.
>
> In addition, users coming from languages like Java or Perl are
> sometimes confused by a difference in Python's import path searching.
>
> In most other languages that have a path mechanism to Python's

... mechanism similar to Python's

> ``sys.path``, a package is merely a namespace that contains modules
> or classes, and can thus be spread across multiple directories in
> the language's path.  In Perl, for instance, a ``Foo::Bar`` module
> will be searched for in ``Foo/`` subdirectories all along the module
> include path, not just in the first such subdirectory found.
>
> Worse, this is not just a problem for new users: it prevents *anyone*
> from easily splitting a package into separately-installable
> components.  In Perl terms, it would be as if every possible ``Net::``
> module on CPAN had to be bundled up and shipped in a single tarball!
>
> For that reason, various workarounds for this latter limitation exist,
> circulated under the term "namespace packages".  The Python standard
> library has provided one such workaround since Python 2.3 (via the
> ``pkgutil.extend_path()`` function), and the "setuptools" package
> provides another (via ``pkg_resources.declare_namespace()``).
>
> The workarounds themselves, however, fall prey to a *third* issue with
> Python's way of laying out packages in the filesystem.
>
> Because a package *must* contain an ``__init__`` module, any attempt
> to distribute modules for that package must necessarily include that
> ``__init__`` module, if those modules are to be importable.
>
> However, the very fact that each distribution of modules for a package
> must contain this (duplicated) ``__init__`` module, means that OS
> vendors who package up these module distributions must somehow handle
> the conflict caused by several distributions installing that
> ``__init__`` module to the same location in the filesystem.
>
> This led to the proposing of PEP 382 ("Namespace Packages") - a way
> to signal to Python's import machinery that a directory was
> importable, using unique filenames per module distribution.
>
> However, there was more than one downside to this approach.
> Performance for all import operations would be affected, and the
> process of designating a package became even more complex.  New
> terminology had to be invented to explain the solution, and so on.
>
> As terminology discussions continued on the Import-SIG, it soon became
> apparent that the main reason it was so difficult to explain the
> concepts related to "namespace packages" was because Python's
> current way of handling packages is somewhat underpowered, when
> compared to other languages.
>
> That is, in other popular languages with package systems, no special
> term is needed to describe "namespace packages", because *all*
> packages generally behave in the desired fashion.
>
> Rather than being an isolated single directory with a special marker
> module (as in Python), packages in other languages are typically just
> a *union of appropriately-named directories* across the *entire*
> import or inclusion path.
>
> In Perl, for example, the module ``Foo`` is always found in a
> ``Foo.pm`` file, and a module ``Foo::Bar`` is always found in a
> ``Foo/Bar.pm`` file.  (In other words, there is One Obvious Way to
> find the location of a particular module.)
>
> This is because Perl considers a module to be *different* from a
> package: the package is purely a *namespace* in which other modules
> may reside, and is only *coincidentally* the name of a module as well.
>
> In current versions of Python, however, the module and the package are
> more tightly bound together.  ``Foo`` is always a module -- whether it
> is found in ``Foo.py`` or ``Foo/__init__.py`` -- and it is tightly
> linked to its submodules (if any), which *must* reside in the exact
> same directory where the ``__init__.py`` was found.
>
> On the positive side, this design choice means that a package is quite
> self-contained, and can be installed, copied, etc. as a unit just by
> performing an operation on the package's root directory.
>
> On the negative side, however, it is non-intuitive for beginners, and
> requires a more complex step to turn a module into a package.  If
> ``Foo`` begins its life as ``Foo.py``, then it must be moved and
> renamed to ``Foo/__init__.py``.
>
> Conversely, if you intend to create a ``Foo.Bar`` module from the
> start, but have no particular module contents to put in ``Foo``
> itself, then you have to create an empty and seemingly-irrelevant
> ``Foo/__init__.py`` file, just so that ``Foo.Bar`` can be imported.
>
> (And these issues don't just confuse newcomers to the language,
> either: they annoy many experienced developers as well.)
>
> So, after some discussion on the Import-SIG, this PEP was created
> as an alternative to PEP \382, in an attempt to solve *all* of the
> above problems, not just the "namespace package" use cases.
>
> And, as a delightful side effect, the solution proposed in this PEP
> does not affect the import performance of ordinary modules or
> self-contained (i.e. ``__init__``-based) packages.
>
>
> The Solution
> ============
>
> In the past, various proposals have been made to allow more intuitive
> approaches to package directory layout.  However, most of them failed
> because of an apparent backward-compatibility problem.
>
> That is, if the requirement for an ``__init__`` module were simply
> dropped, it would open up the possibility for a directory named, say,
> ``string`` on ``sys.path``, to block importing of the standard library
> ``string`` module.
>
> Paradoxically, however, the failure of this approach does *not* arise
> from the elimination of the ``__init__`` requirement!
>
> Rather, the failure arises because the underlying approach takes for
> granted that a package is just ONE thing, instead of two.
>
> In truth, a package comprises two separate, but related entities: a
> module (with its own, optional contents), and a *namespace* where
> *other* modules or packages can be found.
>
> In current versions of Python, however, the module part (found in
> ``__init__``) and the namespace for submodule imports (represented
> by the ``__path__`` attribute) are both initialized at the same time,
> when the package is first imported.
>
> And, if you assume this is the *only* way to initialize these two
> things, then there is no way to drop the need for an ``__init__``
> module, while still being backwards-compatible with existing directory
> layouts.
>
> After all, as soon as you encounter a directory on ``sys.path``
> matching the desired name, that means you've "found" the package, and
> must stop searching, right?
>
> Well, not quite.
>
>
> A Thought Experiment
> --------------------
>
> Let's hop into the time machine for a moment, and pretend we're back
> in the early 1990s, shortly before Python packages and ``__init__.py``
> have been invented.  But, imagine that we *are* familiar with
> Perl-like package imports, and we want to implement a similar system
> in Python.
>
> We'd still have Python's *module* imports to build on, so we could
> certainly conceive of having ``Foo.py`` as a parent ``Foo`` module
> for a ``Foo`` package.  But how would we implement submodule and
> subpackage imports?
>
> Well, if we didn't have the idea of ``__path__`` attributes yet,
> we'd probably just search ``sys.path`` looking for ``Foo/Bar.py``.
>
> But we'd *only* do it when someone actually tried to *import*
> ``Foo.Bar``.
>
> NOT when they imported ``Foo``.
>
> And *that* lets us get rid of the backwards-compatibility problem
> of dropping the ``__init__`` requirement, back here in 2011.
>
> How?
>
> Well, when we ``import Foo``, we're not even *looking* for ``Foo/``
> directories on ``sys.path``, because we don't *care* yet.  The only
> point at which we care, is the point when somebody tries to actually
> import a submodule or subpackage of ``Foo``.
>
> That means that if ``Foo`` is a standard library module (for example),
> and I happen to have a ``Foo`` directory on ``sys.path`` (without
> an ``__init__.py``, of course), then *nothing breaks*.  The ``Foo``
> module is still just a module, and it's still imported normally.
>
>
> Self-Contained vs. "Virtual" Packages
> -------------------------------------
>
> Of course, in today's Python, trying to ``import Foo.Bar`` will
> fail if ``Foo`` is just a ``Foo.py`` module (and thus lacks a
> ``__path__`` attribute).
>
> So, this PEP proposes to *dynamically* create a ``__path__``, in the
> case where one is missing.
>
> That is, if I try to ``import Foo.Bar`` the proposed change to the
> import machinery will notice that the ``Foo`` module lacks a
> ``__path__``, and will therefore try to *build* one before proceeding.
>
> And it will do this by making a list of all the existing ``Foo/``
> subdirectories of the directories listed in ``sys.path``.
>
> If the list is empty, the import will fail with ``ImportError``, just
> like today.  But if the list is *not* empty, then it is saved in
> a new ``Foo.__path__`` attribute, making the module a "virtual
> package".
>
> That is, because it now has a valid ``__path__``, we can proceed
> to import submodules or subpackages in the normal way.
>
> Now, notice that this change does not affect "classic", self-contained
> packages that have an ``__init__`` module in them.  Such packages
> already *have* a ``__path__`` attribute (initialized at import time)
> so the import machinery won't try to create another one later.
>
> This means that (for example) the standard library ``email`` package
> will not be affected in any way by you having a bunch of unrelated
> directories named ``email`` on ``sys.path``.
>
> But it *does* mean that if you want to turn your ``Foo`` module into
> a ``Foo`` package, all you have to do is add a ``Foo/`` directory
> somewhere on ``sys.path``, and start adding modules to it.
>
> But what if you only want a "namespace package"?  That is, a package
> that is *only* a namespace for various separately-distributed
> submodules and subpackages?
>
> For exmaple, if you're Zope Corporation, distributing dozens of
> separate tools like ``zc.buildout``, each in packages under the ``zc``
> namespace, you don't want to have to make and include an empty
> ``zc.py`` in every tool you ship.  (And, if you're a Linux or other
> OS vendor, you don't want to deal with the package conflicts created
> by trying to install ten copies of ``zc.py`` to the same location!)
>
> No problem.  All we have to do is make one more minor tweak to the
> import process: if the "classic" import process fails to find a
> self-contained module or package (e.g., if ``import zc`` fails to find
> a ``zc.py`` or ``zc/__init__.py``), then we once more try to build a
> ``__path__`` by searching for all the ``zc/`` directories on
> ``sys.path``, and putting them in a list.
>
> If this list is empty, we raise ``ImportError``.  But if it's
> non-empty, we create an empty ``zc`` module, and put the list in
> ``zc.__path__``.  Congratulations: ``zc`` is now a namespace-only,
> "pure virtual" package!  It has no module contents, but you can still
> import submodules and subpackages from it, regardless of where they're
> located on ``sys.path``.
>
> (By the way, both of these additions to the import protocol (i.e. the
> dynamically-added ``__path__``, and dynamically-created modules)
> apply recursively to child packages, using the parent package's
> ``__path__`` in place of ``sys.path`` as a basis for generating a
> child ``__path__``.  This means that self-contained and virtual
> packages can contain each other without limitation, with the caveat
> that if you put a virtual package inside a self-contained one, it's
> gonna have a really short ``__path__``!)

Nice.

>
>
> Backwards Compatibility and Performance
> ---------------------------------------
>
> Notice that these two changes *only* affect import operations that
> today would result in ``ImportError``.  As a result, the performance
> of imports that do not involve virtual packages is unaffected, and
> potential backward compatibility issues are very restricted.
>
> Today, if you try to import submodules or subpackages from a module
> with no ``__path__``, it's an immediate error.  And of course, if you
> don't have a ``zc.py`` or ``zc/__init__.py`` somewhere on ``sys.path``
> today, ``import zc`` would likewise fail.
>
> Thus, the only potential backwards-compatibility issues are:
>
> 1. Tools that expect package directories to have an ``__init__``
>   module, that expect directories without an ``__init__`` module
>   to be unimportable, or that expect ``__path__`` attributes to be
>   static, will not recognize virtual packages as packages.
>

Should there be a way to indicate that you do not want a directory to
be considered for a package (an opt-out)?  Currently I can move the
__init__.py out of the way and it gets ignored by import.

>   (In practice, this just means that tools will need updating to
>   support virtual packages, e.g. by using ``pkgutil.walk_modules()``
>   instead of using hardcoded filesystem searches.)
>
> 2. Code that *expects* certain imports to fail may now do something
>   unexpected.  This should be fairly rare in practice, as most sane,
>   non-test code does not import things that are expected not to
>   exist!
>
> The biggest likely exception to the above would be when a piece of
> code tries to check whether some package is installed by importing
> it.  If this is done *only* by importing a top-level module (i.e., not
> checking for a ``__version__`` or some other attribute), *and* there
> is a directory of the same name as the sought-for package on
> ``sys.path`` somewhere, *and* the package is not actually installed,
> then such code could perhaps be fooled into thinking a package is
> installed that really isn't.
>
> However, even in this case, the failure is more likely to be annoying
> than damaging; in most cases, the code will simply fail a little later
> on, when it actually tries to DO something with the imported (but
> empty) module.  (And code that checks for a ``__version__`` attribute
> or the presence of some desired function, class, or module
> in the package will not see such a false positive result in the
> first place.)

Good point.

>
> Meanwhile, tools that expect to locate packages and modules by
> walking a directory tree can be updated to use the existing
> ``pkgutil.walk_modules()`` API, and tools that need to inspect
> packages in memory should use the other APIs described in the
> `Standard Library Changes/Additions`_ section below.
>
>
> Specification
> =============
>
> Two changes are made to the existing import process.
>
> First, the built-in ``__import__`` function must not raise an
> ``ImportError`` when importing a submodule of a module with no
> ``__path__``.  Instead, it must attempt to *create* a ``__path__``
> attribute for the parent module, as described in `__path__ creation`_
> below.
>
> Second, if searching ``sys.meta_path`` and ``sys.path`` (or a parent
> package ``__path__``) fails to find a module, the import process must
> also attempt to create a ``__path__`` attribute for the non-existent
> module.  If the attempt succeeds, an empty module is created and its
> ``__path__`` is set.  Otherwise, importing fails.
>

Nice summary.

> In both of the above cases, if a non-empty ``__path__`` is created,
> the name of the module whose ``__path__`` was created is added to
> ``sys.virtual_packages`` -- an initially-empty set of package names.

<warning>
I am looking at this PEP from the perspective that it may be useful,
and not terribly difficult, to factor in meta importers.  So if that
viewpoint is invalid a good chunk of my remaining comments may be
irrelevant.  Also, I have been knee deep in importlib in the last few
weeks, which will be painfully obvious in my feedback.  I apologize in
advance.  <wink>
</warning>

Perhaps it should be a mapping from the module name to the meta
importer which generated the __path__ entry for the module.  If meta
importers are factored in, the matching importer would be the one to
determine how __path__ should change (like in the situation described
for extend_virtual_paths() below).

>
> Conversely, if an empty ``__path__`` results, an ``ImportError``
> is immediately raised, and the module is not created or changed, nor
> is its name added to ``sys.virtual_packages``.
>
> (This way, code that extends ``sys.path`` at runtime can find out
> what virtual packages are currently imported, and thereby add any
> new subdirectories to those packages' ``__path__`` attributes.  See
> `Standard Library Changes/Additions`_ below for more details.)

Clear and straightforward.

>
>
> ``__path__`` Creation
> ---------------------
>
> A virtual ``__path__`` is created by obtaining a PEP 302 "importer"
> object for each of the path entries found in ``sys.path`` (for a
> top-level module) or the parent ``__path__`` (for a submodule).
>
> (Note: because ``sys.meta_path`` importers are not associated with
> ``sys.path`` or ``__path__`` entry strings, such importers do *not*
> participate in this process.)
>

Nice.  The context for this note here make more sense than in the
other versions (of the other PEP).

Could the importers on sys.meta_path  be given the opportunity to take
control of the process, just as they get tried first when "finding"
modules?  Otherwise we'd be missing the means of customizing the
__path__ creation process, if that is important.  I don't think it
would add much complexity to the implementation and would parallel the
"finding" part of the import process.

In importlib, the _DefaultPathFinder class handles the search across
sys.path, corresponding to the default import behavior for files.  It
is implicitly added to the end of sys.meta_path for
importlib.__import__, along with the builtin and frozen importers.
For virtual __path__ creation, it would perform the process described
in this section.

Thus, _DefaultPathFinder would return the list of __path__ entry
strings resulting when no other meta importer matches the fullname.
However, if another (on sys.meta_path) matched, wouldn't the __path__
coming from  _DefaultPathFinder be potentially wrong?  If so, it would
pay to ask each importer on sys.meta_path for the virtual __path__ and
stop on the first hit.

> Each importer is checked for a ``get_subpath()`` method, and if
> present, the method is called with the full name of the module the
> ``__path__`` is being constructed for.  The return value is either
> a string representing a package subdirectory, or ``None`` if no such
> subdirectory exists.

Should it return a list of strings rather than a single string?  Your
use of "strings" in the next sentence implies that it would.  If
get_path() is called at the meta_path level it would need to return a
list of strings.  I am guessing that importers on sys.path_hooks could
too.

>
> The strings returned by each importer are added to the ``__path__``
> being built, in the same order as they are found.  (``None`` values
> and missing ``get_subpath()`` methods are simply skipped.)
>
> In Python code, the algorithm would look something like this::
>
>    def get_virtual_path(modulename, parent_path=None):
>
>        if parent_path is None:
>            parent_path = sys.path

sys.path is used here instead of as the default arg so that it gets
evaluated each time?

>
>        path = []
>
>        for entry in parent_path:
>            # Obtain a PEP 302 importer object - see pkgutil module
>            importer = pkgutil.get_importer(entry)
>
>            if hasattr(importer, 'get_subpath'):
>                subpath = importer.get_subpath(modulename)
>                if subpath is not None:
>                    path.append(subpath)
>
>        return path
>
> And a function like this one should be exposed in the standard
> library as ``imp.get_virtual_path()``, so that people creating

Or in importlib...

> ``__import__`` replacements or ``sys.meta_path`` hooks can reuse it.
>
>
> Standard Library Changes/Additions
> ----------------------------------
>
> The ``pkgutil`` module should be updated to handle this
> specification appropriately, including any necessary changes to
> ``extend_path()``, ``iter_modules()``, etc.  A new generic API for
> calling ``get_subpath()`` on importers should be added as well.
>
> Specifically the proposed changes and additions to ``pkgutil`` are:
>
> * A new ``get_subpath(importer, fullname)`` generic function, allowing
>  implementations to be registered for existing importers.

Not that it necessarily impacts this PEP, but I'm not sure what you
mean by "registered for existing importers".  I am guessing that
pkgutil is used to facilitate behaviors in packaging libraries, like
setuptools, and that this registration is one of those behaviors.
Then again I am a little dense sometimes <wink>.

Don't sweat responding with an explanation.  I just wanted to point
out the the context of some of the pkgutil related stuff may not be
obvious; and that the documentation for pkgutil doesn't help a ton to
clarify that context.  This may not matter for the PEP and its
expected audience.

>
> * A new ``extend_virtual_paths(path_entry)`` function, to extend
>  existing, already-imported virtual packages' ``__path__`` attributes
>  to include any portions found in a new ``sys.path`` entry.  This
>  function should be called by applications extending ``sys.path``
>  at runtime, e.g. when adding a plugin directory or an egg to the
>  path.
>
>  The implementation of this function does a simple top-down traversal
>  of ``sys.virtual_packages``, and performs any necessary
>  ``get_subpath()`` calls to identify what path entries need to
>  be added to each package's ``__path__``, given that `path_entry`
>  has been added to ``sys.path``.  (Or, in the case of sub-packages,
>  adding a derived subpath entry, based on their parent namespace's
>  ``__path__``.)
>

As I already noted, this is pretty specific to the default file import
mechanism rather than the more general meta import process.  Maybe
that's all that is needed?  My sense of extending virtual paths is
pretty fuzzy.

> * A new ``iter_virtual_packages(parent='')`` function to allow
>  top-down traversal of virtual packages in ``sys.virtual_packages``,
>  by yielding the child virtual packages of `parent`.  For example,
>  calling ``iter_virtual_packages("zope")`` might yield ``zope.app``
>  and ``zope.products`` (if they are imported virtual packages listed
>  in ``sys.virtual_packages``), but **not** ``zope.foo.bar``.
>  (This function is needed to implement ``extend_virtual_paths()``,
>  but is also potentially useful for other code that needs to inspect
>  imported virtual packages.)
>
> * ``ImpImporter.iter_modules()`` should be changed to also detect and
>  yield the names of modules found in virtual packages.
>
> In addition to the above changes, the ``zipimport`` importer should
> have its ``iter_modules()`` implementation similarly changed.  (Note:
> current versions of Python implement this via a shim in ``pkgutil``,
> so technically this is also a change to ``pkgutil``.)
>
> Last, but not least, the ``imp`` module should expose the algorithm
> described in the `__path__ creation`_ section above, as a
> ``get_virtual_path(modulename, parent_path=None)`` function, so that
> creators of ``__import__`` replacements can use it.

Or this could go in importlib?  I guess it depends on where the
implementation happens.

>
>
> Implementation Notes
> --------------------
>
> For users, developers, and distributors of virtual packages:
>
> * ``sys.virtual_packages`` is allowed to contain non-existent or
>  not-yet-imported package names; code that uses its contents should

If it where a dict the module name could point to None, rather than to
the responsible meta importer.

>  not assume that every name in this set is also present in
>  ``sys.modules`` or that importing the name will necessarily succeed.

Good point.

>
> * If you are changing a currently self-contained package into a
>  virtual one, it's important to note that you can no longer use its
>  ``__file__`` attribute to locate data files stored in a package
>  directory.  Instead, you must search ``__path__`` or use the
>  ``__file__`` of a submodule adjacent to the desired files, or
>  of a self-contained subpackage that contains the desired files.

Nice catch.

The "optional extensions" section of PEP 302 has a bit about a
get_data() method for importers.  Using get_data() instead of __file__
or __path__ seems like a safer operation, much as you recommended
using pkgutil.walk_modules() above.

In the case of importlib (yes, it's on my mind), get_data() is already
implemented for the finders surrounding _DefaultPathFinder.  I am not
familiar with the importers that are currently used on
sys.path_importer_cache, but maybe they provide get_data() too?  (a
cursory look makes me think so)

>
> * XXX what is the __file__ of a "pure virtual" package?  ``None``?
>  Some arbitrary string?  The path of the first directory with a
>  trailing separator?  No matter what we put, *some* code is
>  going to break, but the last choice might allow some code to
>  accidentally work.  Is that good or bad?
>
>
> For those implementing PEP \302 importer objects:
>
> * Importers that support the ``iter_modules()`` method (used by
>  ``pkgutil`` to locate importable modules and pacakges) and want to

s/pacakges/packages/

>  add virtual package support should modify their ``iter_modules()``
>  method so that it discovers and lists virtual packages as well as
>  standard modules and packages.  To do this, the importer should
>  simply list all immediate subdirectory names in its jurisdiction
>  that are valid Python identifiers.
>
>  XXX This might list a lot of not-really-packages.  Should we
>  require importable contents to exist?  If so, how deep do we
>  search, and how do we prevent e.g. link loops, or traversing onto
>  different filesystems, etc.?  Ick.
>
> * "Meta" importers (i.e., importers placed on ``sys.meta_path``) do
>  not need to implement ``get_subpath()``, because the method
>  is only called on importers corresponding to ``sys.path`` entries
>  and ``__path__`` entries.  If a meta importer wishes to support
>  virtual packages, it must do so entirely within its own
>  ``find_module()`` implementation.

Certainly that is a simpler approach, but it seems like each
find_module() implementation would end up doing it pretty much the
same way, following the pattern used by the sys.path handler.
However, you are probably right that handling just the sys.path stuff
is good enough.

>
>  Unfortunately, it is unlikely that any such implementation will be
>  able to merge its package subpaths with those of other meta
>  importers or ``sys.path`` importers, so the meaning of "supporting
>  virtual packages" for a meta importer is currently undefined!
>
>  (However, since the intended use case for meta importers is to
>  replace Python's normal import process entirely for some subset of
>  modules, and the number of such importers currently implemented is
>  quite small, this seems unlikely to be a big issue in practice.)

And that is why I wonder if all my blathering is relevant.  Still, I'm
just not sure that it would be difficult for an implementation of this
PEP to handle meta importers intelligently.  I would hate to discount
them unnecessarily.  If I'm just a vocal minority on this point I'll
let it go.  :)

Meta importers could always be addressed in a later addition, if
needed.  Only a couple of things would impact that later effort:

* sys.virtual_packages being a list vs. a dictionary
* get_path() returning a string vs. a list

And only one thing seems ambiguous when meta importers are left for
later.  If a module is loaded through a meta importer, which importer
handles a get_path() call?  When extend_virtual_paths is called, how
are meta-imported modules addressed?

>
>
> References
> ==========
>
> .. [1] "namespace" vs "module" packages (mailing list thread)
>   (http://mail.zope.org/pipermail/zope3-dev/2002-December/004251.html)
>
> .. [2] "Dropping __init__.py requirement for subpackages"
>   (http://mail.python.org/pipermail/python-dev/2006-April/064400.html)
>
>
> Copyright
> =========
>
> This document has been placed in the public domain.
>
>
> ..
>   Local Variables:
>   mode: indented-text
>   indent-tabs-mode: nil
>   sentence-end-double-space: t
>   fill-column: 70
>   coding: utf-8
>   End:
>

One last point:  This PEP results in two ways to provide a module for
a package (<NAME>.py in addition to <NAME>/__init__.py).  However, you
do offer a good distinction; __init__.py is for "self-contained"
packages.  Is it clear when to use which?  Will __init__.py go away
after a while?  Will we have to start looking in two places for a
package's code?

Again, this is much clearer to me than the PEP 382 proposals were.
And your extensive experience with packaging really shows.  Sorry if
any of my feedback displays my ignorance in that area too painfully.
I most wholeheartedly defer to you and the rest on this list regarding
most of the stuff I have said.  :)

Thanks for working on this.

-eric

p.s. if you hurry maybe you can pick up PEP 402.  It's funny how those
PEP numbers line up sometimes.

> _______________________________________________
> Import-SIG mailing list
> Import-SIG at python.org
> http://mail.python.org/mailman/listinfo/import-sig
>