[Python-Dev] Dropping __init__.py requirement for subpackages

Phillip J. Eby pje at telecommunity.com
Wed Apr 26 20:07:30 CEST 2006


At 10:16 AM 4/26/2006 -0700, Guido van Rossum wrote:
>So I have a very simple proposal: keep the __init__.py requirement for
>top-level pacakages, but drop it for subpackages.

Note that many tools exist which have grown to rely on the presence of 
__init__ modules.  Also, although your proposal would allow imports to work 
reasonably well, tools that are actively looking for packages would need to 
have some way to distinguish package directories from others.

My counter-proposal: to be considered a package, a directory must contain 
at least one module (which of course can be __init__).  This allows the "is 
it a package?" question to be answered with only one directory read, as is 
the case now.  Think of it also as a nudge in favor of "flat is better than 
nested".

This tweak would also make it usable for top-level directories, since the 
mere presence of a 'time' directory wouldn't get in the way of anything.

The thing more likely to have potential for problems is that many Python 
projects have a "test" directory that isn't intended to be a package, and 
thus may interfere with imports from the stdlib 'test' package.  Whether 
this is really a problem or not, I don't know.

But, we could treat packages without __init__ as namespace packages.  That 
is, set their __path__ to encompass similarly-named directories already on 
sys.path, so that the init-less package doesn't interfere with other 
packages that have the same name.

This would require a bit of expansion to PEP 302, but probably not 
much.  Most of the rest is existing technology, and we've already begun 
migrating stdlib modules away from doing their own hunting for __init__ and 
other files, towards using the pkgutil API.

By the way, one small precedent for packages without __init__: setuptools 
generates such packages using .pth files when a package is split between 
different distributions but are being installed by a system packaging 
tool.  In such cases, *both* parts of the package can't include an 
__init__, because the packaging tool (e.g. RPM) is going to complain that 
the shared file is a conflict.  So setuptools generates a .pth file that 
creates a module object with the right name and initializes its __path__ to 
point to the __init__-less directory.


>This should be a small change.

Famous last words.  :)  There's a bunch of tools that it's not going to 
work properly with, and not just in today's stdlib.  (Think documentation 
tools, distutils extensions, IDEs...)

Are you sure you wouldn't rather just write a GoogleImporter class to fix 
this problem?  Append it to sys.path_hooks, clear sys.path_importer_cache, 
and you're all set.  For that matter, if you have only one top-level 
package, put the class and the installation code in that top-level 
__init__, and you're set to go.

And that approach will work with Python back to version 2.3; no waiting for 
an upgrade (unless Google is still using 2.2, of course).

Let's see, the code would look something like:

    class GoogleImporter:
        def __init__(self, path):
            if not os.path.isdir(path):
                raise ImportError("Not for me")
            self.path = os.path.realpath(path)

         def find_module(self, fullname, path=None):
             # Note: we ignore 'path' argument since it is only used via 
meta_path
             subname = fullname.split(".")[-1]
             if os.path.isdir(os.path.join(self.path, subname)):
                 return self
             path = [self.path]
             try:
                 file, filename, etc = imp.find_module(subname, path)
             except ImportError:
                 return None
             return ImpLoader(fullname, file, filename, etc)

         def load_module(self, fullname):
             import sys, new
             subname = fullname.split(".")[-1]
             path = os.path.join(self.path, subname)
             module = sys.modules.setdefault(fullname, new.module(fullname))
             module.__dict__.setdefault('__path__',[]).append(path)
             return module

     class ImpLoader:
         def __init__(self, fullname, file, filename, etc):
             self.file = file
             self.filename = filename
             self.fullname = fullname
             self.etc = etc

         def load_module(self, fullname):
             try:
                 mod = imp.load_module(fullname, self.file, self.filename, 
self.etc)
             finally:
                 if self.file:
                     self.file.close()
             return mod

     import sys
     sys.path_hooks.append(GoogleImporter)
     sys.path_importer_cache.clear()



More information about the Python-Dev mailing list