[Python-Dev] Dropping __init__.py requirement for subpackages

Guido van Rossum guido at python.org
Wed Apr 26 20:50:15 CEST 2006


On 4/26/06, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 10:16 AM 4/26/2006 -0700, Guido van Rossum wrote:
> >So I have a very simple proposal: keep the __init__.py requirement for
> >top-level pacakages, but drop it for subpackages.
>
> Note that many tools exist which have grown to rely on the presence of
> __init__ modules.  Also, although your proposal would allow imports to work
> reasonably well, tools that are actively looking for packages would need to
> have some way to distinguish package directories from others.
>
> My counter-proposal: to be considered a package, a directory must contain
> at least one module (which of course can be __init__).  This allows the "is
> it a package?" question to be answered with only one directory read, as is
> the case now.  Think of it also as a nudge in favor of "flat is better than
> nested".

I'm not sure what you mean by "one directory read". You'd have to list
the entire directory, which may require reading more than one block if
the directory is large.

But I'd be happy to define it like this from the POV of tools that
want to know about sub-packages; my users complain because they have
put .py files in a directory that they consider a sub-package so it
would work fine for them. Python itself might attempt to consider the
directory as a package and raise ImportError because the requested
sub-module isn't found; the creation of a dummy entry in sys.modules
in that case doesn't bother me.

> This tweak would also make it usable for top-level directories, since the
> mere presence of a 'time' directory wouldn't get in the way of anything.

Actually, no; the case I remember was a directory full of Python code
(all experiments by the user related to a particular topic -- I
believe it was "string").

> The thing more likely to have potential for problems is that many Python
> projects have a "test" directory that isn't intended to be a package, and
> thus may interfere with imports from the stdlib 'test' package.  Whether
> this is really a problem or not, I don't know.

"test" is a top-level package. I'm not proposing to change the rules
for toplevel packages. Now you have the reason why. (And the new
"absolute import" feature in 2.6 will prevent aliasing problems
between subdirectories and top-level modules.)

> But, we could treat packages without __init__ as namespace packages.  That
> is, set their __path__ to encompass similarly-named directories already on
> sys.path, so that the init-less package doesn't interfere with other
> packages that have the same name.

Let's stick to the one feature I'm actually proposing please.

> This would require a bit of expansion to PEP 302, but probably not
> much.  Most of the rest is existing technology, and we've already begun
> migrating stdlib modules away from doing their own hunting for __init__ and
> other files, towards using the pkgutil API.
>
> By the way, one small precedent for packages without __init__: setuptools
> generates such packages using .pth files when a package is split between
> different distributions but are being installed by a system packaging
> tool.  In such cases, *both* parts of the package can't include an
> __init__, because the packaging tool (e.g. RPM) is going to complain that
> the shared file is a conflict.  So setuptools generates a .pth file that
> creates a module object with the right name and initializes its __path__ to
> point to the __init__-less directory.
>
>
> >This should be a small change.
>
> Famous last words.  :)  There's a bunch of tools that it's not going to
> work properly with, and not just in today's stdlib.  (Think documentation
> tools, distutils extensions, IDEs...)

Are you worried about the tools not finding directories that are now
subpackages? Then fix the tools. Or are you worried about flagging
subdirectories as (empty) packages since they exist, have a valid name
(no hyphens, dots etc.) and contain no modules? I'm not sure I would
call that failing. I can't see how a tool would crash or produce
incorrect results with this change, *unless* you consider it incorrect
to list a data directory as an empty package. To me, that's an
advantage.

> Are you sure you wouldn't rather just write a GoogleImporter class to fix
> this problem?

No, because that would require more setup code with a requirement to
properly enable it, etc., etc., more failure modes, etc., etc.

>  Append it to sys.path_hooks, clear sys.path_importer_cache,
> and you're all set.  For that matter, if you have only one top-level
> package, put the class and the installation code in that top-level
> __init__, and you're set to go.

I wish it were that easy. If there was such an easy solution, there
wouldn't be pitchforks involved. I can't go into the details, but that
just wouldn't work; and the problem happens most frequently to people
who are already overloaded with learning new stuff. This is just one
more bit of insanity they have to deal with.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list