Python packages - problems, pitfalls.

Prabhu Ramachandran prabhu at aero.iitm.ernet.in
Tue Nov 6 07:52:04 EST 2001


>>>>> "PB" == Paul Boddie <paul at boddie.net> writes:

    >> pkg_root/ __init__.py app.py # app is the application that is
    >> not part of the package

    PB> There's something about this which intrigues me, too, but more
    PB> on that below.

    PB> I've been reorganising some package structures recently, and I
    PB> found a number of issues which may be of interest to
    PB> you. Firstly, that remark I made above: let's say we run
    PB> app.py from outside pkg_root...

    PB>   python pkg_root/app.py

    PB> What would you expect to happen if app.py contained the line
    PB> given below?

    PB>   import pkg_root.common_module

    PB> What seems to happen is that unless the current directory (the
    PB> parent of pkg_root) resides on your PYTHONPATH, this will not
    PB> work - it's as if app.py is part of the package and should be
    PB> using...

    PB>   import common_module

Indeed, that is correct.

    PB> I found this slightly confusing, but it is related to what you
    PB> want to do in your example, because I agree that it can make
    PB> some sense to include the test, demonstration or main
    PB> application program within the package directory, even though
    PB> it isn't strictly part of the package - it's a user of the

True, but lets say I want to freeze the package without installing it.
Or I have a user who doesnt want to/cannot install (for instance
because (s)he is not root on the machine) it but wants to still run
the package.

    PB> package. Of course, in Java one uses an explicit package
    PB> keyword to exclude or include "modules" from/in packages,
    PB> whereas in Python it's implicit.

    PB> This is really an aside, though, because your main issue is
    PB> with the way importing of "super-package" members is
    PB> done. What happens if there's a package called common
    PB> somewhere else on your system outside (and unconnected with)
    PB> this package, and then in a.py you do use the following
    PB> statement?

    PB>   import common.foo

True, but then even sibling packages wont work.  If you have name
clashes you are in trouble anyway since python makes all global
modules available without std.<package>.

If you have a module one important rule is that you should not make
the name the same as a standard module (or another globally availabe
module).  For instance one is asking for trouble if one has a module
string.py!  There is nothing you can do about this.  Name clashes are
a problem and my approach does not address them neither does it make
matters worse because the current scheme is no better anyway.

  pkg/
  
	sub/

		b.py
		subsub/

b.py:

import subsub.foo

is valid and has the same problem you mention (i.e. if subsub is
another package's name you are in trouble).

    PB> In the scheme you propose, access to the external common
    PB> module is likely to be restricted in some way by the presence
    PB> of the common subpackage of the new package. With the existing
    PB> mechanisms, however, this ambiguity is avoided entirely in
    PB> this case - you have to do the following:

    PB>   import pkg_root.common.foo

Not really.  See above also if you *truly* want to avoid ambiguity
we'd all have to explcitly name *everything* which is a bigger pain.

Actually, one good way to allow access to global names would be to
introduce a std or a main name that is global always (or atleast not
to be used (officially) by anyone).  then to resolve names in case of
ambiguity one would do:

  import std.string

  to get the local string module you'd do:

  import string

    PB> I do see your point, though. In one of my works, I use
    PB> packages to maintain a sort-of logical structure as follows:

    PB>   XMLForms __init__.py Accessor.py ...  DOM __init__.py
    PB> DOMAccessor.py # Yes, I could have called this Accessor, # but
    PB> I did say that I just reorganised # the package!  ...

[snip]

    PB> Now, DOMElementTypes is related to ElementTypes through their
    PB> "nature" - they perform similar activities. However, I have
    PB> chosen to group define subpackages according to implementation
    PB> technology. As a result, I would have to do the following kind
    PB> of import from within XMLForms.DOM.DOMElementTypes.BaseTypes:

    PB>   import XMLForms.ElementTypes.BaseTypes

    PB> You might want to know why I can't do this:

    PB>   import ElementTypes.BaseTypes

    PB> Well, I don't really see the need to do that unless I consider
    PB> ElementTypes to be independent of XMLForms. Moreover, if I
    PB> decided to give in and rename DOMElementTypes to ElementTypes,

True, I'll admit that ElementTypes arent really independant of
XMLForms but all this is a matter of convenience.  If you really
wanted to force explicit package imports then sibling packages should
not work.  And coding would be painful esp. if you wanted to avoid
things like

from XMLForms.ElementTypes.BaseType import *

or

import XMLForms.ElementTypes.BaseType as BaseType

which are both not recommended and are not reload safe.  In which case
to access any single function you'd have to type way too much.

I think a well defined package import structure is necessary and once
defined people can work around it depending on their needs.

    PB> because it lives in a separate namespace to the other
    PB> ElementTypes, then I would still be compelled to do a "full"
    PB> import under any import scheme which isn't "really clever".

    PB>   import DOM.ElementTypes.BaseTypes import
    PB> ElementTypes.BaseTypes # What does this import?

I'd expect that this imports the closest match depending on where you
do it.  If you do it inside the DOM dir it would refer to the
XMLForms/DOM/ElementTypes/BaseTypes.py

If you wanted explicitly something else - you'd have to ask for it.
But the default would be

  (1) first check in current dir, (2) if no match go up to parent (3)
  if 2 fails till you are out of the package look for global matches.


    PB> One could insist that the first statement always be used to
    PB> access the DOM version, and that the second statement always
    PB> accesses the higher-level version, but how is the import
    PB> mechanism supposed to know what you mean?

It depends on the implementation.  Once defined and documented it can
be used.  Without a specification its hard to tell what the import
should do.

    PB> So, to summarise, perhaps there's no simple way of introducing
    PB> what you want without introducing some of the consequences
    PB> outlined here.  Explicit package specifiers might help in
    PB> certain cases (the app.py or test.py cases), but they could
    PB> promote a package structure which isn't obvious from browsing
    PB> the filesystem. Unions of packages (which could be done using
    PB> either explicit specifiers or import magic) could potentially
    PB> introduce side-effects as modules start to co-exist with
    PB> modules they were never meant to co-exist with.

There are always consequences to what is done.  But some approaches
seem to make more sense than others.  I think the current approach is
broken.  The more i think about it, the more convinced I am of this.

BTW, is it possible to do what I specified using something like knee
or imputil?

thanks,
prabhu




More information about the Python-list mailing list