How to organize Python files in a (relatively) big project

Giovanni Bajo noway at sorry.com
Wed Oct 19 04:57:25 EDT 2005


TokiDoki wrote:

> At first, I had all of my files in one single directory, but now, with
> the increasing number of files, it is becoming hard to browse my
> directory. So, I would want to be able to divide the files between 8
> directory, according to their purpose. The problem is that it breaks
>    the 'import's between my files. And besides,AFAIK, there is no
> easy way to import a
> file that is not in a subdirectory of the current file

Remember that the directory where you start the toplevel script is always
included in the sys.path. This means that you can have your structure like
this:

main.py
   |
   | - - pkg1
   | - - pkg2
   | - - pkg3

Files in any package can import other packages. The usual way is to do "import
pkgN" and then dereference. Within each package, you will have a __init__.py
which will define the package API (that is, will define those symbols that you
can access from outside the package).

Typically, you only want to import *packages*, not submodules. In other words,
try to not do stuff like "from pkg1.submodule3 import Foo", because this breaks
encapsulation (if you reorganize the structure of pkg1, code will break). So
you'd do "import pgk1" and later "pkg1.Foo", assuming that pkg1.__init__.py
does something like "from submodule3 import Foo". My preferred way is to have
__init__.py just do "from submodules import *", and then each submodule defines
__all__ to specify which are its public symbols. This allow for more
encapsulation (each submodule is able to change what it exports in a
self-contained way, you don't need to modify __init__ as well).

Moreover, the dependence graph between packages shouldn't have loops. For
instance, if pkg3 uses pkg1, pkg1 shouldn't use pkg3. It makes sense to think
of pkg1 as the moral equivalent of a library, which pkg3 uses.
-- 
Giovanni Bajo





More information about the Python-list mailing list