organizing your scripts, with plenty of re-use

Sat Oct 10 04:57:08 EDT 2009

On Fri, 09 Oct 2009 16:37:28 -0700, Buck wrote:

> Here's a scenario. A user does a cvs checkout into some arbitrary
> directory and sees this:
> 
> project/
> +-- python/
>     +-- animals.py
>     +-- mammals/
>         +-- horse.py
>         +-- otter.py
>     +-- reptiles/
>         +-- gator.py
>         +-- newt.py
>     +-- misc/
>         +-- lungs.py
>         +-- swimming.py
> 
> These are all runnable scripts that "just work" with no extra effort or
> knowlege, both in the testing scenario above, and for normal users that
> run it from some central location (maybe "/tools/mycompany/bin/
> mammals").
> 
> The frustrating thing, for me, is that all these requirements are met if
> you leave the scripts in jumbled into a flat directory. 

I bet that's not true. I bet that they Just Work only if the user cd's 
into the directory first. In other words, if you have all your scripts in 
the directory /tools/mycompany/bin/scripts, this will work:

$ cd /tools/mycompany/bin/scripts
$ animals.py

but this won't:

$ cd /home/username
$ /tools/mycompany/bin/scripts/animals.py

In the first case, it works because the current working directory is 
included in the PYTHONPATH, and all the modules you need are there. In 
the second, it doesn't because the modules aren't in either the current 
directory or any other directory in the PYTHONPATH.

That's my prediction.

> As soon as you
> start organizing things, you need a good amount of boilerplate in each
> script to make things work anything like they did with the flat
> directory.

You shouldn't need that much boilerplate. A little, perhaps, but not that 
much.

Although I have defended the practice of making modules executable, I do 
recognise that for complex packages this becomes difficult quickly. It 
sounds like you would benefit greatly from separating the interface from 
the backend. You should arrange matters so that the users see something 
like this:

project/
+-- animal
+-- mammal
+-- reptile
+-- backend/
    +-- __init__.py
    +-- animals.py
    +-- mammals/
        +-- __init__.py
        +-- horse.py
        +-- otter.py
    +-- reptiles/
        +-- __init__.py
        +-- gator.py
        +-- newt.py
    +-- misc/
        +-- __init__.py
        +-- lungs.py
        +-- swimming.py

where the front end is made up of three scripts "animal", "mammal" and 
"reptile", and the entire backend is in a package. Each front end script 
manages a small amount of boilerplate, something like this:

#!/usr/bin/python
import os, sys

if __name__ == '__main__':
    # find out where we are, and add it to the path
    location = __import__('__main__').__file__
    location = os.path.dirname(location)
    if location not in sys.path:
        sys.path.append(location)

    import animals
    animals.main()

That's not a lot of boilerplate for a script.

The backend modules rely on the path being setup correctly. For example, 
animals.py might do:

import mammals.horse
horse.ride('like the wind')

Calling the backend modules directly is not supported.

-- 
Steven