File handling: The easy and the hard way

Hans-Joachim Widmaier hjwidmaier at web.de
Thu Sep 30 09:56:53 EDT 2004


Hi all.

Handling files is an extremely frequent task in programming, so most
programming languages have an abstraction of the basic files offered by
the underlying operating system. This is indeed also true for our language
of choice, Python. Its file type allows some extraordinary convenient
access like:

    for line in open("blah"):
        handle_line(line)

While this is very handy for interactive usage or throw-away scripts, I'd
consider it a serious bug in a "production quality" software. Tracebacks
are fine for programmers, but end users really never should see any.
Especially not when the error is not in the program itself, but rather
just a mistyped filename. (Most of my helper scripts that we use to
develop software handle files this way. And even my co-workers don't
recognize 'file or directory not found' for what it is.) End users are
entitled to error messages they can easily understand like "I could not
open 'blaah' because there is no such file". Graceful error handling is
even more important when a program isn't just run on a command line but
with a GUI.

Which means? Which means that all this convenient file handling that
Python offers really should not be used in programs you give away. When I
asked for a canonical file access pattern some months ago, this was the
result:
http://groups.google.com/groups?hl=de&lr=&ie=UTF-8&threadm=pan.2003.12.30.21.32.37.195763%40web.de&rnum=1&prev=/groups%3Fhl%3Dde%26lr%3D%26ie%3DUTF-8%26q%3Dfile%2Bpattern%2Bcanonical%26btnG%3DSuche%26meta%3D

Now I have some programs that read and write several files at once. And
the reading and writing is done all over the place. If I really wanted to
do it "right", my once clear and readily understandable code turns into a
nightmare. This doesn't look like the language I love for its clarity and
expressivness any more. Python, being a very high level language, needs a
higher level file type, IMHO. This is, of course, much easier said than
done. And renown dimwits like me aren't expected to come up with solutions.
I've thought about subclassing file, but to me it looks like it wouldn't
help much. With all this try/except framing you need to insert a call
level anyway (wondering if this new decorator stuff might help?). The best
I've come up so far is a vague idea for an error callback (if there isn't
one, the well known exceptions might be raised) that gets called for
whatever error occured, like:

class File:
    ...
    def write(self, data):
        while True:
            try:
                self._write(data)
            except IOError, e:
                if self.errorcallback:
                    ret, dat = self.errorcallback(self, F_WRITE, e, data)
                    if ret == F_RETURN:
                        return dat
                else:
                    raise

The callback could then write a nice error message, abort the program,
maybe retry the operation (that's what the 'while True'-loop is for) or
return whatever value to the original caller. Although the callback
function will usually be more than a few lines, it can be reused. It can
even be packed into your own file-error-handling module, something the
original usage pattern can't.

If you still bear with me, you might as well sacrifice a few more seconds
and tell me what you think about my rant. Is everything just fine as it is
now? Or do I have a point? I always felt it most important to handle all
errors a program may encounter gracefully, and the easier this is to do,
the less likely it is a programmer will just sneak around the issue and
let the interpreter/run time system/operating system handle it. (And yes,
I'm guilty of not obeying it myself, as it can double or triple the time
needed to write the whole program; just because its so cumbersome.)

Waiting-for-you-to-jump-on-me'ly yours,
Hans-Joachim



More information about the Python-list mailing list