File handling: The easy and the hard way

Jeremy Jones zanesdad at bellsouth.net
Thu Sep 30 11:15:32 EDT 2004


Hans-Joachim Widmaier wrote:

>Hi all.
>
>Handling files is an extremely frequent task in programming, so most
>programming languages have an abstraction of the basic files offered by
>the underlying operating system. This is indeed also true for our language
>of choice, Python. Its file type allows some extraordinary convenient
>access like:
>
>    for line in open("blah"):
>        handle_line(line)
>
>While this is very handy for interactive usage or throw-away scripts, I'd
>consider it a serious bug in a "production quality" software. 
>
I disagree.  If you get an exception, you get it for a reason.  I'll try 
to elaborate more below.

>Tracebacks
>are fine for programmers, but end users really never should see any.
>  
>
Bushwa.  End users may or may not be able to make sense of them, but it 
doesn't mean they _should_ never see them.

>Especially not when the error is not in the program itself, but rather
>just a mistyped filename. (Most of my helper scripts that we use to
>develop software handle files this way. And even my co-workers don't
>recognize 'file or directory not found' for what it is.) End users are
>entitled to error messages they can easily understand like "I could not
>open 'blaah' because there is no such file". 
>
So, you're saying that dumping a raw traceback like:
IOError: [Errno 2] No such file or directory: '/foo/bar/bam'
to a logfile is a no-no?  Instead, it should say:

I'm sorry sir, but an error occurred when trying to write to file 
/foo/bar/bam because it wasn't there.

I think the traceback is perfectly understandable.  I think that even an 
end-user would be able to comprehend that type of message.  Or, if you 
get an IOError, is this not sufficient:
IOError: [Errno 28] No space left on device
?

Chances are, end users aren't going to be particulary concerned with 
exceptions you log - unless they've got a problem that they can't figure 
out.  And if they've got a problem they can't figure out, you'd be 
better off giving them as much information as you can give them, or 
they'll come to you for help.  And when they do come to you for help, 
you'd better make sure you've given yourself the most informatino you 
can to solve the problem.  So logging a traceback is a great idea IMHO.  
Now, in areas where you're dead sure that an exception is nothing to be 
concerned with, don't bother.  So, a good approach may be: handle the 
specific exceptions that you know may occur, let other unexpected (or 
expected in worst case scenarios) exceptions filter up to a higher 
level, log them there, and if need be, terminate program execution.

>Graceful error handling is
>even more important when a program isn't just run on a command line but
>with a GUI.
>  
>
Maybe so.  But if you hit an "Oh, crap, what do I do now?" exception, 
you may want to throw up a dialog box with a traceback or something and 
when the user clicks OK on it, terminate program execution.  That gives 
them a chance to (unlikely) figure out what they can do to remedy the 
situation, otherwise call for help.

>Which means? Which means that all this convenient file handling that
>Python offers really should not be used in programs you give away. When I
>asked for a canonical file access pattern some months ago, this was the
>result:
>http://groups.google.com/groups?hl=de&lr=&ie=UTF-8&threadm=pan.2003.12.30.21.32.37.195763%40web.de&rnum=1&prev=/groups%3Fhl%3Dde%26lr%3D%26ie%3DUTF-8%26q%3Dfile%2Bpattern%2Bcanonical%26btnG%3DSuche%26meta%3D
>
>Now I have some programs that read and write several files at once. And
>the reading and writing is done all over the place. If I really wanted to
>do it "right", my once clear and readily understandable code turns into a
>nightmare. This doesn't look like the language I love for its clarity and
>expressivness any more. Python, being a very high level language, needs a
>higher level file type, IMHO. This is, of course, much easier said than
>done. And renown dimwits like me aren't expected to come up with solutions.
>I've thought about subclassing file, but to me it looks like it wouldn't
>help much. With all this try/except framing you need to insert a call
>level anyway (wondering if this new decorator stuff might help?). The best
>I've come up so far is a vague idea for an error callback (if there isn't
>one, the well known exceptions might be raised) that gets called for
>whatever error occured, like:
>
>class File:
>    ...
>    def write(self, data):
>        while True:
>            try:
>                self._write(data)
>            except IOError, e:
>                if self.errorcallback:
>                    ret, dat = self.errorcallback(self, F_WRITE, e, data)
>                    if ret == F_RETURN:
>                        return dat
>                else:
>                    raise
>
>The callback could then write a nice error message, abort the program,
>maybe retry the operation (that's what the 'while True'-loop is for) or
>return whatever value to the original caller. Although the callback
>function will usually be more than a few lines, it can be reused. It can
>even be packed into your own file-error-handling module, something the
>original usage pattern can't.
>  
>
Hmmm....interesting.  Shouldn't you put a break after your 
self._write(data)?  This is probably not a bad way of going about 
things, but what types of files are we talking about here?  Log files?  
I think you're probably better off using the builtin logging and just 
dump raw tracebacks in there.  Data files?  Then you've probaby got that 
wrapped in code to write formatted data to the data file anyway in which 
case, this type of specialized class is probably not a bad thing.  If 
you're trying to write data to a data file, you don't want litter it 
with error messages.  You want to log it and, maybe even unlink the data 
file and do something special.

>If you still bear with me, you might as well sacrifice a few more seconds
>and tell me what you think about my rant. Is everything just fine as it is
>now? Or do I have a point? I always felt it most important to handle all
>errors a program may encounter gracefully, and the easier this is to do,
>the less likely it is a programmer will just sneak around the issue and
>let the interpreter/run time system/operating system handle it. (And yes,
>I'm guilty of not obeying it myself, as it can double or triple the time
>needed to write the whole program; just because its so cumbersome.)
>
>  
>
I dunno - something just doesn't feel right here.  I kinda feel like 
you're wanting to create an over-generalized solution.  Your File class 
is interesting and may be a good start for a lot of general solutions 
and having a callback mechanism helps specialize it, but....something 
just doesn't sit totally right here with me.  This may work totally 
perfectly and may be an excellent piece of code to handle all of your 
file writing activities.  I dunno....

You're not going to be able to catch every exception - not meaningfully, 
anyway.  You could do something like:

if __name__ == "__main__":
    try:
        main()
    except Exception, e:
        log(e)

But that isn't handling all errors....

Production quality code doesn't necessarily mean never terminating 
because of an exception.  You want to reduce the frequency of program 
termination due to exceptions.  I can appreciate your desire to make 
sure you've got good solid software, and not encumber the end user with 
ever little exception you hit, but sometimes it's OK to log/show 
exceptions.  Like I said earlier, when you hit an exception, you hit it 
for a reason.  Do your best to try to figure out what that reason is, 
deal with it, figure out the most reasonable thing to do with _that_ 
exception, and move on.  Sometimes that'll mean throwing a traceback to 
a log file, sometimes it will mean handling it gracefully and "prettying 
up" the message for logging or display to the end user, sometimes it 
will mean totally ignoring it, other times you may need to just stop the 
program.  All of these resolutions can be part of a production quality 
piece of software.  The discerning programmer has to decide which 
solution is appropriate for which situation.  Like Steve Holden 
mentioned, it's really good that you're concerned with such things, but 
make sure you apply common sense to each scenario.

>Waiting-for-you-to-jump-on-me'ly yours,
>Hans-Joachim
>  
>
Hope I didn't jump too hard.

Jeremy Jones



More information about the Python-list mailing list