popen hangs sporadically

kj socyl at 987jk.com.invalid
Fri Aug 1 15:10:36 EDT 2008






I have a script that calls the function write_tmpfile, which looks
something like this:

def write_tmpfile(f, tmpfile):
    # set-up code omitted
    in_f = popen("""grep -v '^\\[eof\\]$' %s |\
                    grep '[^[:space:]]' |\
                    sort -u""" % f)
    out_f = open(tmpfile, 'w')
    try:
        while 1:
            line = in_f.readline()
            if not line: break
            # i omit the code that munges line
            out_f.write(line)
    finally:
        in_f.close()
        out_f.close()

The script calls this function several thousand times.  (The average
size of the input file f is 70K lines (0.5MB); the maximum size is
about 35M lines, or 200MB.) This function works perfectly most of
the time, but it deadlocks sporadically.  (And it's a deadlock!
The script can be stuck for hours, until I kill it.)

I can't say for sure where the deadlock is happening (and I'd
appreciate suggestions on how to pinpoint this), but I *think* it
is at the in_f.readline() statement.  So maybe the problem is with
the pipe.  (But FWIW, I've used exactly the same pipe in another
script that processes the same set of files (but does not write a
temporary file when it does this), and this script terminates
without any problem.  I.e. the input files are not too large for
the pipe.)

I suppose that I could use some timeout mechanism to unwedge the
script when it deadlocks and then repeat the call to write_tmpfile,
but I'd prefer to avoid the deadlock in the first place.

I'd appreciate suggestions on how to troubleshoot and debug this
thing.

TIA!

Kynn

-- 
NOTE: In my address everything before the first period is backwards;
and the last period, and everything after it, should be discarded.



More information about the Python-list mailing list