problems with threaded socket app

Anthony McDonald tonym1972/at/club-internet/in/fr
Mon Sep 15 03:58:50 EDT 2003


"Gordon Messmer" <gordon at dragonsdawn.net> wrote in message
news:pan.2003.09.14.23.17.52.205571 at dragonsdawn.net...
> I've been working on a threaded daemon application to filter email.  The
> source for the program is here:
>
>
http://phantom.dragonsdawn.net/~gordon/courier-patches/courier-pythonfilter/
>
> The daemon loads individual filters as modules and hands the names of the
> message and control files to each module in turn for processing.  One of
> the modules (filters/dialback.py) checks the address of the sender,
> connects to the MX servers for the senders domain, and validates that the
> sender address is valid. In order to implement a timeout on the dialback,
> each message is processed by two threads.  The first thread creates an
> SMTP object and then starts a second thread to do the lookup using that
> SMTP object.  If the lookup takes too long, the first thread closes the
> SMTP object's socket and collects the failure from the second thread.
>
> During testing, that all works fine.  However, in real world use, the
> program eventually deadlocks.  When it does so, there are several dialback
> threads in process, and the first of each pair seems to be reading from
> the status pipe.  I cannot connect a debugger to the second of the pair to
> see what state it's in.
>
> I'm running this application on python2-2.2.2-11.7.3 under Red Hat Linux
> 7.3.
>
> Does anyone have any suggestions for where I can start looking for the
> problem?
>

        if rpipe not in ready_pipes[0]:
            # Time to cancel this SMTP conversation
            smtpi.close()
            # The dialback thread will now write a failure message to
            # its status pipe, and we'll need to clear that out.
            os.read( rpipe, 1024 )
            continue

The code creates a "race" condition. To work correctly it requires the
worker thread to raise and handle an exception, and to write that result
onto the pipe BEFORE your main thread attempts to read the pipe.

If the worker thread loses the race, the next MX result you process will
recieve the last MX's results 400 error code, and your left with 1 thread at
the end of the sequence which can't terminate as it stays active until what
its written to the pipe is read from the pipe.

Simple enough to fix, just add a select call between closing the SMTP
connection and reading the expected 400 error response.

Anthony McDonald






More information about the Python-list mailing list