[Mailman-Developers] Problems if shunting fails

Mon Feb 26 22:40:51 CET 2007

Barry Warsaw wrote:
>
>I'm sorry I haven't been able to respond to this thread before now,  
>but I've been traveling and PyCon'ing and either haven't had time or  
>have had spotty access to email.

That's OK. I almost went to PyCon myself this year, but didn't make it.
Maybe next time.

>I'm pretty sure we don't want to leave the .psv file in the original  
>queue.  I agree that over time, we'll just end up filling up the  
>queue directory with files we'll never process, wasting time listing  
>them, or worse.  A major reason for having the shunt queue in the  
>first place was to not clutter up our processing queues with messages  
>we couldn't do anything about.  Of course, culling our shunt queue is  
>another issues. :/

I agree that it is not a good thing to leave these in the original
queue. We should somehow move them elsewhere. The only reason for even
thinking about leaving them in the original queue is that a rename,
changing the extension only is almost guaranteed not to fail whereas a
rename that changes the path could run into a permissions problem or a
non-existant directory. I don't think these are significant though as
the non-existant directory can be created, and permissions aren't
likely to be an issue if Mailman is actually running.

>OTOH, if we're going to move the offending message to the shunt  
>queue, then there's not much point in keeping the .psv extension.  We  
>pretty much always know that if the file is in shunt, it's bad, so  
>maybe we just rename the offender to shunt/blah.txt (assuming it's an  
>unparsed text file).

I'm not so sure about this one. The file may not be unparsed text, and
we may not have the metadata to find out. In any case, we don't want
to put a .bak file in the shunt queue because unshunt will 'recover'
it. I don't think we want that.

>The other thing to consider is adding a configuration variable that  
>let's us limit the size of the files we'll handle.  I can't imagine  
>any scenario under which we'd want to (let alone be able to) handle a  
>message of a half terabyte.  Heck, you have to have a pretty  
>misconfigured mail system to even allow such a message to get to  
>Mailman, IMO.  I think Postfix for example has a 10MB default size  
>limit, and even cranking that up by a factor of 10 should allow most  
>legitimate mail to go through.
>
>So, if we added a size limit configuration variable, we'd have to  
>stat the file (os.path.getsize()) and just os.rename() the file to  
>shunt (with a log message) when we see a file over that size limit.

I think the size limit is a good idea. I think it would have avoided
the MemoryError that started me off on this in the first place.

I do think, we also need to consider the possible exceptions from
dequeue(). Now that we have a .bak file in the case of a
MessageParseError exception, I think it is good to try to put it
somewhere where it can be examined if desired. Also, we had a case of
ValueError being thrown by message_from_string. Of course, the
ValueError was the result of an email bug that's been fixed, but it
may still be possible for message_from_string to throw exceptions
other than MessageParseError, and I think we need to protect against
that.

-- 
Mark Sapiro <msapiro at value.net>       The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan