[Mailman-Users] 2.1.1 mbox archive doesn't handle lines starting with "From " correctly?

Eric D. Christensen edc at proadmin.com
Wed Apr 16 02:51:43 CEST 2003


I'm seeing problems with the mbox archives after upgrading mailman from
2.0 to 2.1.1. I started to report these as bugs, but then thought that
I'd better check in here first just in case I'm missing something silly.
Apologies for cross posting to both users and developers lists.... 

First, a little background...

I use both pipermail and mbox format archives for all of our lists. I
use the mbox format mainly as a backup so we can regenerate the
pipermail archives (via 'arch --wipe list'). Since some of our lists
have over 5 years of mailman archives now, having the mbox archives
around  has save my butt several times. Too bad servers don't last as
long as the mailing lists running on them! :-)

I'll admit up front that I'm NOT a python programmer.... perl, C, java,
PHP, but not python. So I'm only slightly familiar with the syntax and
totally clueless beyond that. I'm hoping to NOT have to use this problem
as a reason to learn python (though I'd like to someday when I'm not
quite so busy). 

Anyway, here are the two problems I'm seeing with mbox archives:

After upgrading mailman from 2.0 to 2.1.1 (and python itself to 2.2.2) I
regenerated the pipermail archives from the mbox archives and found that
suddenly I had a bunch of messages with "[no subject]", all together
starting just about the time I switched the lists over the 2.1.1. Upon
further investigation I found two issues that make the mbox file invalid
(or at least suspect):

1. No newline before "From " lines in the mbox with 2.1.1.
        Sine the 2.1.1 update it appears the the mbox archiver is no
        longer instering a newline before starting a new message. This
        results in the "From_" line being directly below the last line
        of the previous message. This confuses the mbox parser something
        awful.... it also confuses elm, mutt, and mh if I try to read
        the mbox files with them. 
        
        I found the code in Mailbox.py (in AppendMessage @ line 46) that
        gets called from Archiver/Archiver.py to handle inserting the
        newline if the last thing in the mbox file isn't already a
        newline before appending the message, but it doesn't seem to be
        working correctly. 
        
        Am I missing something or is AppendMessage broken in this
        respect?
        
2. Lines beginning with "From " inside of a message body are not
handled.
        If a line inside a message body starts with the string "From ",
        it is being mis-interpreted as the beginning of a new message
        (i.e. it's being treated as an envelope "From " line. 
        
        I'm not quite sure who to fault on this one... I believe that
        it's common practice to somehow quote this case, in which case
        it s Generator that's not doing the right thing. I think this is
        supported by the fact the other mail agents (elm, mutt, etc...)
        are confused by this unquoted "From " in the message body. It's
        probably a bit much to ask utilities like arch to try to discern
        body from envelope on the fly while reading in the mbox file. 

Any insight, pointers or ideas on these before I report them as bugs?

-- 
Eric D. Christensen <edc at proadmin.com>
Proadmin, Inc. - http://www.proadmin.com




More information about the Mailman-Users mailing list