[Mailman-Users] mmarch mbox-splitting weird...

Mark Sapiro mark at msapiro.net
Mon Mar 16 18:12:47 CET 2009


IEM - network operating center (IOhannes m zmoelnig) wrote:
>
>i fixed my hacks, but in order to get the archives right i ran
>"mmarch --wipe ...", and found myself suprised that this did not produce 
>the desired results: the new archives seemed to contain some more emails 
>than the original ones, all of them having "No subject" and appearing in 
>the current archive directory.
>
>it turned out that these new emails where parts of old emails.
>
>the problem seems to be within the parsing of the mbox file: at some (to 
>me) arbitrary points, Mailbox.py would decide that the mail has finished 
>and start a new one; since the new one had no proper header, it ended up 
>as "No subject" (and no author information).


This is a well known issue. If a message body contains a line beginning
with "From ", bin/arch takes that as a message separator. Very old
Mailman didn't escape the From lines in the body.

You need to first clean your .mbox files with Mailman's bin/cleanarch
or some other process to escape the "From " lines that aren't message
separators.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Users mailing list