[Mailman-Users] mbox files

Barry Finkel b19141 at britaine.ctd.anl.gov
Mon Dec 4 18:35:38 CET 2006


Quoting Mark Sapiro (msapiro at value.net):
>> Paul Tomblin wrote:
>> >slowed my computer down to a crawl.  I gave up and used the mbox splitter
>> >awk program I found in the list archives and I'm now building the archives
>> >500 messages at a time.  Hope that works.
>> 
>> 
>> It should.
>> 
>> Also, you can effectively do the same thing without breaking up the
>> mbox by using the --start= and --end= options on bin/arch. See
>> 
>>  bin/arch --help

Paul Tomblin <ptomblin at xcski.com> replied:

>Is there any way to make arch smarter about "^From " lines?  First pass
>through the archive, I ended up with a bazillion messages in the
>archive for today, all with "No subject" because it was treating any
>line like "^From " as the start of a message.  It would be nice if it
>recognized the difference between real mbox start-of-message "^From "
>and just random lines from some list member.

I was under the impression that ANY line in an mbox file that began
with "^From " was the start of a new message.  That is why mailers
change a mail body line "^From " to "^> From ".  Is there an mbox
standard?  What you could do is write a script to process the
"corrupted" mbox file.  It would write non-"^From " lines intact, but
it would parse the "^From " lines to determine if they were the start
of new messages or just plain mail body lines.  Body lines would be
re-written with an initial ">".
----------------------------------------------------------------------
Barry S. Finkel
Computing and Information Systems Division
Argonne National Laboratory          Phone:    +1 (630) 252-7277
9700 South Cass Avenue               Facsimile:+1 (630) 252-4601
Building 222, Room D209              Internet: BSFinkel at anl.gov
Argonne, IL   60439-4828             IBMMAIL:  I1004994



More information about the Mailman-Users mailing list