[Mailman-Users] "No subject" messages in archives

Ivan Van Laningham ivanlan at pauahtun.org
Sun May 20 13:41:14 CEST 2007


Hi All--

Mark Sapiro wrote:
> Ivan Van Laningham wrote:
>> But I have one list for which I used archives from two previous 
>> incarnations of the list, plus the current archive mbox, as input to 
>> arch.  I made sure that the previous archives were in mbox format and 
>> that they contained only one "From " line per message.
> 
> 
> Are you sure? Did you run bin/cleanarch against the .mbox file to check
> it?
> 

I ran cleanarch, yes, but all it did was to escape every single "From " 
line, which would make arch think there was only one message.


> 
> This usually results from a message containing an embedded "From "
> somewhere in the message body. The message is archived properly under
> its correct date and subject, but that entry is truncated at the line
> that begins with "From ". Then the rest of the message is archived as
> a separate message. Since it has no From:, Subject: or Date: headers,
> it is archived with the current date and no subject. Also , text
> following the "From " up to the first totally empty (not just blank)
> line is considered part of the header and is not archived with this
> 'second' message.
> 

That would describe what I'm seeing, except that--
>  
> If there is any message body text in the 'No subject' archived entry,
> you should be able to find that in the .mbox.
> 

Right, but there are 5,000 entries with "No subject" and no body, not a 
hint of a body.

> 
>> The _only_ thing I can see, in the current mbox, 
>> is that the end of the last message from the old archives ends on one 
>> line and the "From " line for the next message begins on the very next 
>> line, with no blank lines between,
> 
> 
> That shouldn't cause this.
> 

Good to know.

> 
>> and everywhere else there are either 
>> one or more blank lines or one of those message separator lines from 
>> AOL: 
>>> "----------MB_8C9379FAFA8ECEC_DAC_6C2A_WEBMAIL-MC05.sysops.aol.com--"<
>> These bogus entries aren't really hurting anything, I suppose, but they 
>> are annoying and it is irritating to have to scroll down 5000 lines to 
>> get to the next real message.
> 
> 
> They are actually, because they represent missing pieces of other
> messages.
> 

How to track them down?

> 
>> What is causing this?  And is there anything I can do to get rid of the 
>> problem?  I am willing to live with it if I have to, but I would prefer 
>> having a fix.
> 
> 
> I think you have unescaped "From " lines in the bodies of messages. Run
> bin/cleanarch (with the -n/--dry-run option) to check.
> 
> Another possibility is you have real looking but extraneous
> (duplicate?) "From " lines not followed by a real message with
> Subject: and Date: headers prior to the next "From ".
> 

Do lines beginning with whitespace before a From count?  There are about 
a hundred of those in the input mbox.

Metta,
Ivan
-- 
Ivan Van Laningham
God N Locomotive Works
http://www.pauahtun.org/
http://www.python.org/workshops/1998-11/proceedings/papers/laningham/laningham.html
Army Signal Corps:  Cu Chi, Class of '70
Author:  Teach Yourself Python in 24 Hours


More information about the Mailman-Users mailing list