[Mailman-Users] Moved list, archiving

G. Armour Van Horn vanhorn at whidbey.com
Wed Jan 31 01:16:02 CET 2007


Paul Tomblin wrote:

>Quoting G. Armour Van Horn (vanhorn at whidbey.com):
>  
>
>>After moving the list to it's new home and running the script to update 
>>the archive, I ended up with a raft of messages in the January 2007 
>>archive that are probably ancient. They show no subject, and all of them 
>>are dated this afternoon, probably at the time that I ran the script. Is 
>>there any safe way to clear those out?
>>    
>>
>
>That happened to me when I moved my archives because I had old messages
>that had an "unescaped" "From " line in the body.  I guess there was a
>time when pipermail didn't put a ">" in front of the word "From " in the
>body of a message, and so when I ran "arch" on that mbox I got a lot of
>gibberish messages dated today.  The user contributed program "cleanarch"
>can help fix up some (but not all) of those and I had to use sed to fix
>the rest.  Another problem I ran into were some messages that came around
>1 Jan 2000 that had a date of 1 Jan 100.  I also discovered some very old
>messages that had a header line of
>Content-Type: TEXT/PLAIN; charset=".chrsc"
>which confused arch as well.  It wasn't until I fixed all
>of these problems that I was able to finally run arch in a way that built
>good archives.
>
>  
>
I spoke too soon. I got a lot of this:

#Unix-From line changed: 175609
 From the wire service copy:
#######Unix-From line changed: 176324
 From the MM press release:
##########################Unix-From line changed: 178901
 From a designers view I think FW is the most powerful tool. I designed
######Unix-From line changed: 179571
 From my web site:
Unix-From line changed: 179573
 From my experience, there is no specific palette grouping that causes 
Pal to

(I had used the "-s 100" option to output a # every hundred lines.) 
Every case cleanarch came upon was a valid bit of text inside a message. 
Then I went and looked at the actual output, and saw that cleanarch had 
prepended a ">" to the lines that were part of running text, so I 
renamed files so the output from cleanarch was the live file and ran 
arch again.

I think it may have made things worse, it looks like the same messages 
that were there before still ended up in the January archive. They still 
have date tags based on the time of running arch for the first time on 
the new machine yesterday afternoon. These dates are not found in the 
mbox file.

Looking at the messages in the January archive, it looks like there are 
only about 25 messages, not really a huge task to go back and repair 
manually. The question then becomes, what do I need to do to the mbox 
file so that arch will know where to actually break things, and do I 
need to do anything special to make sure that the messed up archive 
elements are no longer present?

Van



-- 
----------------------------------------------------------
Sign up now for Quotes of the Day, a handful of quotations
on a theme delivered every morning.
Enlightenment! Daily, for free! 
mailto:twisted at whidbey.com?subject=Subscribe_QOTD

For photography, web design, hosting, and maintenance, 
visit Van's home page: http://www.domainvanhorn.com/van/
-----------------------------------------------------------



More information about the Mailman-Users mailing list