[Mailman-Users] Text "Reappears" when MBox Archive Rebuilt

Dave Arndt dave at 3rdValve.net
Mon Nov 30 17:37:15 EST 2015


In this case, there are NO existing HTML files.  It's being rebuilt on a
brand new installation.

I also just heard from the person managing the site that they did use the
--wipe option (so I guess that's all moot).

The mystery is this:  How is it possible to edit out text from an mbox
file, verify that it is NOT there with grep, then see it reappear in the
resulting html file when bin/arch is run?

It's almost as if editing the file with VI left the original text and only
hid it with escape sequences or something.

Whatever it is... the mbox file that got uploaded to the new site HAD to
have had the original text, even though that text does not show up with
when running grep against that same mbox file (and is also not visible when
editing the same file with VI)...

Strange.






On Mon, Nov 30, 2015 at 5:03 PM, Mark Sapiro <mark at msapiro.net> wrote:

> On 11/30/2015 01:22 PM, Dave Arndt wrote:
> >
> > I did not do the rebuild myself - but I would assume they just ran
> bin/arch
> >
> > How would the text re-appear if it was removed, as per step #1?
> >
> > in other words, How would the text still be in the file after removing
> > it, and it doesn't appear with grep?
>
>
> If you have an existing HTML archive and the corresponding .mbox, and
> you run bin/arch without --wipe, every message in the mbox will be added
> to the HTML archive, but they won't be indexed because the Message-IDs
> are duplicates.
>
> For example if there are a total of 10 messages in the archive, the HTML
> messages will have names like 000000.html, 000001.html, ...,
> 000009.html. If you then run bin/arch without --wipe, you will add files
> 000010.html, 000011,html, ..., 000019.html which may or may not be a bit
> different if you modified the mbox. Now, when the archiver added say
> 000010.html, its Message-ID is the same as that of 000000.html, so it
> won't be indexed and the index will still point to 000000.html.
>
> The answer is if you want to rebuild and archive and not just add to it,
> you have to use --wipe to remove the existing HTML archive before adding.
>
> --
> Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
> San Francisco Bay Area, California    better use your sense - B. Dylan
>


More information about the Mailman-Users mailing list