[Mailman-Users] UnicodeDecodeError with Mailman 2.1 and Python 2.6

Mark Sapiro mark at msapiro.net
Wed Sep 2 03:02:19 CEST 2015


On 09/01/2015 12:16 PM, David Magda wrote:
> On Tue, September 1, 2015 14:35, Stephen J. Turnbull wrote:
>> Mark Sapiro writes:
>>
>>  > I don't know what you are grepping, but if it's the mbox, you shouldn't
>>  > be looking for "\xea", you should be looking for "ê".
>>
>> At least on recent BSD-based systems "\xea" is a well-defined escape
>> sequence, interpreted as the hexadecimal representation of a byte.
>> Dunno about GNU or proprietary systems.  (POSIX.2)
> 
> This is GNU grep under Debian.


In my testing with GNU grep on Ubuntu 15.04, 'grep "\xea"' interprets \x
as a literal x and therefore looks for the string "xea", not for the
character whose hex value is EA.


> We are running 2.1.13 from tarballs and so the Mailman code did not change
> when the archive web page generation stopped. The only thing that changed
> was the version of Python (2.5 -> 2.6?) under the OS.
> 
> Doing a "arch --wipe mylist" seems to have solved the issue, though now
> I'm curious to know why \xea was a problem before but suddenly isn't after
> the wipe.


Here's what I suspect was going on.

Your first run of bin/arch encountered some non-ascii in a header and
threw the exception, but not before writing bad data to the pipermail
database for that month.

You then "fixed" the non-ascii in the input mbox, but subsequent runs of
bin/arch still encountered the bad data in the database when they got to
that month.

Finally, you added the --wipe option and that removed everythin and
rebuilt from scratch and as there was no non-ascii in the mbox headers,
it worked.

As to why this didn't happen before, see my next reply.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Users mailing list