[Mailman-Users] Mailman archive creation problem

Mark Sapiro mark at msapiro.net
Wed Sep 21 02:30:54 EDT 2016


On 09/20/2016 08:36 PM, Limperis Antonis wrote:

> Traceback (most recent call last):
>   File "/var/mailman/bin/arch", line 201, in <module>
>     main()
>   File "/var/mailman/bin/arch", line 189, in main
>     archiver.processUnixMailbox(fp, start, end)
>   File "/var/mailman/Mailman/Archiver/pipermail.py", line 596, in processUnixMailbox
>     self.add_article(a)
>   File "/var/mailman/Mailman/Archiver/pipermail.py", line 640, in add_article
>     author = fixAuthor(article.decoded['author'])
>   File "/var/mailman/Mailman/Archiver/pipermail.py", line 63, in fixAuthor
>     while i>0 and (L[i-1][0] in lowercase or
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xdc in position 26: ordinal not in range(128)


Pipermail is trying to canonicalize the display name in the From: header
of a message into "Last, First" form and it is trying to see if the
initial character of a "word" in the name is in the string of lower case
characters for the locale. At this point, the name is a unicode and
Python is trying to decode the "lowercase" string to unicode for the
comparison. For some reason, the "lowercase" string appears to be
iso-8859-7, but the decoding is being done as if it were ascii.


> I  found that  the reconstruction works properly if set locale to el_GR.utf8 with " export LC_ALL=el_GR.utf8".
> The original system  locale environment was:
> 
> LANG=el_GR.ISO8859-7
> LC_CTYPE=el_GR.ISO8859-7
> LC_NUMERIC=el_GR.ISO8859-7
> LC_TIME=el_GR.ISO8859-7
> LC_COLLATE=el_GR.ISO8859-7
> LC_MONETARY=el_GR.ISO8859-7
> LC_MESSAGES=C
> LC_PAPER="el_GR.ISO8859-7"
> LC_NAME="el_GR.ISO8859-7"
> LC_ADDRESS="el_GR.ISO8859-7"
> LC_TELEPHONE="el_GR.ISO8859-7"
> LC_MEASUREMENT="el_GR.ISO8859-7"
> LC_IDENTIFICATION="el_GR.ISO8859-7"
> LC_ALL=


I'm guessing that Python is confused because most of the locale stuff is
"el_GR.ISO8859-7", but LC_ALL is not. In any case, it appears you have
solved the problem by "export LC_ALL=el_GR.utf8".

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Users mailing list