[Mailman-Users] Uncaught runner exception
Mark Sapiro
mark at msapiro.net
Sat Mar 2 00:00:55 EST 2019
On 3/1/19 9:15 AM, Mark Sapiro wrote:
> On 2/28/19 5:26 AM, Lothar Schilling wrote:
>> Hi everybody,
>>
>> a few weeks ago I upgraded from 2.1.16 (as far as I can remember...) to
>> 2.1.29. Everything seemed to work fine at first. But then I found out
>> that a lot of posts - actually far more than half of them - aren't
>> archived any longer. What logging the errors tells me is this:
>>
>> Feb 28 12:29:02 2019 (3123) Uncaught runner exception: 'ascii' codec
>> can't decode byte 0xb5 in position 26: ordinal not in range(128)
>> Feb 28 12:29:02 2019 (3123) Traceback (most recent call last):
>> File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 119, in _oneloop
>> self._onefile(msg, msgdata)
>> File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 190, in _onefile
>> keepqueued = self._dispose(mlist, msg, msgdata)
>> File "/usr/lib/mailman/Mailman/Queue/ArchRunner.py", line 77, in _dispose
>> mlist.ArchiveMail(msg)
>> File "/usr/lib/mailman/Mailman/Archiver/Archiver.py", line 216, in
>> ArchiveMail
>> h.processUnixMailbox(f)
>> File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 596, in
>> processUnixMailbox
>> self.add_article(a)
>> File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 640, in
>> add_article
>> author = fixAuthor(article.decoded['author'])
>> File "/usr/lib/mailman/Mailman/Archiver/pipermail.py", line 63, in
>> fixAuthor
>> while i>0 and (L[i-1][0] in lowercase or [error message stops right
>> here]
>>
>> As I read in a previous thread the reason for this may be non-ascii
>> compliant characters in the post, especially the "from:"-line. But why
>> would Python or Mailman now all of a sudden use ASCII instead of UTF-8
>> in the first place? And if so: How can I change that behaviour?
>
>
> Yes, this is due to non-ascii in the display name portion of the From:
> header. I'm investigating a fix, but I'm not sure if this is an RFC 2047
> encoded header or if the raw header contains non-ascii. If the latter,
> the message is non-compliant - RFC 5321 and predecessors require all raw
> headers to contain only ascii characters.
>
>
> As far as "all of a sudden use ASCII" is concerned. Mailman's character
> set for English has always been ascii, and for German, iso-8859-1.
>
I am unable to duplicate this with 3 tests: a message with non-ascii
utf-8 characters in the display name in a raw From:, a message with
non-ascii iso-8859-1 characters in the display name in a raw From: and a
message with non-ascii iso-8859-1 characters in an RFC 2047 encoded
display name in From:.
All 3 messages were archived on an English language list. The display
name in the archive for the first case was garbled, i.e. the separate
bytes of the utf-8 encoding were shown rather than the character they
represented. Other than that, there were no issues with the archive.
Further, I examined the diff of all the archiver modules between 2.1.16
and 2.1.29 and also between 2.1.15 and 2.1.29, and I see nothing that
seems relevant to this exception.
To try to diagnose this further, you could try:
=== modified file 'Mailman/Archiver/pipermail.py'
--- Mailman/Archiver/pipermail.py 2018-05-03 21:23:47 +0000
+++ Mailman/Archiver/pipermail.py 2019-03-02 04:51:23 +0000
@@ -60,9 +60,12 @@
else:
# Mixed case; assume that small parts of the last name will be
# in lowercase, and check them against the list.
- while i>0 and (L[i-1][0] in lowercase or
- L[i-1].lower() in smallNameParts):
- i = i - 1
+ try:
+ while i>0 and (L[i-1][0] in lowercase or
+ L[i-1].lower() in smallNameParts):
+ i = i - 1
+ except:
+ syslog('error', 'Exception in fixAuthor: %s', author)
author = SPACE.join(L[-1:] + L[i:-1]) + ', ' + SPACE.join(L[:i])
return author
and see what gets logged in Mailman's error log and what the archived
message looks like
--
Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
San Francisco Bay Area, California better use your sense - B. Dylan
More information about the Mailman-Users
mailing list