mailbox misbehavior with non-ASCII

Peter Pearson pkpearson at nowhere.invalid
Fri Jul 29 19:24:57 EDT 2022


The following code produces a nonsense result with the input 
described below:

import mailbox
box = mailbox.Maildir("/home/peter/Temp/temp",create=False)
x = box.values()[0]
h = x.get("X-DSPAM-Factors")
print(type(h))
# <class 'email.header.Header'>

The output is the desired "str" when the message file contains this:

To: recipient at example.com
Message-ID: <123>
Date: Sun, 24 Jul 2022 15:31:19 +0000
Subject: Blah blah
From: from at from.com
X-DSPAM-Factors: a'b

xxx

... but if the apostrophe in "a'b" is replaced with a
RIGHT SINGLE QUOTATION MARK, the returned h is of type 
"email.header.Header", and seems to contain inscrutable garbage.

I realize that one should not put non-ASCII characters in
message headers, but of course I didn't put it there, it
just showed up, pretty much beyond my control.  And I realize
that when software is given input that breaks the rules, one
cannot expect optimal results, but I'd think an exception
would be the right answer.

Is this worth a bug report?

-- 
To email me, substitute nowhere->runbox, invalid->com.


More information about the Python-list mailing list