mailbox misbehavior with non-ASCII

Peter J. Holzer hjp-python at hjp.at
Sat Jul 30 16:19:18 EDT 2022


On 2022-07-29 23:24:57 +0000, Peter Pearson wrote:
> The following code produces a nonsense result with the input 
> described below:
> 
> import mailbox
> box = mailbox.Maildir("/home/peter/Temp/temp",create=False)
> x = box.values()[0]
> h = x.get("X-DSPAM-Factors")
> print(type(h))
> # <class 'email.header.Header'>
> 
> The output is the desired "str" when the message file contains this:
> 
> To: recipient at example.com
> Message-ID: <123>
> Date: Sun, 24 Jul 2022 15:31:19 +0000
> Subject: Blah blah
> From: from at from.com
> X-DSPAM-Factors: a'b
> 
> xxx
> 
> ... but if the apostrophe in "a'b" is replaced with a
> RIGHT SINGLE QUOTATION MARK, the returned h is of type 
> "email.header.Header", and seems to contain inscrutable garbage.

It's not inscrutable to me, but then I remember when RFC 1522 was the
relevant RFC.

Calling h.encode() returns

=?unknown-8bit?b?YeKAmWI=?=

which is about the best result you can get. The character set is unknown
and the content (when decoded) is the bytes

61 e2 80 99 62

which is what your file contained (assuming you used UTF-8).

What would be nice if you could get at that content directly. There
doesn't seem to be documented method to do that. You can use h._chunks,
but as the _ in the name implies, that's implementation detail which
might change in future versions (and it's not quite straightforward
either, although consistent with other parts of python, I think).

        hp

-- 
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp at hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://mail.python.org/pipermail/python-list/attachments/20220730/165dcdc2/attachment.sig>


More information about the Python-list mailing list