Another 2 to 3 mail encoding problem

Chris Green cl at isbd.net
Thu Aug 27 09:48:48 EDT 2020


Richard Damon <Richard at damon-family.org> wrote:
> On 8/27/20 4:31 AM, Chris Green wrote:
> > While an E-Mail body possibly *shouldn't* have non-ASCII characters in
> > it one must be able to handle them without errors.  In fact haven't
> > the RFCs changed such that the message body should be 8-bit clean?
> > Anyway I think the Python 3 mail handling libraries need to be able to
> > pass extended characters through without errors.
> 
> Email message a fully allowed to use non-ASCII characters in them as
> long as the headers indicate this. They can be encoded either as raw 8
> bit bytes on systems that are 8-bit clean, or for systems that are not,
> they will need to be encoded either as base-64 or using quote-printable
> encoding. These characters are to interpreted in the character set
> defined (or presumed) in the header, or even some other binary object
> like and image or executable if the content type isn't text.
> 
> Because of this, the Python 3 str type is not suitable to store an email
> message, since it insists on the string being Unicode encoded, but the
> Python 2 str class could hold it.
> 
Which sounds like the core of my problem[s]! :-)

As I said my system (ignoring the Python issues) is all UTF8 and all
seems to work well so I think it's pretty much correctly configured.
When I send mail that has accented and other extended characters in it
the E-Mail headers have:-
    Content-Type: text/plain; charset=utf-8

If I save a message like the above sent to myself it's stored using
the UTF8 characters directly, I can open it with my text editor (which
is also UTF8 aware) and see the characters as I entered them, there's
no encoding because my system is 8-bit clean and I'm talking to myself
as it were.

The above is using Python 2 to handle and filter my incoming mail
which, as you say, works fine.  However when I try switching to Python
3 I get the errors I've been asking about, even though this is
'talking to myself' and the E-Mail message is just UTF8.


-- 
Chris Green
·


More information about the Python-list mailing list