Another 2 to 3 mail encoding problem

Chris Angelico rosuav at gmail.com
Thu Aug 27 09:13:22 EDT 2020


On Thu, Aug 27, 2020 at 11:10 PM Karsten Hilbert
<Karsten.Hilbert at gmx.net> wrote:
>
> > Because of this, the Python 3 str type is not suitable to store an email
> > message, since it insists on the string being Unicode encoded,
>
> I should greatly appreciate to be enlightened as to what
> a "string being Unicode encoded" is intended to say ?
>

A Python 3 "str" or a Python 2 "unicode" is an abstract sequence of
Unicode codepoints. As such, it's not suitable for transparently
round-tripping an email, as it would lose information about the way
that things were encoded. However, it is excellent for building and
processing emails - you deal with character encodings at the same
point where you deal with the RFC 822 header format. In the abstract,
your headers might be stored in a dict, but then you encode them to a
flat sequence of bytes by putting "Header: value", wrapping correctly
- and also encode the text into bytes at the same time.

ChrisA


More information about the Python-list mailing list