[issue25545] email parsing docs need to be clear that only ASCII strings are supported

Christian Tanzer report at bugs.python.org
Thu Nov 5 04:58:20 EST 2015


Christian Tanzer added the comment:

> Yes, the port from python2 to python3 of the email package
> was...suboptimal.
> ...
> The whole concept of using unicode as a 7bit data channel only is
> just...weird.

+100 to both.

> But, we are now stuck with maintaining that API for backward
> compatibility reasons.

That's a weird definition of backward compatibility, though. The API
breaks backward compatibility to Python 2. Any Python 3 user shouldn't
use the broken API anyway, IMHO.

> To fix it, I rewrote significant parts of the email package, which
> is the new API.

Which unfortunately isn't any help if one needs to stay compatible to
2.7.

> It also is...fraught with the danger of bugs...to talk about
> serializing an email message as a string, transforming it, and then
> trying to re-parse it as an email message.  If your transformations
> are simple, it will probably work, but anything at all complex runs
> the risk of breaking the message.

One of Python's mottos used to be:

   We are all consenting adults here.

But there are other uses for converting a message instance to a
unicode string. Display, printing, and grepping come to mind.

> And having non-ascii bodies counts as non-trivial.

For anybody living in a non-ascii country that statement sounds
**very strange**.

To start with, I have many friends with names that contain non-ascii
characters.

> You do have to conditionalize your 2/3 code to use the bytes parser
> and generator if you are dealing with 8-bit messages. There's just no
> way around that.

I did that yesterday. There are problems with that though:

* Recognizing the problem for what it is.

  Trying to run Python 2.7 code that *should* run under 3.5 but breaks
  with weird errors wastes a lot of time.

  Multiply with the number of Python programmers that want to migrate
  and you get a problem.

  If `message_as_string` and `as_string` just weren't there in 3.x it
  would be much less of a problem (clear documentation would also help
  but not as much).

* Lots of ugly workarounds for the same problem.

  Most of them (mine certainly included) are done quick and ad-hoc and
  probably break in many ways.

  The question then arises: why should one use the email package at
  all. But of course that way lies madness.

Just more roadblocks for the move to Python 3.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25545>
_______________________________________


More information about the Python-bugs-list mailing list