From rdmurray at bitdance.com  Fri Jun  4 18:39:50 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Fri, 04 Jun 2010 12:39:50 -0400
Subject: [Email-SIG] email package status in 3.X
Message-ID: <20100604163950.5AFE1217FD6@kimball.webabinitio.net>

On Mon May 10 20:02:46 CEST 2010 Mark Lutz wrote:
> I'm probably going to have to go ahead and finish the book
> with the email package as it is now, and include a lot of 
> caveats about the problems that a new version may fix in the 
> future.  I can also post updated example code if/when possible.
> 
> I realize everybody on this list probably knows this already,
> but email in 3.X not only doesn't support the Unicode/bytes 
> dichotomy, it was also broken by it.  Beyond the pre-parse 
> decode issue, its mail text generation really only works for 
> all-text mails.  Generating text of an email with any sort of
> binary part doesn't work at all now, because the base64 text 
> is still bytes, and the Generator expects str.  I've coded a 
> custom encoder to pass to MIMEImage that works around this
> by decoding to ASCII, but it's not a great story to have to 
> tell the tens of thousands of readers of this book, many of
> whom will be evaluating 3.X in general.

This bug should now be fixed in both the py3k branch and the 3.1
maint branch.  This means the fix will be in 3.1.3, as well as 3.2a1.
Hopefully that will be in time for your book, since 3.2a1 is due June
27th and I'm guessing the 3.1.3 release will be some time not too far
off that time frame as well.  FYI I also fixed a related bug that made
using utf-8 as a charset problematic.  Unfortunately I suspect there
maybe some other charset issues waiting to be discovered.

If you have come across any other bugs that don't already have
issues in the tracker please file bug reports.  Anything that
can be fixed in the current package I will endeavor to fix
before the next release.  Feel free also to indicate bugs which
should be given priority.

--
R. David Murray                                      www.bitdance.com

From lutz at rmi.net  Thu Jun 10 15:21:52 2010
From: lutz at rmi.net (lutz at rmi.net)
Date: Thu, 10 Jun 2010 09:21:52 -0400 (GMT-04:00)
Subject: [Email-SIG] email package status in 3.X
Message-ID: <5681323.1276176113106.JavaMail.root@elwamui-little.atl.sa.earthlink.net>

Thanks, David; that's great news.  I'll update the book draft 
accordingly.

For the record, despite the issues, I was able to complete a fairly
full-featured email client GUI with the email package as it currently
is.  This includes parsing and generating arbitrary attachments, as
well as encoding on sends and decoding on fetches for both text payloads 
and I18N mail headers. The package is still quite powerful as is.  It
does take a bit of digging to figure out how to use its many tools,
but the book will probably help on this front, especially the 
upcoming edition's more complete application.

In other words, some of my concern may have been a bit premature.  
I hope that in the future we'll either strive for compatibility 
or keep the current version around; it's a lot of very useful code.

In fact, I recommend that any new email package be named distinctly, 
and that the current package be retained for a number of releases to
come.  After all the breakages that 3.X introduced in general, doing
the same to any email-based code seems a bit too much, especially 
given that the current package is largely functional as is.  To me,
after having just used it extensively, fixing its few issues seems 
a better approach than starting from scratch.

As far as other issues, the things I found are described below my
signature.  I don't know what the utf-8 issue is that you refer 
too; I'm able to parse and send with this encoding as is without 
problems (both payloads and headers), but I'm probably not using the
interfaces you fixed, and this may be the same as one of item listed.

Another thought: it might be useful to use the book's email client 
as a sort of test case for the package; it's much more rigorous in 
the new edition because it now has to be given 3.X'Unicode model 
(it's abut 4,900 lines of code, though not all is email-related).
I'd be happy to donate the code as soon as I find out what the 
copyright will be this time around; it will be at O'Reilly's site
this Fall in any event.

Thanks,
--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)


Major issues I found...
------------------------------------------------------------------
1) Str required for parsing, but bytes returned from poplib

The initial decode from bytes to str of full mail text; in 
retrospect, probably not a major issue, since original email 
standards called for ASCII.  A 8-bit encoding like Latin-1 is
probably sufficient for most conforming mails.  For the book,
I try a set of different encodings, beginning with an optional
configuration module setting, then ascii, latin-1, and utf-8;
this is probably overkill, but a GUI has to be defensive.

----------------------------------------------------------------

2) Binary attachments encoding

The binary attachments byte-to-str issue that you've just
fixed.  As I mentioned, I worked around this by passing in a 
custom encoder that calls the original and runs an extra decode
step.  Here's what my fix looked like in the book; your patch 
may do better, and I will minimally add a note about the 3.1.3
and 3.2 fix for this:

def fix_encode_base64(msgobj):
     from email.encoders import encode_base64
     encode_base64(msgobj)                # what email does normally: leaves bytes
     bytes = msgobj.get_payload()         # bytes fails in email pkg on text gen
     text  = bytes.decode('ascii')        # decode to unicode str so text gen works
     ...plus line splitting logic omitted...
     msgobj.set_payload('\n'.join(lines))

>>> from email.mime.image import MIMEImage 
>>> from mailtools.mailSender import fix_encode_base64      # use custom workaround
>>> bytes = open('monkeys.jpg', 'rb').read()
>>> m = MIMEImage(bytes, _encoder=fix_encode_base64)        # convert to ascii str
>>> print(m.as_string()[:500])

-------------------------------------------------------------------

3) Type-dependent text part encoding

There's a str/bytes confusion issue related to Unicode encodings
in text payload generation: some encodings require the payload to
be str, but others expect bytes.  Unfortunately, this means that 
clients need to know how the package will react to the encoding 
that is used, and special-case based upon that.  

For example, I needed to pass in str for ASCII and Latin-1 (the 
former is unencoded and the latter gets QP MIME treatment), but 
must pass a bytes for UTF-8 (which triggers Base64).  That's less
than ideal for a client trying to attach arbitrary text parts 
generically from filenames.  Here's the obscure workaround I came
up with; the bodytext is str when fetched from an edit window, 
but may also be loaded from an attachment file.  This may or may
not have been reported, and it's entirley possible that there's
a better solution that I've missed.

def fix_text_required(encodingname):
    """
    4E: workaround for str/bytes combinaton errors in email package;  MIMEText 
    requires different types for different Unicode encodings in Python 3.1, due
    to the different ways it MIME-encodes some types of text;  see Chapter 13;
    the only other alternative is using generic Message and repeating much code; 
    """ 
    from email.charset import Charset, BASE64, QP
    charset = Charset(encodingname)   # how email knows what to do for encoding
    bodyenc = charset.body_encoding   # utf8, others require bytes input data
    return bodyenc in (None, QP)      # ascii, latin1, others require str

# on mail sends...
# email needs either str xor bytes specifically; 
if fix_text_required(bodytextEncoding): 
    if not isinstance(bodytext, str):
        bodytext = bodytext.decode(bodytextEncoding)
else:
    if not isinstance(bodytext, bytes):
        bodytext = bodytext.encode(bodytextEncoding)

# later
msg.set_payload(bodytext, charset=bodytextEncoding)
...or...
msg = MIMEText(bodytext, _charset=bodytextEncoding)
mainmsg.attach(msg)

# attachments
# build sub-Message of appropriate kind
maintype, subtype = contype.split('/', 1)
if maintype == 'text':                       # 4E: text needs encoding
    if fix_text_required(fileencode):        # requires str or bytes
        data = open(filename, 'r', encoding=fileencode)
    else:
        data = open(filename, 'rb')
    msg = MIMEText(data.read(), _subtype=subtype, _charset=fileencode)
    data.close()

-------------------------------------------------------------------

There are some additional cases that now require decoding per mail 
headers today due to the str/bytes split, but these are just a 
normal artifact of supporting Unicode character sets in general,
ans seem like issues for package client to resolve (e.g., the bytes 
returned for decoded payloads in 3.X didn't play well with existing 
str-based text processing code written for 2.X).

-------------------------------------------------------------------


-----Original Message-----
>From: "R. David Murray" <rdmurray at bitdance.com>
>Sent: Jun 4, 2010 12:39 PM
>To: lutz at rmi.net
>Cc: email-sig at python.org
>Subject: email package status in 3.X
>
>On Mon May 10 20:02:46 CEST 2010 Mark Lutz wrote:
>> I'm probably going to have to go ahead and finish the book
>> with the email package as it is now, and include a lot of 
>> caveats about the problems that a new version may fix in the 
>> future.  I can also post updated example code if/when possible.
>> 
>> I realize everybody on this list probably knows this already,
>> but email in 3.X not only doesn't support the Unicode/bytes 
>> dichotomy, it was also broken by it.  Beyond the pre-parse 
>> decode issue, its mail text generation really only works for 
>> all-text mails.  Generating text of an email with any sort of
>> binary part doesn't work at all now, because the base64 text 
>> is still bytes, and the Generator expects str.  I've coded a 
>> custom encoder to pass to MIMEImage that works around this
>> by decoding to ASCII, but it's not a great story to have to 
>> tell the tens of thousands of readers of this book, many of
>> whom will be evaluating 3.X in general.
>
>This bug should now be fixed in both the py3k branch and the 3.1
>maint branch.  This means the fix will be in 3.1.3, as well as 3.2a1.
>Hopefully that will be in time for your book, since 3.2a1 is due June
>27th and I'm guessing the 3.1.3 release will be some time not too far
>off that time frame as well.  FYI I also fixed a related bug that made
>using utf-8 as a charset problematic.  Unfortunately I suspect there
>maybe some other charset issues waiting to be discovered.
>
>If you have come across any other bugs that don't already have
>issues in the tracker please file bug reports.  Anything that
>can be fixed in the current package I will endeavor to fix
>before the next release.  Feel free also to indicate bugs which
>should be given priority.
>
>--
>R. David Murray                                      www.bitdance.com


From rdmurray at bitdance.com  Thu Jun 10 16:18:48 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Thu, 10 Jun 2010 10:18:48 -0400
Subject: [Email-SIG] email package status in 3.X
In-Reply-To: <5681323.1276176113106.JavaMail.root@elwamui-little.atl.sa.earthlink.net>
References: <5681323.1276176113106.JavaMail.root@elwamui-little.atl.sa.earthlink.net>
Message-ID: <20100610141848.E84181FCC52@kimball.webabinitio.net>

On Thu, 10 Jun 2010 09:21:52 -0400, lutz at rmi.net wrote:
> In other words, some of my concern may have been a bit premature.  
> I hope that in the future we'll either strive for compatibility 
> or keep the current version around; it's a lot of very useful code.

The plan is to have a compatibility layer that will accept calls based
on the old API and forward appropriately to the new API.  So far I'm
thinking I can succeed in doing this in a fairly straightforward manner,
but I won't know for sure until I get some more pieces in place.

> In fact, I recommend that any new email package be named distinctly, 

I'm going to avoid that if I can (though the PyPI package will be
named email6 when we publish it for public testing).  If, however,
it turns out that I can't correctly support both the old and the
new API, then I'll have to do that.

> and that the current package be retained for a number of releases to
> come.  After all the breakages that 3.X introduced in general, doing
> the same to any email-based code seems a bit too much, especially 
> given that the current package is largely functional as is.  To me,
> after having just used it extensively, fixing its few issues seems 
> a better approach than starting from scratch.

Well, the thing is, as you found, existing 2.x code needs to be fixed to
correctly handle the distinction between strings and bytes no matter what.
The goal is to make it easier to write correct programs, while providing
the compatibility layer to make porting smoother.  But I doubt that any
non-trivial 2.x email program will port without significant changes,
even if the compatibility layer is close to 100% compatible with the
current Python3 email package, simply because the previous conflation
of text and bytes must be untangled in order to work correctly in
Python3, and email involves lots of transitions between text and bytes.

As for "starting from scratch", it is true that the current plan involves
considerable changes in the recommended API (in the direction of greater
flexibility and power), but I'm hoping that significant portions of the
code will carry forward with minor changes, and that this will make it
easier to support the old API.

> As far as other issues, the things I found are described below my
> signature.  I don't know what the utf-8 issue is that you refer 
> too; I'm able to parse and send with this encoding as is without 
> problems (both payloads and headers), but I'm probably not using the
> interfaces you fixed, and this may be the same as one of item listed.

It is, see below.

> Another thought: it might be useful to use the book's email client 
> as a sort of test case for the package; it's much more rigorous in 
> the new edition because it now has to be given 3.X'Unicode model 
> (it's abut 4,900 lines of code, though not all is email-related).
> I'd be happy to donate the code as soon as I find out what the 
> copyright will be this time around; it will be at O'Reilly's site
> this Fall in any event.

That would be great.  I am planning to write my own sample ap to
demonstrate the new API, but if I can use yours to test the compatibility
layer that will help a lot, since I otherwise have no Python3 email
application to test against unless I port something from Python2.

> Major issues I found...
> ------------------------------------------------------------------
> 1) Str required for parsing, but bytes returned from poplib
> 
> The initial decode from bytes to str of full mail text; in 
> retrospect, probably not a major issue, since original email 
> standards called for ASCII.  A 8-bit encoding like Latin-1 is
> probably sufficient for most conforming mails.  For the book,
> I try a set of different encodings, beginning with an optional
> configuration module setting, then ascii, latin-1, and utf-8;
> this is probably overkill, but a GUI has to be defensive.

This works (mostly) for conforming email, but some important Python email
applications need to deal with non-conforming email.  That's where the
inability to parse bytes directly really causes problems.

> 2) Binary attachments encoding
> 
> The binary attachments byte-to-str issue that you've just
> fixed.  As I mentioned, I worked around this by passing in a 
> custom encoder that calls the original and runs an extra decode
> step.  Here's what my fix looked like in the book; your patch 
> may do better, and I will minimally add a note about the 3.1.3
> and 3.2 fix for this:

Yeah, our patch was a lot simpler since we could fix the encoding inside
the loop producing the encoded lines :)

> 3) Type-dependent text part encoding
> 
> There's a str/bytes confusion issue related to Unicode encodings
> in text payload generation: some encodings require the payload to
> be str, but others expect bytes.  Unfortunately, this means that 
> clients need to know how the package will react to the encoding 
> that is used, and special-case based upon that.  

This was the UTF-8 bug I fixed.  I shouldn't have called it "the UTF-8
bug", because it applies equally to the other charsets that use base64,
as you note.  I called it that because UTF-8 was where the problem was
noticed and is mentioned in the title of the bug report.

I had a suspicion that the quoted-printable encoding wasn't being done
correctly either, so to hear that it is working for you is good news.
There may still be bugs to find there, though.

So, in the next releases of Python all MIMEText input should be string,
and it will fail if you pass bytes.  I consider this as email previously
not living up to its published API, but do you think I should hack
in a way for it to accept bytes too, for backward compatibility in the
3 line?

> There are some additional cases that now require decoding per mail 
> headers today due to the str/bytes split, but these are just a 
> normal artifact of supporting Unicode character sets in general,
> ans seem like issues for package client to resolve (e.g., the bytes 
> returned for decoded payloads in 3.X didn't play well with existing 
> str-based text processing code written for 2.X).

I'm not following you here.  Can you give me some more specific
examples?  Even if these "normal artifacts" must remain with
the current API, I'd like to make things as easy as practical when
using the new API.

Thanks for all your feedback!

--David

From barry at python.org  Thu Jun 10 16:42:14 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 10 Jun 2010 10:42:14 -0400
Subject: [Email-SIG] email package status in 3.X
In-Reply-To: <20100610141848.E84181FCC52@kimball.webabinitio.net>
References: <5681323.1276176113106.JavaMail.root@elwamui-little.atl.sa.earthlink.net>
	<20100610141848.E84181FCC52@kimball.webabinitio.net>
Message-ID: <20100610104214.2bdd8f48@heresy>

On Jun 10, 2010, at 10:18 AM, R. David Murray wrote:

>That would be great.  I am planning to write my own sample ap to
>demonstrate the new API, but if I can use yours to test the compatibility
>layer that will help a lot, since I otherwise have no Python3 email
>application to test against unless I port something from Python2.

I would support/help with a port of Mailman 3 to Python 3.  It's
non-trivial, but would make a good test case.  The dependency stack may make
that difficult.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/email-sig/attachments/20100610/9e749f7c/attachment-0001.pgp>

From rdmurray at bitdance.com  Thu Jun 10 17:35:07 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Thu, 10 Jun 2010 11:35:07 -0400
Subject: [Email-SIG] email package status in 3.X
In-Reply-To: <20100610104214.2bdd8f48@heresy>
References: <5681323.1276176113106.JavaMail.root@elwamui-little.atl.sa.earthlink.net>
	<20100610141848.E84181FCC52@kimball.webabinitio.net>
	<20100610104214.2bdd8f48@heresy>
Message-ID: <20100610153507.A62C31FCB5A@kimball.webabinitio.net>

On Thu, 10 Jun 2010 10:42:14 -0400, Barry Warsaw <barry at python.org> wrote:
> On Jun 10, 2010, at 10:18 AM, R. David Murray wrote:
> 
> >That would be great.  I am planning to write my own sample ap to
> >demonstrate the new API, but if I can use yours to test the compatibility
> >layer that will help a lot, since I otherwise have no Python3 email
> >application to test against unless I port something from Python2.
> 
> I would support/help with a port of Mailman 3 to Python 3.  It's
> non-trivial, but would make a good test case.  The dependency stack may make
> that difficult.

I realized after I sent that email that I should have said "until",
since that's one of the testing goals (seeing how applications
port both to the compatibility and to the new API).

Mailman is at the top of the list of test ports, but as you say
dependencies may have to be dealt with first.  I'm certainly glad
you are willing to help, since that will doubtless make it go
faster :)

--
R. David Murray                                      www.bitdance.com

From lutz at rmi.net  Sat Jun 12 18:52:32 2010
From: lutz at rmi.net (lutz at rmi.net)
Date: Sat, 12 Jun 2010 16:52:32 -0000
Subject: [Email-SIG] email package status in 3.X
Message-ID: <ug4oeclx5b5ezgx512062010125222@SMTP>

Hi David,

All sounds good, and thanks again for all your work on this.

I appreciate the difficulties of moving this package to 3.X
in a backward-compatible way.  My suggestions stem from the fact 
that it does work as is today, albeit in a less than ideal way.

That, and I'm seeing that Python 3.X in general is still having
a great deal of trouble gaining traction in the "real world" 
almost 2 years after its release, and I'd hate to see further 
disincentives for people to migrate.  This is a bigger issue
than both the email package and this thread, of course.

> > 3) Type-dependent text part encoding
> > 
> ...
> So, in the next releases of Python all MIMEText input should be string,
> and it will fail if you pass bytes.  I consider this as email previously
> not living up to its published API, but do you think I should hack
> in a way for it to accept bytes too, for backward compatibility in the
> 3 line?

Decoding can probably be safely delegated to package clients.
Typical email clients will probably have str for display of the
main text.  They may wish to read attachments in binary mode, but
can always read in text mode instead or decode manualy, because 
they need a known encoding to send the part correctly (my client 
has to ask or use configurations in some cases).

B/W compatibility probably isn't a concern; I suspect that my 
temporary workaround will still work with your patch anyhow, 
and this code didn't work at all for some encodings before.

> > There are some additional cases that now require decoding per mail 
> > headers today due to the str/bytes split, but these are just a 
> > normal artifact of supporting Unicode character sets in general,
> > ans seem like issues for package client to resolve (e.g., the bytes 
> > returned for decoded payloads in 3.X didn't play well with existing 
> > str-based text processing code written for 2.X).
> 
> I'm not following you here.  Can you give me some more specific
> examples?  Even if these "normal artifacts" must remain with
> the current API, I'd like to make things as easy as practical when
> using the new API.

This was just a general statement about things in my own code that
didn't jive with the 3.X string model.  For instance, line wrapping 
logic assumed str; tkinter text widgets do much better rendering str 
than the bytes fetched for decoded payloads; and my Pyedit text editor
component had to be overhauled to handle display/edit/save of payloads 
of arbitrary encodings.  If I remember any more specific issues with 
the email package itself, I'll forward your way.

I'll watch for an opportunity to get the book's new PyMailGUI 
client code to you as a candidate test case, but please ping 
me about it later if I haven't acted on this.  It works well,
but largely because of all the work that went into the email 
package underlying it.

Thanks,
--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)


> -----Original Message-----
> From: "R. David Murray" <rdmurray at bitdance.com>
> To: lutz at rmi.net
> Subject: Re: email package status in 3.X
> Date: Thu, 10 Jun 2010 10:18:48 -0400
> 
> On Thu, 10 Jun 2010 09:21:52 -0400, lutz at rmi.net wrote:
> > In other words, some of my concern may have been a bit premature.  
> > I hope that in the future we'll either strive for compatibility 
> > or keep the current version around; it's a lot of very useful code.
> 
> The plan is to have a compatibility layer that will accept calls based
> on the old API and forward appropriately to the new API.  So far I'm
> thinking I can succeed in doing this in a fairly straightforward manner,
> but I won't know for sure until I get some more pieces in place.
> 
> > In fact, I recommend that any new email package be named distinctly, 
> 
> I'm going to avoid that if I can (though the PyPI package will be
> named email6 when we publish it for public testing).  If, however,
> it turns out that I can't correctly support both the old and the
> new API, then I'll have to do that.
> 
> > and that the current package be retained for a number of releases to
> > come.  After all the breakages that 3.X introduced in general, doing
> > the same to any email-based code seems a bit too much, especially 
> > given that the current package is largely functional as is.  To me,
> > after having just used it extensively, fixing its few issues seems 
> > a better approach than starting from scratch.
> 
> Well, the thing is, as you found, existing 2.x code needs to be fixed to
> correctly handle the distinction between strings and bytes no matter what.
> The goal is to make it easier to write correct programs, while providing
> the compatibility layer to make porting smoother.  But I doubt that any
> non-trivial 2.x email program will port without significant changes,
> even if the compatibility layer is close to 100% compatible with the
> current Python3 email package, simply because the previous conflation
> of text and bytes must be untangled in order to work correctly in
> Python3, and email involves lots of transitions between text and bytes.
> 
> As for "starting from scratch", it is true that the current plan involves
> considerable changes in the recommended API (in the direction of greater
> flexibility and power), but I'm hoping that significant portions of the
> code will carry forward with minor changes, and that this will make it
> easier to support the old API.
> 
> > As far as other issues, the things I found are described below my
> > signature.  I don't know what the utf-8 issue is that you refer 
> > too; I'm able to parse and send with this encoding as is without 
> > problems (both payloads and headers), but I'm probably not using the
> > interfaces you fixed, and this may be the same as one of item listed.
> 
> It is, see below.
> 
> > Another thought: it might be useful to use the book's email client 
> > as a sort of test case for the package; it's much more rigorous in 
> > the new edition because it now has to be given 3.X'Unicode model 
> > (it's abut 4,900 lines of code, though not all is email-related).
> > I'd be happy to donate the code as soon as I find out what the 
> > copyright will be this time around; it will be at O'Reilly's site
> > this Fall in any event.
> 
> That would be great.  I am planning to write my own sample ap to
> demonstrate the new API, but if I can use yours to test the compatibility
> layer that will help a lot, since I otherwise have no Python3 email
> application to test against unless I port something from Python2.
> 
> > Major issues I found...
> > ------------------------------------------------------------------
> > 1) Str required for parsing, but bytes returned from poplib
> > 
> > The initial decode from bytes to str of full mail text; in 
> > retrospect, probably not a major issue, since original email 
> > standards called for ASCII.  A 8-bit encoding like Latin-1 is
> > probably sufficient for most conforming mails.  For the book,
> > I try a set of different encodings, beginning with an optional
> > configuration module setting, then ascii, latin-1, and utf-8;
> > this is probably overkill, but a GUI has to be defensive.
> 
> This works (mostly) for conforming email, but some important Python email
> applications need to deal with non-conforming email.  That's where the
> inability to parse bytes directly really causes problems.
> 
> > 2) Binary attachments encoding
> > 
> > The binary attachments byte-to-str issue that you've just
> > fixed.  As I mentioned, I worked around this by passing in a 
> > custom encoder that calls the original and runs an extra decode
> > step.  Here's what my fix looked like in the book; your patch 
> > may do better, and I will minimally add a note about the 3.1.3
> > and 3.2 fix for this:
> 
> Yeah, our patch was a lot simpler since we could fix the encoding inside
> the loop producing the encoded lines :)
> 
> > 3) Type-dependent text part encoding
> > 
> > There's a str/bytes confusion issue related to Unicode encodings
> > in text payload generation: some encodings require the payload to
> > be str, but others expect bytes.  Unfortunately, this means that 
> > clients need to know how the package will react to the encoding 
> > that is used, and special-case based upon that.  
> 
> This was the UTF-8 bug I fixed.  I shouldn't have called it "the UTF-8
> bug", because it applies equally to the other charsets that use base64,
> as you note.  I called it that because UTF-8 was where the problem was
> noticed and is mentioned in the title of the bug report.
> 
> I had a suspicion that the quoted-printable encoding wasn't being done
> correctly either, so to hear that it is working for you is good news.
> There may still be bugs to find there, though.
> 
> So, in the next releases of Python all MIMEText input should be string,
> and it will fail if you pass bytes.  I consider this as email previously
> not living up to its published API, but do you think I should hack
> in a way for it to accept bytes too, for backward compatibility in the
> 3 line?
> 
> > There are some additional cases that now require decoding per mail 
> > headers today due to the str/bytes split, but these are just a 
> > normal artifact of supporting Unicode character sets in general,
> > ans seem like issues for package client to resolve (e.g., the bytes 
> > returned for decoded payloads in 3.X didn't play well with existing 
> > str-based text processing code written for 2.X).
> 
> I'm not following you here.  Can you give me some more specific
> examples?  Even if these "normal artifacts" must remain with
> the current API, I'd like to make things as easy as practical when
> using the new API.
> 
> Thanks for all your feedback!
> 
> --David
> 


From lutz at rmi.net  Sun Jun 13 17:30:06 2010
From: lutz at rmi.net (lutz at rmi.net)
Date: Sun, 13 Jun 2010 15:30:06 -0000
Subject: [Email-SIG] email package status in 3.X
Message-ID: <kit1ggirgzwbarhm13062010113004@SMTP>

Come to think of it, here was another oddness I just recalled: this 
may have been reported already, but header decoding returns mixed types
depending upon the structure of the header.  Converting to a str for 
display isn't too difficult to handle, but this seems a bit inconsistent
and contrary to Python's type neutrality:

>>> from email.header import decode_header
>>> S1 = 'Man where did you get that assistant?'
>>> S2 = '=?utf-8?q?Man_where_did_you_get_that_assistant=3F?='
>>> S3 = 'Man where did you get that =?UTF-8?Q?assistant=3F?='

# str: don't decode()
>>> decode_header(S1)
[('Man where did you get that assistant?', None)]

# bytes: do decode()
>>> decode_header(S2)
[(b'Man where did you get that assistant?', 'utf-8')]

# bytes: do decode(), using raw-unicode-escape applied in package
>>> decode_header(S3)
[(b'Man where did you get that', None), (b'assistant?', 'utf-8')]

I can make this work around this with the following code, but it 
feels a bit too tightly coupled to the package's internal details
(further evidence that email.* can be made to work as is today, 
even if it may be seen as less than ideal aesthetically):

parts = email.header.decode_header(rawheader)
decoded = []
for (part, enc) in parts:                      # for all substrings
    if enc == None:                            # part unencoded?
        if not isinstance(part, bytes):        # str: full hdr unencoded
            decoded += [part]                  # else do unicode decode
        else:
            decoded += [part.decode('raw-unicode-escape')]
    else:
        decoded += [part.decode(enc)]
return ' '.join(decoded)

Thanks,
--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)


> -----Original Message-----
> From: lutz at rmi.net
> To: "R. David Murray" <rdmurray at bitdance.com>
> Subject: Re: email package status in 3.X
> Date: Sat, 12 Jun 2010 16:52:32 -0000
> 
> Hi David,
> 
> All sounds good, and thanks again for all your work on this.
> 
> I appreciate the difficulties of moving this package to 3.X
> in a backward-compatible way.  My suggestions stem from the fact 
> that it does work as is today, albeit in a less than ideal way.
> 
> That, and I'm seeing that Python 3.X in general is still having
> a great deal of trouble gaining traction in the "real world" 
> almost 2 years after its release, and I'd hate to see further 
> disincentives for people to migrate.  This is a bigger issue
> than both the email package and this thread, of course.
> 
> > > 3) Type-dependent text part encoding
> > > 
> > ...
> > So, in the next releases of Python all MIMEText input should be string,
> > and it will fail if you pass bytes.  I consider this as email previously
> > not living up to its published API, but do you think I should hack
> > in a way for it to accept bytes too, for backward compatibility in the
> > 3 line?
> 
> Decoding can probably be safely delegated to package clients.
> Typical email clients will probably have str for display of the
> main text.  They may wish to read attachments in binary mode, but
> can always read in text mode instead or decode manualy, because 
> they need a known encoding to send the part correctly (my client 
> has to ask or use configurations in some cases).
> 
> B/W compatibility probably isn't a concern; I suspect that my 
> temporary workaround will still work with your patch anyhow, 
> and this code didn't work at all for some encodings before.
> 
> > > There are some additional cases that now require decoding per mail 
> > > headers today due to the str/bytes split, but these are just a 
> > > normal artifact of supporting Unicode character sets in general,
> > > ans seem like issues for package client to resolve (e.g., the bytes 
> > > returned for decoded payloads in 3.X didn't play well with existing 
> > > str-based text processing code written for 2.X).
> > 
> > I'm not following you here.  Can you give me some more specific
> > examples?  Even if these "normal artifacts" must remain with
> > the current API, I'd like to make things as easy as practical when
> > using the new API.
> 
> This was just a general statement about things in my own code that
> didn't jive with the 3.X string model.  For instance, line wrapping 
> logic assumed str; tkinter text widgets do much better rendering str 
> than the bytes fetched for decoded payloads; and my Pyedit text editor
> component had to be overhauled to handle display/edit/save of payloads 
> of arbitrary encodings.  If I remember any more specific issues with 
> the email package itself, I'll forward your way.
> 
> I'll watch for an opportunity to get the book's new PyMailGUI 
> client code to you as a candidate test case, but please ping 
> me about it later if I haven't acted on this.  It works well,
> but largely because of all the work that went into the email 
> package underlying it.
> 
> Thanks,
> --Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)
> 
> 
> > -----Original Message-----
> > From: "R. David Murray" <rdmurray at bitdance.com>
> > To: lutz at rmi.net
> > Subject: Re: email package status in 3.X
> > Date: Thu, 10 Jun 2010 10:18:48 -0400
> > 
> > On Thu, 10 Jun 2010 09:21:52 -0400, lutz at rmi.net wrote:
> > > In other words, some of my concern may have been a bit premature.  
> > > I hope that in the future we'll either strive for compatibility 
> > > or keep the current version around; it's a lot of very useful code.
> > 
> > The plan is to have a compatibility layer that will accept calls based
> > on the old API and forward appropriately to the new API.  So far I'm
> > thinking I can succeed in doing this in a fairly straightforward manner,
> > but I won't know for sure until I get some more pieces in place.
> > 
> > > In fact, I recommend that any new email package be named distinctly, 
> > 
> > I'm going to avoid that if I can (though the PyPI package will be
> > named email6 when we publish it for public testing).  If, however,
> > it turns out that I can't correctly support both the old and the
> > new API, then I'll have to do that.
> > 
> > > and that the current package be retained for a number of releases to
> > > come.  After all the breakages that 3.X introduced in general, doing
> > > the same to any email-based code seems a bit too much, especially 
> > > given that the current package is largely functional as is.  To me,
> > > after having just used it extensively, fixing its few issues seems 
> > > a better approach than starting from scratch.
> > 
> > Well, the thing is, as you found, existing 2.x code needs to be fixed to
> > correctly handle the distinction between strings and bytes no matter what.
> > The goal is to make it easier to write correct programs, while providing
> > the compatibility layer to make porting smoother.  But I doubt that any
> > non-trivial 2.x email program will port without significant changes,
> > even if the compatibility layer is close to 100% compatible with the
> > current Python3 email package, simply because the previous conflation
> > of text and bytes must be untangled in order to work correctly in
> > Python3, and email involves lots of transitions between text and bytes.
> > 
> > As for "starting from scratch", it is true that the current plan involves
> > considerable changes in the recommended API (in the direction of greater
> > flexibility and power), but I'm hoping that significant portions of the
> > code will carry forward with minor changes, and that this will make it
> > easier to support the old API.
> > 
> > > As far as other issues, the things I found are described below my
> > > signature.  I don't know what the utf-8 issue is that you refer 
> > > too; I'm able to parse and send with this encoding as is without 
> > > problems (both payloads and headers), but I'm probably not using the
> > > interfaces you fixed, and this may be the same as one of item listed.
> > 
> > It is, see below.
> > 
> > > Another thought: it might be useful to use the book's email client 
> > > as a sort of test case for the package; it's much more rigorous in 
> > > the new edition because it now has to be given 3.X'Unicode model 
> > > (it's abut 4,900 lines of code, though not all is email-related).
> > > I'd be happy to donate the code as soon as I find out what the 
> > > copyright will be this time around; it will be at O'Reilly's site
> > > this Fall in any event.
> > 
> > That would be great.  I am planning to write my own sample ap to
> > demonstrate the new API, but if I can use yours to test the compatibility
> > layer that will help a lot, since I otherwise have no Python3 email
> > application to test against unless I port something from Python2.
> > 
> > > Major issues I found...
> > > ------------------------------------------------------------------
> > > 1) Str required for parsing, but bytes returned from poplib
> > > 
> > > The initial decode from bytes to str of full mail text; in 
> > > retrospect, probably not a major issue, since original email 
> > > standards called for ASCII.  A 8-bit encoding like Latin-1 is
> > > probably sufficient for most conforming mails.  For the book,
> > > I try a set of different encodings, beginning with an optional
> > > configuration module setting, then ascii, latin-1, and utf-8;
> > > this is probably overkill, but a GUI has to be defensive.
> > 
> > This works (mostly) for conforming email, but some important Python email
> > applications need to deal with non-conforming email.  That's where the
> > inability to parse bytes directly really causes problems.
> > 
> > > 2) Binary attachments encoding
> > > 
> > > The binary attachments byte-to-str issue that you've just
> > > fixed.  As I mentioned, I worked around this by passing in a 
> > > custom encoder that calls the original and runs an extra decode
> > > step.  Here's what my fix looked like in the book; your patch 
> > > may do better, and I will minimally add a note about the 3.1.3
> > > and 3.2 fix for this:
> > 
> > Yeah, our patch was a lot simpler since we could fix the encoding inside
> > the loop producing the encoded lines :)
> > 
> > > 3) Type-dependent text part encoding
> > > 
> > > There's a str/bytes confusion issue related to Unicode encodings
> > > in text payload generation: some encodings require the payload to
> > > be str, but others expect bytes.  Unfortunately, this means that 
> > > clients need to know how the package will react to the encoding 
> > > that is used, and special-case based upon that.  
> > 
> > This was the UTF-8 bug I fixed.  I shouldn't have called it "the UTF-8
> > bug", because it applies equally to the other charsets that use base64,
> > as you note.  I called it that because UTF-8 was where the problem was
> > noticed and is mentioned in the title of the bug report.
> > 
> > I had a suspicion that the quoted-printable encoding wasn't being done
> > correctly either, so to hear that it is working for you is good news.
> > There may still be bugs to find there, though.
> > 
> > So, in the next releases of Python all MIMEText input should be string,
> > and it will fail if you pass bytes.  I consider this as email previously
> > not living up to its published API, but do you think I should hack
> > in a way for it to accept bytes too, for backward compatibility in the
> > 3 line?
> > 
> > > There are some additional cases that now require decoding per mail 
> > > headers today due to the str/bytes split, but these are just a 
> > > normal artifact of supporting Unicode character sets in general,
> > > ans seem like issues for package client to resolve (e.g., the bytes 
> > > returned for decoded payloads in 3.X didn't play well with existing 
> > > str-based text processing code written for 2.X).
> > 
> > I'm not following you here.  Can you give me some more specific
> > examples?  Even if these "normal artifacts" must remain with
> > the current API, I'd like to make things as easy as practical when
> > using the new API.
> > 
> > Thanks for all your feedback!
> > 
> > --David
> > 
> 
> 
> 
> 


From lutz at rmi.net  Wed Jun 16 22:48:49 2010
From: lutz at rmi.net (lutz at rmi.net)
Date: Wed, 16 Jun 2010 20:48:49 -0000
Subject: [Email-SIG] email package status in 3.X
Message-ID: <6wwifklfk7n7tup216062010044853@SMTP>

[copied to pydev from email-sig because of the broader scope]

Well, it looks like I've stumbled onto the "other shoe" on this
issue--that the email package's problems are also apparently 
behind the fact that CGI binary file uploads don't work in 3.1
(http://bugs.python.org/issue4953).  Yikes.

I trust that people realize this is a show-stopper for broader
Python 3.X adoption.  Why 3.0 was rolled out anyhow is beyond 
me; it seems that it would have been better if Python developers
had gotten their own code to work with 3.X, before expecting the 
world at large to do so.

FWIW, after rewriting Programming Python for 3.1, 3.x still feels
a lot like a beta to me, almost 2 years after its release.  How
did this happen?  Maybe nobody is using 3.X enough to care, but 
I have a feeling that issues like this are part of the reason why.

No offense to people who obviously put in an incredible amount of
work on 3.X.  As someone who remembers 0.X, though, it's hard not
to find the current situation a bit disappointing.

--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)


> -----Original Message-----
> From: lutz at rmi.net
> To: "R. David Murray" <rdmurray at bitdance.com>
> Subject: Re: email package status in 3.X
> Date: Sun, 13 Jun 2010 15:30:06 -0000
> 
> Come to think of it, here was another oddness I just recalled: this 
> may have been reported already, but header decoding returns mixed types
> depending upon the structure of the header.  Converting to a str for 
> display isn't too difficult to handle, but this seems a bit inconsistent
> and contrary to Python's type neutrality:
> 
> >>> from email.header import decode_header
> >>> S1 = 'Man where did you get that assistant?'
> >>> S2 = '=?utf-8?q?Man_where_did_you_get_that_assistant=3F?='
> >>> S3 = 'Man where did you get that =?UTF-8?Q?assistant=3F?='
> 
> # str: don't decode()
> >>> decode_header(S1)
> [('Man where did you get that assistant?', None)]
> 
> # bytes: do decode()
> >>> decode_header(S2)
> [(b'Man where did you get that assistant?', 'utf-8')]
> 
> # bytes: do decode(), using raw-unicode-escape applied in package
> >>> decode_header(S3)
> [(b'Man where did you get that', None), (b'assistant?', 'utf-8')]
> 
> I can work around this with the following code, but it 
> feels a bit too tightly coupled to the package's internal details
> (further evidence that email.* can be made to work as is today, 
> even if it may be seen as less than ideal aesthetically):
> 
> parts = email.header.decode_header(rawheader)
> decoded = []
> for (part, enc) in parts:                      # for all substrings
>     if enc == None:                            # part unencoded?
>         if not isinstance(part, bytes):        # str: full hdr unencoded
>             decoded += [part]                  # else do unicode decode
>         else:
>             decoded += [part.decode('raw-unicode-escape')]
>     else:
>         decoded += [part.decode(enc)]
> return ' '.join(decoded)
> 
> Thanks,
> --Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)
> 
> 
> > -----Original Message-----
> > From: lutz at rmi.net
> > To: "R. David Murray" <rdmurray at bitdance.com>
> > Subject: Re: email package status in 3.X
> > Date: Sat, 12 Jun 2010 16:52:32 -0000
> > 
> > Hi David,
> > 
> > All sounds good, and thanks again for all your work on this.
> > 
> > I appreciate the difficulties of moving this package to 3.X
> > in a backward-compatible way.  My suggestions stem from the fact 
> > that it does work as is today, albeit in a less than ideal way.
> > 
> > That, and I'm seeing that Python 3.X in general is still having
> > a great deal of trouble gaining traction in the "real world" 
> > almost 2 years after its release, and I'd hate to see further 
> > disincentives for people to migrate.  This is a bigger issue
> > than both the email package and this thread, of course.
> > 
> > > > 3) Type-dependent text part encoding
> > > > 
> > > ...
> > > So, in the next releases of Python all MIMEText input should be string,
> > > and it will fail if you pass bytes.  I consider this as email previously
> > > not living up to its published API, but do you think I should hack
> > > in a way for it to accept bytes too, for backward compatibility in the
> > > 3 line?
> > 
> > Decoding can probably be safely delegated to package clients.
> > Typical email clients will probably have str for display of the
> > main text.  They may wish to read attachments in binary mode, but
> > can always read in text mode instead or decode manualy, because 
> > they need a known encoding to send the part correctly (my client 
> > has to ask or use configurations in some cases).
> > 
> > B/W compatibility probably isn't a concern; I suspect that my 
> > temporary workaround will still work with your patch anyhow, 
> > and this code didn't work at all for some encodings before.
> > 
> > > > There are some additional cases that now require decoding per mail 
> > > > headers today due to the str/bytes split, but these are just a 
> > > > normal artifact of supporting Unicode character sets in general,
> > > > ans seem like issues for package client to resolve (e.g., the bytes 
> > > > returned for decoded payloads in 3.X didn't play well with existing 
> > > > str-based text processing code written for 2.X).
> > > 
> > > I'm not following you here.  Can you give me some more specific
> > > examples?  Even if these "normal artifacts" must remain with
> > > the current API, I'd like to make things as easy as practical when
> > > using the new API.
> > 
> > This was just a general statement about things in my own code that
> > didn't jive with the 3.X string model.  For instance, line wrapping 
> > logic assumed str; tkinter text widgets do much better rendering str 
> > than the bytes fetched for decoded payloads; and my Pyedit text editor
> > component had to be overhauled to handle display/edit/save of payloads 
> > of arbitrary encodings.  If I remember any more specific issues with 
> > the email package itself, I'll forward your way.
> > 
> > I'll watch for an opportunity to get the book's new PyMailGUI 
> > client code to you as a candidate test case, but please ping 
> > me about it later if I haven't acted on this.  It works well,
> > but largely because of all the work that went into the email 
> > package underlying it.
> > 
> > Thanks,
> > --Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)
> > 
> > 
> > > -----Original Message-----
> > > From: "R. David Murray" <rdmurray at bitdance.com>
> > > To: lutz at rmi.net
> > > Subject: Re: email package status in 3.X
> > > Date: Thu, 10 Jun 2010 10:18:48 -0400
> > > 
> > > On Thu, 10 Jun 2010 09:21:52 -0400, lutz at rmi.net wrote:
> > > > In other words, some of my concern may have been a bit premature.  
> > > > I hope that in the future we'll either strive for compatibility 
> > > > or keep the current version around; it's a lot of very useful code.
> > > 
> > > The plan is to have a compatibility layer that will accept calls based
> > > on the old API and forward appropriately to the new API.  So far I'm
> > > thinking I can succeed in doing this in a fairly straightforward manner,
> > > but I won't know for sure until I get some more pieces in place.
> > > 
> > > > In fact, I recommend that any new email package be named distinctly, 
> > > 
> > > I'm going to avoid that if I can (though the PyPI package will be
> > > named email6 when we publish it for public testing).  If, however,
> > > it turns out that I can't correctly support both the old and the
> > > new API, then I'll have to do that.
> > > 
> > > > and that the current package be retained for a number of releases to
> > > > come.  After all the breakages that 3.X introduced in general, doing
> > > > the same to any email-based code seems a bit too much, especially 
> > > > given that the current package is largely functional as is.  To me,
> > > > after having just used it extensively, fixing its few issues seems 
> > > > a better approach than starting from scratch.
> > > 
> > > Well, the thing is, as you found, existing 2.x code needs to be fixed to
> > > correctly handle the distinction between strings and bytes no matter what.
> > > The goal is to make it easier to write correct programs, while providing
> > > the compatibility layer to make porting smoother.  But I doubt that any
> > > non-trivial 2.x email program will port without significant changes,
> > > even if the compatibility layer is close to 100% compatible with the
> > > current Python3 email package, simply because the previous conflation
> > > of text and bytes must be untangled in order to work correctly in
> > > Python3, and email involves lots of transitions between text and bytes.
> > > 
> > > As for "starting from scratch", it is true that the current plan involves
> > > considerable changes in the recommended API (in the direction of greater
> > > flexibility and power), but I'm hoping that significant portions of the
> > > code will carry forward with minor changes, and that this will make it
> > > easier to support the old API.
> > > 
> > > > As far as other issues, the things I found are described below my
> > > > signature.  I don't know what the utf-8 issue is that you refer 
> > > > too; I'm able to parse and send with this encoding as is without 
> > > > problems (both payloads and headers), but I'm probably not using the
> > > > interfaces you fixed, and this may be the same as one of item listed.
> > > 
> > > It is, see below.
> > > 
> > > > Another thought: it might be useful to use the book's email client 
> > > > as a sort of test case for the package; it's much more rigorous in 
> > > > the new edition because it now has to be given 3.X'Unicode model 
> > > > (it's abut 4,900 lines of code, though not all is email-related).
> > > > I'd be happy to donate the code as soon as I find out what the 
> > > > copyright will be this time around; it will be at O'Reilly's site
> > > > this Fall in any event.
> > > 
> > > That would be great.  I am planning to write my own sample ap to
> > > demonstrate the new API, but if I can use yours to test the compatibility
> > > layer that will help a lot, since I otherwise have no Python3 email
> > > application to test against unless I port something from Python2.
> > > 
> > > > Major issues I found...
> > > > ------------------------------------------------------------------
> > > > 1) Str required for parsing, but bytes returned from poplib
> > > > 
> > > > The initial decode from bytes to str of full mail text; in 
> > > > retrospect, probably not a major issue, since original email 
> > > > standards called for ASCII.  A 8-bit encoding like Latin-1 is
> > > > probably sufficient for most conforming mails.  For the book,
> > > > I try a set of different encodings, beginning with an optional
> > > > configuration module setting, then ascii, latin-1, and utf-8;
> > > > this is probably overkill, but a GUI has to be defensive.
> > > 
> > > This works (mostly) for conforming email, but some important Python email
> > > applications need to deal with non-conforming email.  That's where the
> > > inability to parse bytes directly really causes problems.
> > > 
> > > > 2) Binary attachments encoding
> > > > 
> > > > The binary attachments byte-to-str issue that you've just
> > > > fixed.  As I mentioned, I worked around this by passing in a 
> > > > custom encoder that calls the original and runs an extra decode
> > > > step.  Here's what my fix looked like in the book; your patch 
> > > > may do better, and I will minimally add a note about the 3.1.3
> > > > and 3.2 fix for this:
> > > 
> > > Yeah, our patch was a lot simpler since we could fix the encoding inside
> > > the loop producing the encoded lines :)
> > > 
> > > > 3) Type-dependent text part encoding
> > > > 
> > > > There's a str/bytes confusion issue related to Unicode encodings
> > > > in text payload generation: some encodings require the payload to
> > > > be str, but others expect bytes.  Unfortunately, this means that 
> > > > clients need to know how the package will react to the encoding 
> > > > that is used, and special-case based upon that.  
> > > 
> > > This was the UTF-8 bug I fixed.  I shouldn't have called it "the UTF-8
> > > bug", because it applies equally to the other charsets that use base64,
> > > as you note.  I called it that because UTF-8 was where the problem was
> > > noticed and is mentioned in the title of the bug report.
> > > 
> > > I had a suspicion that the quoted-printable encoding wasn't being done
> > > correctly either, so to hear that it is working for you is good news.
> > > There may still be bugs to find there, though.
> > > 
> > > So, in the next releases of Python all MIMEText input should be string,
> > > and it will fail if you pass bytes.  I consider this as email previously
> > > not living up to its published API, but do you think I should hack
> > > in a way for it to accept bytes too, for backward compatibility in the
> > > 3 line?
> > > 
> > > > There are some additional cases that now require decoding per mail 
> > > > headers today due to the str/bytes split, but these are just a 
> > > > normal artifact of supporting Unicode character sets in general,
> > > > ans seem like issues for package client to resolve (e.g., the bytes 
> > > > returned for decoded payloads in 3.X didn't play well with existing 
> > > > str-based text processing code written for 2.X).
> > > 
> > > I'm not following you here.  Can you give me some more specific
> > > examples?  Even if these "normal artifacts" must remain with
> > > the current API, I'd like to make things as easy as practical when
> > > using the new API.
> > > 
> > > Thanks for all your feedback!
> > > 
> > > --David
> > > 
> > 
> > 
> > 
> > 
> 


From ncoghlan at gmail.com  Wed Jun 16 23:47:27 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Jun 2010 07:47:27 +1000
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <6wwifklfk7n7tup216062010044853@SMTP>
References: <6wwifklfk7n7tup216062010044853@SMTP>
Message-ID: <AANLkTim4RjIE-BMNyZVZUFNPsslFvGR4zKv46mRDidXb@mail.gmail.com>

On Thu, Jun 17, 2010 at 6:48 AM,  <lutz at rmi.net> wrote:
> I trust that people realize this is a show-stopper for broader
> Python 3.X adoption. ?Why 3.0 was rolled out anyhow is beyond
> me; it seems that it would have been better if Python developers
> had gotten their own code to work with 3.X, before expecting the
> world at large to do so.
>
> FWIW, after rewriting Programming Python for 3.1, 3.x still feels
> a lot like a beta to me, almost 2 years after its release. ?How
> did this happen? ?Maybe nobody is using 3.X enough to care, but
> I have a feeling that issues like this are part of the reason why.
>
> No offense to people who obviously put in an incredible amount of
> work on 3.X. ?As someone who remembers 0.X, though, it's hard not
> to find the current situation a bit disappointing.

Agreed, but the binary/text distinction in 2.x (or rather, the lack
thereof) makes the unicode handling situation so hopelessly confused
that there is a lot of 2.x code (including in the standard library)
that silently mixes the two, often without really testing the
consequences (as clearly happened here).

3.x was rolled out anyway because the vast majority of it works.
Obviously people affected by the problems specific to the email
package and any other binary vs text parsing problems that are still
lingering are out of luck at the moment, but leaving 3.x sitting on a
shelf indefinitely would hardly have inspired anyone to clean it up.
My personal perspective is that a lot of that code was likely already
broken in hard to detect ways when dealing with mixed encodings -
releasing 3.x just made the associated errors significantly easier to
detect.

If we end up being able to add your email client code to the standard
library's unit test suite, that should help the situation immensely.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From barry at python.org  Thu Jun 17 17:43:29 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 17 Jun 2010 11:43:29 -0400
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <6wwifklfk7n7tup216062010044853@SMTP>
References: <6wwifklfk7n7tup216062010044853@SMTP>
Message-ID: <20100617114329.254db9ac@heresy>

On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote:

>Well, it looks like I've stumbled onto the "other shoe" on this
>issue--that the email package's problems are also apparently 
>behind the fact that CGI binary file uploads don't work in 3.1
>(http://bugs.python.org/issue4953).  Yikes.
>
>I trust that people realize this is a show-stopper for broader
>Python 3.X adoption.

We know it, we have extensively discussed how to fix it, we have IMO a good
design, and we even have someone willing and able to tackle the problem.  We
need to find a sufficient source of funding to enable him to do the work it
will take, and so far that's been the biggest stumbling block.  It will take a
focused and determined effort to see this through, and it's obvious that
volunteers cannot make it happen.  I include myself in the latter category, as
I've tried and failed at least twice to do it in my spare time.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/email-sig/attachments/20100617/db5d7425/attachment.pgp>

From brett at python.org  Thu Jun 17 21:24:54 2010
From: brett at python.org (Brett Cannon)
Date: Thu, 17 Jun 2010 12:24:54 -0700
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <20100617114329.254db9ac@heresy>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<20100617114329.254db9ac@heresy>
Message-ID: <AANLkTimiDfufv_7SaBFrsJb5HnFUur4wzLiqlA0Phk-X@mail.gmail.com>

On Thu, Jun 17, 2010 at 08:43, Barry Warsaw <barry at python.org> wrote:
> On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote:
>
>>Well, it looks like I've stumbled onto the "other shoe" on this
>>issue--that the email package's problems are also apparently
>>behind the fact that CGI binary file uploads don't work in 3.1
>>(http://bugs.python.org/issue4953). ?Yikes.
>>
>>I trust that people realize this is a show-stopper for broader
>>Python 3.X adoption.
>
> We know it, we have extensively discussed how to fix it, we have IMO a good
> design, and we even have someone willing and able to tackle the problem. ?We
> need to find a sufficient source of funding to enable him to do the work it
> will take, and so far that's been the biggest stumbling block. ?It will take a
> focused and determined effort to see this through, and it's obvious that
> volunteers cannot make it happen. ?I include myself in the latter category, as
> I've tried and failed at least twice to do it in my spare time.

And in general I think this is the reason some modules have not
transitioned as well as others: there are only so many of us. The
stdlib passes its test suite, but obviously some unit tests do not
cover enough of the code in the ways people need it covered.

As for using Python 3 for my code, I do and have since Python 3 became
more-or-less usable. I just happen to not work with internet-related
stuff in my day-to-day work.

Plus we have needed to maintain FOUR branches for a while. That is a
nasty time sink when you are having to port bug fixes and such. It
also means that python-dev has been focused on making sure Python 2.7
is a solid release instead of getting to focus on the stdlib in Python
3. This a nasty chicken-and-egg issue; we could ignore Python 2 and
focus on Python 3, but then the community would complain about us not
supporting the transition from 2 to 3 better, but obviously focusing
on 2 has led to 3 not getting enough TLC.

Once Python 2.7 is done and out the door the entire situation for
Python 3 should start to improve as python-dev as whole will have a
chance to begin to focus solely on Python 3.

From steve at holdenweb.com  Fri Jun 18 04:32:51 2010
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 18 Jun 2010 11:32:51 +0900
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <20100617114329.254db9ac@heresy>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<20100617114329.254db9ac@heresy>
Message-ID: <4C1ADAD3.9070808@holdenweb.com>

Barry Warsaw wrote:
> On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote:
> 
>> Well, it looks like I've stumbled onto the "other shoe" on this
>> issue--that the email package's problems are also apparently 
>> behind the fact that CGI binary file uploads don't work in 3.1
>> (http://bugs.python.org/issue4953).  Yikes.
>>
>> I trust that people realize this is a show-stopper for broader
>> Python 3.X adoption.
> 
> We know it, we have extensively discussed how to fix it, we have IMO a good
> design, and we even have someone willing and able to tackle the problem.  We
> need to find a sufficient source of funding to enable him to do the work it
> will take, and so far that's been the biggest stumbling block.  It will take a
> focused and determined effort to see this through, and it's obvious that
> volunteers cannot make it happen.  I include myself in the latter category, as
> I've tried and failed at least twice to do it in my spare time.
> 
> -Barry
> 
Lest the readership think that the PSF is unaware of this issue, allow
me to point out that we have already partially funded this effort, and
are still offering R. David Murray some further matching funds if he can
raise sponsorship to complete the effort (on which he has made a very
promising start).

We are also attempting to enable tax-deductible fund raising to increase
the likelihood of David's finding support. Perhaps we need to think
about a broader campaign to increase the quality of the python 3
libraries. I find it very annoying that the #python IRC group still has
"Don't use Python 3" in it's topic.  They adamantly refuse to remove it
until there is better library support, and they are the guys who see the
issues day in day out so it is hard to argue with them (and I don't
think an autocratic decision-making process would be appropriate).

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From stephen at xemacs.org  Fri Jun 18 07:52:17 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 18 Jun 2010 14:52:17 +0900
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <6wwifklfk7n7tup216062010044853@SMTP>
References: <6wwifklfk7n7tup216062010044853@SMTP>
Message-ID: <87d3volwfi.fsf@uwakimon.sk.tsukuba.ac.jp>

lutz at rmi.net writes:

 > FWIW, after rewriting Programming Python for 3.1, 3.x still feels
 > a lot like a beta to me, almost 2 years after its release.

Email, of course, is a big wart.  But guess what?  Python 2's email
module doesn't actually work!  Sure, the program runs most of the
time, but every program that depends on email must acquire inches of
armorplate against all the things that can go wrong.  You simply can't
rely on it to DTRT except in a pre-MIME, pre-HTML, ASCII-only world.
Although they're often addressing general problems, these hacks are
*not* integrated back into the email module in most cases, but remain
app-specific voodoo.

If you live in Kansas, sure, you can concentrate on dodging tornados
and completely forget about Unicode and MIME and text/bogus content.
For the rest of the world, though, the problem is not Python 3.  It's
STD 11 (which still points at RFC 822, dated 1982!)  It's really
inappropriate to point at the email module, whose developers are
trying *not* to punt on conformance and robustness, when even the IETF
can only "run in circles, scream and shout"!

Maybe there are other problems with Python 3 that deserve to be
pointed at, but given the general scarcity of resources I think the
email module developers are working on the right things.  Unlike many
other modules, email really needs to be rewritten from the ground
(Python 3) up, because of the centrality of bytes/unicode confusion to
all email problems.  Python 3 completely changes the assumptions
there; a Python 2-style email module really can't work properly.

Then on top of that, today we know a lot more about handling issues
like text/html content and MIME in general than when the Python 2
email module was designed.  New problems have arisen over the period
of Python 3 development, like "domain keys", which email doesn't
handle out of the box AFAIK, but email for Python 3 should IMHO.

Should Python 3 have been held back until email was fixed?  Dunno, but
I personally am very glad it was not; where I have a choice, I always
use Python 3 now, and have yet to run into a problem.  I expect that
to change if I can find the time to get involved in email and Mailman
3 development, of course.<wink>


From steve at holdenweb.com  Fri Jun 18 04:32:51 2010
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 18 Jun 2010 11:32:51 +0900
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <20100617114329.254db9ac@heresy>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<20100617114329.254db9ac@heresy>
Message-ID: <4C1ADAD3.9070808@holdenweb.com>

Barry Warsaw wrote:
> On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote:
> 
>> Well, it looks like I've stumbled onto the "other shoe" on this
>> issue--that the email package's problems are also apparently 
>> behind the fact that CGI binary file uploads don't work in 3.1
>> (http://bugs.python.org/issue4953).  Yikes.
>>
>> I trust that people realize this is a show-stopper for broader
>> Python 3.X adoption.
> 
> We know it, we have extensively discussed how to fix it, we have IMO a good
> design, and we even have someone willing and able to tackle the problem.  We
> need to find a sufficient source of funding to enable him to do the work it
> will take, and so far that's been the biggest stumbling block.  It will take a
> focused and determined effort to see this through, and it's obvious that
> volunteers cannot make it happen.  I include myself in the latter category, as
> I've tried and failed at least twice to do it in my spare time.
> 
> -Barry
> 
Lest the readership think that the PSF is unaware of this issue, allow
me to point out that we have already partially funded this effort, and
are still offering R. David Murray some further matching funds if he can
raise sponsorship to complete the effort (on which he has made a very
promising start).

We are also attempting to enable tax-deductible fund raising to increase
the likelihood of David's finding support. Perhaps we need to think
about a broader campaign to increase the quality of the python 3
libraries. I find it very annoying that the #python IRC group still has
"Don't use Python 3" in it's topic.  They adamantly refuse to remove it
until there is better library support, and they are the guys who see the
issues day in day out so it is hard to argue with them (and I don't
think an autocratic decision-making process would be appropriate).

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From arcriley at gmail.com  Fri Jun 18 05:16:47 2010
From: arcriley at gmail.com (Arc Riley)
Date: Thu, 17 Jun 2010 23:16:47 -0400
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <4C1ADAD3.9070808@holdenweb.com>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<20100617114329.254db9ac@heresy> <4C1ADAD3.9070808@holdenweb.com>
Message-ID: <AANLkTilSNPgK2GnutYbxKdIPWZY82v6LFus3fzat7Fzk@mail.gmail.com>

David and his Google Summer of Code student, Shashwat Anand.

You can read Shashwat's weekly progress updates at http://l0nwlf.in/ or
subscribe to http://twitter.com/l0nwlf for more micro updates.

We have more than 30 paid students working on Python 3 tasks this year, most
of them participating under the PSF umbrella but also a few with 3rd party
projects such as Mercurial porting those various packages to Py3.

Given all this "on the horizon" work, I think the Py3 package situation will
look a lot brighter by Python 3.2's release.


On Thu, Jun 17, 2010 at 10:32 PM, Steve Holden <steve at holdenweb.com> wrote:

>
> Lest the readership think that the PSF is unaware of this issue, allow
> me to point out that we have already partially funded this effort, and
> are still offering R. David Murray some further matching funds if he can
> raise sponsorship to complete the effort (on which he has made a very
> promising start).
>
> We are also attempting to enable tax-deductible fund raising to increase
> the likelihood of David's finding support.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/email-sig/attachments/20100617/1db6a30e/attachment.html>

From barry at python.org  Fri Jun 18 15:45:57 2010
From: barry at python.org (Barry Warsaw)
Date: Fri, 18 Jun 2010 09:45:57 -0400
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <4C1ADAD3.9070808@holdenweb.com>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<20100617114329.254db9ac@heresy> <4C1ADAD3.9070808@holdenweb.com>
Message-ID: <20100618094557.77a07994@heresy>

On Jun 18, 2010, at 11:32 AM, Steve Holden wrote:

>Lest the readership think that the PSF is unaware of this issue, allow
>me to point out that we have already partially funded this effort, and
>are still offering R. David Murray some further matching funds if he can
>raise sponsorship to complete the effort (on which he has made a very
>promising start).

Right, sorry, I didn't mean to imply the PSF isn't doing anything.  More that
we need a coordinated effort among all the companies and organizations that
use Python to help fund Python 3 library development (and not just in the
stdlib).  I think the PSF is best suited to coordinating and managing those
efforts, and through its tax-exempt status, collecting and distributing
donations specifically targeted to Python 3 work.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/email-sig/attachments/20100618/78d9c318/attachment.pgp>

From lutz at rmi.net  Fri Jun 18 17:09:40 2010
From: lutz at rmi.net (lutz at rmi.net)
Date: Fri, 18 Jun 2010 15:09:40 -0000
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
Message-ID: <cvsjrr4t84x35d3418062010110947@SMTP>

Replying en masse to save bandwidth here...

Barry Warsaw <barry at python.org> writes:
> We know it, we have extensively discussed how to fix it, we have IMO a good
> design, and we even have someone willing and able to tackle the problem.  We
> need to find a sufficient source of funding to enable him to do the work it
> will take, and so far that's been the biggest stumbling block.  It will take a
> focused and determined effort to see this through, and it's obvious that
> volunteers cannot make it happen.  I include myself in the latter category, as
> I've tried and failed at least twice to do it in my spare time.

All understood, and again, not to disparage anyone here.  My 
comments are directed to the development community at large
to underscore the grave p/r problems 3.X faces.

I realize email parsing is a known issue; I also realize that
most people evaluating 3.X today won't care that it is.  Most
will care only that the new version of a language reportedly 
used by Google and YouTube still doesn't support CGI uploads 
a year and a half after its release.  As an author, that's a 
downright horrible story to have to tell the world.


"Stephen J. Turnbull" <stephen at xemacs.org> writes:
> Email, of course, is a big wart.  But guess what?  Python 2's email
> module doesn't actually work! 

Yes it does (see next point).

> If you live in Kansas, sure, you can concentrate on dodging tornados
> and completely forget about Unicode and MIME and text/bogus content.
> For the rest of the world, though, the problem is not Python 3

Yes it is, and Kansas is a lot bigger than you seem to think.

I want to reiterate that I was able to build a feature rich
email client with the email package as it exists in 3.1.  This
includes support on both the receiving and sending sides for HTML,
arbitrary attachments, and decoding and encoding of both text 
payloads and headers according to email, MIME, and Unicode/I18N
standards.  It's an amazingly useful package, and does work as is
in 3.X.  The two main issues I found have been recently fixed.  
It's unfortunate that this package is also the culprit behind CGI
breakage, but it's not clear why it became a critical path for so
much utility in the first place.

The package might not be aesthetically ideal, but to me it 
seems that an utterly incompatible overhaul of this in the name
of supporting potentially very different data streams is a huge
functional overload.  And to those people in Kansas who live 
outside the pydev clique, replacing it with something different 
at this point will look as if an incompatible Python is already 
incompatible with releases in its own line.  Why in the world 
would anyone base a new project on that sort of thrashing?

For my part, I've had to add far too many notes to the upcoming
edition of Programming Python about major pieces of functionality
that worked in 2.X but no longer do in 3.X.  That's disappointing
to me personally, but it will probably seem a lot worse to the
book's tens of thousands of readers.  Yet this is the reality 
that 3.X has created for itself.

> Should Python 3 have been held back until email was fixed?  Dunno, but
> I personally am very glad it was not; where I have a choice, I always
> use Python 3 now, and have yet to run into a problem. 

I guess we'll just have to disagree on that.  IMHO, Python 3 shot
itself in the foot by releasing in half-baked form.  And the 3.0 
I/O speed issue (remember that?) came very close to blowing its 
leg clean off.

The reality out there in Kansas today is that 3.X is perceived as 
so bad that it could very well go the way of POP4 if its story does
not improve.  I don't know what sort of Python world will be left
behind in the wake, but I do know it will probably be much smaller.


Steve Holden <steve at holdenweb.com> writes:
> Lest the readership think that the PSF is unaware of this issue, allow
> me to point out that we have already partially funded this effort, and
> are still offering R. David Murray some further matching funds if he can
> raise sponsorship to complete the effort (on which he has made a very
> promising start).
> 
> We are also attempting to enable tax-deductible fund raising to increase
> the likelihood of David's finding support. Perhaps we need to think
> about a broader campaign to increase the quality of the python 3
> libraries. I find it very annoying that the #python IRC group still has
> "Don't use Python 3" in it's topic.  They adamantly refuse to remove it
> until there is better library support, and they are the guys who see the
> issues day in day out so it is hard to argue with them (and I don't
> think an autocratic decision-making process would be appropriate).

I'm all for people getting paid for work they do, but with all
due respect, I think this underscores part of the problem in 
the Python world today.  If funding had been as stringent a 
prerequisite in the 90s, I doubt there would be a Python today.
It was about the fun and the code, not the bucks and the 
bureaucracy.  As far as I can recall, there was no notion of 
creating a task force to get things done.

Of course, this may just be the natural evolutionary pattern of 
human enterprises.  As it is today, though, the Python community 
has a formal diversity statement, but it still does not have a 
fully functional 3.X almost two years after the fact.  I doubt
that I'm the only one who sees the irony in that.

Again, I mean no disrespect to people contributing to Python 
today on so many fronts, and I have no answers to offer here. 
For better or worse, though, this is a personal issue to me too.
After spending much of the last 2 years updating the best selling 
Python books for all the changes this group has seen fit to make, 
I believe I can say with some authority that 3.X still faces a
very uncertain future.

--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)


From lutz at rmi.net  Fri Jun 18 19:22:10 2010
From: lutz at rmi.net (lutz at rmi.net)
Date: Fri, 18 Jun 2010 17:22:10 -0000
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
Message-ID: <h3sa87mevl05p5ro18062010012216@SMTP>

> Python 3.0 was *declared* to be an experimental release, and by most 
> standards 3.1 (in terms of the core language and functionality) was a 
> solid release.
> 
> Any reasonable expectation about Python 3 adoption predicted that it 
> would take years, and would include going through a phase of difficulty 
> and disappointment...

Declaring something to be a turd doesn't change the fact that
it's a turd.  I have a feeling that most people outside this
list would have much rather avoided the difficulty and 
disappointment altogether.

Let's be honest here; 3.X was released to the community in part 
as an extended beta.  That's not a problem, unless you drop the 
word "beta".  And if you're still not buying that, imagine the sort
of response you'd get if you tried to sell software that billed 
itself as "experimental", and promised a phase of "disappointment".  
Why would you expect the Python world to react any differently?

> Whilst I agree that there are plenty of issues to workon, and I don't 
> underestimate the difficulty of some of them, I think "half-baked" is 
> very much overblown. Whilst you have a lot to say about how much of a 
> problem this is I don't understand what you are suggesting be *done*?

I agree that 3.X isn't all bad, and I very much hope it succeeds.  And 
no, I have no answers; I'm just reporting the perception from downwind.

So here it is: The prevailing view is that 3.X developers hoisted things
on users that they did not fully work through themselves.  Unicode is 
prime among these: for all the talk here about how 2.X was broken in 
this regard, the implications of the 3.X string solution remain to be
fully resolved in the 3.X standard library to this day.  What is a 
common Python user to make of that?

--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)


From fuzzyman at voidspace.org.uk  Fri Jun 18 17:31:09 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 18 Jun 2010 16:31:09 +0100
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <cvsjrr4t84x35d3418062010110947@SMTP>
References: <cvsjrr4t84x35d3418062010110947@SMTP>
Message-ID: <4C1B913D.60401@voidspace.org.uk>

On 18/06/2010 16:09, lutz at rmi.net wrote:
> Replying en masse to save bandwidth here...
>
> Barry Warsaw<barry at python.org>  writes:
>    
>> We know it, we have extensively discussed how to fix it, we have IMO a good
>> design, and we even have someone willing and able to tackle the problem.  We
>> need to find a sufficient source of funding to enable him to do the work it
>> will take, and so far that's been the biggest stumbling block.  It will take a
>> focused and determined effort to see this through, and it's obvious that
>> volunteers cannot make it happen.  I include myself in the latter category, as
>> I've tried and failed at least twice to do it in my spare time.
>>      
> All understood, and again, not to disparage anyone here.  My
> comments are directed to the development community at large
> to underscore the grave p/r problems 3.X faces.
>
> I realize email parsing is a known issue; I also realize that
> most people evaluating 3.X today won't care that it is.  Most
> will care only that the new version of a language reportedly
> used by Google and YouTube still doesn't support CGI uploads
> a year and a half after its release.  As an author, that's a
> downright horrible story to have to tell the world.
>
>    

Really? How widely used is the CGI module these days? Maybe there is a 
reason nobody appeared to notice...


> [snip...]
>> Should Python 3 have been held back until email was fixed?  Dunno, but
>> I personally am very glad it was not; where I have a choice, I always
>> use Python 3 now, and have yet to run into a problem.
>>      
> I guess we'll just have to disagree on that.  IMHO, Python 3 shot
> itself in the foot by releasing in half-baked form.  And the 3.0
> I/O speed issue (remember that?) came very close to blowing its
> leg clean off.
>
>    

Whilst I agree that there are plenty of issues to workon, and I don't 
underestimate the difficulty of some of them, I think "half-baked" is 
very much overblown. Whilst you have a lot to say about how much of a 
problem this is I don't understand what you are suggesting be *done*?

Python 3.0 was *declared* to be an experimental release, and by most 
standards 3.1 (in terms of the core language and functionality) was a 
solid release.

Any reasonable expectation about Python 3 adoption predicted that it 
would take years, and would include going through a phase of difficulty 
and disappointment...

All the best,

Michael Foord

-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From pje at telecommunity.com  Fri Jun 18 22:48:21 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 18 Jun 2010 16:48:21 -0400
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <h3sa87mevl05p5ro18062010012216@SMTP>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
Message-ID: <20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>

At 05:22 PM 6/18/2010 +0000, lutz at rmi.net wrote:
>So here it is: The prevailing view is that 3.X developers hoisted things
>on users that they did not fully work through themselves.  Unicode is
>prime among these: for all the talk here about how 2.X was broken in
>this regard, the implications of the 3.X string solution remain to be
>fully resolved in the 3.X standard library to this day.  What is a
>common Python user to make of that?

Certainly, this was my impression as well, after all the Web-SIG 
discussions regarding the state of the stdlib in 3.x with respect to 
URL parsing, joining, opening, etc.

To be honest, I'm waiting to see some sort of tutorial(s) for using 
3.x that actually addresses these kinds of stdlib usage issues, so 
that I don't have to think about it or futz around with 
experimenting, possibly to find that some things can't be done at all.

IOW, 3.x has broken TOOOWTDI for me in some areas.  There may be 
obvious ways to do it, but, as per the Zen of Python, "that way may 
not be obvious at first unless you're Dutch".  ;-)
Since at the moment Python 3 offers me only cosmetic improvements 
over 2.x (apart from argument annotations), it's hard to get excited 
enough about it to want to muck about with porting anything to it, or 
even trying to learn about all the ramifications of the changes.  :-(


From jnoller at gmail.com  Fri Jun 18 23:02:09 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Fri, 18 Jun 2010 17:02:09 -0400
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>
Message-ID: <AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>

On Fri, Jun 18, 2010 at 4:48 PM, P.J. Eby <pje at telecommunity.com> wrote:
> At 05:22 PM 6/18/2010 +0000, lutz at rmi.net wrote:
>>
>> So here it is: The prevailing view is that 3.X developers hoisted things
>> on users that they did not fully work through themselves. ?Unicode is
>> prime among these: for all the talk here about how 2.X was broken in
>> this regard, the implications of the 3.X string solution remain to be
>> fully resolved in the 3.X standard library to this day. ?What is a
>> common Python user to make of that?
>
> Certainly, this was my impression as well, after all the Web-SIG discussions
> regarding the state of the stdlib in 3.x with respect to URL parsing,
> joining, opening, etc.

Nothing is set in stone; if something is incredibly painful, or worse
yet broken, then someone needs to file a bug, bring it to this list,
or bring up a patch. This is code we're talking about - nothing is set
in stone, and if something is criminally broken it needs to be first
identified, and then fixed.

> To be honest, I'm waiting to see some sort of tutorial(s) for using 3.x that
> actually addresses these kinds of stdlib usage issues, so that I don't have
> to think about it or futz around with experimenting, possibly to find that
> some things can't be done at all.

I guess tutorial welcome, rather than patch welcome then ;)

> IOW, 3.x has broken TOOOWTDI for me in some areas. ?There may be obvious
> ways to do it, but, as per the Zen of Python, "that way may not be obvious
> at first unless you're Dutch". ?;-)

What areas. We need specifics which can either be:

1> Shot down.
2> Turned into bugs, so they can be fixed
3> Documented in the core documentation.

jesse

From nyamatongwe at gmail.com  Sat Jun 19 00:31:40 2010
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Sat, 19 Jun 2010 08:31:40 +1000
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <4C1B913D.60401@voidspace.org.uk>
References: <cvsjrr4t84x35d3418062010110947@SMTP>
	<4C1B913D.60401@voidspace.org.uk>
Message-ID: <AANLkTimhJqiLdapFKKOD9OT9YBWC0BYNnS2D_si8ruDV@mail.gmail.com>

Michael Foord:

> Python 3.0 was *declared* to be an experimental release, and by most
> standards 3.1 (in terms of the core language and functionality) was a solid
> release.

   That looks to me like an after-the-event rationalization. The
release note for Python 3.0 (and the "What's new") gives no indication
that it is experimental but does say """
We are confident that Python 3.0 is of the same high quality as our
previous releases ...
you can safely choose either version (or both) to use in your projects. """
http://mail.python.org/pipermail/python-dev/2008-December/083824.html

   Neil

From stephen at xemacs.org  Sat Jun 19 15:55:29 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 19 Jun 2010 22:55:29 +0900
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <h3sa87mevl05p5ro18062010012216@SMTP>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
Message-ID: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>

lutz at rmi.net writes:

 > I agree that 3.X isn't all bad, and I very much hope it succeeds.  And 
 > no, I have no answers; I'm just reporting the perception from downwind.

The fact is, though, that many of your "downwind" readers are not the
audience for Python 3, not yet.  If you want to do Python 3 a favor,
make sure that they understand that Python 3 is *not* an "upgrade" of
Python 2.  It's a hard task for you, but IMO one strategy is to write
in the style that we wrote the DVCS PEP (#374) in: here's how you do
the same task in these similar languages.  And just as git and Bazaar
turned out to have fatal defects in terms of adoption *in that time
frame*, Python 3 is not yet adoptable for many, many users.

Python 3 is a Python-2-like language, but even though it's built on
the same design principles, and uses nearly identical syntax, there
are fundamental differences.  And it is *very* young.  So it's a new
language and should be approached in the same way as any new language.
Try it on non-mission critical projects, on projects where its library
support has a good reputation, etc.  Many of your readers have no time
(or perhaps no approval "from upstairs") for that kind of thing.  Too
bad, but that's what happens to every great new language.

 > So here it is: The prevailing view is that 3.X developers hoisted things
 > on users that they did not fully work through themselves.  Unicode is 
 > prime among these: for all the talk here about how 2.X was broken in 
 > this regard, the implications of the 3.X string solution remain to be
 > fully resolved in the 3.X standard library to this day.  What is a 
 > common Python user to make of that?

Why should she make anything of that?  Python 3 is a *new* language,
possibly as different from Python 2 as C++ was from C (and *more*
different in terms of fundamental incompatibilities).  And as long as
C++ was almost entirely dependent on C libraries, there were problems.
(Not to mention that even today there are plenty of programmers who
are proud to be C programmers, not C++ programmers.)  Today, Python 3
is entirely dependent on Python 2 libraries.  It's human to hope there
will be no problems, but not realistic.

BTW, I think what you're missing is that you're wrong about the money.
Python 3 is still about the fun and the code.  "Fun and code" are why
the core developers spent about five years developing it, because
doing that was fun, because the new code has high value as code, and
because it promised *them* a more fun and more productive future.

Library support, on the other hand, *is* about money.  Your readers,
down in the trenches of WWW, intraweb, and sysadmin implementation and
support, depend on robust libraries to get their day jobs done.  They
really don't care that writing Python 3 was fun, and that programming
in Python 3 is more fun than ever.  That doesn't compensate for even
one lingering str/bytes bogosity to most of them, and since they don't
get paid for fixing Python library bugs, they don't, and they're in no
mood to *forgive* any, either.

So tell users who feel that way to use Python 2, for now, and check on
Python 3 progress every 6 months or so.  And users who are just a bit
more adventurous to stick to applications where the libraries already
have a good reputation *in Python 3*.  It's as simple as that, I think.

Regards,


From pje at telecommunity.com  Sat Jun 19 18:07:43 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sat, 19 Jun 2010 12:07:43 -0400
Subject: [Email-SIG] [Python-Dev] email package status in 3.X
In-Reply-To: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100619160755.4E50C3A4060@sparrow.telecommunity.com>

At 10:55 PM 6/19/2010 +0900, Stephen J. Turnbull wrote:
>They really don't care that writing Python 3 was fun, and that 
>programming in Python 3 is more fun than ever.  That doesn't 
>compensate for even one lingering str/bytes bogosity to most of 
>them, and since they don't get paid for fixing Python library bugs, 
>they don't, and they're in no mood to *forgive* any, either.

This is pretty much where I'm at, except that the only potential fun 
increase Py3 appears to offer me are argument annotations and 
keyword-only args -- but these are partly balanced by the loss of 
argument tuple unpacking.  The metaclass keyword argument is nice, 
but the loss of dynamically-settable __metaclass__ is just plain annoying.

Really, just about everything that Py3 offers in the way of added 
fun, seems offset by a matching loss somewhere else.  So it's hard to 
get excited about it - it seems like, "ho hum, a new language that's 
kind of like Python, but just different enough to be annoying."

OTOH, I don't know what to do about that, besides adding some sort of 
"killer app" feature that makes Python 3 the One Obvious Way to do 
some specific application domain.