From rdmurray at bitdance.com Fri Jun 4 18:39:50 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 04 Jun 2010 12:39:50 -0400 Subject: [Email-SIG] email package status in 3.X Message-ID: <20100604163950.5AFE1217FD6@kimball.webabinitio.net> On Mon May 10 20:02:46 CEST 2010 Mark Lutz wrote: > I'm probably going to have to go ahead and finish the book > with the email package as it is now, and include a lot of > caveats about the problems that a new version may fix in the > future. I can also post updated example code if/when possible. > > I realize everybody on this list probably knows this already, > but email in 3.X not only doesn't support the Unicode/bytes > dichotomy, it was also broken by it. Beyond the pre-parse > decode issue, its mail text generation really only works for > all-text mails. Generating text of an email with any sort of > binary part doesn't work at all now, because the base64 text > is still bytes, and the Generator expects str. I've coded a > custom encoder to pass to MIMEImage that works around this > by decoding to ASCII, but it's not a great story to have to > tell the tens of thousands of readers of this book, many of > whom will be evaluating 3.X in general. This bug should now be fixed in both the py3k branch and the 3.1 maint branch. This means the fix will be in 3.1.3, as well as 3.2a1. Hopefully that will be in time for your book, since 3.2a1 is due June 27th and I'm guessing the 3.1.3 release will be some time not too far off that time frame as well. FYI I also fixed a related bug that made using utf-8 as a charset problematic. Unfortunately I suspect there maybe some other charset issues waiting to be discovered. If you have come across any other bugs that don't already have issues in the tracker please file bug reports. Anything that can be fixed in the current package I will endeavor to fix before the next release. Feel free also to indicate bugs which should be given priority. -- R. David Murray www.bitdance.com From lutz at rmi.net Thu Jun 10 15:21:52 2010 From: lutz at rmi.net (lutz at rmi.net) Date: Thu, 10 Jun 2010 09:21:52 -0400 (GMT-04:00) Subject: [Email-SIG] email package status in 3.X Message-ID: <5681323.1276176113106.JavaMail.root@elwamui-little.atl.sa.earthlink.net> Thanks, David; that's great news. I'll update the book draft accordingly. For the record, despite the issues, I was able to complete a fairly full-featured email client GUI with the email package as it currently is. This includes parsing and generating arbitrary attachments, as well as encoding on sends and decoding on fetches for both text payloads and I18N mail headers. The package is still quite powerful as is. It does take a bit of digging to figure out how to use its many tools, but the book will probably help on this front, especially the upcoming edition's more complete application. In other words, some of my concern may have been a bit premature. I hope that in the future we'll either strive for compatibility or keep the current version around; it's a lot of very useful code. In fact, I recommend that any new email package be named distinctly, and that the current package be retained for a number of releases to come. After all the breakages that 3.X introduced in general, doing the same to any email-based code seems a bit too much, especially given that the current package is largely functional as is. To me, after having just used it extensively, fixing its few issues seems a better approach than starting from scratch. As far as other issues, the things I found are described below my signature. I don't know what the utf-8 issue is that you refer too; I'm able to parse and send with this encoding as is without problems (both payloads and headers), but I'm probably not using the interfaces you fixed, and this may be the same as one of item listed. Another thought: it might be useful to use the book's email client as a sort of test case for the package; it's much more rigorous in the new edition because it now has to be given 3.X'Unicode model (it's abut 4,900 lines of code, though not all is email-related). I'd be happy to donate the code as soon as I find out what the copyright will be this time around; it will be at O'Reilly's site this Fall in any event. Thanks, --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) Major issues I found... ------------------------------------------------------------------ 1) Str required for parsing, but bytes returned from poplib The initial decode from bytes to str of full mail text; in retrospect, probably not a major issue, since original email standards called for ASCII. A 8-bit encoding like Latin-1 is probably sufficient for most conforming mails. For the book, I try a set of different encodings, beginning with an optional configuration module setting, then ascii, latin-1, and utf-8; this is probably overkill, but a GUI has to be defensive. ---------------------------------------------------------------- 2) Binary attachments encoding The binary attachments byte-to-str issue that you've just fixed. As I mentioned, I worked around this by passing in a custom encoder that calls the original and runs an extra decode step. Here's what my fix looked like in the book; your patch may do better, and I will minimally add a note about the 3.1.3 and 3.2 fix for this: def fix_encode_base64(msgobj): from email.encoders import encode_base64 encode_base64(msgobj) # what email does normally: leaves bytes bytes = msgobj.get_payload() # bytes fails in email pkg on text gen text = bytes.decode('ascii') # decode to unicode str so text gen works ...plus line splitting logic omitted... msgobj.set_payload('\n'.join(lines)) >>> from email.mime.image import MIMEImage >>> from mailtools.mailSender import fix_encode_base64 # use custom workaround >>> bytes = open('monkeys.jpg', 'rb').read() >>> m = MIMEImage(bytes, _encoder=fix_encode_base64) # convert to ascii str >>> print(m.as_string()[:500]) ------------------------------------------------------------------- 3) Type-dependent text part encoding There's a str/bytes confusion issue related to Unicode encodings in text payload generation: some encodings require the payload to be str, but others expect bytes. Unfortunately, this means that clients need to know how the package will react to the encoding that is used, and special-case based upon that. For example, I needed to pass in str for ASCII and Latin-1 (the former is unencoded and the latter gets QP MIME treatment), but must pass a bytes for UTF-8 (which triggers Base64). That's less than ideal for a client trying to attach arbitrary text parts generically from filenames. Here's the obscure workaround I came up with; the bodytext is str when fetched from an edit window, but may also be loaded from an attachment file. This may or may not have been reported, and it's entirley possible that there's a better solution that I've missed. def fix_text_required(encodingname): """ 4E: workaround for str/bytes combinaton errors in email package; MIMEText requires different types for different Unicode encodings in Python 3.1, due to the different ways it MIME-encodes some types of text; see Chapter 13; the only other alternative is using generic Message and repeating much code; """ from email.charset import Charset, BASE64, QP charset = Charset(encodingname) # how email knows what to do for encoding bodyenc = charset.body_encoding # utf8, others require bytes input data return bodyenc in (None, QP) # ascii, latin1, others require str # on mail sends... # email needs either str xor bytes specifically; if fix_text_required(bodytextEncoding): if not isinstance(bodytext, str): bodytext = bodytext.decode(bodytextEncoding) else: if not isinstance(bodytext, bytes): bodytext = bodytext.encode(bodytextEncoding) # later msg.set_payload(bodytext, charset=bodytextEncoding) ...or... msg = MIMEText(bodytext, _charset=bodytextEncoding) mainmsg.attach(msg) # attachments # build sub-Message of appropriate kind maintype, subtype = contype.split('/', 1) if maintype == 'text': # 4E: text needs encoding if fix_text_required(fileencode): # requires str or bytes data = open(filename, 'r', encoding=fileencode) else: data = open(filename, 'rb') msg = MIMEText(data.read(), _subtype=subtype, _charset=fileencode) data.close() ------------------------------------------------------------------- There are some additional cases that now require decoding per mail headers today due to the str/bytes split, but these are just a normal artifact of supporting Unicode character sets in general, ans seem like issues for package client to resolve (e.g., the bytes returned for decoded payloads in 3.X didn't play well with existing str-based text processing code written for 2.X). ------------------------------------------------------------------- -----Original Message----- >From: "R. David Murray" >Sent: Jun 4, 2010 12:39 PM >To: lutz at rmi.net >Cc: email-sig at python.org >Subject: email package status in 3.X > >On Mon May 10 20:02:46 CEST 2010 Mark Lutz wrote: >> I'm probably going to have to go ahead and finish the book >> with the email package as it is now, and include a lot of >> caveats about the problems that a new version may fix in the >> future. I can also post updated example code if/when possible. >> >> I realize everybody on this list probably knows this already, >> but email in 3.X not only doesn't support the Unicode/bytes >> dichotomy, it was also broken by it. Beyond the pre-parse >> decode issue, its mail text generation really only works for >> all-text mails. Generating text of an email with any sort of >> binary part doesn't work at all now, because the base64 text >> is still bytes, and the Generator expects str. I've coded a >> custom encoder to pass to MIMEImage that works around this >> by decoding to ASCII, but it's not a great story to have to >> tell the tens of thousands of readers of this book, many of >> whom will be evaluating 3.X in general. > >This bug should now be fixed in both the py3k branch and the 3.1 >maint branch. This means the fix will be in 3.1.3, as well as 3.2a1. >Hopefully that will be in time for your book, since 3.2a1 is due June >27th and I'm guessing the 3.1.3 release will be some time not too far >off that time frame as well. FYI I also fixed a related bug that made >using utf-8 as a charset problematic. Unfortunately I suspect there >maybe some other charset issues waiting to be discovered. > >If you have come across any other bugs that don't already have >issues in the tracker please file bug reports. Anything that >can be fixed in the current package I will endeavor to fix >before the next release. Feel free also to indicate bugs which >should be given priority. > >-- >R. David Murray www.bitdance.com From rdmurray at bitdance.com Thu Jun 10 16:18:48 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 10 Jun 2010 10:18:48 -0400 Subject: [Email-SIG] email package status in 3.X In-Reply-To: <5681323.1276176113106.JavaMail.root@elwamui-little.atl.sa.earthlink.net> References: <5681323.1276176113106.JavaMail.root@elwamui-little.atl.sa.earthlink.net> Message-ID: <20100610141848.E84181FCC52@kimball.webabinitio.net> On Thu, 10 Jun 2010 09:21:52 -0400, lutz at rmi.net wrote: > In other words, some of my concern may have been a bit premature. > I hope that in the future we'll either strive for compatibility > or keep the current version around; it's a lot of very useful code. The plan is to have a compatibility layer that will accept calls based on the old API and forward appropriately to the new API. So far I'm thinking I can succeed in doing this in a fairly straightforward manner, but I won't know for sure until I get some more pieces in place. > In fact, I recommend that any new email package be named distinctly, I'm going to avoid that if I can (though the PyPI package will be named email6 when we publish it for public testing). If, however, it turns out that I can't correctly support both the old and the new API, then I'll have to do that. > and that the current package be retained for a number of releases to > come. After all the breakages that 3.X introduced in general, doing > the same to any email-based code seems a bit too much, especially > given that the current package is largely functional as is. To me, > after having just used it extensively, fixing its few issues seems > a better approach than starting from scratch. Well, the thing is, as you found, existing 2.x code needs to be fixed to correctly handle the distinction between strings and bytes no matter what. The goal is to make it easier to write correct programs, while providing the compatibility layer to make porting smoother. But I doubt that any non-trivial 2.x email program will port without significant changes, even if the compatibility layer is close to 100% compatible with the current Python3 email package, simply because the previous conflation of text and bytes must be untangled in order to work correctly in Python3, and email involves lots of transitions between text and bytes. As for "starting from scratch", it is true that the current plan involves considerable changes in the recommended API (in the direction of greater flexibility and power), but I'm hoping that significant portions of the code will carry forward with minor changes, and that this will make it easier to support the old API. > As far as other issues, the things I found are described below my > signature. I don't know what the utf-8 issue is that you refer > too; I'm able to parse and send with this encoding as is without > problems (both payloads and headers), but I'm probably not using the > interfaces you fixed, and this may be the same as one of item listed. It is, see below. > Another thought: it might be useful to use the book's email client > as a sort of test case for the package; it's much more rigorous in > the new edition because it now has to be given 3.X'Unicode model > (it's abut 4,900 lines of code, though not all is email-related). > I'd be happy to donate the code as soon as I find out what the > copyright will be this time around; it will be at O'Reilly's site > this Fall in any event. That would be great. I am planning to write my own sample ap to demonstrate the new API, but if I can use yours to test the compatibility layer that will help a lot, since I otherwise have no Python3 email application to test against unless I port something from Python2. > Major issues I found... > ------------------------------------------------------------------ > 1) Str required for parsing, but bytes returned from poplib > > The initial decode from bytes to str of full mail text; in > retrospect, probably not a major issue, since original email > standards called for ASCII. A 8-bit encoding like Latin-1 is > probably sufficient for most conforming mails. For the book, > I try a set of different encodings, beginning with an optional > configuration module setting, then ascii, latin-1, and utf-8; > this is probably overkill, but a GUI has to be defensive. This works (mostly) for conforming email, but some important Python email applications need to deal with non-conforming email. That's where the inability to parse bytes directly really causes problems. > 2) Binary attachments encoding > > The binary attachments byte-to-str issue that you've just > fixed. As I mentioned, I worked around this by passing in a > custom encoder that calls the original and runs an extra decode > step. Here's what my fix looked like in the book; your patch > may do better, and I will minimally add a note about the 3.1.3 > and 3.2 fix for this: Yeah, our patch was a lot simpler since we could fix the encoding inside the loop producing the encoded lines :) > 3) Type-dependent text part encoding > > There's a str/bytes confusion issue related to Unicode encodings > in text payload generation: some encodings require the payload to > be str, but others expect bytes. Unfortunately, this means that > clients need to know how the package will react to the encoding > that is used, and special-case based upon that. This was the UTF-8 bug I fixed. I shouldn't have called it "the UTF-8 bug", because it applies equally to the other charsets that use base64, as you note. I called it that because UTF-8 was where the problem was noticed and is mentioned in the title of the bug report. I had a suspicion that the quoted-printable encoding wasn't being done correctly either, so to hear that it is working for you is good news. There may still be bugs to find there, though. So, in the next releases of Python all MIMEText input should be string, and it will fail if you pass bytes. I consider this as email previously not living up to its published API, but do you think I should hack in a way for it to accept bytes too, for backward compatibility in the 3 line? > There are some additional cases that now require decoding per mail > headers today due to the str/bytes split, but these are just a > normal artifact of supporting Unicode character sets in general, > ans seem like issues for package client to resolve (e.g., the bytes > returned for decoded payloads in 3.X didn't play well with existing > str-based text processing code written for 2.X). I'm not following you here. Can you give me some more specific examples? Even if these "normal artifacts" must remain with the current API, I'd like to make things as easy as practical when using the new API. Thanks for all your feedback! --David From barry at python.org Thu Jun 10 16:42:14 2010 From: barry at python.org (Barry Warsaw) Date: Thu, 10 Jun 2010 10:42:14 -0400 Subject: [Email-SIG] email package status in 3.X In-Reply-To: <20100610141848.E84181FCC52@kimball.webabinitio.net> References: <5681323.1276176113106.JavaMail.root@elwamui-little.atl.sa.earthlink.net> <20100610141848.E84181FCC52@kimball.webabinitio.net> Message-ID: <20100610104214.2bdd8f48@heresy> On Jun 10, 2010, at 10:18 AM, R. David Murray wrote: >That would be great. I am planning to write my own sample ap to >demonstrate the new API, but if I can use yours to test the compatibility >layer that will help a lot, since I otherwise have no Python3 email >application to test against unless I port something from Python2. I would support/help with a port of Mailman 3 to Python 3. It's non-trivial, but would make a good test case. The dependency stack may make that difficult. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From rdmurray at bitdance.com Thu Jun 10 17:35:07 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 10 Jun 2010 11:35:07 -0400 Subject: [Email-SIG] email package status in 3.X In-Reply-To: <20100610104214.2bdd8f48@heresy> References: <5681323.1276176113106.JavaMail.root@elwamui-little.atl.sa.earthlink.net> <20100610141848.E84181FCC52@kimball.webabinitio.net> <20100610104214.2bdd8f48@heresy> Message-ID: <20100610153507.A62C31FCB5A@kimball.webabinitio.net> On Thu, 10 Jun 2010 10:42:14 -0400, Barry Warsaw wrote: > On Jun 10, 2010, at 10:18 AM, R. David Murray wrote: > > >That would be great. I am planning to write my own sample ap to > >demonstrate the new API, but if I can use yours to test the compatibility > >layer that will help a lot, since I otherwise have no Python3 email > >application to test against unless I port something from Python2. > > I would support/help with a port of Mailman 3 to Python 3. It's > non-trivial, but would make a good test case. The dependency stack may make > that difficult. I realized after I sent that email that I should have said "until", since that's one of the testing goals (seeing how applications port both to the compatibility and to the new API). Mailman is at the top of the list of test ports, but as you say dependencies may have to be dealt with first. I'm certainly glad you are willing to help, since that will doubtless make it go faster :) -- R. David Murray www.bitdance.com From lutz at rmi.net Sat Jun 12 18:52:32 2010 From: lutz at rmi.net (lutz at rmi.net) Date: Sat, 12 Jun 2010 16:52:32 -0000 Subject: [Email-SIG] email package status in 3.X Message-ID: Hi David, All sounds good, and thanks again for all your work on this. I appreciate the difficulties of moving this package to 3.X in a backward-compatible way. My suggestions stem from the fact that it does work as is today, albeit in a less than ideal way. That, and I'm seeing that Python 3.X in general is still having a great deal of trouble gaining traction in the "real world" almost 2 years after its release, and I'd hate to see further disincentives for people to migrate. This is a bigger issue than both the email package and this thread, of course. > > 3) Type-dependent text part encoding > > > ... > So, in the next releases of Python all MIMEText input should be string, > and it will fail if you pass bytes. I consider this as email previously > not living up to its published API, but do you think I should hack > in a way for it to accept bytes too, for backward compatibility in the > 3 line? Decoding can probably be safely delegated to package clients. Typical email clients will probably have str for display of the main text. They may wish to read attachments in binary mode, but can always read in text mode instead or decode manualy, because they need a known encoding to send the part correctly (my client has to ask or use configurations in some cases). B/W compatibility probably isn't a concern; I suspect that my temporary workaround will still work with your patch anyhow, and this code didn't work at all for some encodings before. > > There are some additional cases that now require decoding per mail > > headers today due to the str/bytes split, but these are just a > > normal artifact of supporting Unicode character sets in general, > > ans seem like issues for package client to resolve (e.g., the bytes > > returned for decoded payloads in 3.X didn't play well with existing > > str-based text processing code written for 2.X). > > I'm not following you here. Can you give me some more specific > examples? Even if these "normal artifacts" must remain with > the current API, I'd like to make things as easy as practical when > using the new API. This was just a general statement about things in my own code that didn't jive with the 3.X string model. For instance, line wrapping logic assumed str; tkinter text widgets do much better rendering str than the bytes fetched for decoded payloads; and my Pyedit text editor component had to be overhauled to handle display/edit/save of payloads of arbitrary encodings. If I remember any more specific issues with the email package itself, I'll forward your way. I'll watch for an opportunity to get the book's new PyMailGUI client code to you as a candidate test case, but please ping me about it later if I haven't acted on this. It works well, but largely because of all the work that went into the email package underlying it. Thanks, --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) > -----Original Message----- > From: "R. David Murray" > To: lutz at rmi.net > Subject: Re: email package status in 3.X > Date: Thu, 10 Jun 2010 10:18:48 -0400 > > On Thu, 10 Jun 2010 09:21:52 -0400, lutz at rmi.net wrote: > > In other words, some of my concern may have been a bit premature. > > I hope that in the future we'll either strive for compatibility > > or keep the current version around; it's a lot of very useful code. > > The plan is to have a compatibility layer that will accept calls based > on the old API and forward appropriately to the new API. So far I'm > thinking I can succeed in doing this in a fairly straightforward manner, > but I won't know for sure until I get some more pieces in place. > > > In fact, I recommend that any new email package be named distinctly, > > I'm going to avoid that if I can (though the PyPI package will be > named email6 when we publish it for public testing). If, however, > it turns out that I can't correctly support both the old and the > new API, then I'll have to do that. > > > and that the current package be retained for a number of releases to > > come. After all the breakages that 3.X introduced in general, doing > > the same to any email-based code seems a bit too much, especially > > given that the current package is largely functional as is. To me, > > after having just used it extensively, fixing its few issues seems > > a better approach than starting from scratch. > > Well, the thing is, as you found, existing 2.x code needs to be fixed to > correctly handle the distinction between strings and bytes no matter what. > The goal is to make it easier to write correct programs, while providing > the compatibility layer to make porting smoother. But I doubt that any > non-trivial 2.x email program will port without significant changes, > even if the compatibility layer is close to 100% compatible with the > current Python3 email package, simply because the previous conflation > of text and bytes must be untangled in order to work correctly in > Python3, and email involves lots of transitions between text and bytes. > > As for "starting from scratch", it is true that the current plan involves > considerable changes in the recommended API (in the direction of greater > flexibility and power), but I'm hoping that significant portions of the > code will carry forward with minor changes, and that this will make it > easier to support the old API. > > > As far as other issues, the things I found are described below my > > signature. I don't know what the utf-8 issue is that you refer > > too; I'm able to parse and send with this encoding as is without > > problems (both payloads and headers), but I'm probably not using the > > interfaces you fixed, and this may be the same as one of item listed. > > It is, see below. > > > Another thought: it might be useful to use the book's email client > > as a sort of test case for the package; it's much more rigorous in > > the new edition because it now has to be given 3.X'Unicode model > > (it's abut 4,900 lines of code, though not all is email-related). > > I'd be happy to donate the code as soon as I find out what the > > copyright will be this time around; it will be at O'Reilly's site > > this Fall in any event. > > That would be great. I am planning to write my own sample ap to > demonstrate the new API, but if I can use yours to test the compatibility > layer that will help a lot, since I otherwise have no Python3 email > application to test against unless I port something from Python2. > > > Major issues I found... > > ------------------------------------------------------------------ > > 1) Str required for parsing, but bytes returned from poplib > > > > The initial decode from bytes to str of full mail text; in > > retrospect, probably not a major issue, since original email > > standards called for ASCII. A 8-bit encoding like Latin-1 is > > probably sufficient for most conforming mails. For the book, > > I try a set of different encodings, beginning with an optional > > configuration module setting, then ascii, latin-1, and utf-8; > > this is probably overkill, but a GUI has to be defensive. > > This works (mostly) for conforming email, but some important Python email > applications need to deal with non-conforming email. That's where the > inability to parse bytes directly really causes problems. > > > 2) Binary attachments encoding > > > > The binary attachments byte-to-str issue that you've just > > fixed. As I mentioned, I worked around this by passing in a > > custom encoder that calls the original and runs an extra decode > > step. Here's what my fix looked like in the book; your patch > > may do better, and I will minimally add a note about the 3.1.3 > > and 3.2 fix for this: > > Yeah, our patch was a lot simpler since we could fix the encoding inside > the loop producing the encoded lines :) > > > 3) Type-dependent text part encoding > > > > There's a str/bytes confusion issue related to Unicode encodings > > in text payload generation: some encodings require the payload to > > be str, but others expect bytes. Unfortunately, this means that > > clients need to know how the package will react to the encoding > > that is used, and special-case based upon that. > > This was the UTF-8 bug I fixed. I shouldn't have called it "the UTF-8 > bug", because it applies equally to the other charsets that use base64, > as you note. I called it that because UTF-8 was where the problem was > noticed and is mentioned in the title of the bug report. > > I had a suspicion that the quoted-printable encoding wasn't being done > correctly either, so to hear that it is working for you is good news. > There may still be bugs to find there, though. > > So, in the next releases of Python all MIMEText input should be string, > and it will fail if you pass bytes. I consider this as email previously > not living up to its published API, but do you think I should hack > in a way for it to accept bytes too, for backward compatibility in the > 3 line? > > > There are some additional cases that now require decoding per mail > > headers today due to the str/bytes split, but these are just a > > normal artifact of supporting Unicode character sets in general, > > ans seem like issues for package client to resolve (e.g., the bytes > > returned for decoded payloads in 3.X didn't play well with existing > > str-based text processing code written for 2.X). > > I'm not following you here. Can you give me some more specific > examples? Even if these "normal artifacts" must remain with > the current API, I'd like to make things as easy as practical when > using the new API. > > Thanks for all your feedback! > > --David > From lutz at rmi.net Sun Jun 13 17:30:06 2010 From: lutz at rmi.net (lutz at rmi.net) Date: Sun, 13 Jun 2010 15:30:06 -0000 Subject: [Email-SIG] email package status in 3.X Message-ID: Come to think of it, here was another oddness I just recalled: this may have been reported already, but header decoding returns mixed types depending upon the structure of the header. Converting to a str for display isn't too difficult to handle, but this seems a bit inconsistent and contrary to Python's type neutrality: >>> from email.header import decode_header >>> S1 = 'Man where did you get that assistant?' >>> S2 = '=?utf-8?q?Man_where_did_you_get_that_assistant=3F?=' >>> S3 = 'Man where did you get that =?UTF-8?Q?assistant=3F?=' # str: don't decode() >>> decode_header(S1) [('Man where did you get that assistant?', None)] # bytes: do decode() >>> decode_header(S2) [(b'Man where did you get that assistant?', 'utf-8')] # bytes: do decode(), using raw-unicode-escape applied in package >>> decode_header(S3) [(b'Man where did you get that', None), (b'assistant?', 'utf-8')] I can make this work around this with the following code, but it feels a bit too tightly coupled to the package's internal details (further evidence that email.* can be made to work as is today, even if it may be seen as less than ideal aesthetically): parts = email.header.decode_header(rawheader) decoded = [] for (part, enc) in parts: # for all substrings if enc == None: # part unencoded? if not isinstance(part, bytes): # str: full hdr unencoded decoded += [part] # else do unicode decode else: decoded += [part.decode('raw-unicode-escape')] else: decoded += [part.decode(enc)] return ' '.join(decoded) Thanks, --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) > -----Original Message----- > From: lutz at rmi.net > To: "R. David Murray" > Subject: Re: email package status in 3.X > Date: Sat, 12 Jun 2010 16:52:32 -0000 > > Hi David, > > All sounds good, and thanks again for all your work on this. > > I appreciate the difficulties of moving this package to 3.X > in a backward-compatible way. My suggestions stem from the fact > that it does work as is today, albeit in a less than ideal way. > > That, and I'm seeing that Python 3.X in general is still having > a great deal of trouble gaining traction in the "real world" > almost 2 years after its release, and I'd hate to see further > disincentives for people to migrate. This is a bigger issue > than both the email package and this thread, of course. > > > > 3) Type-dependent text part encoding > > > > > ... > > So, in the next releases of Python all MIMEText input should be string, > > and it will fail if you pass bytes. I consider this as email previously > > not living up to its published API, but do you think I should hack > > in a way for it to accept bytes too, for backward compatibility in the > > 3 line? > > Decoding can probably be safely delegated to package clients. > Typical email clients will probably have str for display of the > main text. They may wish to read attachments in binary mode, but > can always read in text mode instead or decode manualy, because > they need a known encoding to send the part correctly (my client > has to ask or use configurations in some cases). > > B/W compatibility probably isn't a concern; I suspect that my > temporary workaround will still work with your patch anyhow, > and this code didn't work at all for some encodings before. > > > > There are some additional cases that now require decoding per mail > > > headers today due to the str/bytes split, but these are just a > > > normal artifact of supporting Unicode character sets in general, > > > ans seem like issues for package client to resolve (e.g., the bytes > > > returned for decoded payloads in 3.X didn't play well with existing > > > str-based text processing code written for 2.X). > > > > I'm not following you here. Can you give me some more specific > > examples? Even if these "normal artifacts" must remain with > > the current API, I'd like to make things as easy as practical when > > using the new API. > > This was just a general statement about things in my own code that > didn't jive with the 3.X string model. For instance, line wrapping > logic assumed str; tkinter text widgets do much better rendering str > than the bytes fetched for decoded payloads; and my Pyedit text editor > component had to be overhauled to handle display/edit/save of payloads > of arbitrary encodings. If I remember any more specific issues with > the email package itself, I'll forward your way. > > I'll watch for an opportunity to get the book's new PyMailGUI > client code to you as a candidate test case, but please ping > me about it later if I haven't acted on this. It works well, > but largely because of all the work that went into the email > package underlying it. > > Thanks, > --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) > > > > -----Original Message----- > > From: "R. David Murray" > > To: lutz at rmi.net > > Subject: Re: email package status in 3.X > > Date: Thu, 10 Jun 2010 10:18:48 -0400 > > > > On Thu, 10 Jun 2010 09:21:52 -0400, lutz at rmi.net wrote: > > > In other words, some of my concern may have been a bit premature. > > > I hope that in the future we'll either strive for compatibility > > > or keep the current version around; it's a lot of very useful code. > > > > The plan is to have a compatibility layer that will accept calls based > > on the old API and forward appropriately to the new API. So far I'm > > thinking I can succeed in doing this in a fairly straightforward manner, > > but I won't know for sure until I get some more pieces in place. > > > > > In fact, I recommend that any new email package be named distinctly, > > > > I'm going to avoid that if I can (though the PyPI package will be > > named email6 when we publish it for public testing). If, however, > > it turns out that I can't correctly support both the old and the > > new API, then I'll have to do that. > > > > > and that the current package be retained for a number of releases to > > > come. After all the breakages that 3.X introduced in general, doing > > > the same to any email-based code seems a bit too much, especially > > > given that the current package is largely functional as is. To me, > > > after having just used it extensively, fixing its few issues seems > > > a better approach than starting from scratch. > > > > Well, the thing is, as you found, existing 2.x code needs to be fixed to > > correctly handle the distinction between strings and bytes no matter what. > > The goal is to make it easier to write correct programs, while providing > > the compatibility layer to make porting smoother. But I doubt that any > > non-trivial 2.x email program will port without significant changes, > > even if the compatibility layer is close to 100% compatible with the > > current Python3 email package, simply because the previous conflation > > of text and bytes must be untangled in order to work correctly in > > Python3, and email involves lots of transitions between text and bytes. > > > > As for "starting from scratch", it is true that the current plan involves > > considerable changes in the recommended API (in the direction of greater > > flexibility and power), but I'm hoping that significant portions of the > > code will carry forward with minor changes, and that this will make it > > easier to support the old API. > > > > > As far as other issues, the things I found are described below my > > > signature. I don't know what the utf-8 issue is that you refer > > > too; I'm able to parse and send with this encoding as is without > > > problems (both payloads and headers), but I'm probably not using the > > > interfaces you fixed, and this may be the same as one of item listed. > > > > It is, see below. > > > > > Another thought: it might be useful to use the book's email client > > > as a sort of test case for the package; it's much more rigorous in > > > the new edition because it now has to be given 3.X'Unicode model > > > (it's abut 4,900 lines of code, though not all is email-related). > > > I'd be happy to donate the code as soon as I find out what the > > > copyright will be this time around; it will be at O'Reilly's site > > > this Fall in any event. > > > > That would be great. I am planning to write my own sample ap to > > demonstrate the new API, but if I can use yours to test the compatibility > > layer that will help a lot, since I otherwise have no Python3 email > > application to test against unless I port something from Python2. > > > > > Major issues I found... > > > ------------------------------------------------------------------ > > > 1) Str required for parsing, but bytes returned from poplib > > > > > > The initial decode from bytes to str of full mail text; in > > > retrospect, probably not a major issue, since original email > > > standards called for ASCII. A 8-bit encoding like Latin-1 is > > > probably sufficient for most conforming mails. For the book, > > > I try a set of different encodings, beginning with an optional > > > configuration module setting, then ascii, latin-1, and utf-8; > > > this is probably overkill, but a GUI has to be defensive. > > > > This works (mostly) for conforming email, but some important Python email > > applications need to deal with non-conforming email. That's where the > > inability to parse bytes directly really causes problems. > > > > > 2) Binary attachments encoding > > > > > > The binary attachments byte-to-str issue that you've just > > > fixed. As I mentioned, I worked around this by passing in a > > > custom encoder that calls the original and runs an extra decode > > > step. Here's what my fix looked like in the book; your patch > > > may do better, and I will minimally add a note about the 3.1.3 > > > and 3.2 fix for this: > > > > Yeah, our patch was a lot simpler since we could fix the encoding inside > > the loop producing the encoded lines :) > > > > > 3) Type-dependent text part encoding > > > > > > There's a str/bytes confusion issue related to Unicode encodings > > > in text payload generation: some encodings require the payload to > > > be str, but others expect bytes. Unfortunately, this means that > > > clients need to know how the package will react to the encoding > > > that is used, and special-case based upon that. > > > > This was the UTF-8 bug I fixed. I shouldn't have called it "the UTF-8 > > bug", because it applies equally to the other charsets that use base64, > > as you note. I called it that because UTF-8 was where the problem was > > noticed and is mentioned in the title of the bug report. > > > > I had a suspicion that the quoted-printable encoding wasn't being done > > correctly either, so to hear that it is working for you is good news. > > There may still be bugs to find there, though. > > > > So, in the next releases of Python all MIMEText input should be string, > > and it will fail if you pass bytes. I consider this as email previously > > not living up to its published API, but do you think I should hack > > in a way for it to accept bytes too, for backward compatibility in the > > 3 line? > > > > > There are some additional cases that now require decoding per mail > > > headers today due to the str/bytes split, but these are just a > > > normal artifact of supporting Unicode character sets in general, > > > ans seem like issues for package client to resolve (e.g., the bytes > > > returned for decoded payloads in 3.X didn't play well with existing > > > str-based text processing code written for 2.X). > > > > I'm not following you here. Can you give me some more specific > > examples? Even if these "normal artifacts" must remain with > > the current API, I'd like to make things as easy as practical when > > using the new API. > > > > Thanks for all your feedback! > > > > --David > > > > > > From lutz at rmi.net Wed Jun 16 22:48:49 2010 From: lutz at rmi.net (lutz at rmi.net) Date: Wed, 16 Jun 2010 20:48:49 -0000 Subject: [Email-SIG] email package status in 3.X Message-ID: <6wwifklfk7n7tup216062010044853@SMTP> [copied to pydev from email-sig because of the broader scope] Well, it looks like I've stumbled onto the "other shoe" on this issue--that the email package's problems are also apparently behind the fact that CGI binary file uploads don't work in 3.1 (http://bugs.python.org/issue4953). Yikes. I trust that people realize this is a show-stopper for broader Python 3.X adoption. Why 3.0 was rolled out anyhow is beyond me; it seems that it would have been better if Python developers had gotten their own code to work with 3.X, before expecting the world at large to do so. FWIW, after rewriting Programming Python for 3.1, 3.x still feels a lot like a beta to me, almost 2 years after its release. How did this happen? Maybe nobody is using 3.X enough to care, but I have a feeling that issues like this are part of the reason why. No offense to people who obviously put in an incredible amount of work on 3.X. As someone who remembers 0.X, though, it's hard not to find the current situation a bit disappointing. --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) > -----Original Message----- > From: lutz at rmi.net > To: "R. David Murray" > Subject: Re: email package status in 3.X > Date: Sun, 13 Jun 2010 15:30:06 -0000 > > Come to think of it, here was another oddness I just recalled: this > may have been reported already, but header decoding returns mixed types > depending upon the structure of the header. Converting to a str for > display isn't too difficult to handle, but this seems a bit inconsistent > and contrary to Python's type neutrality: > > >>> from email.header import decode_header > >>> S1 = 'Man where did you get that assistant?' > >>> S2 = '=?utf-8?q?Man_where_did_you_get_that_assistant=3F?=' > >>> S3 = 'Man where did you get that =?UTF-8?Q?assistant=3F?=' > > # str: don't decode() > >>> decode_header(S1) > [('Man where did you get that assistant?', None)] > > # bytes: do decode() > >>> decode_header(S2) > [(b'Man where did you get that assistant?', 'utf-8')] > > # bytes: do decode(), using raw-unicode-escape applied in package > >>> decode_header(S3) > [(b'Man where did you get that', None), (b'assistant?', 'utf-8')] > > I can work around this with the following code, but it > feels a bit too tightly coupled to the package's internal details > (further evidence that email.* can be made to work as is today, > even if it may be seen as less than ideal aesthetically): > > parts = email.header.decode_header(rawheader) > decoded = [] > for (part, enc) in parts: # for all substrings > if enc == None: # part unencoded? > if not isinstance(part, bytes): # str: full hdr unencoded > decoded += [part] # else do unicode decode > else: > decoded += [part.decode('raw-unicode-escape')] > else: > decoded += [part.decode(enc)] > return ' '.join(decoded) > > Thanks, > --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) > > > > -----Original Message----- > > From: lutz at rmi.net > > To: "R. David Murray" > > Subject: Re: email package status in 3.X > > Date: Sat, 12 Jun 2010 16:52:32 -0000 > > > > Hi David, > > > > All sounds good, and thanks again for all your work on this. > > > > I appreciate the difficulties of moving this package to 3.X > > in a backward-compatible way. My suggestions stem from the fact > > that it does work as is today, albeit in a less than ideal way. > > > > That, and I'm seeing that Python 3.X in general is still having > > a great deal of trouble gaining traction in the "real world" > > almost 2 years after its release, and I'd hate to see further > > disincentives for people to migrate. This is a bigger issue > > than both the email package and this thread, of course. > > > > > > 3) Type-dependent text part encoding > > > > > > > ... > > > So, in the next releases of Python all MIMEText input should be string, > > > and it will fail if you pass bytes. I consider this as email previously > > > not living up to its published API, but do you think I should hack > > > in a way for it to accept bytes too, for backward compatibility in the > > > 3 line? > > > > Decoding can probably be safely delegated to package clients. > > Typical email clients will probably have str for display of the > > main text. They may wish to read attachments in binary mode, but > > can always read in text mode instead or decode manualy, because > > they need a known encoding to send the part correctly (my client > > has to ask or use configurations in some cases). > > > > B/W compatibility probably isn't a concern; I suspect that my > > temporary workaround will still work with your patch anyhow, > > and this code didn't work at all for some encodings before. > > > > > > There are some additional cases that now require decoding per mail > > > > headers today due to the str/bytes split, but these are just a > > > > normal artifact of supporting Unicode character sets in general, > > > > ans seem like issues for package client to resolve (e.g., the bytes > > > > returned for decoded payloads in 3.X didn't play well with existing > > > > str-based text processing code written for 2.X). > > > > > > I'm not following you here. Can you give me some more specific > > > examples? Even if these "normal artifacts" must remain with > > > the current API, I'd like to make things as easy as practical when > > > using the new API. > > > > This was just a general statement about things in my own code that > > didn't jive with the 3.X string model. For instance, line wrapping > > logic assumed str; tkinter text widgets do much better rendering str > > than the bytes fetched for decoded payloads; and my Pyedit text editor > > component had to be overhauled to handle display/edit/save of payloads > > of arbitrary encodings. If I remember any more specific issues with > > the email package itself, I'll forward your way. > > > > I'll watch for an opportunity to get the book's new PyMailGUI > > client code to you as a candidate test case, but please ping > > me about it later if I haven't acted on this. It works well, > > but largely because of all the work that went into the email > > package underlying it. > > > > Thanks, > > --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) > > > > > > > -----Original Message----- > > > From: "R. David Murray" > > > To: lutz at rmi.net > > > Subject: Re: email package status in 3.X > > > Date: Thu, 10 Jun 2010 10:18:48 -0400 > > > > > > On Thu, 10 Jun 2010 09:21:52 -0400, lutz at rmi.net wrote: > > > > In other words, some of my concern may have been a bit premature. > > > > I hope that in the future we'll either strive for compatibility > > > > or keep the current version around; it's a lot of very useful code. > > > > > > The plan is to have a compatibility layer that will accept calls based > > > on the old API and forward appropriately to the new API. So far I'm > > > thinking I can succeed in doing this in a fairly straightforward manner, > > > but I won't know for sure until I get some more pieces in place. > > > > > > > In fact, I recommend that any new email package be named distinctly, > > > > > > I'm going to avoid that if I can (though the PyPI package will be > > > named email6 when we publish it for public testing). If, however, > > > it turns out that I can't correctly support both the old and the > > > new API, then I'll have to do that. > > > > > > > and that the current package be retained for a number of releases to > > > > come. After all the breakages that 3.X introduced in general, doing > > > > the same to any email-based code seems a bit too much, especially > > > > given that the current package is largely functional as is. To me, > > > > after having just used it extensively, fixing its few issues seems > > > > a better approach than starting from scratch. > > > > > > Well, the thing is, as you found, existing 2.x code needs to be fixed to > > > correctly handle the distinction between strings and bytes no matter what. > > > The goal is to make it easier to write correct programs, while providing > > > the compatibility layer to make porting smoother. But I doubt that any > > > non-trivial 2.x email program will port without significant changes, > > > even if the compatibility layer is close to 100% compatible with the > > > current Python3 email package, simply because the previous conflation > > > of text and bytes must be untangled in order to work correctly in > > > Python3, and email involves lots of transitions between text and bytes. > > > > > > As for "starting from scratch", it is true that the current plan involves > > > considerable changes in the recommended API (in the direction of greater > > > flexibility and power), but I'm hoping that significant portions of the > > > code will carry forward with minor changes, and that this will make it > > > easier to support the old API. > > > > > > > As far as other issues, the things I found are described below my > > > > signature. I don't know what the utf-8 issue is that you refer > > > > too; I'm able to parse and send with this encoding as is without > > > > problems (both payloads and headers), but I'm probably not using the > > > > interfaces you fixed, and this may be the same as one of item listed. > > > > > > It is, see below. > > > > > > > Another thought: it might be useful to use the book's email client > > > > as a sort of test case for the package; it's much more rigorous in > > > > the new edition because it now has to be given 3.X'Unicode model > > > > (it's abut 4,900 lines of code, though not all is email-related). > > > > I'd be happy to donate the code as soon as I find out what the > > > > copyright will be this time around; it will be at O'Reilly's site > > > > this Fall in any event. > > > > > > That would be great. I am planning to write my own sample ap to > > > demonstrate the new API, but if I can use yours to test the compatibility > > > layer that will help a lot, since I otherwise have no Python3 email > > > application to test against unless I port something from Python2. > > > > > > > Major issues I found... > > > > ------------------------------------------------------------------ > > > > 1) Str required for parsing, but bytes returned from poplib > > > > > > > > The initial decode from bytes to str of full mail text; in > > > > retrospect, probably not a major issue, since original email > > > > standards called for ASCII. A 8-bit encoding like Latin-1 is > > > > probably sufficient for most conforming mails. For the book, > > > > I try a set of different encodings, beginning with an optional > > > > configuration module setting, then ascii, latin-1, and utf-8; > > > > this is probably overkill, but a GUI has to be defensive. > > > > > > This works (mostly) for conforming email, but some important Python email > > > applications need to deal with non-conforming email. That's where the > > > inability to parse bytes directly really causes problems. > > > > > > > 2) Binary attachments encoding > > > > > > > > The binary attachments byte-to-str issue that you've just > > > > fixed. As I mentioned, I worked around this by passing in a > > > > custom encoder that calls the original and runs an extra decode > > > > step. Here's what my fix looked like in the book; your patch > > > > may do better, and I will minimally add a note about the 3.1.3 > > > > and 3.2 fix for this: > > > > > > Yeah, our patch was a lot simpler since we could fix the encoding inside > > > the loop producing the encoded lines :) > > > > > > > 3) Type-dependent text part encoding > > > > > > > > There's a str/bytes confusion issue related to Unicode encodings > > > > in text payload generation: some encodings require the payload to > > > > be str, but others expect bytes. Unfortunately, this means that > > > > clients need to know how the package will react to the encoding > > > > that is used, and special-case based upon that. > > > > > > This was the UTF-8 bug I fixed. I shouldn't have called it "the UTF-8 > > > bug", because it applies equally to the other charsets that use base64, > > > as you note. I called it that because UTF-8 was where the problem was > > > noticed and is mentioned in the title of the bug report. > > > > > > I had a suspicion that the quoted-printable encoding wasn't being done > > > correctly either, so to hear that it is working for you is good news. > > > There may still be bugs to find there, though. > > > > > > So, in the next releases of Python all MIMEText input should be string, > > > and it will fail if you pass bytes. I consider this as email previously > > > not living up to its published API, but do you think I should hack > > > in a way for it to accept bytes too, for backward compatibility in the > > > 3 line? > > > > > > > There are some additional cases that now require decoding per mail > > > > headers today due to the str/bytes split, but these are just a > > > > normal artifact of supporting Unicode character sets in general, > > > > ans seem like issues for package client to resolve (e.g., the bytes > > > > returned for decoded payloads in 3.X didn't play well with existing > > > > str-based text processing code written for 2.X). > > > > > > I'm not following you here. Can you give me some more specific > > > examples? Even if these "normal artifacts" must remain with > > > the current API, I'd like to make things as easy as practical when > > > using the new API. > > > > > > Thanks for all your feedback! > > > > > > --David > > > > > > > > > > > > From ncoghlan at gmail.com Wed Jun 16 23:47:27 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 17 Jun 2010 07:47:27 +1000 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: <6wwifklfk7n7tup216062010044853@SMTP> References: <6wwifklfk7n7tup216062010044853@SMTP> Message-ID: On Thu, Jun 17, 2010 at 6:48 AM, wrote: > I trust that people realize this is a show-stopper for broader > Python 3.X adoption. ?Why 3.0 was rolled out anyhow is beyond > me; it seems that it would have been better if Python developers > had gotten their own code to work with 3.X, before expecting the > world at large to do so. > > FWIW, after rewriting Programming Python for 3.1, 3.x still feels > a lot like a beta to me, almost 2 years after its release. ?How > did this happen? ?Maybe nobody is using 3.X enough to care, but > I have a feeling that issues like this are part of the reason why. > > No offense to people who obviously put in an incredible amount of > work on 3.X. ?As someone who remembers 0.X, though, it's hard not > to find the current situation a bit disappointing. Agreed, but the binary/text distinction in 2.x (or rather, the lack thereof) makes the unicode handling situation so hopelessly confused that there is a lot of 2.x code (including in the standard library) that silently mixes the two, often without really testing the consequences (as clearly happened here). 3.x was rolled out anyway because the vast majority of it works. Obviously people affected by the problems specific to the email package and any other binary vs text parsing problems that are still lingering are out of luck at the moment, but leaving 3.x sitting on a shelf indefinitely would hardly have inspired anyone to clean it up. My personal perspective is that a lot of that code was likely already broken in hard to detect ways when dealing with mixed encodings - releasing 3.x just made the associated errors significantly easier to detect. If we end up being able to add your email client code to the standard library's unit test suite, that should help the situation immensely. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Thu Jun 17 17:43:29 2010 From: barry at python.org (Barry Warsaw) Date: Thu, 17 Jun 2010 11:43:29 -0400 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: <6wwifklfk7n7tup216062010044853@SMTP> References: <6wwifklfk7n7tup216062010044853@SMTP> Message-ID: <20100617114329.254db9ac@heresy> On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote: >Well, it looks like I've stumbled onto the "other shoe" on this >issue--that the email package's problems are also apparently >behind the fact that CGI binary file uploads don't work in 3.1 >(http://bugs.python.org/issue4953). Yikes. > >I trust that people realize this is a show-stopper for broader >Python 3.X adoption. We know it, we have extensively discussed how to fix it, we have IMO a good design, and we even have someone willing and able to tackle the problem. We need to find a sufficient source of funding to enable him to do the work it will take, and so far that's been the biggest stumbling block. It will take a focused and determined effort to see this through, and it's obvious that volunteers cannot make it happen. I include myself in the latter category, as I've tried and failed at least twice to do it in my spare time. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From brett at python.org Thu Jun 17 21:24:54 2010 From: brett at python.org (Brett Cannon) Date: Thu, 17 Jun 2010 12:24:54 -0700 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: <20100617114329.254db9ac@heresy> References: <6wwifklfk7n7tup216062010044853@SMTP> <20100617114329.254db9ac@heresy> Message-ID: On Thu, Jun 17, 2010 at 08:43, Barry Warsaw wrote: > On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote: > >>Well, it looks like I've stumbled onto the "other shoe" on this >>issue--that the email package's problems are also apparently >>behind the fact that CGI binary file uploads don't work in 3.1 >>(http://bugs.python.org/issue4953). ?Yikes. >> >>I trust that people realize this is a show-stopper for broader >>Python 3.X adoption. > > We know it, we have extensively discussed how to fix it, we have IMO a good > design, and we even have someone willing and able to tackle the problem. ?We > need to find a sufficient source of funding to enable him to do the work it > will take, and so far that's been the biggest stumbling block. ?It will take a > focused and determined effort to see this through, and it's obvious that > volunteers cannot make it happen. ?I include myself in the latter category, as > I've tried and failed at least twice to do it in my spare time. And in general I think this is the reason some modules have not transitioned as well as others: there are only so many of us. The stdlib passes its test suite, but obviously some unit tests do not cover enough of the code in the ways people need it covered. As for using Python 3 for my code, I do and have since Python 3 became more-or-less usable. I just happen to not work with internet-related stuff in my day-to-day work. Plus we have needed to maintain FOUR branches for a while. That is a nasty time sink when you are having to port bug fixes and such. It also means that python-dev has been focused on making sure Python 2.7 is a solid release instead of getting to focus on the stdlib in Python 3. This a nasty chicken-and-egg issue; we could ignore Python 2 and focus on Python 3, but then the community would complain about us not supporting the transition from 2 to 3 better, but obviously focusing on 2 has led to 3 not getting enough TLC. Once Python 2.7 is done and out the door the entire situation for Python 3 should start to improve as python-dev as whole will have a chance to begin to focus solely on Python 3. From steve at holdenweb.com Fri Jun 18 04:32:51 2010 From: steve at holdenweb.com (Steve Holden) Date: Fri, 18 Jun 2010 11:32:51 +0900 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: <20100617114329.254db9ac@heresy> References: <6wwifklfk7n7tup216062010044853@SMTP> <20100617114329.254db9ac@heresy> Message-ID: <4C1ADAD3.9070808@holdenweb.com> Barry Warsaw wrote: > On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote: > >> Well, it looks like I've stumbled onto the "other shoe" on this >> issue--that the email package's problems are also apparently >> behind the fact that CGI binary file uploads don't work in 3.1 >> (http://bugs.python.org/issue4953). Yikes. >> >> I trust that people realize this is a show-stopper for broader >> Python 3.X adoption. > > We know it, we have extensively discussed how to fix it, we have IMO a good > design, and we even have someone willing and able to tackle the problem. We > need to find a sufficient source of funding to enable him to do the work it > will take, and so far that's been the biggest stumbling block. It will take a > focused and determined effort to see this through, and it's obvious that > volunteers cannot make it happen. I include myself in the latter category, as > I've tried and failed at least twice to do it in my spare time. > > -Barry > Lest the readership think that the PSF is unaware of this issue, allow me to point out that we have already partially funded this effort, and are still offering R. David Murray some further matching funds if he can raise sponsorship to complete the effort (on which he has made a very promising start). We are also attempting to enable tax-deductible fund raising to increase the likelihood of David's finding support. Perhaps we need to think about a broader campaign to increase the quality of the python 3 libraries. I find it very annoying that the #python IRC group still has "Don't use Python 3" in it's topic. They adamantly refuse to remove it until there is better library support, and they are the guys who see the issues day in day out so it is hard to argue with them (and I don't think an autocratic decision-making process would be appropriate). regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS: http://holdenweb.eventbrite.com/ "All I want for my birthday is another birthday" - Ian Dury, 1942-2000 From stephen at xemacs.org Fri Jun 18 07:52:17 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 18 Jun 2010 14:52:17 +0900 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: <6wwifklfk7n7tup216062010044853@SMTP> References: <6wwifklfk7n7tup216062010044853@SMTP> Message-ID: <87d3volwfi.fsf@uwakimon.sk.tsukuba.ac.jp> lutz at rmi.net writes: > FWIW, after rewriting Programming Python for 3.1, 3.x still feels > a lot like a beta to me, almost 2 years after its release. Email, of course, is a big wart. But guess what? Python 2's email module doesn't actually work! Sure, the program runs most of the time, but every program that depends on email must acquire inches of armorplate against all the things that can go wrong. You simply can't rely on it to DTRT except in a pre-MIME, pre-HTML, ASCII-only world. Although they're often addressing general problems, these hacks are *not* integrated back into the email module in most cases, but remain app-specific voodoo. If you live in Kansas, sure, you can concentrate on dodging tornados and completely forget about Unicode and MIME and text/bogus content. For the rest of the world, though, the problem is not Python 3. It's STD 11 (which still points at RFC 822, dated 1982!) It's really inappropriate to point at the email module, whose developers are trying *not* to punt on conformance and robustness, when even the IETF can only "run in circles, scream and shout"! Maybe there are other problems with Python 3 that deserve to be pointed at, but given the general scarcity of resources I think the email module developers are working on the right things. Unlike many other modules, email really needs to be rewritten from the ground (Python 3) up, because of the centrality of bytes/unicode confusion to all email problems. Python 3 completely changes the assumptions there; a Python 2-style email module really can't work properly. Then on top of that, today we know a lot more about handling issues like text/html content and MIME in general than when the Python 2 email module was designed. New problems have arisen over the period of Python 3 development, like "domain keys", which email doesn't handle out of the box AFAIK, but email for Python 3 should IMHO. Should Python 3 have been held back until email was fixed? Dunno, but I personally am very glad it was not; where I have a choice, I always use Python 3 now, and have yet to run into a problem. I expect that to change if I can find the time to get involved in email and Mailman 3 development, of course. From steve at holdenweb.com Fri Jun 18 04:32:51 2010 From: steve at holdenweb.com (Steve Holden) Date: Fri, 18 Jun 2010 11:32:51 +0900 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: <20100617114329.254db9ac@heresy> References: <6wwifklfk7n7tup216062010044853@SMTP> <20100617114329.254db9ac@heresy> Message-ID: <4C1ADAD3.9070808@holdenweb.com> Barry Warsaw wrote: > On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote: > >> Well, it looks like I've stumbled onto the "other shoe" on this >> issue--that the email package's problems are also apparently >> behind the fact that CGI binary file uploads don't work in 3.1 >> (http://bugs.python.org/issue4953). Yikes. >> >> I trust that people realize this is a show-stopper for broader >> Python 3.X adoption. > > We know it, we have extensively discussed how to fix it, we have IMO a good > design, and we even have someone willing and able to tackle the problem. We > need to find a sufficient source of funding to enable him to do the work it > will take, and so far that's been the biggest stumbling block. It will take a > focused and determined effort to see this through, and it's obvious that > volunteers cannot make it happen. I include myself in the latter category, as > I've tried and failed at least twice to do it in my spare time. > > -Barry > Lest the readership think that the PSF is unaware of this issue, allow me to point out that we have already partially funded this effort, and are still offering R. David Murray some further matching funds if he can raise sponsorship to complete the effort (on which he has made a very promising start). We are also attempting to enable tax-deductible fund raising to increase the likelihood of David's finding support. Perhaps we need to think about a broader campaign to increase the quality of the python 3 libraries. I find it very annoying that the #python IRC group still has "Don't use Python 3" in it's topic. They adamantly refuse to remove it until there is better library support, and they are the guys who see the issues day in day out so it is hard to argue with them (and I don't think an autocratic decision-making process would be appropriate). regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS: http://holdenweb.eventbrite.com/ "All I want for my birthday is another birthday" - Ian Dury, 1942-2000 From arcriley at gmail.com Fri Jun 18 05:16:47 2010 From: arcriley at gmail.com (Arc Riley) Date: Thu, 17 Jun 2010 23:16:47 -0400 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: <4C1ADAD3.9070808@holdenweb.com> References: <6wwifklfk7n7tup216062010044853@SMTP> <20100617114329.254db9ac@heresy> <4C1ADAD3.9070808@holdenweb.com> Message-ID: David and his Google Summer of Code student, Shashwat Anand. You can read Shashwat's weekly progress updates at http://l0nwlf.in/ or subscribe to http://twitter.com/l0nwlf for more micro updates. We have more than 30 paid students working on Python 3 tasks this year, most of them participating under the PSF umbrella but also a few with 3rd party projects such as Mercurial porting those various packages to Py3. Given all this "on the horizon" work, I think the Py3 package situation will look a lot brighter by Python 3.2's release. On Thu, Jun 17, 2010 at 10:32 PM, Steve Holden wrote: > > Lest the readership think that the PSF is unaware of this issue, allow > me to point out that we have already partially funded this effort, and > are still offering R. David Murray some further matching funds if he can > raise sponsorship to complete the effort (on which he has made a very > promising start). > > We are also attempting to enable tax-deductible fund raising to increase > the likelihood of David's finding support. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Fri Jun 18 15:45:57 2010 From: barry at python.org (Barry Warsaw) Date: Fri, 18 Jun 2010 09:45:57 -0400 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: <4C1ADAD3.9070808@holdenweb.com> References: <6wwifklfk7n7tup216062010044853@SMTP> <20100617114329.254db9ac@heresy> <4C1ADAD3.9070808@holdenweb.com> Message-ID: <20100618094557.77a07994@heresy> On Jun 18, 2010, at 11:32 AM, Steve Holden wrote: >Lest the readership think that the PSF is unaware of this issue, allow >me to point out that we have already partially funded this effort, and >are still offering R. David Murray some further matching funds if he can >raise sponsorship to complete the effort (on which he has made a very >promising start). Right, sorry, I didn't mean to imply the PSF isn't doing anything. More that we need a coordinated effort among all the companies and organizations that use Python to help fund Python 3 library development (and not just in the stdlib). I think the PSF is best suited to coordinating and managing those efforts, and through its tax-exempt status, collecting and distributing donations specifically targeted to Python 3 work. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From lutz at rmi.net Fri Jun 18 17:09:40 2010 From: lutz at rmi.net (lutz at rmi.net) Date: Fri, 18 Jun 2010 15:09:40 -0000 Subject: [Email-SIG] [Python-Dev] email package status in 3.X Message-ID: Replying en masse to save bandwidth here... Barry Warsaw writes: > We know it, we have extensively discussed how to fix it, we have IMO a good > design, and we even have someone willing and able to tackle the problem. We > need to find a sufficient source of funding to enable him to do the work it > will take, and so far that's been the biggest stumbling block. It will take a > focused and determined effort to see this through, and it's obvious that > volunteers cannot make it happen. I include myself in the latter category, as > I've tried and failed at least twice to do it in my spare time. All understood, and again, not to disparage anyone here. My comments are directed to the development community at large to underscore the grave p/r problems 3.X faces. I realize email parsing is a known issue; I also realize that most people evaluating 3.X today won't care that it is. Most will care only that the new version of a language reportedly used by Google and YouTube still doesn't support CGI uploads a year and a half after its release. As an author, that's a downright horrible story to have to tell the world. "Stephen J. Turnbull" writes: > Email, of course, is a big wart. But guess what? Python 2's email > module doesn't actually work! Yes it does (see next point). > If you live in Kansas, sure, you can concentrate on dodging tornados > and completely forget about Unicode and MIME and text/bogus content. > For the rest of the world, though, the problem is not Python 3 Yes it is, and Kansas is a lot bigger than you seem to think. I want to reiterate that I was able to build a feature rich email client with the email package as it exists in 3.1. This includes support on both the receiving and sending sides for HTML, arbitrary attachments, and decoding and encoding of both text payloads and headers according to email, MIME, and Unicode/I18N standards. It's an amazingly useful package, and does work as is in 3.X. The two main issues I found have been recently fixed. It's unfortunate that this package is also the culprit behind CGI breakage, but it's not clear why it became a critical path for so much utility in the first place. The package might not be aesthetically ideal, but to me it seems that an utterly incompatible overhaul of this in the name of supporting potentially very different data streams is a huge functional overload. And to those people in Kansas who live outside the pydev clique, replacing it with something different at this point will look as if an incompatible Python is already incompatible with releases in its own line. Why in the world would anyone base a new project on that sort of thrashing? For my part, I've had to add far too many notes to the upcoming edition of Programming Python about major pieces of functionality that worked in 2.X but no longer do in 3.X. That's disappointing to me personally, but it will probably seem a lot worse to the book's tens of thousands of readers. Yet this is the reality that 3.X has created for itself. > Should Python 3 have been held back until email was fixed? Dunno, but > I personally am very glad it was not; where I have a choice, I always > use Python 3 now, and have yet to run into a problem. I guess we'll just have to disagree on that. IMHO, Python 3 shot itself in the foot by releasing in half-baked form. And the 3.0 I/O speed issue (remember that?) came very close to blowing its leg clean off. The reality out there in Kansas today is that 3.X is perceived as so bad that it could very well go the way of POP4 if its story does not improve. I don't know what sort of Python world will be left behind in the wake, but I do know it will probably be much smaller. Steve Holden writes: > Lest the readership think that the PSF is unaware of this issue, allow > me to point out that we have already partially funded this effort, and > are still offering R. David Murray some further matching funds if he can > raise sponsorship to complete the effort (on which he has made a very > promising start). > > We are also attempting to enable tax-deductible fund raising to increase > the likelihood of David's finding support. Perhaps we need to think > about a broader campaign to increase the quality of the python 3 > libraries. I find it very annoying that the #python IRC group still has > "Don't use Python 3" in it's topic. They adamantly refuse to remove it > until there is better library support, and they are the guys who see the > issues day in day out so it is hard to argue with them (and I don't > think an autocratic decision-making process would be appropriate). I'm all for people getting paid for work they do, but with all due respect, I think this underscores part of the problem in the Python world today. If funding had been as stringent a prerequisite in the 90s, I doubt there would be a Python today. It was about the fun and the code, not the bucks and the bureaucracy. As far as I can recall, there was no notion of creating a task force to get things done. Of course, this may just be the natural evolutionary pattern of human enterprises. As it is today, though, the Python community has a formal diversity statement, but it still does not have a fully functional 3.X almost two years after the fact. I doubt that I'm the only one who sees the irony in that. Again, I mean no disrespect to people contributing to Python today on so many fronts, and I have no answers to offer here. For better or worse, though, this is a personal issue to me too. After spending much of the last 2 years updating the best selling Python books for all the changes this group has seen fit to make, I believe I can say with some authority that 3.X still faces a very uncertain future. --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) From lutz at rmi.net Fri Jun 18 19:22:10 2010 From: lutz at rmi.net (lutz at rmi.net) Date: Fri, 18 Jun 2010 17:22:10 -0000 Subject: [Email-SIG] [Python-Dev] email package status in 3.X Message-ID: > Python 3.0 was *declared* to be an experimental release, and by most > standards 3.1 (in terms of the core language and functionality) was a > solid release. > > Any reasonable expectation about Python 3 adoption predicted that it > would take years, and would include going through a phase of difficulty > and disappointment... Declaring something to be a turd doesn't change the fact that it's a turd. I have a feeling that most people outside this list would have much rather avoided the difficulty and disappointment altogether. Let's be honest here; 3.X was released to the community in part as an extended beta. That's not a problem, unless you drop the word "beta". And if you're still not buying that, imagine the sort of response you'd get if you tried to sell software that billed itself as "experimental", and promised a phase of "disappointment". Why would you expect the Python world to react any differently? > Whilst I agree that there are plenty of issues to workon, and I don't > underestimate the difficulty of some of them, I think "half-baked" is > very much overblown. Whilst you have a lot to say about how much of a > problem this is I don't understand what you are suggesting be *done*? I agree that 3.X isn't all bad, and I very much hope it succeeds. And no, I have no answers; I'm just reporting the perception from downwind. So here it is: The prevailing view is that 3.X developers hoisted things on users that they did not fully work through themselves. Unicode is prime among these: for all the talk here about how 2.X was broken in this regard, the implications of the 3.X string solution remain to be fully resolved in the 3.X standard library to this day. What is a common Python user to make of that? --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) From fuzzyman at voidspace.org.uk Fri Jun 18 17:31:09 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 18 Jun 2010 16:31:09 +0100 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: References: Message-ID: <4C1B913D.60401@voidspace.org.uk> On 18/06/2010 16:09, lutz at rmi.net wrote: > Replying en masse to save bandwidth here... > > Barry Warsaw writes: > >> We know it, we have extensively discussed how to fix it, we have IMO a good >> design, and we even have someone willing and able to tackle the problem. We >> need to find a sufficient source of funding to enable him to do the work it >> will take, and so far that's been the biggest stumbling block. It will take a >> focused and determined effort to see this through, and it's obvious that >> volunteers cannot make it happen. I include myself in the latter category, as >> I've tried and failed at least twice to do it in my spare time. >> > All understood, and again, not to disparage anyone here. My > comments are directed to the development community at large > to underscore the grave p/r problems 3.X faces. > > I realize email parsing is a known issue; I also realize that > most people evaluating 3.X today won't care that it is. Most > will care only that the new version of a language reportedly > used by Google and YouTube still doesn't support CGI uploads > a year and a half after its release. As an author, that's a > downright horrible story to have to tell the world. > > Really? How widely used is the CGI module these days? Maybe there is a reason nobody appeared to notice... > [snip...] >> Should Python 3 have been held back until email was fixed? Dunno, but >> I personally am very glad it was not; where I have a choice, I always >> use Python 3 now, and have yet to run into a problem. >> > I guess we'll just have to disagree on that. IMHO, Python 3 shot > itself in the foot by releasing in half-baked form. And the 3.0 > I/O speed issue (remember that?) came very close to blowing its > leg clean off. > > Whilst I agree that there are plenty of issues to workon, and I don't underestimate the difficulty of some of them, I think "half-baked" is very much overblown. Whilst you have a lot to say about how much of a problem this is I don't understand what you are suggesting be *done*? Python 3.0 was *declared* to be an experimental release, and by most standards 3.1 (in terms of the core language and functionality) was a solid release. Any reasonable expectation about Python 3 adoption predicted that it would take years, and would include going through a phase of difficulty and disappointment... All the best, Michael Foord -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From pje at telecommunity.com Fri Jun 18 22:48:21 2010 From: pje at telecommunity.com (P.J. Eby) Date: Fri, 18 Jun 2010 16:48:21 -0400 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: References: Message-ID: <20100618204831.A8F2A3A40A5@sparrow.telecommunity.com> At 05:22 PM 6/18/2010 +0000, lutz at rmi.net wrote: >So here it is: The prevailing view is that 3.X developers hoisted things >on users that they did not fully work through themselves. Unicode is >prime among these: for all the talk here about how 2.X was broken in >this regard, the implications of the 3.X string solution remain to be >fully resolved in the 3.X standard library to this day. What is a >common Python user to make of that? Certainly, this was my impression as well, after all the Web-SIG discussions regarding the state of the stdlib in 3.x with respect to URL parsing, joining, opening, etc. To be honest, I'm waiting to see some sort of tutorial(s) for using 3.x that actually addresses these kinds of stdlib usage issues, so that I don't have to think about it or futz around with experimenting, possibly to find that some things can't be done at all. IOW, 3.x has broken TOOOWTDI for me in some areas. There may be obvious ways to do it, but, as per the Zen of Python, "that way may not be obvious at first unless you're Dutch". ;-) Since at the moment Python 3 offers me only cosmetic improvements over 2.x (apart from argument annotations), it's hard to get excited enough about it to want to muck about with porting anything to it, or even trying to learn about all the ramifications of the changes. :-( From jnoller at gmail.com Fri Jun 18 23:02:09 2010 From: jnoller at gmail.com (Jesse Noller) Date: Fri, 18 Jun 2010 17:02:09 -0400 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: <20100618204831.A8F2A3A40A5@sparrow.telecommunity.com> References: <20100618204831.A8F2A3A40A5@sparrow.telecommunity.com> Message-ID: On Fri, Jun 18, 2010 at 4:48 PM, P.J. Eby wrote: > At 05:22 PM 6/18/2010 +0000, lutz at rmi.net wrote: >> >> So here it is: The prevailing view is that 3.X developers hoisted things >> on users that they did not fully work through themselves. ?Unicode is >> prime among these: for all the talk here about how 2.X was broken in >> this regard, the implications of the 3.X string solution remain to be >> fully resolved in the 3.X standard library to this day. ?What is a >> common Python user to make of that? > > Certainly, this was my impression as well, after all the Web-SIG discussions > regarding the state of the stdlib in 3.x with respect to URL parsing, > joining, opening, etc. Nothing is set in stone; if something is incredibly painful, or worse yet broken, then someone needs to file a bug, bring it to this list, or bring up a patch. This is code we're talking about - nothing is set in stone, and if something is criminally broken it needs to be first identified, and then fixed. > To be honest, I'm waiting to see some sort of tutorial(s) for using 3.x that > actually addresses these kinds of stdlib usage issues, so that I don't have > to think about it or futz around with experimenting, possibly to find that > some things can't be done at all. I guess tutorial welcome, rather than patch welcome then ;) > IOW, 3.x has broken TOOOWTDI for me in some areas. ?There may be obvious > ways to do it, but, as per the Zen of Python, "that way may not be obvious > at first unless you're Dutch". ?;-) What areas. We need specifics which can either be: 1> Shot down. 2> Turned into bugs, so they can be fixed 3> Documented in the core documentation. jesse From nyamatongwe at gmail.com Sat Jun 19 00:31:40 2010 From: nyamatongwe at gmail.com (Neil Hodgson) Date: Sat, 19 Jun 2010 08:31:40 +1000 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: <4C1B913D.60401@voidspace.org.uk> References: <4C1B913D.60401@voidspace.org.uk> Message-ID: Michael Foord: > Python 3.0 was *declared* to be an experimental release, and by most > standards 3.1 (in terms of the core language and functionality) was a solid > release. That looks to me like an after-the-event rationalization. The release note for Python 3.0 (and the "What's new") gives no indication that it is experimental but does say """ We are confident that Python 3.0 is of the same high quality as our previous releases ... you can safely choose either version (or both) to use in your projects. """ http://mail.python.org/pipermail/python-dev/2008-December/083824.html Neil From stephen at xemacs.org Sat Jun 19 15:55:29 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 19 Jun 2010 22:55:29 +0900 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: References: Message-ID: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp> lutz at rmi.net writes: > I agree that 3.X isn't all bad, and I very much hope it succeeds. And > no, I have no answers; I'm just reporting the perception from downwind. The fact is, though, that many of your "downwind" readers are not the audience for Python 3, not yet. If you want to do Python 3 a favor, make sure that they understand that Python 3 is *not* an "upgrade" of Python 2. It's a hard task for you, but IMO one strategy is to write in the style that we wrote the DVCS PEP (#374) in: here's how you do the same task in these similar languages. And just as git and Bazaar turned out to have fatal defects in terms of adoption *in that time frame*, Python 3 is not yet adoptable for many, many users. Python 3 is a Python-2-like language, but even though it's built on the same design principles, and uses nearly identical syntax, there are fundamental differences. And it is *very* young. So it's a new language and should be approached in the same way as any new language. Try it on non-mission critical projects, on projects where its library support has a good reputation, etc. Many of your readers have no time (or perhaps no approval "from upstairs") for that kind of thing. Too bad, but that's what happens to every great new language. > So here it is: The prevailing view is that 3.X developers hoisted things > on users that they did not fully work through themselves. Unicode is > prime among these: for all the talk here about how 2.X was broken in > this regard, the implications of the 3.X string solution remain to be > fully resolved in the 3.X standard library to this day. What is a > common Python user to make of that? Why should she make anything of that? Python 3 is a *new* language, possibly as different from Python 2 as C++ was from C (and *more* different in terms of fundamental incompatibilities). And as long as C++ was almost entirely dependent on C libraries, there were problems. (Not to mention that even today there are plenty of programmers who are proud to be C programmers, not C++ programmers.) Today, Python 3 is entirely dependent on Python 2 libraries. It's human to hope there will be no problems, but not realistic. BTW, I think what you're missing is that you're wrong about the money. Python 3 is still about the fun and the code. "Fun and code" are why the core developers spent about five years developing it, because doing that was fun, because the new code has high value as code, and because it promised *them* a more fun and more productive future. Library support, on the other hand, *is* about money. Your readers, down in the trenches of WWW, intraweb, and sysadmin implementation and support, depend on robust libraries to get their day jobs done. They really don't care that writing Python 3 was fun, and that programming in Python 3 is more fun than ever. That doesn't compensate for even one lingering str/bytes bogosity to most of them, and since they don't get paid for fixing Python library bugs, they don't, and they're in no mood to *forgive* any, either. So tell users who feel that way to use Python 2, for now, and check on Python 3 progress every 6 months or so. And users who are just a bit more adventurous to stick to applications where the libraries already have a good reputation *in Python 3*. It's as simple as that, I think. Regards, From pje at telecommunity.com Sat Jun 19 18:07:43 2010 From: pje at telecommunity.com (P.J. Eby) Date: Sat, 19 Jun 2010 12:07:43 -0400 Subject: [Email-SIG] [Python-Dev] email package status in 3.X In-Reply-To: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20100619160755.4E50C3A4060@sparrow.telecommunity.com> At 10:55 PM 6/19/2010 +0900, Stephen J. Turnbull wrote: >They really don't care that writing Python 3 was fun, and that >programming in Python 3 is more fun than ever. That doesn't >compensate for even one lingering str/bytes bogosity to most of >them, and since they don't get paid for fixing Python library bugs, >they don't, and they're in no mood to *forgive* any, either. This is pretty much where I'm at, except that the only potential fun increase Py3 appears to offer me are argument annotations and keyword-only args -- but these are partly balanced by the loss of argument tuple unpacking. The metaclass keyword argument is nice, but the loss of dynamically-settable __metaclass__ is just plain annoying. Really, just about everything that Py3 offers in the way of added fun, seems offset by a matching loss somewhere else. So it's hard to get excited about it - it seems like, "ho hum, a new language that's kind of like Python, but just different enough to be annoying." OTOH, I don't know what to do about that, besides adding some sort of "killer app" feature that makes Python 3 the One Obvious Way to do some specific application domain.