From lutz at rmi.net Thu Nov 4 13:56:34 2010 From: lutz at rmi.net (lutz at rmi.net) Date: Thu, 04 Nov 2010 12:56:34 -0000 Subject: [Email-SIG] email breakage in 3.2 alpha Message-ID: <0any8rx56hvk6rru04112010085700@SMTP> Actually, nevermind. I have to submit the QC book review to O'Reilly today, so I must assume that the email changes in 3.2 are immutable at this point. To accommodate, I made a last minute patch to the book and its examples package, to special-case the mail sender's workaround code for the fact that 3.2 now returns str instead of bytes: .. text = msgobj.get_payload() # bytes fails in email pkg on text gen if isinstance(text, bytes): # payload is bytes in 3.1, str in 3.2 alpha text = text.decode('ascii') # decode to unicode str so text gen works .. With this, sends work under 3.2 too. The workaround must still split up the base64 data into lines, though, or else emails send with one massive line which does not play well with many text tools. Other 3.2 "fixes" seem to be compatible with my workarounds so far (at least with the limited testing I've been able to do). In the end, this wasn't a big change on my end, though patching code in books just before publication is very error-prone. The bigger issue to me is that Python core developers seems a bit too inclined to delegate the consequences of their actions to their users. I don't mean this personally; it's a group-wide attitude that has escalated in recent years. In this case, it was left to me to accommodate recent changes, or they would have broken a new 3.X book's major example. More to the point: if a fix made in the name of aesthetics breaks working code, is it really a fix? I don't think so, and perhaps we'll have to agree to disagree, but I hope this case serves as a data point for future email changes. --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) > -----Original Message----- > From: lutz at rmi.net > To: "R. David Murray" > Subject: email breakage in 3.2 alpha > Date: Mon, 01 Nov 2010 17:50:20 -0000 > > Hi David, > > (Sending this offlist; feel free to forward as appropriate) > > As promised, I've finally gotten around to testing the big email > client example in the upcoming Programming Python under 3.2 alpha3. > Although much still works, as expected the example is now broken > under 3.2. So far at least, the only specific breakage I've found > is on sending emails with non-text attachments. Obviously, this > is a major issue by itself. > > Below is the change, exception, and relevant code for the send > breakage I've found so far; the full source file is also attached. > (See my prior mail for the book examples link; it's on oreilly.com.) > > In short, for 3.1, I had to add code that manually decoded the bytes > that email in 3.1 left for base64-encoded binary parts. Because email > in 3.2 now returns these as str instead of bytes, the workaround > decoding step no longer is needed, but also now fails in 3.2. > > This change is an improvement in 3.2, of course; but because it > breaks code that ran correctly under 3.1, it's also a regression. > It would be straightforward for me to special-case the code for > 3.1 only; unfortunately I'm no longer able to change the book or > retarget it for 3.2 (it's in production now, and is already late). > Given that this book might be seen by something like 100K readers > evaluating 3.X in general, 3.1 compatibility seems a big deal to > me. Leaving readers with the impression that 3.X is not even > backwards compatible within its own line is not good. > > So, I see two options here: > > 1) I can patch the source code of the book's examples on the web and > post an errata prior to publication. Not difficult, but again, my > larger concern is the PR effect of having to note that the book was > broken by 3.X changes in between the time it was written and the time > it was printed. I'd much rather have any patch/errata work delayed > until 3.3 if possible per the next option (as is, the book already > has to mention more 3.X issues than I'd like). Even in 3.3+, I > think incompatibilities should be an option that must be enabled > if they cannot be avoided altogether (as discussed before). > > 2) In cases like this where you've changed email in a way that's > incompatible with 3.1, unless the 3.1 code never worked at all, > you probably should make the original 3.1 behavior the default, > and allow 3.2 changes to be enabled as options (e.g., via default > method args, top-level settings in the package, command-line args, > and so on). In this specific case, even though I agree that > returning a base64-encoded part as bytes might seem "wrong", > it did work; changing it to always return str now is a regression > from 3.1 behavior, and breaks code. Ideally, this would continue > to return bytes, but could return str in 3.2 if an incompatibility > switch were set in the package. > > Naturally, other more exotic solutions are plausible too (returning > a str subclass instance with a no-op .decode() method added, for > example); but they're probably too tricky to bother with for a > temporary compatibility fix. > > I completely understand the desire to fix email issues asap, but > these issues were not showstoppers, and could be managed in 3.1. > To me, unless code absolutely cannot work with what's present in > a release, "fixing" it in a later release also implies potentially > "breaking" it for current release users, and is not a clear-cut > Good Thing. Let me know what you think; given the broad impact > the book may have on 3.X's future, please weight this carefully. > > Thanks, > --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) > > ================================================================================ > [changed behavior that breaks non-text-attachment sends] > > C:\...>c:\python31\python > >>> from email.mime.image import MIMEImage > >>> bytes = open('monkeys.jpg', 'rb').read() > >>> m = MIMEImage(bytes) > >>> m.get_payload()[:40] > b'/9j/4AAQSkZJRgABAQEAeAB4AAD/2wBDAAIBAQIB' > > C:\...>c:\python32\python > >>> from email.mime.image import MIMEImage > >>> bytes = open('monkeys.jpg', 'rb').read() > >>> m = MIMEImage(bytes) > >>> m.get_payload()[:40] > '/9j/4AAQSkZJRgABAQEAeAB4AAD/2wBDAAIBAQIB' > > [the following also changed and may impact email clents' content > type handling, but looks irrelevant to the email package regression > (mimetypes now scans the Windows registry, which looks a bit iffy)] > > C:\...>c:\python31\python > >>> from mimetypes import guess_type > >>> guess_type('monkeys.jpg') > ('image/jpeg', None) > > C:...>c:\python32\python > >>> from mimetypes import guess_type > >>> guess_type('monkeys.jpg') > ('image/pjpeg', None) > > ================================================================================ > [exception on failure of send with image attachment] > > Adding image/pjpeg > > 'str' object has no attribute 'decode' > File "C:\Examples\PP4E\Gui\Tools\threadtools.py", line 81, in threaded > action(*args) # assume raises exception if fails > File "C:\Examples\PP4E\Internet\Email\mailtools\mailSender.py", line 111, in > sendMessage > bodytextEncoding, attachesEncodings) > File "C:\Examples\PP4E\Internet\Email\mailtools\mailSender.py", line 201, in > addAttachments > data.read(), _subtype=subtype, _encoder=fix_encode_base64) > File "c:\Python32\lib\email\mime\image.py", line 46, in __init__ > _encoder(self) > File "C:\Examples\PP4E\Internet\Email\mailtools\mailSender.py", line 39, in > fix_encode_base64 > text = bytes.decode('ascii') # decode to unicode str so text gen wor > ks > > ================================================================================ > [relevant code in C:\Examples\PP4E\Internet\Email\mailtools\mailSender.py] > > def fix_encode_base64(msgobj): > """ > 4E: workaround for a genuine bug in Python 3.1 email package that prevents > mail text generation for binary parts encoded with base64 or other email > encodings; the normal email.encoder run by the constructor leaves payload > as bytes, even though it's encoded to base64 text form; this breaks email > text generation which assumes this is text and requires it to be str; net > effect is that only simple text part emails can be composed in Py 3.1 email > package as is - any MIME-encoded binary part cause mail text generation to > fail; this bug seems likely to go away in a future Python and email package, > in which case this should become a no-op; see Chapter 13 for more details; > """ > linelen = 76 # per MIME standards > from email.encoders import encode_base64 > > encode_base64(msgobj) # what email does normally: leaves bytes > bytes = msgobj.get_payload() # bytes fails in email pkg on text gen > 39=> text = bytes.decode('ascii') # decode to unicode str so text gen works > lines = [] # split into lines, else 1 massive line > while text: > line, text = text[:linelen], text[linelen:] > lines.append(line) > msgobj.set_payload('\n'.join(lines)) > > > class MailSender(MailTool): > def sendMessage(self, From, To, Subj, extrahdrs, bodytext, attaches, > saveMailSeparator=(('=' * 80) + 'PY\n'), > bodytextEncoding='us-ascii', > attachesEncodings=None): > ... > msg = MIMEMultipart() > self.addAttachments(msg, bodytext, attaches, > bodytextEncoding, attachesEncodings) > > def addAttachments(self, mainmsg, bodytext, attaches, > bodytextEncoding, attachesEncodings): > """ > format a multipart message with attachments; > use Unicode encodings for text parts if passed; > """ > # add main text/plain part > msg = MIMEText(bodytext, _charset=bodytextEncoding) > mainmsg.attach(msg) > > # add attachment parts > encodings = attachesEncodings or (['us-ascii'] * len(attaches)) > for (filename, fileencode) in zip(attaches, encodings): > ... > # guess content type from file extension, ignore encoding > contype, encoding = mimetypes.guess_type(filename) > if contype is None or encoding is not None: # no guess, compressed? > contype = 'application/octet-stream' # use generic default > self.trace('Adding ' + contype) > > # build sub-Message of appropriate kind > maintype, subtype = contype.split('/', 1) > if maintype == 'text': # 4E: text needs encoding > if fix_text_required(fileencode): # requires str or bytes > data = open(filename, 'r', encoding=fileencode) > else: > data = open(filename, 'rb') > msg = MIMEText(data.read(), _subtype=subtype, _charset=fileencode) > data.close() > > elif maintype == 'image': > data = open(filename, 'rb') # 4E: use fix for binaries > msg = MIMEImage( > 201=> data.read(), _subtype=subtype, _encoder=fix_encode_base64) > data.close() > ... > > ======================================================================= > From rdmurray at bitdance.com Mon Nov 8 21:35:32 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 08 Nov 2010 15:35:32 -0500 Subject: [Email-SIG] email breakage in 3.2 alpha In-Reply-To: <0any8rx56hvk6rru04112010085700@SMTP> References: <0any8rx56hvk6rru04112010085700@SMTP> Message-ID: <20101108203532.383811FED05@kimball.webabinitio.net> On Thu, 04 Nov 2010 12:56:34 -0000, lutz at rmi.net wrote: > Actually, nevermind. I have to submit the QC book review to O'Reilly > today, so I must assume that the email changes in 3.2 are immutable > at this point. > > To accommodate, I made a last minute patch to the book and its examples > package, to special-case the mail sender's workaround code for the fact > that 3.2 now returns str instead of bytes: There is no reason we should withdraw that fix. It is a bug fix, and makes the code work as documented. There are times when we don't fix bugs in a maintenance release because to fix it would break a significant amount of code "in the field" that depends on the bug being present, but such a consideration is given less weight for a major release. Otherwise we'd have to live with stupid bugs forever, and that would not be a service to our users. As it is, "less weight" is not "no weight", and we do live with an awful lot of stupid bugs (especially design bugs) due to such considerations. In this case, because the bug is so obviously a bug that no one should depend on it not getting fixed, the fix was also backported to 3.1 and will appear in 3.1.3. You may consider this as putting a burden on the users, but we view it as *serving* the users: making the code work as designed and documented...in other words, fulfilling the promise made by our documentation. I told you that the fix would appear in 3.2 and 3.1.3 back at the beginning of June when I fixed it. In fact, I made a specific effort to get that bug fixed well in advance of our projected release dates so that you would be able to reference it in your book. A good strategy for you to have followed, when you found this bug while writing the program for your book, would have been to submit a bug report, and then have written your code from the start to work whether or not the bug was fixed, so that when it was fixed your code would continue to work. > .. > text = msgobj.get_payload() # bytes fails in email pkg on text gen > if isinstance(text, bytes): # payload is bytes in 3.1, str in 3.2 alpha > text = text.decode('ascii') # decode to unicode str so text gen works > .. > > With this, sends work under 3.2 too. The workaround must still split up > the base64 data into lines, though, or else emails send with one massive > line which does not play well with many text tools. Other 3.2 "fixes" It would have been helpful if you had reported this bug in the tracker (and it would still be helpful). I will take a look and see if I can reproduce it. > seem to be compatible with my workarounds so far (at least with the > limited testing I've been able to do). Thanks for the testing and the report. > In the end, this wasn't a big change on my end, though patching code > in books just before publication is very error-prone. The bigger > issue to me is that Python core developers seems a bit too inclined > to delegate the consequences of their actions to their users. I don't > mean this personally; it's a group-wide attitude that has escalated in > recent years. In this case, it was left to me to accommodate recent > changes, or they would have broken a new 3.X book's major example. Well, there are *always* changes in a major release. We do try to minimize backward compatibility problems, but there is no perfect world. > More to the point: if a fix made in the name of aesthetics breaks > working code, is it really a fix? I don't think so, and perhaps > we'll have to agree to disagree, but I hope this case serves as > a data point for future email changes. I suppose that depends on your definition of aesthetics. To me this was not a matter of aesthetics, but of code not working as designed or documented. I stand behind my decision to implement this fix; so, yes, I suppose we will just have to agree to disagree. -- R. David Murray www.bitdance.com From lutz at rmi.net Tue Nov 9 14:03:12 2010 From: lutz at rmi.net (lutz at rmi.net) Date: Tue, 09 Nov 2010 13:03:12 -0000 Subject: [Email-SIG] email breakage in 3.2 alpha Message-ID: <3508w1cz67cyop5l09112010080321@SMTP> Truthfully, the tone of your reply illustrates much of my point. I made the book's dependence on 3.1 email as clear as could be, both in June and in later exchanges. I did so for the sake of 100K flesh-and-blood Python users, and seem to have come up short. If Python developers don't work for their users, who do they work for? But you gotta do what you gotta do, and so do I. So let's stick to the constructive here. I hope that you will keep the book's deep dependence on email in mind in future work; not for my sake, but for its readers, a user base clearly large enough to matter. And for the sake of Python 3 itself, can we at least agree to not make further changes to the email package until after 3.2 is out? I went to the trouble of porting from 3.1 to 3.2 alpha3, and would hate to have to tell the world that the pace of Python change is so brisk that stability cannot be relied on for even a few weeks. --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) > -----Original Message----- > From: "R. David Murray" > To: lutz at rmi.net > Subject: Re: email breakage in 3.2 alpha > Date: Mon, 08 Nov 2010 15:35:32 -0500 > > On Thu, 04 Nov 2010 12:56:34 -0000, lutz at rmi.net wrote: > > Actually, nevermind. I have to submit the QC book review to O'Reilly > > today, so I must assume that the email changes in 3.2 are immutable > > at this point. > > > > To accommodate, I made a last minute patch to the book and its examples > > package, to special-case the mail sender's workaround code for the fact > > that 3.2 now returns str instead of bytes: > > There is no reason we should withdraw that fix. It is a bug fix, and > makes the code work as documented. There are times when we don't fix > bugs in a maintenance release because to fix it would break a significant > amount of code "in the field" that depends on the bug being present, > but such a consideration is given less weight for a major release. > Otherwise we'd have to live with stupid bugs forever, and that would > not be a service to our users. As it is, "less weight" is not "no > weight", and we do live with an awful lot of stupid bugs (especially > design bugs) due to such considerations. > > In this case, because the bug is so obviously a bug that no one should > depend on it not getting fixed, the fix was also backported to 3.1 and > will appear in 3.1.3. You may consider this as putting a burden on > the users, but we view it as *serving* the users: making the code work > as designed and documented...in other words, fulfilling the promise > made by our documentation. > > I told you that the fix would appear in 3.2 and 3.1.3 back at the > beginning of June when I fixed it. In fact, I made a specific effort > to get that bug fixed well in advance of our projected release dates so > that you would be able to reference it in your book. > > A good strategy for you to have followed, when you found this bug while > writing the program for your book, would have been to submit a bug report, > and then have written your code from the start to work whether or not the > bug was fixed, so that when it was fixed your code would continue to work. > > > .. > > text = msgobj.get_payload() # bytes fails in email pkg on text gen > > if isinstance(text, bytes): # payload is bytes in 3.1, str in 3.2 alpha > > text = text.decode('ascii') # decode to unicode str so text gen works > > .. > > > > With this, sends work under 3.2 too. The workaround must still split up > > the base64 data into lines, though, or else emails send with one massive > > line which does not play well with many text tools. Other 3.2 "fixes" > > It would have been helpful if you had reported this bug in the tracker > (and it would still be helpful). I will take a look and see if I can > reproduce it. > > > seem to be compatible with my workarounds so far (at least with the > > limited testing I've been able to do). > > Thanks for the testing and the report. > > > In the end, this wasn't a big change on my end, though patching code > > in books just before publication is very error-prone. The bigger > > issue to me is that Python core developers seems a bit too inclined > > to delegate the consequences of their actions to their users. I don't > > mean this personally; it's a group-wide attitude that has escalated in > > recent years. In this case, it was left to me to accommodate recent > > changes, or they would have broken a new 3.X book's major example. > > Well, there are *always* changes in a major release. We do try > to minimize backward compatibility problems, but there is no perfect > world. > > > More to the point: if a fix made in the name of aesthetics breaks > > working code, is it really a fix? I don't think so, and perhaps > > we'll have to agree to disagree, but I hope this case serves as > > a data point for future email changes. > > I suppose that depends on your definition of aesthetics. To me this > was not a matter of aesthetics, but of code not working as designed or > documented. I stand behind my decision to implement this fix; so, yes, > I suppose we will just have to agree to disagree. > > -- > R. David Murray www.bitdance.com >