From rdmurray at bitdance.com Tue Apr 5 16:17:54 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 05 Apr 2011 10:17:54 -0400 Subject: [Email-SIG] Hooking up Policy: work in progress Message-ID: <20110405141815.5B458250C6D@mailhost.webabinitio.net> Here's my blog post for my work last week on hooking up Policy: http://www.bitdance.com/blog/2011/04/04_01_Hooking_Up_Email6_Policy_First_Steps/ All of that is pushed to the feature branch: http://hg.python.org/features/email6#email6) I also posted a cumulative diff against default to the tracker here: http://bugs.python.org/issue11731 which now thanks to Martin also means there's a reitveld review issue for it. No need for detailed reviews yet, I think (unless you feel like it of course; such would be welcome any time), but if anyone has time to cast an eye over it to make sure I'm headed in the right direction that would be great. Anyone who wants me to add them to nosy on any email6 issues I create, please let me know your tracker id and I will do so. --David From barry at python.org Tue Apr 5 21:34:38 2011 From: barry at python.org (Barry Warsaw) Date: Tue, 5 Apr 2011 15:34:38 -0400 Subject: [Email-SIG] Hooking up Policy: work in progress In-Reply-To: <20110405141815.5B458250C6D@mailhost.webabinitio.net> References: <20110405141815.5B458250C6D@mailhost.webabinitio.net> Message-ID: <20110405153438.62b221b0@neurotica.wooz.org> On Apr 05, 2011, at 10:17 AM, R. David Murray wrote: >Here's my blog post for my work last week on hooking up Policy: > >http://www.bitdance.com/blog/2011/04/04_01_Hooking_Up_Email6_Policy_First_Steps/ Very nice post! It brings up lots of memories, and it's great to see things getting refactored, removed, and improved. >Anyone who wants me to add them to nosy on any email6 issues I create, >please let me know your tracker id and I will do so. Well, you're probably adding me by default... keep doing so! :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From axel.rau at chaos1.de Sat Apr 9 23:18:33 2011 From: axel.rau at chaos1.de (Axel Rau) Date: Sat, 9 Apr 2011 23:18:33 +0200 Subject: [Email-SIG] 3.2: email.message.get_payload() delivers str, but send_message expects bytes Message-ID: Hi all, [This is a repost from comp.lang.python] I'm just starting with imaplib, email and smtplib and try to write a SPAM reporter. I retrieve SPAM mails from an IMAP server and add them as message/rfc822 attachments to a report mail. Sometimes my call of smtplib.send_message works, sometimes, I get: ---------- File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/smtplib.py", line 771, in send_message rcpt_options) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/smtplib.py", line 739, in sendmail (code,resp) = self.data(msg) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/smtplib.py", line 495, in data q = _quote_periods(msg) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/smtplib.py", line 165, in _quote_periods return re.sub(br'(?m)^\.', '..', bindata) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/re.py", line 167, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: sequence item 1: expected bytes, str found ---------- When I query the class of my pyloads, they always show up as strings. The test case, which always fails is an oversized SPAM, which my script must truncates. I do this by removing MIME parts from the end (just deleting items from the list, describing the multipart structure). Another problem comes up, when I try to encode the payload of the whole report mail, I get always: ------- File "erdb_bt.py", line 195, in flushReports email.encoders.encode_base64(self.msg) File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/email/encoders.py", line 32, in encode_base64 encdata = str(_bencode(orig), 'ascii') File "/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/base64.py", line 56, in b64encode raise TypeError("expected bytes, not %s" % s.__class__.__name__) TypeError: expected bytes, not list ------- What am I doing wrong? Axel --- PGP-Key:29E99DD6 ? +49 151 2300 9283 ? computing @ chaos claudius From rdmurray at bitdance.com Sun Apr 10 22:46:33 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 10 Apr 2011 16:46:33 -0400 Subject: [Email-SIG] Fixing the header wrapping algorithm Message-ID: <20110410204653.797512505D7@mailhost.webabinitio.net> I just posted a significant patch on issue 11492. After banging my head on the existing header folding algorithm and watching the if/else cases proliferate and break other things, I decided to try a rewrite. It is 70 lines shorter but passes all the tests plus the new ones I posted to the bug. And some additional ones. In the new algorithm I'm changing the interpretation of RFC2822 that it implements. The old algorithm breaks on the 'splitchars' unconditionally, introducing whitespace if there isn't whitespace there already. This seems wrong to me. When 2822 talks about higher level syntactic breaks, I believe it means only such breaks where FWS is present. So the new algorithm breaks only where there is at least one tab or space, but prefers to break after the splitchars when such are followed by a tab or space. We still aren't doing it "right", because we aren't paying attention to the real syntax of structured headers, and we might inadvertently break at whitespace that is not legitimate FWS. Those case should be pretty darn rare, though, and they old algorithm could make the same mistake. The patch adjusts a few tests that were checking the old line breaking that was failing to break long lines even though they contained whitespace when they also contained splitchars. There is even a comment in one of them that says that it is wrong. Since this fixes bugs and improves RFC compliance, I plan to apply it to 3.2. (As noted in the issue, 3.1 has a test failure I don't understand...really I ought to figure it out, and perhaps I will before the time comes that I can actually apply the patch.) diffstat says the header.py portion of the patch is 107 lines added and 178 deleted, so it is a non trivial change. Reviews welcomed. --David From rdmurray at bitdance.com Mon Apr 11 23:48:15 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 11 Apr 2011 17:48:15 -0400 Subject: [Email-SIG] blog post about the headef folding algorithm rewrite Message-ID: <20110411214836.49D7C25063A@mailhost.webabinitio.net> For the curious, I just posted a writeup of the process that produced the header folding algorithm rewrite on my blog: http://www.bitdance.com/blog/2011/04/11_01_Email6_Rewriting_Header_Folding/ -- R. David Murray http://www.bitdance.com From v+python at g.nevcal.com Tue Apr 12 00:07:09 2011 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 11 Apr 2011 15:07:09 -0700 Subject: [Email-SIG] blog post about the headef folding algorithm rewrite In-Reply-To: <20110411214836.49D7C25063A@mailhost.webabinitio.net> References: <20110411214836.49D7C25063A@mailhost.webabinitio.net> Message-ID: <4DA37B8D.8040005@g.nevcal.com> On 4/11/2011 2:48 PM, R. David Murray wrote: > For the curious, I just posted a writeup of the process that > produced the header folding algorithm rewrite on my blog: > > http://www.bitdance.com/blog/2011/04/11_01_Email6_Rewriting_Header_Folding/ Updated my virus definitions, and it went away. Guess it was a false positive in Avast. Sorry for the noise. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Apr 12 00:12:05 2011 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Apr 2011 18:12:05 -0400 Subject: [Email-SIG] blog post about the headef folding algorithm rewrite In-Reply-To: <20110411214836.49D7C25063A@mailhost.webabinitio.net> References: <20110411214836.49D7C25063A@mailhost.webabinitio.net> Message-ID: <20110411181205.7c2c5c4f@neurotica.wooz.org> On Apr 11, 2011, at 05:48 PM, R. David Murray wrote: >For the curious, I just posted a writeup of the process that >produced the header folding algorithm rewrite on my blog: > >http://www.bitdance.com/blog/2011/04/11_01_Email6_Rewriting_Header_Folding/ Excellent post David. Thanks too for being so diplomatic when mentioning "the author of the algorithm" :). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From rdmurray at bitdance.com Tue Apr 12 00:21:13 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 11 Apr 2011 18:21:13 -0400 Subject: [Email-SIG] blog post about the headef folding algorithm rewrite In-Reply-To: <20110411181205.7c2c5c4f@neurotica.wooz.org> References: <20110411214836.49D7C25063A@mailhost.webabinitio.net> <20110411181205.7c2c5c4f@neurotica.wooz.org> Message-ID: <20110411222133.E83832505A2@mailhost.webabinitio.net> On Mon, 11 Apr 2011 18:12:05 -0400, Barry Warsaw wrote: > On Apr 11, 2011, at 05:48 PM, R. David Murray wrote: > >For the curious, I just posted a writeup of the process that > >produced the header folding algorithm rewrite on my blog: > > > >http://www.bitdance.com/blog/2011/04/11_01_Email6_Rewriting_Header_Folding/ > > Excellent post David. Thanks too for being so diplomatic when > mentioning "the author of the algorithm" :). Well, (a) I wasn't sure it was you and (b) my original attempt to rewrite the algorithm ended up even uglier than yours :) -- R. David Murray http://www.bitdance.com From v+python at g.nevcal.com Tue Apr 12 00:26:27 2011 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 11 Apr 2011 15:26:27 -0700 Subject: [Email-SIG] blog post about the headef folding algorithm rewrite In-Reply-To: <20110411214836.49D7C25063A@mailhost.webabinitio.net> References: <20110411214836.49D7C25063A@mailhost.webabinitio.net> Message-ID: <4DA38013.405@g.nevcal.com> On 4/11/2011 2:48 PM, R. David Murray wrote: > For the curious, I just posted a writeup of the process that > produced the header folding algorithm rewrite on my blog: > > http://www.bitdance.com/blog/2011/04/11_01_Email6_Rewriting_Header_Folding/ Interesting! > [2] > There > may also be cases where whitespace does /not/ mark a valid folding > point. So for email6 the folding API will further need to provide a > way for the specific header to indicate these points before folding. > The simplest scheme is probably to replace the non-folding whitespace > with marker characters, fold the header, and then re-convert the > marker characters to the original whitespace. Another alternative may be to allow the header access to the Accumulator, and let it emit chunks directly into the accumulator. This would save reparsing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Tue Apr 12 03:09:20 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 11 Apr 2011 21:09:20 -0400 Subject: [Email-SIG] blog post about the headef folding algorithm rewrite In-Reply-To: <4DA38013.405@g.nevcal.com> References: <20110411214836.49D7C25063A@mailhost.webabinitio.net> <4DA38013.405@g.nevcal.com> Message-ID: <20110412010949.ABBB025063A@mailhost.webabinitio.net> On Mon, 11 Apr 2011 15:26:27 -0700, Glenn Linderman wrote: > On 4/11/2011 2:48 PM, R. David Murray wrote: > > [2] > > There > > may also be cases where whitespace does /not/ mark a valid folding > > point. So for email6 the folding API will further need to provide a > > way for the specific header to indicate these points before folding. > > The simplest scheme is probably to replace the non-folding whitespace > > with marker characters, fold the header, and then re-convert the > > marker characters to the original whitespace. > > Another alternative may be to allow the header access to the > Accumulator, and let it emit chunks directly into the accumulator. This > would save reparsing. Well, the algorithm doesn't work that way, since the smarts for managing the accumulator is all in the class that is building the output lines, not in the accumulator itself. Right now what gets fed into that class is strings that are pieces of the header (actually it's (string, charset) pairs), which the feed method then splits up into potential split points and uses the accumulator to manage. What would work would be to provide an alternate API where the Header class feeds in an already-split list instead of a string. That makes sense from several perspectives. Or, despite what I said in the blog post, I may end up rewriting the algorithm again at some point ;) --David From v+python at g.nevcal.com Tue Apr 12 03:42:33 2011 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 11 Apr 2011 18:42:33 -0700 Subject: [Email-SIG] blog post about the headef folding algorithm rewrite In-Reply-To: <20110412010949.ABBB025063A@mailhost.webabinitio.net> References: <20110411214836.49D7C25063A@mailhost.webabinitio.net> <4DA38013.405@g.nevcal.com> <20110412010949.ABBB025063A@mailhost.webabinitio.net> Message-ID: <4DA3AE09.6060605@g.nevcal.com> On 4/11/2011 6:09 PM, R. David Murray wrote: > What would work would be to provide an alternate API where the Header > class feeds in an already-split list instead of a string. That makes > sense from several perspectives. That's the essence of what I meant. An already-split list wouldn't (hopefully) need to do the double substitution for white-space versus placeholders, and wouldn't need to parse to find potential split points. The only fly in the ointment would be if there was an item in the already-split list that is too long and needs to be further split. "Just a person that is interested in email and wants to post to the list and wants to use a discriptive but not real name for real name" -ly yours, Glenn -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Thu Apr 14 18:39:27 2011 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 14 Apr 2011 12:39:27 -0400 Subject: [Email-SIG] policy patch review Message-ID: <20110414163952.3E0012500D5@mailhost.webabinitio.net> OK, so I've started designing/coding the more interesting stuff (the new header class), and I find I need a new policy control. So what I'd like to do is commit the policy stuff I've done to this point so I can use that as a base for further development. So I've removed the parameters that I'm not ready to implement yet and have posted a version of the patch for final review on the tracker (http://bugs.python.org/issue11731). This version of the patch pretty much just adds the hooks for maxlinelen, compared to the previous version. The proposed policy patch shows all the essentials of the API/framework, with several "hooked up" examples. It does not include any existing-parameter deprecations, I'll leave that for much later. Please let me know what you think. Absent negative feedback I'll probably commit this (and the folding algorithm rewrite) early next week. --David From barry at python.org Fri Apr 15 14:47:32 2011 From: barry at python.org (Barry Warsaw) Date: Fri, 15 Apr 2011 08:47:32 -0400 Subject: [Email-SIG] policy patch review In-Reply-To: <20110414163952.3E0012500D5@mailhost.webabinitio.net> References: <20110414163952.3E0012500D5@mailhost.webabinitio.net> Message-ID: <20110415084732.21419bfb@neurotica.wooz.org> On Apr 14, 2011, at 12:39 PM, R. David Murray wrote: >Please let me know what you think. Absent negative feedback I'll probably >commit this (and the folding algorithm rewrite) early next week. Really nice stuff David. I did a quick review, but my comments are mostly shallow. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: