From wichert at wiggy.net Wed Jan 5 10:23:23 2005 From: wichert at wiggy.net (Wichert Akkerman) Date: Wed Jan 5 10:23:25 2005 Subject: [Email-SIG] encoding To headers In-Reply-To: <20041203151321.GL17614@wiggy.net> References: <20041203151321.GL17614@wiggy.net> Message-ID: <20050105092323.GB7126@wiggy.net> It's been a month since I posted this; can I safely assume nobody knows the answer or is it just not possible? Wichert. Previously Wichert Akkerman wrote: > I am trying to figure out how to use an 'internationaled' To-header but > not succeeding so far. The basic documented approach does not work: > > >>> mail["From"]=email.Header.Header(, "utf-8") > >>> print mail.as_string() > [...] > From: =?utf-8?b?dMODwrhzdGk=?= > > this is not allowed by the RFCs: the address must not be encoded. Using > quoted-printable would fix this but I can not find any documentation > as to how to do that. > > All current documentation mentions that specifying encoders is > deprecated, but without setting them the email module seems to insist on > using base64 encoding. > > Wichert. -- Wichert Akkerman It is simple to make things. http://www.wiggy.net/ It is hard to make things simple. From phd at mail2.phd.pp.ru Wed Jan 5 11:28:33 2005 From: phd at mail2.phd.pp.ru (Oleg Broytmann) Date: Wed Jan 5 11:28:38 2005 Subject: [Email-SIG] encoding To headers In-Reply-To: <20050105092323.GB7126@wiggy.net> References: <20041203151321.GL17614@wiggy.net> <20050105092323.GB7126@wiggy.net> Message-ID: <20050105102833.GA22640@phd.pp.ru> On Wed, Jan 05, 2005 at 10:23:23AM +0100, Wichert Akkerman wrote: > > I am trying to figure out how to use an 'internationaled' To-header but > > not succeeding so far. The basic documented approach does not work: > > > > >>> mail["From"]=email.Header.Header(, "utf-8") You must separate the name to be encoded from non-encodable address: From: "=?koi8-r?B?6dPMwc7P18Eg78zYx8E=?=" Copied from a real mail mesage. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From stuart at stuartbishop.net Wed Jan 5 11:55:12 2005 From: stuart at stuartbishop.net (Stuart Bishop) Date: Wed Jan 5 11:55:21 2005 Subject: [Email-SIG] encoding To headers In-Reply-To: <20050105092323.GB7126@wiggy.net> References: <20041203151321.GL17614@wiggy.net> <20050105092323.GB7126@wiggy.net> Message-ID: <41DBC790.9000605@stuartbishop.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Wichert Akkerman wrote: | It's been a month since I posted this; can I safely assume nobody knows | the answer or is it just not possible? I never saw your original message sorry. There is code that will correctly encode and decode email addresses at http://stuartbishop.net/Software/EmailAddress. It handles the case you mention, as well as handling Unicode domain names if you should ever actually see one. |>I am trying to figure out how to use an 'internationaled' To-header but |>not succeeding so far. The basic documented approach does not work: |> |> |>>>>mail["From"]=email.Header.Header(, "utf-8") |>>>>print mail.as_string() |> |>[...] |>From: =?utf-8?b?dMODwrhzdGk=?= |> |>this is not allowed by the RFCs: the address must not be encoded. Using |>quoted-printable would fix this but I can not find any documentation |>as to how to do that. |> |>All current documentation mentions that specifying encoders is |>deprecated, but without setting them the email module seems to insist on |>using base64 encoding. |> |>Wichert. - -- Stuart Bishop http://www.stuartbishop.net/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFB28eQAfqZj7rGN0oRAmg3AJ90ZLjurHATcW2qaGDk/eU6AALvpACgi7tr gUOlaVn/ArH40aj/YKJhS4k= =Soir -----END PGP SIGNATURE----- From phd at mail2.phd.pp.ru Wed Jan 5 12:08:53 2005 From: phd at mail2.phd.pp.ru (Oleg Broytmann) Date: Wed Jan 5 12:08:55 2005 Subject: [Email-SIG] encoding To headers In-Reply-To: <41DBC790.9000605@stuartbishop.net> References: <20041203151321.GL17614@wiggy.net> <20050105092323.GB7126@wiggy.net> <41DBC790.9000605@stuartbishop.net> Message-ID: <20050105110852.GA23611@phd.pp.ru> On Wed, Jan 05, 2005 at 09:55:12PM +1100, Stuart Bishop wrote: > http://stuartbishop.net/Software/EmailAddress IWBN to refactor the code into an AddressHeader class, so instead of > |>>>>mail["From"]=email.Header.Header(, "utf-8") one would write mail["From"]=email.Header.AddressHeader(name, address, "utf-8") Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From tkikuchi at is.kochi-u.ac.jp Wed Jan 12 05:20:30 2005 From: tkikuchi at is.kochi-u.ac.jp (Tokio Kikuchi) Date: Wed Jan 12 05:28:15 2005 Subject: [Email-SIG] Re: Mailman and Scrubber.py : problem and possible Solution In-Reply-To: <41E41F4C.9010703@ei.tum.de> References: <41E41F4C.9010703@ei.tum.de> Message-ID: <41E4A58E.6040007@is.kochi-u.ac.jp> Hi, Looks like this bug report should go to email-sig@python. Jan Wolff wrote: > Dear Mr. Kikuchi, > > I read you are maintaining the file 'Scrubber.py' of Mailman. > We got some strange problems with one mailing list that > prevented mailman so send any messages to this list. > > After reading your posting > http://mail.python.org/pipermail/mailman-i18n/2002-September/000616.html > in Mailman-Users archive, I was able to identify the message that caused > the trouble. It appears to me, that mailman failed to send the message > because the filename of an attachment had the character ' in it. > > After removing the message from the shunt-directory, all other messages > could be unshunted normally. I have then removed the character ' by > overwriting with whitespace in the .pck-file. After that, the > trouble-making-message could also be sent. > > Below is the error-message from mailman and an excerpt from the > original mail with the attachment-part. We are running Mailman version > 2.1.5. > > I hope this is helpful. Thanks for your work with Mailman. Its a great > piece of software. > > Greetings > -Jan Wolff > > from prefix-dir/logs/error: > --------------------------------------- > Jan 11 18:48:51 2005 (7305) SHUNTING: > 1105457671.442827+56adb191ee68a790a3817f1b3fe9c3acfb2fc5b4 > Jan 11 18:48:51 2005 (7305) Uncaught runner exception: unpack list of > wrong size > Jan 11 18:48:51 2005 (7305) Traceback (most recent call last): > File "/usr/local/mailman/Mailman/Queue/Runner.py", line 111, in _oneloop > self._onefile(msg, msgdata) > File "/usr/local/mailman/Mailman/Queue/Runner.py", line 167, in _onefile > keepqueued = self._dispose(mlist, msg, msgdata) > File "/usr/local/mailman/Mailman/Queue/IncomingRunner.py", line 130, > in _dispose > more = self._dopipeline(mlist, msg, msgdata, pipeline) > File "/usr/local/mailman/Mailman/Queue/IncomingRunner.py", line 153, > in _dopipeline > sys.modules[modname].process(mlist, msg, msgdata) > File "/usr/local/mailman/Mailman/Handlers/ToDigest.py", line 91, in > process > send_digests(mlist, mboxfp) > File "/usr/local/mailman/Mailman/Handlers/ToDigest.py", line 132, in > send_digests > send_i18n_digests(mlist, mboxfp) > File "/usr/local/mailman/Mailman/Handlers/ToDigest.py", line 306, in > send_i18n_digests > msg = scrubber(mlist, msg) > File "/usr/local/mailman/Mailman/Handlers/Scrubber.py", line 265, in > process > url = save_attachment(mlist, part, dir) > File "/usr/local/mailman/Mailman/Handlers/Scrubber.py", line 359, in > save_attachment > fnext = os.path.splitext(msg.get_filename(''))[1] > File "/usr/local/mailman/pythonlib/email/Message.py", line 725, in > get_filename > filename = self.get_param('filename', missing, 'content-disposition') > File "/usr/local/mailman/pythonlib/email/Message.py", line 608, in > get_param > for k, v in self._get_params_preserve(failobj, header): > File "/usr/local/mailman/pythonlib/email/Message.py", line 555, in > _get_params_preserve > params = Utils.decode_params(params) > File "/usr/local/mailman/pythonlib/email/Utils.py", line 337, in > decode_params > charset, language, value = decode_rfc2231(EMPTYSTRING.join(value)) > File "/usr/local/mailman/pythonlib/email/Utils.py", line 284, in > decode_rfc2231 > charset, language, s = parts > ValueError: unpack list of wrong size > > from the original mail: > --------------------------------------- > --Apple-Mail-2-109131617 > Content-Transfer-Encoding: base64 > Content-Type: application/msword; x-mac-type=42494E41; x-unix-mode=0644; > x-mac-creator=4D535744; > name="miriam's file.doc" > Content-Disposition: attachment; > filename*0="miriam's file"; filename*1=ths.doc > > > > > > > > -- Tokio Kikuchi, tkikuchi@ is.kochi-u.ac.jp http://weather.is.kochi-u.ac.jp/ From barry at python.org Sat Jan 15 17:15:12 2005 From: barry at python.org (Barry Warsaw) Date: Sat Jan 15 17:15:13 2005 Subject: [Email-SIG] Re: Mailman and Scrubber.py : problem and possible Solution In-Reply-To: <41E4A58E.6040007@is.kochi-u.ac.jp> References: <41E41F4C.9010703@ei.tum.de> <41E4A58E.6040007@is.kochi-u.ac.jp> Message-ID: <1105805711.10229.5.camel@presto.wooz.org> On Tue, 2005-01-11 at 23:20, Tokio Kikuchi wrote: > Looks like this bug report should go to email-sig@python. Yes, this is a legitimate email package bug. It affects email 2.5 and email 3.0, so that includes Python 2.3, 2.4 and the head. I don't have time to fix this right now, but I've submitted a SF bug report on it: http://sourceforge.net/tracker/index.php?func=detail&aid=1102973&group_id=5470&atid=105470 I've assigned it to myself, but feel free to take a crack at it in the meantime! > > from the original mail: > > --------------------------------------- > > --Apple-Mail-2-109131617 > > Content-Transfer-Encoding: base64 > > Content-Type: application/msword; x-mac-type=42494E41; x-unix-mode=0644; > > x-mac-creator=4D535744; > > name="miriam's file.doc" > > Content-Disposition: attachment; > > filename*0="miriam's file"; filename*1=ths.doc -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20050115/dd61b219/attachment.pgp From barry at python.org Sat Jan 15 17:22:17 2005 From: barry at python.org (Barry Warsaw) Date: Sat Jan 15 17:22:18 2005 Subject: [Email-SIG] encoding To headers In-Reply-To: <41DBC790.9000605@stuartbishop.net> References: <20041203151321.GL17614@wiggy.net> <20050105092323.GB7126@wiggy.net> <41DBC790.9000605@stuartbishop.net> Message-ID: <1105806137.9882.14.camel@presto.wooz.org> On Wed, 2005-01-05 at 05:55, Stuart Bishop wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Wichert Akkerman wrote: > | It's been a month since I posted this; can I safely assume nobody knows > | the answer or is it just not possible? > > I never saw your original message sorry. > > There is code that will correctly encode and decode email addresses at > http://stuartbishop.net/Software/EmailAddress. It handles the case you > mention, as well as handling Unicode domain names if you should ever > actually see one. > > |>I am trying to figure out how to use an 'internationaled' To-header but > |>not succeeding so far. The basic documented approach does not work: > |> > |> > |>>>>mail["From"]=email.Header.Header(, "utf-8") > |>>>>print mail.as_string() > |> > |>[...] > |>From: =?utf-8?b?dMODwrhzdGk=?= > |> > |>this is not allowed by the RFCs: the address must not be encoded. Using > |>quoted-printable would fix this but I can not find any documentation > |>as to how to do that. > |> > |>All current documentation mentions that specifying encoders is > |>deprecated, but without setting them the email module seems to insist on > |>using base64 encoding. > |> > |>Wichert. Stuart, I wonder if it makes sense to try to fold this into the email package? On a related note, Pycon2005 is coming up soon and I will be attending. There are four days of sprints preceding the conference and last year, Anthony, Thomas and I sprinted on the new feed parser. While we didn't finish that work during the sprint, I think that session was more successful than my following Mailman3 sprint. The fruits of the email sprint are evident in the new FeedParser that's part of email 3.0. Does anybody want to sprint on the email package again this year? Some things we could work on include: - An RFC compliant email address parser - Adding better header support - Adding a persistency and/or external storage API, or developing a relational model for storing email messages. - Another round of API fixes, and/or sprinting on bugs. - Switching the Python stdlib away from deprecated modules such as rfc822. Let's gather some thoughts here and if there's enough interest, I'll sign up for a sprint at Pycon2005. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20050115/91f23327/attachment.pgp From tim at sitefusion.co.uk Sat Jan 15 20:59:42 2005 From: tim at sitefusion.co.uk (Tim Hicks) Date: Sat Jan 15 20:59:44 2005 Subject: [Fwd: [Email-SIG] persisting email.Message.Message instances] Message-ID: <1149.82.45.210.74.1105819182.squirrel@82.45.210.74> Hello all, I'm just re-sending this message. I guess it got lost in the holiday season... tim ---------------------------- Original Message ---------------------------- Subject: [Email-SIG] persisting email.Message.Message instances From: "Tim Hicks" Date: Sun, December 26, 2004 5:17 pm To: email-sig@python.org -------------------------------------------------------------------------- Hi all, I'm trying to come up with the best way to persist Message instances (using ZODB). After playing around for a bit, I came up with some code that works (I think). If you look at , you can see what I've done. The PersistentMessage class is my second attempt, while the DeeplyPersistentMessage class was my first (rather flawed) attempt. The reason I'm writing to the list is that I've realised a couple of things that I'd quite like the email package to offer, so I thought I'd ask. My PersistentMessage class seems like a fair few too many LOC for what it really does. If email.Message.Message automatically called a specific method, say '_ob_changed' whenever the instance was changed, with a default implementation looking like:: def _ob_changed(self): pass then my PersistentMessage implementation could simply look like:: class PersistentMessage(Message, Persistent): def _ob_changed(self): self._p_changed = True Does this sound like a good idea? The second thing is that I think I want to be able to specify the class/factory that is used when Message sub-parts are constructed. That is, I want each message-part of my PersistentMessage instances to also be PersistentMessage instances. This is because I want to (a) stop ZODB bloat when I make changes to message attachments where other (potentially large) attachments are also present; (b) have the benefits of the solution to the _ob_changed method outlined above; (c) have security declarations from Zope3 ZCML in effect; (d) related to b and c, not have to worry about protecting PersistentMessage methods that return objects (i.e. message-parts) that can then be mutated in place. I hope that all makes sense. Comments and guidance very much appreciated. cheers, Tim _______________________________________________ Email-SIG mailing list Email-SIG@python.org Your options: http://mail.python.org/mailman/options/email-sig/tim%40sitefusion.co.uk From cce at clarkevans.com Sat Jan 15 21:43:10 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Sat Jan 15 21:43:12 2005 Subject: [Email-SIG] FeedParser: Incremental Payload Reporting Message-ID: <20050115204310.GA19556@prometheusresearch.com> Hello. First, I want to sing praises to Baxter, Wouters, and Warsaw; this new FeedParser is just fantastic -- the whole design. I'm going to steal the entire design for the next generation YAML parser I'm putting together. Incremental processing, hierarchical iterators... you fellas rock. I've got one suggestion. In my application domain (medical imaging) the payloads are _huge_ and I'd like to incrementally add them to the database as they arrive. Therefore, lines 416-422 causes me a bit of concern. The parser has made a Herculean effort to be incremental, and then blows it in the last mile -- couldn't the Message class be extended to have an "write_payload()" method instead? I'm also confused at push() in lines 96 to 110. How does this work with a binary payload? Cheers, Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From barry at python.org Sat Jan 15 21:58:58 2005 From: barry at python.org (Barry Warsaw) Date: Sat Jan 15 21:59:01 2005 Subject: [Email-SIG] FeedParser: Incremental Payload Reporting In-Reply-To: <20050115204310.GA19556@prometheusresearch.com> References: <20050115204310.GA19556@prometheusresearch.com> Message-ID: <1105822738.10230.30.camel@presto.wooz.org> On Sat, 2005-01-15 at 15:43, Clark C. Evans wrote: > Hello. First, I want to sing praises to Baxter, Wouters, and > Warsaw; this new FeedParser is just fantastic -- the whole design. > I'm going to steal the entire design for the next generation YAML > parser I'm putting together. Incremental processing, hierarchical > iterators... you fellas rock. 1/3 thanks! :) > I've got one suggestion. In my application domain (medical imaging) > the payloads are _huge_ and I'd like to incrementally add them to > the database as they arrive. Therefore, lines 416-422 causes me a > bit of concern. The parser has made a Herculean effort to be incremental, > and then blows it in the last mile -- couldn't the Message class be > extended to have an "write_payload()" method instead? Yep, and there have been some discussions about this, so check the archives. It's what I meant when I talked about 'external storage API' in the list of possible sprint topics. > I'm also confused at push() in lines 96 to 110. How does this work > with a binary payload? It doesn't, but that's not a use case the FeedParser really needs to worry about. Email is always line-oriented text, and binary messages are encoded into stuff like base64 so we can cheat this way. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20050115/6c82c6d1/attachment.pgp From cce at clarkevans.com Sat Jan 15 22:55:39 2005 From: cce at clarkevans.com (Clark C. Evans) Date: Sat Jan 15 22:55:40 2005 Subject: [Email-SIG] FeedParser: Incremental Payload Reporting In-Reply-To: <1105822738.10230.30.camel@presto.wooz.org> References: <20050115204310.GA19556@prometheusresearch.com> <1105822738.10230.30.camel@presto.wooz.org> Message-ID: <20050115215539.GA36831@prometheusresearch.com> On Sat, Jan 15, 2005 at 03:58:58PM -0500, Barry Warsaw wrote: | > couldn't the Message class be | > extended to have an "write_payload()" method instead? | | Yep, and there have been some discussions about this, so check the | archives. It's what I meant when I talked about 'external storage API' | in the list of possible sprint topics. Fantastic... I can hardly wait! | > I'm also confused at push() in lines 96 to 110. How does this work | > with a binary payload? | | It doesn't, but that's not a use case the FeedParser really needs to | worry about. Email is always line-oriented text, and binary messages | are encoded into stuff like base64 so we can cheat this way. I'd like to use the FeedParser for use parsing multipart/form-data from a HTTP POST, and in this case the most common file content is binary and the encoding is... none. Could this be added to the wish-list? It'd also allow cgi.py to be refactored nicely. Clark -- Clark C. Evans Prometheus Research, LLC. http://www.prometheusresearch.com/ o office: +1.203.777.2550 ~/ , mobile: +1.203.444.0557 // (( Prometheus Research: Transforming Data Into Knowledge \\ , \/ - Research Exchange Database /\ - Survey & Assessment Technologies ` \ - Software Tools for Researchers ~ * From anthony at interlink.com.au Sun Jan 16 17:04:32 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Sun Jan 16 17:04:20 2005 Subject: [Email-SIG] FeedParser: Incremental Payload Reporting In-Reply-To: <20050115215539.GA36831@prometheusresearch.com> References: <20050115204310.GA19556@prometheusresearch.com> <1105822738.10230.30.camel@presto.wooz.org> <20050115215539.GA36831@prometheusresearch.com> Message-ID: <200501170304.32852.anthony@interlink.com.au> On Sunday 16 January 2005 08:55, Clark C. Evans wrote: > I'd like to use the FeedParser for use parsing multipart/form-data > from a HTTP POST, and in this case the most common file content is > binary and the encoding is... none. Could this be added to the > wish-list? It'd also allow cgi.py to be refactored nicely. Hm. While this seems like a good idea, I'd be a little bit nervous that it might not be a perfect match - how identical are the requirements of HTTP and MIME? Completely, or are there vile little gotchas in the details? The MIME parser goes to extraordinary lengths to deal with the fundamentally broken nature of much of the MIME that's generated - as far as I'm aware, web browsers are a fair bit more sane (and a heck of a lot simpler). The partial storage thing, though, is a good idea. As to why it wasn't implemented - because we didn't think of it during the sprint. As you've no doubt noticed, the current email parser took a lot of time to get right, and this meant we didn't get to everything we could have. It is, however, immensely satisfying that it's just so damn robust in the presence of so much utterly utterly shiteful MIME messages (*cough* MS Entourage *cough*) I've also used it on a number of occasions to demonstrate good coding practices - in particular, the BufferedSubFile iterator (although I still prefer the original name, ReloadableLumpOfText ) Anthony -- Anthony Baxter It's never too late to have a happy childhood. From barry at python.org Sun Jan 16 21:45:30 2005 From: barry at python.org (Barry Warsaw) Date: Sun Jan 16 21:45:32 2005 Subject: [Email-SIG] FeedParser: Incremental Payload Reporting In-Reply-To: <200501170304.32852.anthony@interlink.com.au> References: <20050115204310.GA19556@prometheusresearch.com> <1105822738.10230.30.camel@presto.wooz.org> <20050115215539.GA36831@prometheusresearch.com> <200501170304.32852.anthony@interlink.com.au> Message-ID: <1105908330.5931.24.camel@geddy.wooz.org> On Sun, 2005-01-16 at 11:04, Anthony Baxter wrote: > I've also used it on a number of occasions to demonstrate good coding > practices - in particular, the BufferedSubFile iterator (although I still > prefer the original name, ReloadableLumpOfText ) Maybe we should have called it PushmePullyu ? -B -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20050116/d9ae1a87/attachment.pgp From stuart at stuartbishop.net Mon Jan 17 06:59:39 2005 From: stuart at stuartbishop.net (Stuart Bishop) Date: Mon Jan 17 06:59:47 2005 Subject: [Email-SIG] encoding To headers In-Reply-To: <1105806137.9882.14.camel@presto.wooz.org> References: <20041203151321.GL17614@wiggy.net> <20050105092323.GB7126@wiggy.net> <41DBC790.9000605@stuartbishop.net> <1105806137.9882.14.camel@presto.wooz.org> Message-ID: <41EB544B.7070200@stuartbishop.net> Barry Warsaw wrote: > I wonder if it makes sense to try to fold this into the email package? http://www.python.org/sf/963906 I personally quite like the 'subclass unicode' approach, but I'm happy to sacrifice that for getting the code into the core. > On a related note, Pycon2005 is coming up soon and I will be attending. > There are four days of sprints preceding the conference and last year, > Anthony, Thomas and I sprinted on the new feed parser. While we didn't > finish that work during the sprint, I think that session was more > successful than my following Mailman3 sprint. The fruits of the email > sprint are evident in the new FeedParser that's part of email 3.0. > > Does anybody want to sprint on the email package again this year? Some > things we could work on include: > - Adding a persistency and/or external storage API, or developing a > relational model for storing email messages. Hmmm... maybe I should talk to my boss about getting to PyCon this year. Although I suspect we might want it before end-of-March so might have to implement the bits we need sooner. -- Stuart Bishop http://www.stuartbishop.net/ From bkirsch at osafoundation.org Tue Jan 18 20:08:39 2005 From: bkirsch at osafoundation.org (Brian Kirsch) Date: Tue Jan 18 20:13:14 2005 Subject: [Email-SIG] encoding To headers In-Reply-To: <1105806137.9882.14.camel@presto.wooz.org> References: <20041203151321.GL17614@wiggy.net> <20050105092323.GB7126@wiggy.net> <41DBC790.9000605@stuartbishop.net> <1105806137.9882.14.camel@presto.wooz.org> Message-ID: <5CEEF6E0-6984-11D9-ABA6-000A95CA1ECC@osafoundation.org> Count me in! A robust email address parser and improved header support sound like worth while causes to me. Brian Kirsch - Email Framework Engineer Open Source Applications Foundation 543 Howard St. 5th Floor? San Francisco, CA 94105? (415) 946-3056? http://www.osafoundation.org On Jan 15, 2005, at 8:22 AM, Barry Warsaw wrote: > On Wed, 2005-01-05 at 05:55, Stuart Bishop wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Wichert Akkerman wrote: >> | It's been a month since I posted this; can I safely assume nobody >> knows >> | the answer or is it just not possible? >> >> I never saw your original message sorry. >> >> There is code that will correctly encode and decode email addresses at >> http://stuartbishop.net/Software/EmailAddress. It handles the case you >> mention, as well as handling Unicode domain names if you should ever >> actually see one. >> >> |>I am trying to figure out how to use an 'internationaled' To-header >> but >> |>not succeeding so far. The basic documented approach does not work: >> |> >> |> >> |>>>>mail["From"]=email.Header.Header(, "utf-8") >> |>>>>print mail.as_string() >> |> >> |>[...] >> |>From: =?utf-8?b?dMODwrhzdGk=?= >> |> >> |>this is not allowed by the RFCs: the address must not be encoded. >> Using >> |>quoted-printable would fix this but I can not find any documentation >> |>as to how to do that. >> |> >> |>All current documentation mentions that specifying encoders is >> |>deprecated, but without setting them the email module seems to >> insist on >> |>using base64 encoding. >> |> >> |>Wichert. > > Stuart, > > I wonder if it makes sense to try to fold this into the email package? > > On a related note, Pycon2005 is coming up soon and I will be attending. > There are four days of sprints preceding the conference and last year, > Anthony, Thomas and I sprinted on the new feed parser. While we didn't > finish that work during the sprint, I think that session was more > successful than my following Mailman3 sprint. The fruits of the email > sprint are evident in the new FeedParser that's part of email 3.0. > > Does anybody want to sprint on the email package again this year? Some > things we could work on include: > > - An RFC compliant email address parser > > - Adding better header support > > - Adding a persistency and/or external storage API, or developing a > relational model for storing email messages. > > - Another round of API fixes, and/or sprinting on bugs. > > - Switching the Python stdlib away from deprecated modules such as > rfc822. > > Let's gather some thoughts here and if there's enough interest, I'll > sign up for a sprint at Pycon2005. > > -Barry > > _______________________________________________ > Email-SIG mailing list > Email-SIG@python.org > Your options: > http://mail.python.org/mailman/options/email-sig/ > bkirsch%40osafoundation.org -------------- next part -------------- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iQCVAwUAQelDOXEjvBPtnXfVAQJ6iAP/cq2AGbgF6iKAK4k69RqnsCurxl7Hv4XG qzw8pmYYYgIU9HNrZghu5kYEbhWYlmP8WlvRJ3vZAYbCEGJ/Anw2xB0jYSEC+wP3 4z0CWoWz7RiTAmyPdNcjRWe6+6w4nMZvoDSsCAb5Gg/1e4eZ56TqJt30264XntB6 /kKorI2CiOg= =/ah1 -----END PGP SIGNATURE----- From stuart at stuartbishop.net Thu Jan 20 03:53:35 2005 From: stuart at stuartbishop.net (Stuart Bishop) Date: Thu Jan 20 03:53:44 2005 Subject: [Email-SIG] Reconstituting stored messages and quoted printable encoding Message-ID: <41EF1D2F.6080208@stuartbishop.net> Hi. I'm currently looking into persistent email storage for some internal applications. One of our requirements is that we need to be able to produce on demand a byte-by-byte identical copy of the original message (we expect GPG signed messages to be quite common). My understanding of this means that to do this we would either need to store the original message, or split it up into chunks but store these chunks in their encoded form. This is because quoted printable encoding does not necessarily round trip: >>> '=66=6f=6f'.decode('quopri').encode('quopri') 'foo' Anyone know if this assumption is correct for email messages? I suspect I'm going to have to double our storage requirements and store a copy of the original message as well as decoded text for indexing and decoded attachments for easy retrieval. Or just decide to break GPG signed messages for the pathalogical cases. -- Stuart Bishop http://www.stuartbishop.net/ From kdart at kdart.com Thu Jan 20 05:00:18 2005 From: kdart at kdart.com (Keith Dart) Date: Thu Jan 20 05:00:26 2005 Subject: [Email-SIG] Reconstituting stored messages and quoted printable encoding In-Reply-To: <41EF1D2F.6080208@stuartbishop.net> References: <41EF1D2F.6080208@stuartbishop.net> Message-ID: <41EF2CD2.8090509@kdart.com> Stuart Bishop wrote: > Hi. > > I'm currently looking into persistent email storage for some internal > applications. > Just FYI, I have been using the Durus persistence package. I like it a lot. You can check it out at: http://www.mems-exchange.org/software/durus/ -- -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Keith Dart public key: ID: F3D288E4 ===================================================================== From anthony at interlink.com.au Thu Jan 20 05:37:44 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu Jan 20 05:37:58 2005 Subject: [Email-SIG] Reconstituting stored messages and quoted printable encoding In-Reply-To: <41EF1D2F.6080208@stuartbishop.net> References: <41EF1D2F.6080208@stuartbishop.net> Message-ID: <200501201537.46168.anthony@interlink.com.au> On Thursday 20 January 2005 13:53, Stuart Bishop wrote: > Hi. > > I'm currently looking into persistent email storage for some internal > applications. > > One of our requirements is that we need to be able to produce on demand > a byte-by-byte identical copy of the original message (we expect GPG > signed messages to be quite common). My understanding of this means that > to do this we would either need to store the original message, I think you'll want to do this. Once upon a time, the email package made heroic efforts to make the output of the parsed message byte-for-byte identical with the input. This was sacrificed for a parser that was more robust - the feeling at the time was that if you really wanted a byte-for-byte perfect copy of the original, you could save it off yourself. -- Anthony Baxter It's never too late to have a happy childhood. From bkirsch at osafoundation.org Thu Jan 20 20:38:34 2005 From: bkirsch at osafoundation.org (Brian Kirsch) Date: Thu Jan 20 20:42:52 2005 Subject: [Email-SIG] i18n Message Set Message-ID: Can anyone point me to an i18n message test set that I can use to validate an email clients internationalization compliance? The set should contain headers / bodies / attachments in various languages, character sets, and encodings. Thanks, Brian Kirsch - Email Framework Engineer Open Source Applications Foundation 543 Howard St. 5th Floor? San Francisco, CA 94105? (415) 946-3056? http://www.osafoundation.org From wichert at wiggy.net Thu Jan 20 21:12:29 2005 From: wichert at wiggy.net (Wichert Akkerman) Date: Thu Jan 20 21:12:30 2005 Subject: [Email-SIG] i18n Message Set In-Reply-To: References: Message-ID: <20050120201229.GK3476@wiggy.net> Previously Brian Kirsch wrote: > Can anyone point me to an i18n message test set that I can use to > validate an email clients internationalization compliance? My spam folders have a reasonably diverse set of i18n messages. I'm not sure if it has any attachments though. Wichert. -- Wichert Akkerman It is simple to make things. http://www.wiggy.net/ It is hard to make things simple. From barry at python.org Sat Jan 29 18:52:42 2005 From: barry at python.org (Barry Warsaw) Date: Sat Jan 29 18:52:43 2005 Subject: [Email-SIG] Sprint page Message-ID: <1107021161.7314.49.camel@presto.wooz.org> I've filled out the Email SIG sprint page at: http://www.python.org/moin/EmailSprint?action=show Feel free to add your own topics or your name to this page. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20050129/7f657bb5/attachment.pgp From tim at sitefusion.co.uk Mon Jan 31 12:19:31 2005 From: tim at sitefusion.co.uk (Tim Hicks) Date: Mon Jan 31 12:19:33 2005 Subject: [Email-SIG] Re: persisting email.Message.Message instances In-Reply-To: <1107010292.7307.39.camel@presto.wooz.org> References: <2469.81.105.12.31.1104075066.squirrel@81.105.12.31> <1104079861.9109.56.camel@presto.wooz.org> <1948.81.105.12.31.1104373355.squirrel@81.105.12.31> <1104446320.10296.24.camel@presto.wooz.org> <1470.82.45.210.74.1106771265.squirrel@82.45.210.74> <1107010292.7307.39.camel@presto.wooz.org> Message-ID: <1640.82.45.210.74.1107170371.squirrel@82.45.210.74> Barry Warsaw said: > On Wed, 2005-01-26 at 15:27, Tim Hicks wrote: > >> sorry to bother you again. It seems no one has anything to say about >> the >> email I sent to the email-sig list. I resent it just in case people >> missed it over Christmas. >> >> Am I talking nonsense or is there just no interest/prospect of these >> sorts >> of things getting into the email package? > > Neither actually. I think there /is/ interest in these kinds of things, > but nobody has done it and no one has any brilliant ideas. I know that > I don't ;) and unfortunately I have no time to spend on this right now. Thanks for the reply... it's nice to know I wasn't talking nonsense :-). > If you're coming to Pycon, you might sign up for the email-sig sprint, > which I would like to have. This could certainly be one of the topics > for that sprint. I'm in the UK, and only a hobbyist, so can't really justify getting to pycon. Wrt persisting messages, am I correct to assume that you would *not* accept a patch that adds the following to email.Message.Message? def _persist(self): pass ... and that also arranges for this to be called at the end of all Message mutators? If I am right, is that because such a mechanism would be too ZODB specific? Wrt providing an optional factory for message subparts, is it correct to say that the problem is that code that generates the internal structure of Message instances is spread around - in different parsers etc? Or are there other difficulties? cheers, tim