From barry at python.org Mon Nov 1 05:11:36 2004 From: barry at python.org (Barry Warsaw) Date: Mon Nov 1 05:11:42 2004 Subject: [Email-SIG] SF bug #1017329 Message-ID: <1099282296.8092.9.camel@geddy.wooz.org> http://sourceforge.net/tracker/index.php?func=detail&aid=1017329&group_id=5470&atid=105470 This one wants to extend the Message API by enabling iteration over headers. I'd like to get your opinion about whether this would be a useful thing (or even a good thing ), and whether to add this for Python 2.4/email 3.0. This might be considered a new feature rather than a bug so perhaps it missed the Python 2.4 beta cut. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20041031/2a514a43/attachment.pgp From t-meyer at ihug.co.nz Mon Nov 1 05:28:09 2004 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Mon Nov 1 05:28:14 2004 Subject: [Email-SIG] SF bug #1017329 In-Reply-To: Message-ID: [Not sure if you wanted discussion here or in the tracker, so guessed here. Sorry if that was wrong.] > http://sourceforge.net/tracker/index.php?func=detail&aid=10173 > 29&group_id=5470&atid=105470 > > This one wants to extend the Message API by enabling > iteration over headers. I'd like to get your opinion about > whether this would be a useful thing (or even a good thing > ), and whether to add this for Python 2.4/email 3.0. Iteration over headers is already provided by: import email >>> msg = email.message_from_file(open("d:\\example.txt")) >>> for i in msg.keys(): ... print i ... Isn't it? (Works for me, anyway). I wouldn't expect that "for i in msg:" would iterate only through the headers - I would expect the body [parts] to be iterated through somehow, too. So -0 from me. +1 on raising a more understandable error, though, if that's not difficult. > This might be considered a new feature rather than a bug so > perhaps it missed the Python 2.4 beta cut. I'd definitely say it was a feature rather than a bug. =Tony.Meyer From barry at python.org Mon Nov 1 14:38:27 2004 From: barry at python.org (Barry Warsaw) Date: Mon Nov 1 14:38:30 2004 Subject: [Email-SIG] SF bug #1017329 In-Reply-To: References: Message-ID: <1099316307.8097.30.camel@geddy.wooz.org> On Sun, 2004-10-31 at 23:28, Tony Meyer wrote: > Iteration over headers is already provided by: > > import email > >>> msg = email.message_from_file(open("d:\\example.txt")) > >>> for i in msg.keys(): > ... print i > ... Good point. I wouldn't expect a Message had so many keys that instantiating a list would be prohibitively expensive. > So -0 from me. +1 on raising a more understandable error, though, if that's > not difficult. I generally dislike isinstance() tests or try/excepts just to transform one exception into another, so I think we'll leave things the way they are and close the bug as Won't Fix. Thanks, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20041101/14e58cef/attachment.pgp From menno at freshfoo.com Mon Nov 1 23:47:29 2004 From: menno at freshfoo.com (Menno Smits) Date: Mon Nov 1 23:46:57 2004 Subject: [Email-SIG] Handling large emails: DiskMessage and DiskFeedParser In-Reply-To: <1097354173.29448.30.camel@presto.wooz.org> References: <40B29E3D.5010004@netbox.biz> <1096849991.21012.27.camel@geddy.wooz.org> <41610768.4090403@NetBoxBlue.com> <1097354173.29448.30.camel@presto.wooz.org> Message-ID: <4186BD01.1060109@freshfoo.com> Barry Warsaw wrote: >>An alternative solution I've been thinking of... what if we abstract >>message payloads to a "Payload" class? We could have MemoryPayload for >>in-memory storage (the default), TmpFilePayload for temporary disk >>storage etc etc. The read/write interface to the payload would always be >>the same and all Message methods would only ever access the payload via >>the API. Each Message instance would have exactly one MessagePayload >>instance internally. I realise this would be a big change and probably >>isn't suited for Python 2.4 but do you think this is useful? > > > It might be the right way to do it, much like headers can be strings or > instances of Header. I don't think we can really do either for Python > 2.4, but we can continue to pursue this for email 3.1 / Python 2.5. I really think a separate payload storage class is the right way to do it too. The trick of course is to get the interface right. I've been thinking about how to do it this morning. Here's a first attempt: class PayloadStorage: def __init__(self): '''Payload specific initialisation''' def write(self, buf): '''Add new data to the end of the payload''' def readblocks(self, blocksize): '''Iterate over the payload data return fixed sized blocks ''' def close(self): '''Close/cleanup the Payload instance ''' Some things to consider... I've gone with a iterator interface for reading out the payload data because a classic file-like read interface would get messy in a multi-threaded/forking situation. For example, if a subclass of PayloadStorage was to keeping the payload in a disk file, each call to readblocks() could re-open the file for reading and yield the payload in blocks. This would be difficult to achieve with a standard file-like read. This approach only allows sequential writing and reading of the payload data. I've checked through various parts of the email library and I can't find any obvious places where this would be a problem although some refactoring will be required in parts. Can anyone see a part of the library where random access to the payload data is required? Note that in order for this approach to work FeedParser/Parser will need to be able to take a PayloadStorage factory class option for use when create message instances. As Barry suggested, I see this class working alongside the existing string-based payload code. A message's payload could be either a string or a PayloadStorage instance. This is needed for backwards compatibility. Thoughts/feedback anyone? Regards, Menno ps. Barry: unfortunately I don't have as much time up my sleeve to play with this as I had hoped. I've had to suddenly move to another city and am in the middle of the mess at the moment. I'm still keen to work on this however and will keep at it when I have time. pps. Matthew (matt at mondoinfo.com): The company where I work for definitely has a need for this sort of thing and I can think of at least one other external project where large messages in RAM is a problem. There's definitely a case for this :) From hpj at urpla.net Sun Nov 21 18:25:36 2004 From: hpj at urpla.net (Hans-Peter Jansen) Date: Sun Nov 21 18:25:43 2004 Subject: [Email-SIG] problem with base64 encoded message/rfc822 attachment Message-ID: <200411211825.36009.hpj@urpla.net> Hi, I'm trying to understand the behaviour of email.Message, processing a base64 encoded message/rfc822 payload. Using the sligthly modified version of email-unpack.py, the attached mail results in 2 attachments written: write 2:text/plain [7bit] to tmp/part-001.ksh ignore 3:message/delivery-status [] ignore 4:text/plain [] ignore 5:text/plain [] ignore 6:message/rfc822 [base64] write 7:text/plain [] to tmp/part-006.bin While part 2 is fine as expected, part 7 looks strange, since it seems to belong to the base64 encoded part 6. Consequently, the written file is undecoded, which is not what I want/need. The encoding itself is consistent, BTW. Who is wrong here? Could somebody shed some light on this? I would really like to base a attachment filter on this module, and its precursor based on rfc822/ mimetools felt much more backwardly to me.. I'm on python 2.3[.4] here. Cheers, Pete -------------- next part -------------- An embedded message was scrubbed... From: unknown sender Subject: no subject Date: no date Size: 6131 Url: http://mail.python.org/pipermail/email-sig/attachments/20041121/be8a50fb/DSN4.mht -------------- next part -------------- A non-text attachment was scrubbed... Name: email-unpack.py Type: text/x-python Size: 2450 bytes Desc: not available Url : http://mail.python.org/pipermail/email-sig/attachments/20041121/be8a50fb/email-unpack.py From matt at lickey.com Sun Nov 21 21:51:25 2004 From: matt at lickey.com (Matt Armstrong) Date: Sun Nov 21 21:51:29 2004 Subject: [Email-SIG] problem with base64 encoded message/rfc822 attachment In-Reply-To: <200411211825.36009.hpj@urpla.net> (Hans-Peter Jansen's message of "Sun, 21 Nov 2004 18:25:36 +0100") References: <200411211825.36009.hpj@urpla.net> Message-ID: <87653zgc36.fsf@naz.lickey.com> Hans-Peter Jansen writes: > Hi, > > I'm trying to understand the behaviour of email.Message, processing a > base64 encoded message/rfc822 payload. RFC 2046 section 5.2.1 has this to say about message/rfc822: No encoding other than "7bit", "8bit", or "binary" is permitted for the body of a "message/rfc822" entity. The message header fields are always US-ASCII in any case, and data within the body can still be encoded, in which case the Content-Transfer-Encoding header field in the encapsulated message will reflect this. Non-US-ASCII text in the headers of an encapsulated message can be specified using the mechanisms described in RFC 2047. So, a base64 encoded Using the sligthly modified > version of email-unpack.py, the attached mail results in 2 > attachments written: > > write 2:text/plain [7bit] to tmp/part-001.ksh > ignore 3:message/delivery-status [] > ignore 4:text/plain [] > ignore 5:text/plain [] > ignore 6:message/rfc822 [base64] > write 7:text/plain [] to tmp/part-006.bin > > While part 2 is fine as expected, part 7 looks strange, since it seems > to belong to the base64 encoded part 6. Consequently, the written > file is undecoded, which is not what I want/need. The encoding itself > is consistent, BTW. Who is wrong here? > > Could somebody shed some light on this? I would really like to base a > attachment filter on this module, and its precursor based on rfc822/ > mimetools felt much more backwardly to me.. > > I'm on python 2.3[.4] here. > > Cheers, > Pete > > > > _______________________________________________ > Email-SIG mailing list > Email-SIG@python.org > Your options: http://mail.python.org/mailman/options/email-sig/matt%40lickey.com -- matt From barry at python.org Tue Nov 23 15:14:00 2004 From: barry at python.org (Barry Warsaw) Date: Tue Nov 23 15:14:03 2004 Subject: [Email-SIG] Modifying messages loaded with message_from_string() with 3.0 In-Reply-To: <1098214106.5818.14.camel@hercules.dustbite.org> References: <1098214106.5818.14.camel@hercules.dustbite.org> Message-ID: <1101219240.8202.187.camel@presto.wooz.org> On Tue, 2004-10-19 at 15:28, Indrek J?rve wrote: > While testing our webmail client code with email 3.0, I found that > modifying message objects loaded with message_from_string() (attach()ing > new files) break boundaries - the added file will become unaccessable > after the next reload with message_from_string(). I've attached a > testcase and the output I got on Suse 9.1, Python 2.3.3, email 3.0 from > the python 2.4b1 package. The second run in testcase1.output is with the > default Python 2.3.3 email library. > > Is this something that shouldn't be done this way or something just > gone a bit broken? From a quick test, this looks like a real bug. Could you please submit a bug report with SF: http://sourceforge.net/tracker/?group_id=5470&atid=105470 Attach the test case and assign it to me (bwarsaw). I want to fix this for Python 2.4 final. Note: I think can be boiled down to simply calling Y = message_from_string(a.as_string()) since Y will end up with the end boundary in its payload, which causes the double end boundary to be printed. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20041123/376d46ad/attachment.pgp From t-meyer at ihug.co.nz Mon Nov 29 01:07:12 2004 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Mon Nov 29 01:07:45 2004 Subject: [Email-SIG] Messages that start with '>' Message-ID: A change in email 3.0 is that messages that start with '>' are considered to be all body and no header, e.g: >>> import email >>> email.__version__ '2.5.5' >>> s = '>From 6r331g9@bigfoot.com Fri Jan 23 11:31:44 2004\r\nSubject:test\r\n\r\nTest\r\n' >>> m = email.message_from_string(s) >>> m.keys() ['>From 6r331g9@bigfoot.com Fri Jan 23 11', 'Subject'] >>> m.as_string() '>From 6r331g9@bigfoot.com Fri Jan 23 11: 31:44 2004\nSubject: test\n\nTest\r\n' >>> import email >>> email.__version__ '3.0b1' >>> s = '>From 6r331g9@bigfoot.com Fri Jan 23 11:31:44 2004\r\nSubject:test\r\n\r\nTest\r\n' >>> m = email.message_from_string(s) >>> m.keys() [] >>> m.as_string() '\n>From 6r331g9@bigfoot.com Fri Jan 23 11:31:44 2004\r\nSubject:test\r\n\r\nTest\r\n' Is this deliberate? I don't know (and haven't checked) whether it's valid to start a message like this, but one of my POP servers does (it puts a >From header at the start of any message). If this is deliberate, I'll just strip that off before doing message_from_string, but if it's not, then I guess it needs to be fixed, and I'll open a tracker for it. Thanks, Tony Meyer From barry at python.org Mon Nov 29 02:34:48 2004 From: barry at python.org (Barry Warsaw) Date: Mon Nov 29 02:34:52 2004 Subject: [Email-SIG] Messages that start with '>' In-Reply-To: References: Message-ID: <1101692088.19172.27.camel@presto.wooz.org> On Sun, 2004-11-28 at 19:07, Tony Meyer wrote: > A change in email 3.0 is that messages that start with '>' are considered to > be all body and no header, e.g: > > >>> import email > >>> email.__version__ > '2.5.5' > >>> s = '>From 6r331g9@bigfoot.com Fri Jan 23 11:31:44 > 2004\r\nSubject:test\r\n\r\nTest\r\n' > >>> m = email.message_from_string(s) > >>> m.keys() > ['>From 6r331g9@bigfoot.com Fri Jan 23 11', 'Subject'] > >>> m.as_string() > '>From 6r331g9@bigfoot.com Fri Jan 23 11: 31:44 2004\nSubject: > test\n\nTest\r\n' > > >>> import email > >>> email.__version__ > '3.0b1' > >>> s = '>From 6r331g9@bigfoot.com Fri Jan 23 11:31:44 > 2004\r\nSubject:test\r\n\r\nTest\r\n' > >>> m = email.message_from_string(s) > >>> m.keys() > [] > >>> m.as_string() > '\n>From 6r331g9@bigfoot.com Fri Jan 23 11:31:44 > 2004\r\nSubject:test\r\n\r\nTest\r\n' > > Is this deliberate? I don't know (and haven't checked) whether it's valid > to start a message like this, but one of my POP servers does (it puts a > >From header at the start of any message). If this is deliberate, I'll just > strip that off before doing message_from_string, but if it's not, then I > guess it needs to be fixed, and I'll open a tracker for it. It's not valid to start a message with ">From". Strictly speaking, RFC 2822 doesn't say anything about the "Unix-From" line that can start messages in mbox formats, but the parsers in email 2.5 and 3.0 know how to parse them. However the test is that the line starts exactly with "From ". If you look closely enough at the email 2.5 example, the parser thought the >From line was a normal RFC 2822 header, and split it at the first colon -- inside the time string! So that's clearly broken. I think email 3.0 is doing the right thing. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20041128/5046dde3/attachment.pgp From t-meyer at ihug.co.nz Mon Nov 29 03:10:26 2004 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Mon Nov 29 03:11:04 2004 Subject: [Email-SIG] Messages that start with '>' In-Reply-To: Message-ID: [Tony Meyer] >> A change in email 3.0 is that messages that start with '>' are >> considered to be all body and no header, e.g: [Barry Warsaw] > It's not valid to start a message with ">From". Are you sure? I'm not as familiar with RFC2822 as I'm sure you are, but reading it, it seems that a header name can be any characters from (ASCII ord) 33-57 and 59-126, which includes ">" (e.g. you could have a ">Dumbname: value" header). Non-standard headers typically start with "X-", of course, but that's not required by RFC2822 (again, as I understand it). It isn't (AFICT) valid to have a space in the name, so having the "Unix-From" line is invalid (so I'll work around this stupid POP3 server I connect to), but without the space, the header name is valid, I think, so it should work. New example (email 3.0b1 only): >>> s = ">Message:bad, bad, bad!\n\nTest" >>> m = email.message_from_string(s) >>> m.keys() [] >>> m.as_string() '\n>Message:bad, bad, bad!\n\nTest' >>> s = "Message:bad, bad, bad!\n\nTest" >>> m = email.message_from_string(s) >>> m.keys() ['Message'] >>> m.as_string() 'Message: bad, bad, bad!\n\nTest' > If you look closely enough at the email 2.5 example, the > parser thought the >From line was a normal RFC 2822 header, > and split it at the first colon -- inside the time string! > So that's clearly broken. > > I think email 3.0 is doing the right thing. I'm fine with the 'invalid header means all headers in body' rule, but I'm not convinced that having '>' in the header name makes the header invalid. Cheers, Tony From barry at python.org Mon Nov 29 04:35:47 2004 From: barry at python.org (Barry Warsaw) Date: Mon Nov 29 04:35:51 2004 Subject: [Email-SIG] Messages that start with '>' In-Reply-To: References: Message-ID: <1101699347.19175.39.camel@presto.wooz.org> On Sun, 2004-11-28 at 21:10, Tony Meyer wrote: > [Tony Meyer] > >> A change in email 3.0 is that messages that start with '>' are > >> considered to be all body and no header, e.g: > > [Barry Warsaw] > > It's not valid to start a message with ">From". > > Are you sure? I'm not as familiar with RFC2822 as I'm sure you are, but > reading it, it seems that a header name can be any characters from (ASCII > ord) 33-57 and 59-126, which includes ">" (e.g. you could have a ">Dumbname: > value" header). Non-standard headers typically start with "X-", of course, > but that's not required by RFC2822 (again, as I understand it). Oops, I really meant ">From ". Your other point is a fair cop. The relevant section is 3.6.8 in RFC 2822. I have a patch and a test case for this, which I'll sneak in momentarily, assuming Python's "make test" succeeds. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20041128/a5456486/attachment.pgp From t-meyer at ihug.co.nz Mon Nov 29 04:44:11 2004 From: t-meyer at ihug.co.nz (Tony Meyer) Date: Mon Nov 29 04:44:45 2004 Subject: [Email-SIG] Messages that start with '>' In-Reply-To: Message-ID: > Oops, I really meant ">From ". Your other point is a fair > cop. The relevant section is 3.6.8 in RFC 2822. I have a > patch and a test case for this, which I'll sneak in > momentarily, assuming Python's "make test" succeeds. I won't tell Anthony if you won't . Thanks! Cheers, Tony From anthony at interlink.com.au Mon Nov 29 08:14:06 2004 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon Nov 29 08:15:18 2004 Subject: [Email-SIG] Messages that start with '>' In-Reply-To: References: Message-ID: <200411291814.06708.anthony@interlink.com.au> On Monday 29 November 2004 14:44, Tony Meyer wrote: > > Oops, I really meant ">From ". Your other point is a fair > > cop. The relevant section is 3.6.8 in RFC 2822. I have a > > patch and a test case for this, which I'll sneak in > > momentarily, assuming Python's "make test" succeeds. > > I won't tell Anthony if you won't . *cough* Assuming Barry can produce a patch that doesn't break anything else, I can't see a problem with this going in. But do it _soon_ - 2.4 final is less than 24 hours away, and I want to start the build-fest on a crapload of different machines soon. Anthony From barry at python.org Mon Nov 29 08:16:15 2004 From: barry at python.org (Barry Warsaw) Date: Mon Nov 29 08:16:17 2004 Subject: [Email-SIG] Standalone email package 3.0 final Message-ID: <1101712575.19174.59.camel@presto.wooz.org> Python 2.4 final will probably be released in a few hours so this seems like a good time to release the standalone email package, version 3.0 final. Unless there's some last second snafu, this will be identical to the version released with Python 2.4. email 3.0 is compatible with Python 2.3 and 2.4. If you need to support earlier versions of Python, stick with email 2.5.5. For documentation (until Fred flips the "current" docs switch) and download links, please see the email-sig home page: http://www.python.org/sigs/email-sig Changes in email 3.0 include: * New FeedParser provides an incremental parsing API for applications that may need to read email messages from blocking sources (e.g. sockets). FeedParser is also more standards compliant than the old parser and is "non-strict", so that it should never raise parse errors when parsing broken messages. * The old Parser API is (mostly) supported for backward compatibility. * Previously deprecated API features have been removed, while a few more deprecations have been added. * Support for Pythons earlier than 2.3 have been removed. * Lots and lots of fixes. Feel free to join the email-sig mailing list for further discussion. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20041129/78ae018b/attachment.pgp From barry at python.org Mon Nov 29 08:19:21 2004 From: barry at python.org (Barry Warsaw) Date: Mon Nov 29 08:19:23 2004 Subject: [Email-SIG] Messages that start with '>' In-Reply-To: <200411291814.06708.anthony@interlink.com.au> References: <200411291814.06708.anthony@interlink.com.au> Message-ID: <1101712761.19173.62.camel@presto.wooz.org> On Mon, 2004-11-29 at 02:14, Anthony Baxter wrote: > > I won't tell Anthony if you won't . > > *cough* > > Assuming Barry can produce a patch that doesn't break anything > else, I can't see a problem with this going in. But do it _soon_ - > 2.4 final is less than 24 hours away, and I want to start the build-fest > on a crapload of different machines soon. The deed is done. I don't plan on touching anything before you release Python 2.4 final. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20041129/361d3657/attachment.pgp From kdart at kdart.com Mon Nov 29 14:52:16 2004 From: kdart at kdart.com (Keith Dart) Date: Mon Nov 29 14:52:19 2004 Subject: [Email-SIG] Application MIME message Message-ID: <41AB2990.60101@kdart.com> I had a need for a MIME attachement of type Application. I was surprised to find that the mail package does not include one. So I created one. I have attached it here. -- \/ \/ (O O) -- --------------------oOOo~(_)~oOOo---------------------------------------- Keith Dart vcard: public key: ID: F3D288E4 URL: ============================================================================ -------------- next part -------------- A non-text attachment was scrubbed... Name: MIMEApplication.py Type: text/x-python Size: 1046 bytes Desc: not available Url : http://mail.python.org/pipermail/email-sig/attachments/20041129/91375e6a/MIMEApplication.py -------------- next part -------------- A non-text attachment was scrubbed... Name: kdart.vcf Type: text/x-vcard Size: 179 bytes Desc: not available Url : http://mail.python.org/pipermail/email-sig/attachments/20041129/91375e6a/kdart.vcf From barry at python.org Mon Nov 29 18:49:20 2004 From: barry at python.org (Barry Warsaw) Date: Mon Nov 29 18:49:33 2004 Subject: [Email-SIG] Application MIME message In-Reply-To: <41AB2990.60101@kdart.com> References: <41AB2990.60101@kdart.com> Message-ID: <1101750560.22647.17.camel@geddy.wooz.org> On Mon, 2004-11-29 at 08:52, Keith Dart wrote: > I had a need for a MIME attachement of type Application. I was surprised > to find that the mail package does not include one. So I created one. I > have attached it here. It's probably a worthwhile thing to add (tho' it'll have to wait for email 3.1). Can you please submit a SourceForge patch with your file attached. Assign it to me. Otherwise, it'll get lost in my inbox. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/email-sig/attachments/20041129/d5924f46/attachment.pgp