From albrecht.andi at googlemail.com Wed Jun 25 16:14:31 2008 From: albrecht.andi at googlemail.com (Andi Albrecht) Date: Wed, 25 Jun 2008 16:14:31 +0200 Subject: [Email-SIG] continuation_ws in Generator and Header Message-ID: <11497d880806250714j5d311250j2709e5d61328de56@mail.gmail.com> There's currently a discussion about the different continuation whitespaces used in Generator._write_headers() and the Header class regarding issue 1974 (http://bugs.python.org/1974). Barry pointed out to move this discussion to this list, so here it is... The problem can be summarized as follows: The default continuation whitespace for long headers in the Header class is the space character, but in Generator._write_headers() a tab is used to create a Header class from a string. This resulted in at least two bug reports (1974, 1645148) where there were problems with some email clients (e.g. Outlook) that didn't display the subject as expected. It turned out that the problem only occurs when a string is used to set the subject header but not when using the Header class directly. I've uploaded a patch to codereview (http://codereview.appspot.com/2407). It is just the result from spending some time on isolating the subject issue and maybe it has some use for the discussion here. Regards, Andi P.s.: As it is my first post on this list: Hello to all! ;-) - http://andialbrecht.de From mark at msapiro.net Thu Jun 26 17:13:38 2008 From: mark at msapiro.net (Mark Sapiro) Date: Thu, 26 Jun 2008 08:13:38 -0700 Subject: [Email-SIG] continuation_ws in Generator and Header In-Reply-To: <11497d880806250714j5d311250j2709e5d61328de56@mail.gmail.com> Message-ID: Andi Albrecht wrote: >There's currently a discussion about the different continuation >whitespaces used in Generator._write_headers() and the Header class >regarding issue 1974 (http://bugs.python.org/1974). Barry pointed out >to move this discussion to this list, so here it is... > >The problem can be summarized as follows: The default continuation >whitespace for long headers in the Header class is the space >character, but in Generator._write_headers() a tab is used to create a >Header class from a string. This resulted in at least two bug reports >(1974, 1645148) where there were problems with some email clients >(e.g. Outlook) that didn't display the subject as expected. It turned >out that the problem only occurs when a string is used to set the >subject header but not when using the Header class directly. There are a couple of problems here that historically result from ambiguities in RFC-822. RFC-2822, sec. 2.2.3 clarifies the standard and is now clear on how folding and unfolding should be done, but the email library doesn't do it that way, and for historical reasons, many MUAs don't either. The email library is pretty good I think about folding 'structured' headers at higher level breaks. Most problems seem to occur in Subject: headers which are unstructured and in which commas and semi-colons are just text and not field separators. According to RFC-2822, we shouldn't have a continuation-ws character at all, because we shouldn't be inserting anything other than a . The real problem is RFC-822 said we could insert followed by whitespace, and MUAs in an attempt to deal with that tend to remove at least the first whitespace character following the even though both RFC-822 and RFC-2822 say that unfolding is accomplished by removing any (only) that is immediately followed by whitespace. While I think the patch will help somewhat by providing consistency, and by not putting in Subject: headers that doesn't get removed, I need to look more closely to see if when continuation_ws is does an extra get inserted. In any case, I think the goal should be RFC-2822 compliance, especially since it seems that Outlook and Tbird appear to be going that way. I may have the urge to look at this after Mailman 2.1.11 is released. -- Mark Sapiro The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan