[Email-SIG] email.header.decode_header eats my spaces

Tokio Kikuchi tkikuchi at is.kochi-u.ac.jp
Wed Mar 28 02:06:49 CEST 2007


Barry Warsaw wrote:

> On Mar 27, 2007, at 3:06 AM, Tokio Kikuchi wrote:
> 
>> In my opinion (may not be true to RFC2822 in detail), ascii strings in 
>> header object should be strip()ped and separated by FWS (including 
>> '\r\n ' or '\r\n\t').
> 
> I actually think we should be doing the opposite, namely preserving any 
> FWS in the existing text and /not/ substituting continuation_ws for it 
> when we re-break the headers.  This is the only way to maintain 
> idempotency short of saving the original header intact (but then memory 
> usage doubles).  continuation_ws should be used only when we're forced 
> to break at a non-existing FWS location, e.g. if we've split a non-ascii 
> header or at a non-whitespace header-specific syntactic break.  In the 
> case of RFC 2047 headers, the FWS gets consumed anyway so it isn't 
> idempotentially (?!) significant.

Well, this will surely break my contribution on Mailman 2.2 
CookHeaders.py where unifying the code for subject prefix munging for 
both ascii and rfc2047.  :-(

Almost all the MUAs do subject munging by adding 'Re:' and adjusting the 
header length.  This direction of patching means Python email package 
can't no more be used for eg. webmail application.  If I understand 
correctly of course.
> 
> That's where my patch is headed anyway.  I have one test case failure 
> left to resolve.  It's a bear, but when I get that working I'll submit a 
> patch for review.  My gut is telling me not to apply this to Python 2.5 
> but only Python 2.6 since enough of the semantics of continuation_ws and 
> folding has changed that it isn't appropriate for a patch release.
> 
May be we should add a option for email.header.Header(), like 
idempotent=Ture/False.  ;-)


-- 
Tokio Kikuchi, tkikuchi at is.kochi-u.ac.jp
http://weather.is.kochi-u.ac.jp/


More information about the Email-SIG mailing list