[Python-bugs-list] [ python-Bugs-513683 ] email.Parser uses LF as line sep.

Fri, 15 Mar 2002 08:51:27 -0800

Bugs item #513683, was opened at 2002-02-06 04:35
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=513683&group_id=5470

Category: Python Library
Group: Python 2.2
>Status: Closed
>Resolution: Wont Fix
Priority: 5
Submitted By: Brian Takashi Hooper (bthooper)
Assigned to: Barry Warsaw (bwarsaw)
Summary: email.Parser uses LF as line sep.

Initial Comment:
I'm not sure what the best solution is for this, but some email 
clients sent multipart MIME messages using CRLF as the line 
separator instead of just LF, which seems to be assumed in 
email.Parser.Parser._parsebody. Maybe I'm reading the RFC 
wrong, but it seems like it says that lines of a mail message should be 
separated using CRLF (although I'm sure many clients don't do that 
either)...

----------------------------------------------------------------------

>Comment By: Barry Warsaw (bwarsaw)
Date: 2002-03-15 11:51

Message:
Logged In: YES 
user_id=12800

Closing this as Won't Fix because the email package's
philosophy is to process messages uses native line endings.
 Any conversions to/from RFC line endings must happen
outside the package (e.g. by the delivering MTA, or by
smtplib for outgoing mail).

----------------------------------------------------------------------

Comment By: Brian Takashi Hooper (bthooper)
Date: 2002-02-06 09:41

Message:
Logged In: YES 
user_id=450505

OK, that seems like a satisfactory answer.
I do actually happen to be 
using Postfix on FreeBSD, albeit a little old (maybe a year or so), and am 
piping mails to a Python script, which is where I observed this problem. 
Maybe something with my local setup? (I didn't set up Postfix, but I don't 
see why it wouldn't be doing the default thing)

Maybe it would be safer 
not to make assumptions about the input message, and process line endings 
to native before parsing? This would be my vote anyways (I tend to avoid 
thoroughly reading documentation unless something doesn't work as I 
intuit it should :-)

----------------------------------------------------------------------

Comment By: Barry Warsaw (bwarsaw)
Date: 2002-02-06 07:32

Message:
Logged In: YES 
user_id=12800

My philosophy so far (and I *think* this is documented in
the latest rev of the .tex file), is that the email package
should deal with native line endings, and that it is the job
of a delivering mta to convert from rfc line endings (crlf)
to native.  It is certainly the case that smtplib converts
from native to rfc line endings when sending the message
out.  Most mtas (e.g. postfix) when piping the message to a
process or onto a file will convert to native line endings,
at least in my experience.

This may not be a very useful assumption though, and it is
probably more robust to be able to deal with either line
endings.  There have been some movements in this direction
in the cvs snapshot of the mimelib/email package where
support for multibyte charsets (e.g. Japanese) have been
added.  You might want to check out that project's cvs trunk
and see if it helps your situation, or submit a bug report
there and we'll prototype the fix in that project first. 
Eventually all that code will be ported back to the Python
2.3 tree.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=513683&group_id=5470