email.Message.get_payload() surprising behavior

lrotger lrotger at aircomp.aero
Mon Jul 3 11:23:22 EDT 2006


The behavior of get_payload() is different when the quoted-printable 
text has \n line endings or \r\n line endings. If it's \n and the last 
byte of a line in that file is 0x0D it confuses them for a \r\n line 
ending and strips both bytes.

This behavior does not occur if the same file has \r\n line endings 
because the confusing line will end in \r\r\n; then \r\n will be 
correctly stripped, leaving a \r at the end of a line, as it should.

More graphically:

Quoted-printable encoded file with \n newlines:
'Byte number 13 is =0D\n' <--- get_payload(decode = True) will strip the 
last 0x0D byte, leaving a short by 1 file.

Quoted-printable encoded file with \r\n newlines:
'Byte number 13 is =0D\r\n' <--- get_payload(decode = True) will leave 
the 0x0D byte in the file.

I think this is a bug? This has bitten me in a script run from procmail; 
postfix hands \n line endings and the bug happens about once in four 
months, I can work around it substituting \n with \r\n ensuring that the 
\r that are part of the file and happen to be in the last position of a 
line will never be stripped but this is unelegant.

Thanks
Lucia



More information about the Python-list mailing list