email.Message.get_payload() surprising behavior
lrotger
lrotger at aircomp.aero
Mon Jul 3 11:23:22 EDT 2006
The behavior of get_payload() is different when the quoted-printable
text has \n line endings or \r\n line endings. If it's \n and the last
byte of a line in that file is 0x0D it confuses them for a \r\n line
ending and strips both bytes.
This behavior does not occur if the same file has \r\n line endings
because the confusing line will end in \r\r\n; then \r\n will be
correctly stripped, leaving a \r at the end of a line, as it should.
More graphically:
Quoted-printable encoded file with \n newlines:
'Byte number 13 is =0D\n' <--- get_payload(decode = True) will strip the
last 0x0D byte, leaving a short by 1 file.
Quoted-printable encoded file with \r\n newlines:
'Byte number 13 is =0D\r\n' <--- get_payload(decode = True) will leave
the 0x0D byte in the file.
I think this is a bug? This has bitten me in a script run from procmail;
postfix hands \n line endings and the bug happens about once in four
months, I can work around it substituting \n with \r\n ensuring that the
\r that are part of the file and happen to be in the last position of a
line will never be stripped but this is unelegant.
Thanks
Lucia
More information about the Python-list
mailing list