[ python-Bugs-1588217 ] quoted printable parse the sequence '= ' incorrectly

SourceForge.net noreply at sourceforge.net
Thu Nov 16 18:09:14 CET 2006


Bugs item #1588217, was opened at 2006-10-31 21:06
Message generated for change (Comment added) made by gbrandl
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1588217&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.4
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Wai Yip Tung (tungwaiyip)
>Assigned to: Georg Brandl (gbrandl)
Summary: quoted printable parse the sequence '= ' incorrectly

Initial Comment:
>>> import quopri

>>> s = 'I say= a secret message\r\nThank you'

>>> quopri.a2b_qp
<built-in function a2b_qp>
>>> quopri.decodestring(s)  # use the c version 
binascii.a2b_qp() to decode
'I sayThank you'

>>> quopri.a2b_qp=None
>>> quopri.decodestring(s)  # use the python version 
quopri.decode() to decode
'I say= a secret message\nThank you'


Note that the sequence '= ' is invalid according to 
RFC 2045 section 6.7:

-------------------------------------------------------
An "=" followed by a character that is neither a 
hexadecimal digit (including "abcdef") nor the CR 
character of a CRLF pair is illegal ... A reasonable 
approach by a robust implementation might be to 
include the "=" character and the following character 
in the decoded data without any transformation
-------------------------------------------------------

The lenient interpretation is used by the Python 
version parser quopri.decode() to produce the second 
string. Most email clients use a similar lenient 
interpretation.

The C version parser binascii.a2b_qp(), which is used 
in preference to the Python verison, produce a 
surprising result with the string 'a secret message' 
omitted.

This may create an opportunity for spammers to insert 
secret message after '= ' so that it is not visible to 
Python based spam filter but woiuld display in non-
Python based email client.


----------------------------------------------------------------------

>Comment By: Georg Brandl (gbrandl)
Date: 2006-11-16 17:09

Message:
Logged In: YES 
user_id=849994
Originator: NO

Thanks for the report, this is now fixed in rev. 52765, 52766 (2.5).

----------------------------------------------------------------------

Comment By: Wai Yip Tung (tungwaiyip)
Date: 2006-10-31 21:18

Message:
Logged In: YES 
user_id=561546

The problem may come from binascii_a2b_qp() in binascii.c. It 
considers the '= ' or '=\t' sequence as a soft line break. Such 
interpretation appears to have no basis. It could be an 
misinterpretation of RFC 2045:

-------------------------------------------------------------------
In particular, an "=" at the end of an encoded line, indicating a 
soft line break (see rule #5) may follow one or more TAB (HT) or 
SPACE characters.
-------------------------------------------------------------------

This passage reminds readers they might find TAB or SPACE before 
an "=", but not after it. "= " is plain illegal as far as I know.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1588217&group_id=5470


More information about the Python-bugs-list mailing list