[Python-checkins] cpython (3.3): #17369: Improve handling of broken RFC2231 values in get_filename.

r.david.murray python-checkins at python.org
Fri Feb 7 21:05:03 CET 2014


http://hg.python.org/cpython/rev/63f8ea0eeb6d
changeset:   89025:63f8ea0eeb6d
branch:      3.3
parent:      89022:aecc0a4be052
user:        R David Murray <rdmurray at bitdance.com>
date:        Fri Feb 07 15:02:19 2014 -0500
summary:
  #17369: Improve handling of broken RFC2231 values in get_filename.

This fixes a regression relative to python2.

files:
  Lib/email/utils.py                |   4 +++
  Lib/test/test_email/test_email.py |  20 +++++++++++++++++++
  Misc/NEWS                         |   4 +++
  3 files changed, 28 insertions(+), 0 deletions(-)


diff --git a/Lib/email/utils.py b/Lib/email/utils.py
--- a/Lib/email/utils.py
+++ b/Lib/email/utils.py
@@ -337,6 +337,10 @@
     # object.  We do not want bytes() normal utf-8 decoder, we want a straight
     # interpretation of the string as character bytes.
     charset, language, text = value
+    if charset is None:
+        # Issue 17369: if charset/lang is None, decode_rfc2231 couldn't parse
+        # the value, so use the fallback_charset.
+        charset = fallback_charset
     rawbytes = bytes(text, 'raw-unicode-escape')
     try:
         return str(rawbytes, charset, errors)
diff --git a/Lib/test/test_email/test_email.py b/Lib/test/test_email/test_email.py
--- a/Lib/test/test_email/test_email.py
+++ b/Lib/test/test_email/test_email.py
@@ -5018,6 +5018,26 @@
         self.assertNotIsInstance(param, tuple)
         self.assertEqual(param, "Frank's Document")
 
+    def test_rfc2231_missing_tick(self):
+        m = '''\
+Content-Disposition: inline;
+\tfilename*0*="'This%20is%20broken";
+'''
+        msg = email.message_from_string(m)
+        self.assertEqual(
+            msg.get_filename(),
+            "'This is broken")
+
+    def test_rfc2231_missing_tick_with_encoded_non_ascii(self):
+        m = '''\
+Content-Disposition: inline;
+\tfilename*0*="'This%20is%E2broken";
+'''
+        msg = email.message_from_string(m)
+        self.assertEqual(
+            msg.get_filename(),
+            "'This is\ufffdbroken")
+
     # test_headerregistry.TestContentTypeHeader.rfc2231_single_quote_in_value_with_charset_and_lang
     def test_rfc2231_tick_attack_extended(self):
         eq = self.assertEqual
diff --git a/Misc/NEWS b/Misc/NEWS
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -45,6 +45,10 @@
 Library
 -------
 
+- Issue #17369: get_filename was raising an exception if the filename
+  parameter's RFC2231 encoding was broken in certain ways.  This was
+  a regression relative to python2.
+
 - Issue #20013: Some imap servers disconnect if the current mailbox is
   deleted, and imaplib did not handle that case gracefully.  Now it
   handles the 'bye' correctly.

-- 
Repository URL: http://hg.python.org/cpython


More information about the Python-checkins mailing list