[Python-checkins] r54370 - in python/branches/release25-maint/Lib/email: header.py test/test_email.py test/test_email_renamed.py

barry.warsaw python-checkins at python.org
Wed Mar 14 05:29:10 CET 2007


Author: barry.warsaw
Date: Wed Mar 14 05:29:06 2007
New Revision: 54370

Modified:
   python/branches/release25-maint/Lib/email/header.py
   python/branches/release25-maint/Lib/email/test/test_email.py
   python/branches/release25-maint/Lib/email/test/test_email_renamed.py
Log:
SF bug #1582282; decode_header() incorrectly splits not-conformant RFC
2047-like headers where there is no whitespace between encoded words.  This
fix changes the matching regexp to include a trailing lookahead assertion that
the closing ?= must be followed by whitespace, newline, or end-of-string.
This also changes the regexp to add the MULTILINE flag.


Modified: python/branches/release25-maint/Lib/email/header.py
==============================================================================
--- python/branches/release25-maint/Lib/email/header.py	(original)
+++ python/branches/release25-maint/Lib/email/header.py	Wed Mar 14 05:29:06 2007
@@ -39,7 +39,8 @@
   \?                    # literal ?
   (?P<encoded>.*?)      # non-greedy up to the next ?= is the encoded string
   \?=                   # literal ?=
-  ''', re.VERBOSE | re.IGNORECASE)
+  (?=[ \t]|$)           # whitespace or the end of the string
+  ''', re.VERBOSE | re.IGNORECASE | re.MULTILINE)
 
 # Field name regexp, including trailing colon, but not separating whitespace,
 # according to RFC 2822.  Character range is from tilde to exclamation mark.

Modified: python/branches/release25-maint/Lib/email/test/test_email.py
==============================================================================
--- python/branches/release25-maint/Lib/email/test/test_email.py	(original)
+++ python/branches/release25-maint/Lib/email/test/test_email.py	Wed Mar 14 05:29:06 2007
@@ -1527,6 +1527,18 @@
         hu = make_header(dh).__unicode__()
         eq(hu, u'The quick brown fox jumped over the lazy dog')
 
+    def test_rfc2047_without_whitespace(self):
+        s = 'Sm=?ISO-8859-1?B?9g==?=rg=?ISO-8859-1?B?5Q==?=sbord'
+        dh = decode_header(s)
+        self.assertEqual(dh, [(s, None)])
+
+    def test_rfc2047_with_whitespace(self):
+        s = 'Sm =?ISO-8859-1?B?9g==?= rg =?ISO-8859-1?B?5Q==?= sbord'
+        dh = decode_header(s)
+        self.assertEqual(dh, [('Sm', None), ('\xf6', 'iso-8859-1'),
+                              ('rg', None), ('\xe5', 'iso-8859-1'),
+                              ('sbord', None)])
+
 
 
 # Test the MIMEMessage class

Modified: python/branches/release25-maint/Lib/email/test/test_email_renamed.py
==============================================================================
--- python/branches/release25-maint/Lib/email/test/test_email_renamed.py	(original)
+++ python/branches/release25-maint/Lib/email/test/test_email_renamed.py	Wed Mar 14 05:29:06 2007
@@ -1525,6 +1525,18 @@
         hu = make_header(dh).__unicode__()
         eq(hu, u'The quick brown fox jumped over the lazy dog')
 
+    def test_rfc2047_missing_whitespace(self):
+        s = 'Sm=?ISO-8859-1?B?9g==?=rg=?ISO-8859-1?B?5Q==?=sbord'
+        dh = decode_header(s)
+        self.assertEqual(dh, [(s, None)])
+
+    def test_rfc2047_with_whitespace(self):
+        s = 'Sm =?ISO-8859-1?B?9g==?= rg =?ISO-8859-1?B?5Q==?= sbord'
+        dh = decode_header(s)
+        self.assertEqual(dh, [('Sm', None), ('\xf6', 'iso-8859-1'),
+                              ('rg', None), ('\xe5', 'iso-8859-1'),
+                              ('sbord', None)])
+
 
 
 # Test the MIMEMessage class


More information about the Python-checkins mailing list