[Python-checkins] r81659 - in python/branches/release26-maint: Doc/library/email.message.rst Doc/library/email.mime.rst Lib/email/message.py Lib/email/test/test_email.py Misc/NEWS

r.david.murray python-checkins at python.org
Thu Jun 3 00:11:01 CEST 2010


Author: r.david.murray
Date: Thu Jun  3 00:11:01 2010
New Revision: 81659

Log:
Merged revisions 81658 via svnmerge from 
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81658 | r.david.murray | 2010-06-02 18:03:15 -0400 (Wed, 02 Jun 2010) | 9 lines
  
  #1368247: make set_charset/MIMEText automatically encode unicode _payload.
  
  Fixes (mysterious, to the end user) UnicodeErrors when using utf-8 as
  the charset and unicode as the _text argument.  Also makes the way in
  which unicode gets encoded to quoted printable for other charsets more
  sane (it only worked by accident previously).  The _payload now is encoded
  to the charset.output_charset if it is unicode.
........


Modified:
   python/branches/release26-maint/   (props changed)
   python/branches/release26-maint/Doc/library/email.message.rst
   python/branches/release26-maint/Doc/library/email.mime.rst
   python/branches/release26-maint/Lib/email/message.py
   python/branches/release26-maint/Lib/email/test/test_email.py
   python/branches/release26-maint/Misc/NEWS

Modified: python/branches/release26-maint/Doc/library/email.message.rst
==============================================================================
--- python/branches/release26-maint/Doc/library/email.message.rst	(original)
+++ python/branches/release26-maint/Doc/library/email.message.rst	Thu Jun  3 00:11:01 2010
@@ -136,9 +136,10 @@
       :mailheader:`Content-Type` header. Anything else will generate a
       :exc:`TypeError`.
 
-      The message will be assumed to be of type :mimetype:`text/\*` encoded with
-      *charset.input_charset*.  It will be converted to *charset.output_charset*
-      and encoded properly, if needed, when generating the plain text
+      The message will be assumed to be of type :mimetype:`text/\*`, with the
+      payload either in unicode or encoded with *charset.input_charset*.
+      It will be encoded or converted to *charset.output_charset*
+      and transfer encoded properly, if needed, when generating the plain text
       representation of the message.  MIME headers (:mailheader:`MIME-Version`,
       :mailheader:`Content-Type`, :mailheader:`Content-Transfer-Encoding`) will
       be added as needed.

Modified: python/branches/release26-maint/Doc/library/email.mime.rst
==============================================================================
--- python/branches/release26-maint/Doc/library/email.mime.rst	(original)
+++ python/branches/release26-maint/Doc/library/email.mime.rst	Thu Jun  3 00:11:01 2010
@@ -191,9 +191,11 @@
    minor type and defaults to :mimetype:`plain`.  *_charset* is the character
    set of the text and is passed as a parameter to the
    :class:`~email.mime.nonmultipart.MIMENonMultipart` constructor; it defaults
-   to ``us-ascii``.  No guessing or encoding is performed on the text data.
+   to ``us-ascii``.  If *_text* is unicode, it is encoded using the
+   *output_charset* of *_charset*, otherwise it is used as-is.
 
    .. versionchanged:: 2.4
-      The previously deprecated *_encoding* argument has been removed.  Encoding
-      happens implicitly based on the *_charset* argument.
+      The previously deprecated *_encoding* argument has been removed.  Content
+      Transfer Encoding now happens happens implicitly based on the *_charset*
+      argument.
 

Modified: python/branches/release26-maint/Lib/email/message.py
==============================================================================
--- python/branches/release26-maint/Lib/email/message.py	(original)
+++ python/branches/release26-maint/Lib/email/message.py	Thu Jun  3 00:11:01 2010
@@ -256,6 +256,8 @@
                             charset=charset.get_output_charset())
         else:
             self.set_param('charset', charset.get_output_charset())
+        if isinstance(self._payload, unicode):
+            self._payload = self._payload.encode(charset.output_charset)
         if str(charset) != charset.get_output_charset():
             self._payload = charset.body_encode(self._payload)
         if not self.has_key('Content-Transfer-Encoding'):

Modified: python/branches/release26-maint/Lib/email/test/test_email.py
==============================================================================
--- python/branches/release26-maint/Lib/email/test/test_email.py	(original)
+++ python/branches/release26-maint/Lib/email/test/test_email.py	Thu Jun  3 00:11:01 2010
@@ -1025,6 +1025,31 @@
         eq(msg.get_charset().input_charset, 'us-ascii')
         eq(msg['content-type'], 'text/plain; charset="us-ascii"')
 
+    def test_7bit_unicode_input(self):
+        eq = self.assertEqual
+        msg = MIMEText(u'hello there', _charset='us-ascii')
+        eq(msg.get_charset().input_charset, 'us-ascii')
+        eq(msg['content-type'], 'text/plain; charset="us-ascii"')
+
+    def test_7bit_unicode_input_no_charset(self):
+        eq = self.assertEqual
+        msg = MIMEText(u'hello there')
+        eq(msg.get_charset(), 'us-ascii')
+        eq(msg['content-type'], 'text/plain; charset="us-ascii"')
+        self.assertTrue('hello there' in msg.as_string())
+
+    def test_8bit_unicode_input(self):
+        teststr = u'\u043a\u0438\u0440\u0438\u043b\u0438\u0446\u0430'
+        eq = self.assertEqual
+        msg = MIMEText(teststr, _charset='utf-8')
+        eq(msg.get_charset().output_charset, 'utf-8')
+        eq(msg['content-type'], 'text/plain; charset="utf-8"')
+        eq(msg.get_payload(decode=True), teststr.encode('utf-8'))
+
+    def test_8bit_unicode_input_no_charset(self):
+        teststr = u'\u043a\u0438\u0440\u0438\u043b\u0438\u0446\u0430'
+        self.assertRaises(UnicodeEncodeError, MIMEText, teststr)
+
 
 
 # Test complicated multipart/* messages

Modified: python/branches/release26-maint/Misc/NEWS
==============================================================================
--- python/branches/release26-maint/Misc/NEWS	(original)
+++ python/branches/release26-maint/Misc/NEWS	Thu Jun  3 00:11:01 2010
@@ -58,6 +58,9 @@
 Library
 -------
 
+- Issue #1368247: set_charset (and therefore MIMEText) now automatically
+  encodes a unicode _payload to the output_charset.
+
 - Issue #7150: Raise OverflowError if the result of adding or subtracting
   timedelta from date or datetime falls outside of the MINYEAR:MAXYEAR range.
 


More information about the Python-checkins mailing list