[Python-checkins] cpython (2.7): Issue #1548891: The cStringIO.StringIO() constructor now encodes unicode

antoine.pitrou python-checkins at python.org
Sat Oct 22 21:31:18 CEST 2011


http://hg.python.org/cpython/rev/27ae7d4e1983
changeset:   73054:27ae7d4e1983
branch:      2.7
parent:      73046:d30482d51c25
user:        Antoine Pitrou <solipsis at pitrou.net>
date:        Sat Oct 22 21:26:01 2011 +0200
summary:
  Issue #1548891: The cStringIO.StringIO() constructor now encodes unicode
arguments with the system default encoding just like the write() method
does, instead of converting it to a raw buffer.

files:
  Doc/library/stringio.rst  |   5 +----
  Lib/test/test_StringIO.py |  21 +++++++++++++++++++++
  Misc/NEWS                 |   4 ++++
  Modules/cStringIO.c       |   6 +++++-
  4 files changed, 31 insertions(+), 5 deletions(-)


diff --git a/Doc/library/stringio.rst b/Doc/library/stringio.rst
--- a/Doc/library/stringio.rst
+++ b/Doc/library/stringio.rst
@@ -82,10 +82,7 @@
    those cases.
 
    Unlike the :mod:`StringIO` module, this module is not able to accept Unicode
-   strings that cannot be encoded as plain ASCII strings.  Calling
-   :func:`StringIO` with a Unicode string parameter populates the object with
-   the buffer representation of the Unicode string instead of encoding the
-   string.
+   strings that cannot be encoded as plain ASCII strings.
 
    Another difference from the :mod:`StringIO` module is that calling
    :func:`StringIO` with a string parameter creates a read-only object. Unlike an
diff --git a/Lib/test/test_StringIO.py b/Lib/test/test_StringIO.py
--- a/Lib/test/test_StringIO.py
+++ b/Lib/test/test_StringIO.py
@@ -134,6 +134,27 @@
         f = self.MODULE.StringIO(a)
         self.assertEqual(f.getvalue(), '\x00\x01\x02')
 
+    def test_unicode(self):
+
+        if not test_support.have_unicode: return
+
+        # The cStringIO module converts Unicode strings to character
+        # strings when writing them to cStringIO objects.
+        # Check that this works.
+
+        f = self.MODULE.StringIO()
+        f.write(u'abcde')
+        s = f.getvalue()
+        self.assertEqual(s, 'abcde')
+        self.assertEqual(type(s), str)
+
+        f = self.MODULE.StringIO(u'abcde')
+        s = f.getvalue()
+        self.assertEqual(s, 'abcde')
+        self.assertEqual(type(s), str)
+
+        self.assertRaises(UnicodeEncodeError, self.MODULE.StringIO, u'\xf4')
+
 
 import sys
 if sys.platform.startswith('java'):
diff --git a/Misc/NEWS b/Misc/NEWS
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -66,6 +66,10 @@
 Library
 -------
 
+- Issue #1548891: The cStringIO.StringIO() constructor now encodes unicode
+  arguments with the system default encoding just like the write() method
+  does, instead of converting it to a raw buffer.
+
 - Issue #9168: now smtpd is able to bind privileged port.
 
 - Issue #12529: fix cgi.parse_header issue on strings with double-quotes and
diff --git a/Modules/cStringIO.c b/Modules/cStringIO.c
--- a/Modules/cStringIO.c
+++ b/Modules/cStringIO.c
@@ -661,7 +661,11 @@
   char *buf;
   Py_ssize_t size;
 
-  if (PyObject_AsReadBuffer(s, (const void **)&buf, &size)) {
+  if (PyUnicode_Check(s)) {
+    if (PyObject_AsCharBuffer(s, (const char **)&buf, &size) != 0)
+      return NULL;
+  }
+  else if (PyObject_AsReadBuffer(s, (const void **)&buf, &size)) {
     PyErr_Format(PyExc_TypeError, "expected read buffer, %.200s found",
                  s->ob_type->tp_name);
     return NULL;

-- 
Repository URL: http://hg.python.org/cpython


More information about the Python-checkins mailing list