[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

Stefan Behnel report at bugs.python.org
Fri Aug 20 18:43:11 CEST 2010


Stefan Behnel <scoder at users.sourceforge.net> added the comment:

Here's a patch against the latest py3k. The following will call the new code, for example:

  str(memoryview(b'abc'), 'ASCII')

whereas bytes and bytesarray continue to use their own special casing code (which has also changed a bit since I wanted to avoid code duplication).

For testing, I wrote a short Cython module that implements the buffer protocol in an extension type and freshly allocates a new bytes object as buffer on each access:

  from cpython.ref cimport Py_INCREF, Py_DECREF, PyObject

  cdef class Test:
      def __getbuffer__(self, Py_buffer* buffer, int flags):
          s = b'abcdefg' * 10
          buffer.buf = <char*> s
          buffer.obj = self
          buffer.len = len(s)
          Py_INCREF(s)
          buffer.internal = <void*> s

      def __releasebuffer__(self, Py_buffer* buffer):
          Py_DECREF(<object>buffer.internal)

Put it into a file "buftest.pyx", build it, start up Python 3.x and call

    >>> import buftest
    >>> print(len( str(buftest.Test(), "ASCII") ))

Under the unpatched Py3, this raises a decoding exception for me when it tries to decode data from the deallocated bytes object. Other systems may happily crash here. The patched Python runtime prints '70' as expected.

----------
keywords: +patch
Added file: http://bugs.python.org/file18585/unicodeobject-PyUnicode_FromEncodedObject-buffer.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue7415>
_______________________________________


More information about the Python-bugs-list mailing list