[Python-3000] PEP3137: "".encode() return type

Christian Heimes lists at cheimes.de
Thu Nov 1 15:20:31 CET 2007


I'm trying to fix some of the errors in the py3k-pep3137 branch and I
stumbled over an issue with encode().

TypeError: encoder did not return a bytes object (type=bytes)

The sentence "... encoding always takes a Unicode string and returns a
bytes sequence, and decoding always takes a bytes sequence and returns a
Unicode string." isn't helpful because it's ambiguous. Do you mean the
old bytes type PyBytes or the new bytes type PyString?

One of the encode methods in unicodeobject.c PyUnicode_AsEncodedString()
checks for PyString and converts it into a buffer (PyBytes).

    if (!PyBytes_Check(v)) {
        if (PyString_Check(v)) {
            /* Old codec, turn it into bytes */
            PyObject *b = PyBytes_FromObject(v);
            Py_DECREF(v);
            return b;
        }

unicode_encode() doesn't like PyString at all and immediately raises a
type error.

* What's the correct return type, PyBytes or PyString?

* If PyString is wrong shouldn't we fix the PyUnicode_AsEncodedString
and remove PyBytes_FromObject?

Christian



More information about the Python-3000 mailing list