[Patches] [ python-Patches-470578 ] Fixes to synchronize unicode() and str()

noreply@sourceforge.net noreply@sourceforge.net
Fri, 12 Oct 2001 07:03:28 -0700


Patches item #470578, was opened at 2001-10-12 07:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=470578&group_id=5470

Category: Core (C code)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: M.-A. Lemburg (lemburg)
Assigned to: Guido van Rossum (gvanrossum)
Summary: Fixes to synchronize unicode() and str()

Initial Comment:
This patch implements what we have discussed on python-dev late in September: str(obj) and 
unicode(obj) should behave similar, while the old behaviour is retained for unicode(obj, encoding, errors).

The patch also adds a new feature with which objects can provide unicode(obj) with input data: the 
__unicode__ method. Currently no new tp_unicode slot is implemented; this is left as option for the 
future.

Note that PyUnicode_FromEncodedObject() no longer accepts Unicode objects as input. The API name 
already suggests that Unicode objects do not belong in the list of acceptable objects and the functionality 
was only needed because PyUnicode_FromEncodedObject() was being used directly by unicode(). The 
latter was changed in the discussed way:

* unicode(obj) calls PyObject_Unicode()
* unicode(obj, encoding, errors) calls PyUnicode_FromEncodedObject()

One thing left open to discussion is whether to leave the PyUnicode_FromObject() API as a thin API 
extension on top of PyUnicode_FromEncodedObject() or to turn it into a (macro) alias for 
PyObject_Unicode() and deprecate it. Doing so would have some surprising consequences though, e.g. 
u"abc" + 123 would turn out as u"abc123"...

Please check and then reassign to me for the checkin.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=470578&group_id=5470