[Python-Dev] unicode inconsistency?

M.-A. Lemburg mal at egenix.com
Thu Sep 9 23:11:53 CEST 2004


Martin v. Löwis wrote:
> M.-A. Lemburg wrote:
> 
>>> No, it would not "work" the way I want.  I don't want to force
>>> things to unicode strings unless necessary.
>>
>> Unicode always causes coercion towards Unicode, just like floats
>> always cause coercion towards floats. Nothing's going to
>> change at that end.
> 
> Not always. As we are discussing right now, str() (and indirectly
> %s) coerce Unicode objects into string objects. Also,
> PyArg_ParseTuple coerces Unicode into byte strings for the "s"
> and "t" formats.

I may have been misunderstanding Neil, but I was referring
to Neil's comment that he would not like things to get
forced to Unicode.

If I look at his initial posting, it looks as if Neil wanted
'%s' % A() to return u'\u1234'.

The current implementation tests for Unicode-subclasses,
but does not look at the __str__ return object. In order
to add support for the latter we'd have to add a new C API,
e.g. PyObject_Text() that returns a StringTypes
instance, or catch the UnicodeError caused by the ASCII codec
and let this trigger a redirection to the Unicode formatting
routine (however, this is dangerous since it would cause the
object to be evaluated twice).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 09 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list