[Python-Dev] More Unicode support

M.-A. Lemburg mal@lemburg.com
Mon, 06 Nov 2000 19:15:27 +0100


Guido van Rossum wrote:
> 
> [Guido]
> > > Adding unistr() and StreamRecoder isn't enough.  The problem is that
> > > when you set sys.stdout to a StreamRecoder, the print statement
> > > doesn't do the right thing!  Try it.  print u"foo" will work, but
> > > print u"\u1234" will fail because print always applies the default
> > > encoding.
> 
> [MAL]
> > Hmm, that's due to PyFile_WriteObject() calling PyObject_Str().
> > Perhaps we ought to let it call PyObject_Unicode() (which you
> > find in the patch on SF) instead for Unicode objects. That way
> > the file-like .write() method will be given a Unicode object
> > and StreamRecoder could then do the trick.
> 
> That's still not enough. Classes and types should be able to have a
> __str__ (or tp_str) that yields Unicode too.

Instances are allowed to return Unicode through their __str__
method and PyObject_Unicode() will pass it along. PyObject_Str()
will still convert it to an 8-bit string though because there's
too much code out there which expects a string object (without
checking !) ... even the Python core.

So if you print an instance which returns Unicode through __str__,
the wrapper should see a real Unicode object at its end... at least
I think we're getting closer ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/