unicode vs. str: not quite parallel?

Mon Nov 11 12:38:09 EST 2002

ht at cogsci.ed.ac.uk (Henry S. Thompson) writes:

> If you print an object to a normal stream, and object's class has a
>  __str__ method, what appears is the result of the __str__ method.

What is "a normal stream"?

>>> f=open("/tmp/bla","w")
>>> class X:
...   def __str__(self):
...     print "STR"
...     return "str"
... 
>>> x=X()
>>> str(x)
STR
'str'
>>> f.write(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: argument 1 must be string or read-only character buffer, not instance

So it is not at all common that you can write arbitrary things into a
byte stream.

> I've searched the archives but found no joy for this one -- any help
> out there?

I can't offer help, but I will instead ask for help.

This looks like a bug. PyUnicode_FromObject does not consider invoking
__unicode__, but I think it should. In fact, I cannot understand why
PyObject_Unicode and PyUnicode_FromObject are different functions.
There is already a comment in this function suggesting that.

So please either submit a bug report, or, better yet, a patch (I
*will* forget about this if there is no reminder on SF).

In return, I can offer a work-around: When you lookup a stream writer,
don't use that directly. Instead, do

 basewriter = codecs.get_writer(encodingname)
 class writer(basewriter):
   def write(self, data):
     data = unicode(data)
     return self.__bases__[0].write(data)

Regards,
Martin

__unicode__ vs. __str__: not quite parallel?

unicode vs. str: not quite parallel?