Can I make unicode in a repr() print readably?

Terry Hancock hancock at anansispaceworks.com
Mon Sep 11 09:21:38 EDT 2006


Martin v. Löwis wrote:
>  Terry Hancock schrieb:
> > Or, put another way, what exactly does 'print' do when it gets a
> > class instance to print? It seems to do the right thing if given a
> > unicode or string object, but I cant' figure out how to make it do
> > the same thing for a class instance.
>
>  It won't. PyFile_WriteObject checks for Unicode objects, and whether
>  the file has an encoding attribute set, and if so, encodes the
>  Unicode object.
>
>  If it is not a Unicode object, it falls through to PyObject_Print,
>  which first checks for the tp_print slot (which can't be set in
>  Python), then uses PyObject_Str (which requires that the __str__
>  result is a true byte string), or PyObject_Repr (if the RAW flag
>  isn't set - it is when printing). PyObject_Str first checks for
>  tp_str; if that isn't set, it falls back to PyObject_Repr.

>  You can save some typing, of course, with a helper function:
>
>  def p(o): print unicode(o)

Yeah, that's what I've done as it stands.  I think it's actually fewer
keystrokes that way, but it is still inconsistent* with other objects,
of course.

>  I agree that this is not optimal; contributions are welcome. It would
>  probably be easiest to drop the guarantee that PyObject_Str returns a
>  true string, or use _PyObject_Str (which does not make this
>  guarantee) in PyObject_Print. One would have to think what the effect
>  on backwards compatibility is of such a change.

Ah, contribute to Python itself.  I'll have to think about it -- I don't do
a lot of C programming these days, but it sounds like an idea.

I don't know about the backwards compatibility issue. I'm not sure
what would be affected.  But "print" frequently generates encoded
Unicode output if the stream supports it, so there is no guarantee
whether it produces unicode or string output now.  I think it's clear
that str() *must* return an ordinary Python string.

I think what would make sense is for the "print" statement to attempt
to call __unicode__ on an instance before attempting to call __str__
(just as it currently falls back from __str__ to __repr__).  That seems like
it would be pretty consistent, right?

Cheers,
Terry

*Okay, actually it is perfectly consistent in a technical sense, but not in
the utility, "this is what you do to examine the object", sense.

-- 
Terry Hancock (hancock at AnansiSpaceworks.com)
Anansi Spaceworks http://www.AnansiSpaceworks.com




More information about the Python-list mailing list