Python 3.0 crashes displaying Unicode at interactive prompt

Paul Boddie paul at boddie.org.uk
Sun Dec 14 18:32:42 EST 2008


On 14 Des, 22:13, "Martin v. Löwis" <mar... at v.loewis.de> wrote:
> > But shouldn't the production of an object's representation via repr be
> > a "safe" operation?
>
> It's a trade-off. It should also be legible.

Right. I can understand that unlike Python 2.x, a representation of a
string in Python 3.x (whose equivalent in Python 2.x would be a
Unicode object) must also be a string (as opposed to a byte string in
Python 2.x), and that no decision can be taken to choose "safe"
representations for characters which cannot be displayed in a
terminal. In examples, for Python 2.x...

>>> u"æøå"
u'\xe6\xf8\xe5'
>>> repr(u"æøå")
"u'\\xe6\\xf8\\xe5'"

...and for Python 3.x...

>>> "æøå"
'æøå'
>>> repr("æøå")
"'æøå'"

...with an ISO-8859-15 terminal. Python 2.x could conceivably be
smarter about encoding representations, but chooses not to be since
the smarter behaviour would need to involve knowing that an "output
situation" was imminent. Python 3.x, on the other hand, leaves issues
of encoding to the generic I/O pipeline, causing the described
problem.

Of course, repr will always work if its output does not get sent to
sys.stdout or an insufficiently capable output stream, but I suppose
usage of repr for debugging purposes, where one may wish to inspect
character values, must be superseded by usage of the ascii function,
as you point out. It's unfortunate that the default behaviour isn't
optimal at the interactive prompt for some configurations, though.

Paul



More information about the Python-list mailing list