[Python-3000] PEP 3138- String representation in Python 3000

Paul Moore p.f.moore at gmail.com
Thu May 15 18:49:06 CEST 2008


On 15/05/2008, Atsuo Ishimoto <ishimoto at gembook.org> wrote:
> I would like to call it "improve", not break :)

Please can you help me understand the impact here. I am running
Windows XP (UK English - console code page 850, which is some variety
of Latin 1). Currently, printing non-latin1 characters gives me an
exception: for example,

>>> print("Hello\u03C8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Apps\Python30\lib\io.py", line 1103, in write
    b = s.encode(self._encoding)
  File "D:\Apps\Python30\lib\encodings\cp850.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character '\u03c8' in
position 5: character maps to <undefined>

(This is 3.0a1 - I don't know if much has changed in more recent
alphas, if it's significant I can upgrade and try again).

Can you explain what I need to change to make sys.stdout behave as you
propose? If you can do that, I can test what I will see in your
proposal if I type print(repr("Hello\u03C8")). My suspicion is that I
will see unreadable garbage, rather than what I currently get, which
is backslash-escaped, but readable.

The key point here is that I don't think you're proposing to detect
the user's display capabilities and adapt the output to match, so if
my display can't cope with the full Unicode character set, I'll have
to make manual adjustments or see broken output.

Like it or not, a large proportion of Python's users still work in
environments where much of the Unicode character space is not
displayed readably.

My apologies if I misunderstood your proposal - I have almost no
Unicode experience, and that probably shows :-)

Paul.


More information about the Python-3000 mailing list