break unichr instead of fix ord?

Sat Aug 29 15:43:58 EDT 2009

2009/8/29  <rurpy at yahoo.com>:
> On 08/28/2009 02:12 AM, "Martin v. Löwis" wrote:
>
> So far, it seems not and that unichr/ord
> is a poster child for "purity beats practicality".
> --
> http://mail.python.org/mailman/listinfo/python-list
>

As Mark Tolonen pointed out earlier in this thread, in Python 3 the
practicality apparently beat purity in this aspect:

Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.

>>> goth_urus_1 = '\U0001033f'
>>> list(goth_urus_1)
['\ud800', '\udf3f']
>>> len(goth_urus_1)
2
>>> ord(goth_urus_1)
66367
>>> goth_urus_2 = chr(66367)
>>> len(goth_urus_2)
2
>>> import unicodedata
>>> unicodedata.name(goth_urus_1)
'GOTHIC LETTER URUS'
>>> goth_urus_3 = unicodedata.lookup("GOTHIC LETTER URUS")
>>> goth_urus_4 = "\N{GOTHIC LETTER URUS}"
>>> goth_urus_1 == goth_urus_2 == goth_urus_3 == goth_urus_4
True
>>>

As for the behaviour in python 2.x, it's probably good enough, that
the surrogates aren't prohibited and the eventually needed behaviour
can be easily added via custom functions.

vbr