[Python-Dev] UTF-16 code point comparison
Finn Bock
bckfnn@worldonline.dk
Wed, 26 Jul 2000 19:42:29 GMT
CPythons unicode compare function contains some code to compare surrogate
characters in code-point order (I think). This is properly a very neat
feature but is differs from java's way of comparing strings.
Python 2.0b1 (#0, Jul 26 2000, 21:29:11) [MSC 32 bit (Intel)] on win32
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
Copyright 1995-2000 Corporation for National Research Initiatives (CNRI)
>>> print u'\ue000' < u'\ud800'
1
>>> print ord(u'\ue000') < ord(u'\ud800')
0
>>>
Java (and JPython) compares the 16-bit characters numericly which result in:
JPython 1.1+08 on java1.3.0 (JIT: null)
Copyright (C) 1997-1999 Corporation for National Research Initiatives
>>> print u'\ue000' < u'\ud800'
0
>>> print ord(u'\ue000') < ord(u'\ud800')
0
>>>
I don't think I can come up with a solution that allow JPython to emulate
CPython on this type of comparison.
regards,
finn