[XML-SIG] DC DOM tests
Martin v. Loewis
martin@loewis.home.cs.tu-berlin.de
Tue, 20 Feb 2001 19:42:16 +0100
> I wonder if that's what Martijn means. I've read that most Java
> implementations have trouble with characters outside the BMP. I
> wonder if Python handles these properly.
Not sure what "properly" would be:
>>> s=unichr(0xD000)+unichr(0xD800)
>>> s
u'\ud000\ud800'
>>> len(s)
2
Do I even use them in the right order here? It can store them, and
reproduce what was stored. Apart for that, it does not special-case
for surrogates at all.
Regards,
Martin
P.S. I really think Python should have used a 32-bit wide character
representation instead.