[Ironpython-users] Unicode

Markus Schaber m.schaber at codesys.com
Tue Sep 16 17:59:56 CEST 2014


Hi,

I just noticed that the non-BMP Unicode Literals don't seem to work correctly in IronPython, this seems to work correctly in cPython 2.7.8:

Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)] on win32
>>> u"\U00010042"
u'\U00010042'

IronPython 2.7.4 (2.7.0.40) on .NET 4.0.30319.18444 (32-bit)
>>> u"\U00010042"
'B'

I guess this truncation is caused by the UTF-16 nature of .NET strings. I guess the most sensible thing IronPython could do here is to create the correct surrogate pair:

>>> System.Text.Encoding.UTF32.GetString((0x42, 0, 1, 0))
u'\ud800\udc42'


Best regards

Markus Schaber

CODESYS(r) a trademark of 3S-Smart Software Solutions GmbH

Inspiring Automation Solutions

3S-Smart Software Solutions GmbH
Dipl.-Inf. Markus Schaber | Product Development Core Technology
Memminger Str. 151 | 87439 Kempten | Germany
Tel. +49-831-54031-979 | Fax +49-831-54031-50

E-Mail: m.schaber at codesys.com | Web: http://www.codesys.com | CODESYS store: http://store.codesys.com
CODESYS forum: http://forum.codesys.com

Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | Trade register: Kempten HRB 6186 | Tax ID No.: DE 167014915

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received
this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure
or distribution of the material in this e-mail is strictly forbidden.



More information about the Ironpython-users mailing list