[Ironpython-users] string/unicode

Markus Schaber m.schaber at codesys.com
Mon Sep 19 05:58:22 EDT 2016


Hi, Pawel,

I’m not that deep into the inner workings of IronPython string handling, my impression until now was that we just used the .NET Strings 1:1, using clever binding for the members.

For your approach, I think we we’d need our own wrapper instance around the .NET String objects, so the string instances don’t lose their type. (But maybe we could restrict that to “byte” strings). But I can see that your approach would circumvent the need of converting our own strings to .NET Strings, by using .NET Strings as underlying storage, eliminating most of the performance problems I feared.

Additionally, you mentioned the risk of breaking existing code, so maybe there could be an option switch to the old behavior…

I’m not in the position to tell others how to spend their time, but my suggestion is: If you do not have any specific incentive to fix this problem in the generic way for Python 2.7, I think the time is spent better by working on Python 3 support in IronPython (as the world should migrate to Python 3 anyways ☺)

Best regards

Markus Schaber

CODESYS® a trademark of 3S-Smart Software Solutions GmbH

Inspiring Automation Solutions
________________________________
3S-Smart Software Solutions GmbH
Dipl.-Inf. Markus Schaber | Product Development Core Technology
Memminger Str. 151 | 87439 Kempten | Germany
Tel. +49-831-54031-979 | Fax +49-831-54031-50

E-Mail: m.schaber at codesys.com<mailto:m.schaber at codesys.com> | Web: codesys.com<http://www.codesys.com> | CODESYS store: store.codesys.com<http://store.codesys.com>
CODESYS forum: forum.codesys.com<http://forum.codesys.com>

Managing Directors: Dipl.Inf. Dieter Hess, Dipl.Inf. Manfred Werner | Trade register: Kempten HRB 6186 | Tax ID No.: DE 167014915
________________________________
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received
this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure
or distribution of the material in this e-mail is strictly forbidden.
From: Ironpython-users [mailto:ironpython-users-bounces+m.schaber=codesys.com at python.org] On Behalf Of Pawel Jasinski
Sent: Saturday, September 17, 2016 12:14 PM
To: ironpython-users at python.org
Subject: [Ironpython-users] string/unicode

hi,
I have noticed that the long standing string/unicode subject surfaced again in chat (#1414)
For long time I was convinced that jython uses the same strategy as ironpython in regards to str/unicode aliasing. There was hope to get cpython compatibility at similar level as jython (e.g. django works under jython).
Thanks to Kuno (see https://github.com/IronLanguages/main/pull/1331), I was corrected. jython made a change in 2.5, so str and unicode are distinct types. I believe this change alone made a big difference with respect to their cpython compatibility.


Now the question is what stops us from doing the same?
We could have a distinctive type entry for str and unicode, but keep the existing implementation which uses .net string as storage for byte strings.
The arguments passed to .net will be mapped exactly as they are today.
The results coming back from .net would surface as unicode (which they are).

Marcus expressed concerns "shipping our own string implementation would be somehow overkill, and could severely hurt performance when interfacing with .NET. I think it's not worth the effort for 2.x"
Marcus, does your concern still apply if the implementation follows the "distinctive type entry" idea?

The positives I can see:
- able to use things out of PyPI without tweaking
- no patching of stdlib
- no time spend on investigating another bug report which turns out to be str/unicode alias. I would never expect it to cause stack overflow (#1414).
- chance to move forward with ironclad - the str/unicode aliasing is the biggest road block when trying to fix numpy integration

The negatives I am aware of:
- this has a potential to break existing ironpython code which already has a lot of handcrafted tweaks for str/unicode aliasing. This would have to be taken out.
- working on this, would take resources from working on 3

I am sure, that as usually I am overlooking and/or trivializing something, so please speak up.

cheers,
--pawel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ironpython-users/attachments/20160919/4120d0de/attachment.html>


More information about the Ironpython-users mailing list