Benefits of unicode identifiers (was: Allow additional separator in identifiers)

Fri Nov 24 02:03:02 EST 2017

On Fri, Nov 24, 2017 at 2:52 PM, Mikhail V <mikhailwas at gmail.com> wrote:
> On Fri, Nov 24, 2017 at 4:13 AM, Chris Angelico <rosuav at gmail.com> wrote:
>> On Fri, Nov 24, 2017 at 1:44 PM, Mikhail V <mikhailwas at gmail.com> wrote:
>>> From my above example, you could probably see that I prefer somewhat
>>> middle-sized identifiers, one-two syllables. And naturally, they tend to
>>> reflect some process/meaining, it is not always achievable,
>>> but yes there is such a natural tendency, although by me personally
>>> not so strong, and quite often I use totally meaningless names,
>>> mainly to avoid visual similarity to already created names.
>>> So for very expanded names, it ends up with a lot of underscores :(
>>
>> Okay. So if it makes sense for you to use English words instead of
>> individual letters, since you are fluent in English, does it stand to
>> reason that it would make sense for other programmers to use Russian,
>> Norwegian, Hebrew, Korean, or Japanese words the same way?
>
> I don't know. Probably, especially if those *programmers* don't know latin
> letters, then they would want to write code with their letters and their
> language. This target group, as I said, will have really hard time
> with programming,
> and in Python in particular, because they will be not only forced to learn
> some english, but also will have all 'pleasures' of  multi-script editing.
> But wait, probably one can write python code in, say Arabic script *only*?
> How about such feature proposal?

If Python supports ASCII identifiers only, people have no choice but
to transliterate. As it is, people get to choose which is better for
them - to transliterate or not to transliterate, that is the
readability question.

> As for non-english speaker who know some English already,
> could of course want to include identifiers in those scripts.
> But how about libraries?

If you want to use numpy, you have to understand the language of
numpy. That's a lot of technical jargon, so even if you understand
English, you have to learn that. So there's ultimately no difference.

The most popular libraries, just like the standard library, are
usually going to choose to go ASCII-only as the
lowest-common-denominator. But that is, again, their own choice.

> Ok, so we return back to my original question: apart from
> ability to do so, how beneficial is it on a pragmatical basis?
> I mean, e.g. Cyrillic will introduce homoglyph issues.
> CJK and Arabic scripts are metrically and optically incompatible with
> latin, so such mixing will end up with messy look. So just for
> the experiment, yes, it's fun.

Does it really introduce homoglyph issues in real-world situations,
though? Are there really cases where people can't figure out from
context what's going on? I haven't seen that happening. Usually there
are *entire words* (and more) in a single language, making it pretty
easy to figure out.

ChrisA