[beginner] What's wrong?

Wed Apr 6 15:56:26 EDT 2016

Rustom Mody wrote:

> On Sunday, April 3, 2016 at 5:17:36 PM UTC+5:30, Thomas 'PointedEars' Lahn
> wrote:
>> Rustom Mody wrote:
>> > When python went to full unicode identifers it should have also added
>> > pragmas for which blocks the programmer intended to use -- something
>> > like a charset declaration of html.
>> > 
>> > This way if the programmer says "I want latin and greek"
>> > and then A and Α get mixed up well he asked for it.
>> > If he didn't ask then springing it on him seems unnecessary and
>> > uncalled for
>> 
>> Nonsense.
> 
> Some misunderstanding of what I said it looks
> [Guessing also from Marko's "...silly..."]

First of all, while bandwidth might not be precious anymore to some, free 
time still is.  So please trim your quotations to the relevant minimum, to 
the parts you are actually referring to, and summarize properly if 
necessary.  For if you continue this mindbogglingly stupid full-quoting, 
this is going to be my last reply to you for a long time.  You have been 
warned.

<https://www.netmeister.org/news/learn2quote.html>

> So here are some examples to illustrate what I am saying:
> 
> Example 1 -- Ligatures:
> 
> Python3 gets it right
>>>> ﬂag = 1
>>>> flag
> 1

Fascinating; confirmed with

| $ python3 
| Python 3.4.4 (default, Jan  5 2016, 15:35:18) 
| [GCC 5.3.1 20160101] on linux
| […]

I do not think this is correct, though.  Different Unicode code sequences, 
after normalization, should result in different symbols.

> Whereas haskell gets it wrong:
> Prelude> let ﬂag = 1
> Prelude> flag
> 
> <interactive>:3:1: Not in scope: ‘flag’
> Prelude> ﬂag
> 1
> Prelude>

I think Haskell gets it right here, while Py3k does not.  The “ﬂ” is not to 
be decomposed to “fl”.

> Example 2 Case Sensitivity
> Scheme¹ gets it right
> 
>> (define a 1)
>> A
> 1
>> a
> 1

So Scheme is case-insensitive there.  So is (Visual) Basic.  That does not 
make it (any) better.

> Python gets it wrong
>>>> a=1
>>>> A
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> NameError: name 'A' is not defined

This is not wrong; it is just different.  And given that identifiers 
starting with uppercase ought to be class names in Python (and other OOPLs 
that are case-sensitive there), and that a class name serves in constructor 
calls (in Python, instantiating a class is otherwise indistinguishable from 
a function call), it makes sense that the (maybe local) variable “a” should 
be different from the (probably global) class “A”.

> [Likewise filenames windows gets right; Unix wrong]

Utter nonsense.  Apparently you are blissfully unaware of how much grief it 
has caused WinDOS lusers and users alike over the years that Micro$~1 
decided in their infinite wisdom that letter case was not important.

Example: By contrast to previous versions, FAT32 supports long filenames 
(VFAT).  Go try changing a long filename from uppercase (“Really Long 
Filename.txt”) to partial lowercase (“Really long filename.txt”).  It does 
not work, you get an error, because the underlying “short filename” is the 
same as it is has to be case-insensitive for backwards compatibility 
(“REALLY~1.TXT”)  First you have to rename the file so that its name results 
in a different “short filename” (“REALLY~2.TXT”).  Then you have to rename 
it again to get the proper letter case (by which the “short filename” might 
either become “REALLY~1.TXT” again or “REALLY~3.TXT”).

> Unicode Identifiers in the spirit of IDN homograph attack.
> Every language that 'supports' unicode gets it wrong

NAK, see above.

> Python3
>>>> A=1
>>>> Α
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> NameError: name 'Α' is not defined
>>>> A
> 1
> 
> Can you make out why A both is and is not defined?

Fallacy.  “A” is _not_ both defined and not defined.  There is only one “A”.

However, given the proper font, I might see at a glance what is wrong there.  
In fact, in my Konsole[tm] where the default font is “Courier 10 Pitch” I 
clearly see what is wrong there.  “A” (U+0041 LATIN CAPITAL LETTER A) is 
displayed using that serif font where the letter has a serif to the left at 
cap height and serifs left and right on the baseline, while “Α” (U+0391 
GREEK CAPITAL LETTER ALPHA) is displayed using a sans-serif font, where also 
the cap height is considerably higher.

> When the language does not support it eg python2 the behavior is better

NAK.  Being able to use Unicode strings verbatim in a program without having 
to declare them is infinitely useful.  Unicode identifiers appear to be 
merely a (happy?) side effect of that.

> The notion of 'variable' in programming language is inherently based on
> that of 'identifier'.

ACK.

> With ASCII the problems are minor: Case-distinct identifiers are distinct
> -- they dont IDENTIFY.

I do not think this is a problem.

> This contradicts standard English usage and practice 

No, it does not.  English distinguishes between proper *nouns* and proper 
*names* (the latter can be the former).  For example, “Wednesday”, 
regardless where it occurs in a sentence, is an English word, a proper 
*name*; by contrast, “wednesday” is not only neither a proper noun nor a 
proper name; it is not a proper English *word* in the first place.  “i” 
might be the imaginary unit or a marketing abbreviation for “internet” [1]; 
“I” is (AFAIK) *only* the English pronoun for referring to oneself.

[1] <https://en.wikipedia.org/wiki/IMac#History>

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.