case-sensitivity (was Re: True, False, None)

Wed Nov 12 11:03:42 EST 2003

Michele Simionato wrote:
   ...
>> Why is staticmethod "a class" to you while int presumably isn't?
> 
> Actually, I would prefer "Int" over ""int" ;)

A consistent preference -- which will never be satisfied, of course.

So, when I used to have a factory function (as 'int' was), and change
it into a type (or class, same thing), I should rename it and break all
existing programs that would otherwise keep working just fine?  Or
would you prefer to bloat the built-in namespace (or module namespace
where such refactoring is going on) by having both 'int' _and 'Int'?

I consider such a desire to distinguish callables that are for the
most part quite polymorphic, such as types/classes on one hand and
factory functions on the other, quite misplaced.  Python is all about
signature-based polymorphism: why should we sacrifice the wonders of
this on the altar of "Capitalization"?!

> saying "classes are capitalized". Of course, I am not proposing to
> change anything. It is a very minor annoyance I can very well live
> with.

To me, case sensitivity is somewhat more than a minor annoyance,
though I still live with it because the case-insensitive languages
I know of (such as Lisp) have many more things I don't really like.

> I never understood why you advocate case insensitivity. I would expect
> case insensitivity to be a source of bugs: for instance, identifiers
> with similar names could be confused (es. myfunction vs. myFunction):
> one would expect them to be different (at least people coming from
> case insensitive languages) whereas they would be the same.

People coming from (an exclusive diet of) case _sensitive_ (not, as you
say, INsensitive!) languages would surely expect these identifiers to
be separate, but there are enough people coming from case INsensitive
languages (Pascal, Fortran, Basic, Lisp, ...) to basically even this
factor out.  One factor that makes me prefer insensitivity is that
people who meet Python as their _first_ programming language quite
rightly see case-sensitivity as just one more hassle being thrown at
them, since in real life people aren't really case-sensitive.  E.g.,
intel's trademark is all-lowercase, but on the website they keep
referring to themselves as uppercase-I Intel and nobody's confused;
hostnames and protocols in URL's are case-insensitive, too, so people 
don't particularly have a connection of computers with case-sensitivity; 
so are filenames in Windows, the most widespread OS, and MacOS, widely
considered the most user-friendly one (in the default filesystem, 
although since it has Unix underneath you can use a case-insensitive 
FS on it if you deliberately go for it); etc, etc, ad nauseam.

But people's expectations when they first meet Python are only a
part of it.  More relevant is, how usable are the two possibilities?
Case sensitivity means you must basically memorize several more bits
to go with each name -- for no good reason whatsoever.  You must
remember that module FCNTL has an all-uppercase name, htmllib all-lower,
cStringIO weirdly mixed, mimetypes lower, MimeWriter mixed, etc, etc --
totally wasted mnemonic effort.  Then you get into the single modules
for more of the same -- unless you can know what is conceptualized as
"a class" vs "a type" vs "a function" memorization's your only hope,
and if you DO know it's still learning by rote, incessantly (ah yes,
class dispatcher in module asyncore, that's LOWER-case, ah yes, the
'error' exception class, that's lowercase in sunaudiodev, it's
lowercase in socket, and in anydbm, and thread -- it's uppercase Error 
in sunau, and also in shutil, and multifile, and binhex ... and
functions are supposed to start lowercase?  Yeah right, look at
imaplib and weep.  or stat.  or token... And you think your troubles
are over once you've got the casing of the FIRST letter right?  HA!
E.g., would the letter 'f' in the word 'file' be uppercased or not
when it occurs within a composite word?  Take your pick...
shelve.DbfilenameShelf, zipfile.BadZipfile, zipfile.ZipFile,
mimify.HeaderFile, ...

Basically, you end up looking all of these things up -- again and
again and again -- for no good reason.  Case-sensitivity inevitably
causes that, because people sometimes think of e.g. "zipfile" as ONE
word, sometimes as two, so they uppercase the 'f' or not "wantonly".

Some will inevitably say that's just the fault of the human beings
who choose each of these many names -- case sensitivity as an abstract
ideal is pristine and perfect.  To which, my witty repartee is "Yeah,
right".  When you present me a language whose entire libraries have
been written by superhumanly perfect beings my attitude to case
sensitivity may change.  Until you do, I'll surmise that _most_ such
languages and libraries ARE going to be written by humans, and there
really is no added value in me having to memorize or constantly look
up the capitalization of all of these names -- misspellings are bad
enough (and fortunately are generally acknowledged as mistakes, and
fixed, when found, which isn't the case for capitalization issues).

Moreover, many of the distinctions you're supposed to be drawing
with this precious capitalization, allegedly worth making me and a
zillion learners suffer under silly gratuitous mnemonic load, are
distinctions I'd much rather *NOT* see, such as ones between types
and classes (traditionally), or types/classes and factory functions.
Some factory functions get capitalized, like threading.RLock, cause
it's "sorta like a class", some don't, because, hey, it's a function.
More useless distinction and more memorization.

But at least constants, I hear some claim?  THOSE are invariably
ALWAYS spelled in all-uppercase...?

"Sure", say I, "just like math.pi"...

> Also, now I can write
> 
> ONE=1
> 
> and
> 
> def one(): return 1
> 
> and it is clear that the first name (ONE) refers to a constant
> whereas the second name (one) refers to something which is not a constant.

Except that it is, unless your design intention (unlikely though
possible) is that the user will rebind name 'one' in your module
to indicate some other function object.  Unless such is your meaning,
names 'one' and 'ONE' are "constants" in exactly the same sense: you
do not intend those names to be re-bound to different objects.  One
of the objects is callable, the other is not, but that's quite another
issue.  One is technically immutable -- the other one isn't (you can
set arbitrary attributes on it) but it IS hashable (the attributes
don't enter into the hash(one) computation) which is more often than
not the key issue we care about when discussing mutability.  So,
what IS "a constant" in Python?  If you wanted to bind a name (asking
implicitly that it never be re-bound) to a "indisputably mutable" (not
hashable) object, how would you capitalize that?  Python itself does
not seem to care particularly.  E.g.:

>>> def f(): pass
...
>>> f.func_name = 'feep'
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: readonly attribute
>>> f.func_defaults = ()

I can rebind f.func_defaults, NOT f.func_name -- but they have exactly
the same capitalization.  So, no guidance here...

> In a case insensitive language I am sure I would risk to override
> constants with functions. 

Why is that any more likely than 'overriding' (e.g.) types (in the old
convention which makes them lowercase) or "variables" (names _meant_
to be re-bound, but not necessarily to functions)?  And if you ever use
one-letter names, is G a class/type, or is it a constant?

I consider the desire to draw all of these distinctions by lexical
conventions quite close to the concepts of "hungarian notation", and
exactly as misguided as those.

> When I learned C (coming from Basic and Pascal)
> I thought case sensitivity was a good idea, why you don't think so?

When I learned C (coming from a lot of languages, mostly case
insensitive) I thought case sensitivity was a ghastly idea, and I
still do.  Much later I met the concept of "case _preservation_" --
an identifier is to be spelled with the same case throughout, and
using a different casing for it is either impossible or causes an
error -- and THAT one is a concept I might well accept if tools did
support it well (but I suspect it's more suitable for languages
where there are declarations, or other ways to single out ONE
spelling/capitalization of each identifier as the "canonical, mandated" 
one -- I'm not sure how I'd apply that to Python, for example).
As long as I could input, e.g., simplexmlrpcserver and have the
editor (or whatever tool) change it to SimpleXMLRPCServer (or
whatever other SpelLing they've chOseN) I wouldn't mind as much.

> (you are free to point me to old posts if this, as I suspect, has
> been debated to death already ;)

Oh, this isn't the first round, but it isn't the second one, either.
It's essentially a discussion on how languages should be designed,
rather than anything that could ever concretely change in Python:
the very existence of "indistinguishable except for spelling" names
such as fileinput and FileInput, random and Random, etc, etc, makes
it certain that this characteristic can never be changed.

Perhaps it's exactly because the discussion is totally moot, that
it keeps getting hot each time it's vented (wanna bet...?-).

Alex