python 2.7.12 on Linux behaving differently than on Windows

Chris Angelico rosuav at gmail.com
Thu Dec 8 19:55:44 EST 2016


On Fri, Dec 9, 2016 at 10:19 AM, BartC <bc at freeuk.com> wrote:
> I get this (although I suspect Thunderbird will screw up the tabs); the code I used follows at the end:

Actually it came through just fine. Although, point of note: the case
conversions of individual characters are not the same as the case
conversions of the whole string.

> As I said some characters have ill-defined upper and lower case conversions,
> even if some aren't as esoteric as I'd thought.

Can you show me any character with ill-defined conversions?

> In English however the conversions are perfectly well defined for A-Z and
> a-z, while they are not meaningful for characters such as space, and for
> digits.

Correct; caseless characters don't change. There's no concept of
"upper-case 5", and no, it isn't "%", despite people who think that
"upper-case" means "hold shift" :)

> In English such conversions are immensely useful, and it is invaluable for
> many purposes to have upper and lower case interchangeable (for example, you
> don't have separate sections in a dictionary for letters starting with A and
> those starting with a).

That's not quite the same; that's a *collation order*. And some
dictionaries will sort them all in together, but others will put the
uppercase before the lowercase (ie sorting "AaBbCcDd"). It's not that
they're considered identical; it's that they're placed near each other
in the sort. In fact, it's possible to have collations without
conversions - for instance, it's common for McFoo and MacFoo to be
sorted together in a list of names.

> So it it perfectly possible to have case conversion defined for English,
> while other alphabets can do what they like.

Aaaaand there we have it. Not only do you assume that English is the
only thing that matters, you're happy to give the finger to everyone
else on the planet.

> It is a little ridiculous however to have over two thousand distinct files
> all with the lower-case normalised name of "harry_potter".

It's also ridiculous to have hundreds of files with unique two-letter
names, but I'm sure someone has done it. The file system shouldn't
stop you, any more than it should stop you from having "swimmer" and
"swirnrner" in the same directory (in some fonts, they're
indistinguishable).

> What were we talking about again? Oh yes, belittling me because I work with
> Windows!

Or because you don't understand anything outside of what you have
worked with, and assume that anything you don't understand must be
inferior. It's not a problem to not know everything, but you
repeatedly assert that other ways of doing things are "wrong", without
acknowledging that these ways have worked for forty years and are
strongly *preferred* by myriad people around the world. I grew up on
OS/2, using codepage 437 and case insensitive file systems, and there
was no way for me to adequately work with other Latin-script
languages, much less other European languages (Russian, Greek), and
certainly I had no way of working with non-alphabetic languages
(Chinese, Japanese). Nor right-to-left languages (Hebrew, Arabic). And
I didn't know anything about the Unix security model, with different
processes being allowed to do different things. Today, the world is a
single place, not lots of separate places that sorta kinda maybe
grudgingly talk to each other. We aren't divided according to "people
who use IBMs" and "people who use DECs". We shouldn't be divided
according to "people who use CP-850" and "people who use
Windows-1253". The tools we have today are capable of serving everyone
properly. Let's use them.

ChrisA



More information about the Python-list mailing list