Python usage numbers

Roy Smith roy at panix.com
Sun Feb 12 10:48:36 EST 2012


In article <4f375347$0$29986$c3e8da3$5496439d at news.astraweb.com>,
 Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:

> ASCII truly is a blight on the world, and the sooner it fades into 
> obscurity, like EBCDIC, the better.

That's a fair statement, but it's also fair to say that at the time it 
came out (49 years ago!) it was a revolutionary improvement on the 
extant state of affairs (every manufacturer inventing their own code, 
and often different codes for different machines).  Given the cost of 
both computer memory and CPU cycles at the time, sticking to a 7-bit 
code (the 8th bit was for parity) was a necessary evil.

As Steven D'Aprano pointed out, it was missing some commonly used US 
symbols such as ¢ or ©.  This was a small price to pay for the 
simplicity ASCII afforded.  It wasn't a bad encoding.  I was a very good 
encoding.  But the world has moved on and computing hardware has become 
cheap enough that supporting richer encodings and character sets is 
realistic.

And, before people complain about the character set being US-Centric, 
keep in mind that the A in ASCII stands for American, and it was 
published by ANSI (whose A also stands for American).  I'm not trying to 
wave the flag here, just pointing out that it was never intended to be 
anything other than a national character set.

Part of the complexity of Unicode is that when people switch from 
working with ASCII to working with Unicode, they're really having to 
master two distinct things at the same time (and often conflate them 
into a single confusing mess).  One is the Unicode character set.  The 
other is a specific encoding (UTF-8, UTF-16, etc).  Not to mention silly 
things like BOM (Byte Order Mark).  I expect that some day, storage 
costs will become so cheap that we'll all just be using UTF-32, and 
programmers of the day will wonder how their poor parents and 
grandparents ever managed in a world where nobody quite knew what you 
meant when you asked, "how long is that string?".



More information about the Python-list mailing list