Python usage numbers

Roy Smith roy at panix.com
Sun Feb 12 17:27:34 EST 2012


In article <mailman.5739.1329084873.27778.python-list at python.org>,
 Chris Angelico <rosuav at gmail.com> wrote:

> On Mon, Feb 13, 2012 at 9:07 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> > The situation before ascii is like where we ended up *before* unicode.
> > Unicode aims to replace all those byte encoding and character sets with
> > *one* byte encoding for *one* character set, which will be a great
> > simplification. It is the idea of ascii applied on a global rather that
> > local basis.
> 
> Unicode doesn't deal with byte encodings; UTF-8 is an encoding, but so
> are UTF-16, UTF-32. and as many more as you could hope for. But
> broadly yes, Unicode IS the solution.

I could hope for one and only one, but I know I'm just going to be 
disapointed.  The last project I worked on used UTF-8 in most places, 
but also used some C and Java libraries which were only available for 
UTF-16.  So it was transcoding hell all over the place.

Hopefully, we will eventually reach the point where storage is so cheap 
that nobody minds how inefficient UTF-32 is and we all just start using 
that.  Life will be a lot simpler then.  No more transcoding, a string 
will just as many bytes as it is characters, and everybody will be happy 
again.



More information about the Python-list mailing list