why isn't Unicode the default encoding?

Robert Kern robert.kern at gmail.com
Mon Mar 20 15:27:36 EST 2006


John Salerno wrote:
> Robert Kern wrote:
> 
>>Well, *I* use UTF-8, but that's neither here nor there.
> 
> I see UTF-8 a lot, but this particular book also mentions that UTF-16 is 
> the most common. Is that true?

I think it unlikely, but I have no numbers to give. And I'll bet that that book
doesn't either.

>>>Why can't Unicode replace them so we no longer need the 'u' 
>>>prefix or the encoding tricks?
>>
>>It would break a hell of a lot of code. Try using the -U command line argument
>>to the Python interpreter. That makes unicode strings default.
> 
> I figured this might have something to do with it, but then again I 
> thought that Unicode was created as a subset of ASCII and Latin-1 so 
> that they would be compatible...but I guess it's never that easy. :)

No, it isn't. You seem to be somewhat confused about Unicode. At least you are
misusing terminology quite a bit. You may want to read the following articles:

  http://www.joelonsoftware.com/articles/Unicode.html
  http://effbot.org/zone/unicode-objects.htm

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco




More information about the Python-list mailing list