[XML-SIG] Re: Issues with Unicode type

Fred L. Drake, Jr. fdrake@acm.org
Wed, 25 Sep 2002 16:40:42 -0400


Martin v. Loewis writes:
 > I think he does. However, he might come to the conclusion that we
 > would be better off if the Unicode type had not been added to the
 > language :-)

Ah, but the concern over backward compatibility will prevent him from
removing it.

 > I think there is also some unpleasant feeling about having to make
 > decisions without fully understanding all consequences, together with
 > the feeling, that all people who give guidance lack full understanding
 > as well...

I think there are a couple of other issues, which perhaps you're
implying here:

- The addition of Unicode has proven to be quite invasive in the C
  code of the core; it has certainly affected more than we really
  expected (though others may have had more appropriate expectations).

- The Unicode data type is quite contagious -- since string operations
  that involve Unicode objects tend to produce Unicode objects, even
  if all the data is ASCII, there's a serious element of surprise.

The traditional view that text strings and byte strings are the same
certainly fosters a reactionary position with regard to Unicode.  I
don't know how to work around this historical baggage without calling
it "Python3K" (doesn't that sound a lot like "MST3K"? ;).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation