[I18n-sig] Unicode strings: an alternative

Walter Dörwald walter.doerwald@catsystems.de
Fri, 05 May 2000 18:42:07 +0200


Guido van Rossum wrote:

> [...]
> > Will this ASCII restriction only be enforced when converting
> > to Unicode, or will the string type itself be restricted to
> > ASCII?
> 
> No, 8-bit strings will always be 8-bit clear, of course!  The ASCII
> restriction is only used for conversion to Unicode when no explicit
> encoding is given.  For example, "abc" + u"xyz" is u"abcxyz", but "èé"
> + u"xyz" raises an exception.

Which has to be considered an "artificial" overflow error, and error
that is raised because of the values of some object.

> However you can write
> unicode("èé","latin-1") and it will yield u"\350\351".

I would like to be able to change the default encoding on a global 
scale. So when my terminal and keyboard support latin-1 I want to be 
able to specify that str() and repr() return latin-1 strings.
The __str__ and __repr__ implemented by classes should return
Unicode strings, which are converted to the system global encoding
by Python.

> [...]

Bye,
	Walter Dörwald