Python 2.1 and Unicode

Tue Jan 30 21:24:55 EST 2001

I've just been catching up with the new 2.1 release and there's lots of nice stuff in there.

However, I was disapointed not to find any improvements to the way Unicode is handled.

Currently we are having to use constructs such as this all over the place when dealing with values
from international databases:

	if type(val) == types.UnicodeType:
		val = val.encode('Latin-1', 'ignore')
	else:
		val = str(val)

And it gets even more ugly when lists of values are handled.

encode() blows if you give it an integer and str() blows if it doesn't like the Unicode.

On top of that, print needs to default to rugged handling of Unicode and not raise an exception.
Inserting print statements is normally the quickest way to gather debugging info but it becomes a
big problem when dealing with Unicode, requiring code like that above.

If you need to print multiple values or lists of Unicode items - well forget it.

We need two important features:

1. A print command that defaults to ignoring Unicode conversion errors - or at least has the option
to do so.
2. A function, such as str() that converts *anything* to an ASCII string - ignoring Unicode errors.

Thanks
--
Dale Strickland-Clark
Out-Think Ltd
Business Technology Consultants