Python 2.1 and Unicode

Martin von Loewis loewis at informatik.hu-berlin.de
Fri Feb 2 16:32:32 EST 2001


Dale Strickland-Clark <dale at out-think.NOSPAMco.uk> writes:

> However, I was disapointed not to find any improvements to the way
> Unicode is handled.

Did you report the problems you had as bugs? If not, why do you think
it might have "improved"?

> Currently we are having to use constructs such as this all over the place when dealing with values
> from international databases:
> 
> 	if type(val) == types.UnicodeType:
> 		val = val.encode('Latin-1', 'ignore')
> 	else:
> 		val = str(val)

It surely sounds wrong if you do this in many places. However, it
would be equally wrong if Python would set string conversion to
Latin-1 by default - what do you do if your terminal does not support
Latin-1?

> And it gets even more ugly when lists of values are handled.
>
> encode() blows if you give it an integer and str() blows if it
> doesn't like the Unicode.

Well, explicit is better than implicit. This fragment looks like an
application-specific conversion function - so it should be one,
instead of being inlined in many places.

> On top of that, print needs to default to rugged handling of Unicode
> and not raise an exception.

That turns out to be difficult, since print uses str() for each value.
What print *really* should do is to use the maximum capabilities of
the terminals (i.e. loading fonts, whatever is necessary). That is
even more difficult.

> Inserting print statements is normally the quickest way to gather
> debugging info but it becomes a big problem when dealing with
> Unicode, requiring code like that above.

Nah, for debugging, I think

  print `val`

would work fine in most cases.

> 1. A print command that defaults to ignoring Unicode conversion
> errors - or at least has the option to do so.

So how does repr() sound to you?

> 2. A function, such as str() that converts *anything* to an ASCII
> string - ignoring Unicode errors.

Well, this is actually the question to which repr() is the answer...

Regards,
Martin



More information about the Python-list mailing list