[Python-Dev] re: Unicode as argument for 8-bit strings

M.-A. Lemburg mal@lemburg.com
Sat, 08 Apr 2000 11:51:32 +0200


Bill Tutt wrote:
> 
> > There has been a bug report about the treatment of Unicode
> > objects together with 8-bit format strings. The current
> > implementation converts the Unicode object to UTF-8 and then
> > inserts this value in place of the %s....
> >
> > I'm inclined to change this to have '...%s...' % u'abc'
> > return u'...abc...' since this is just another case of
> > coercing data to the "bigger" type to avoid information loss.
> >
> > Thoughts ?
> 
> Suddenly returning a Unicode string from an operation that was an 8-bit
> string is likely to give some code exterme fits of despondency.
> 
> Converting to UTF-8 didn't give you any data loss, however it certainly
> might be unexpected to now find UTF-8 characters in what the user originally
> thought was
> a binary string containing whatever they had wanted it to contain.

Well, the design is to always coerce to Unicode when 8-bit
string objects and Unicode objects meet. This is done for
all string methods and that's the reason I'm also implementing
this for %-formatting (internally this is just another string
method).
 
> Throwing an exception would at the very least force the user to make a
> decision one way or the other about what they want to do with the data.
> They might want to do a codepage translation, or something else. (aka Hey,
> here's a bug I just found for you!)

True; but Guido's intention was to have strings and Unicode
interoperate without too much user intervention.

> In what other cases are you suddenly returning a Unicode string object from
> which previouslly returned a string object?

All string methods automatically coerce to Unicode when they
see a Unicode argument, e.g. " ".join(("abc", u"def")) will
return u"abc def".

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/