[Python-Dev] Unicode as argument for 8-bit format strings

Guido van Rossum guido@python.org
Fri, 07 Apr 2000 09:01:45 -0400


> There has been a bug report about the treatment of Unicode
> objects together with 8-bit format strings. The current
> implementation converts the Unicode object to UTF-8 and then
> inserts this value in place of the %s.... 
> 
> I'm inclined to change this to have '...%s...' % u'abc'
> return u'...abc...' since this is just another case of
> coercing data to the "bigger" type to avoid information loss.
> 
> Thoughts ?

Makes sense.  But note that it's going to be difficult to catch all
cases: you could have

'...%d...%s...%s...' % (3, "abc", u"abc")

and

'...%(foo)s...' % {'foo': u'abc'}

and even

'...%(foo)s...' % {'foo': 'abc', 'bar': u'def'}

(the latter should *not* convert to Unicode).

--Guido van Rossum (home page: http://www.python.org/~guido/)