[I18n-sig] Format strings

"Martin v. Löwis" martin at v.loewis.de
Fri Nov 25 23:16:17 CET 2005


Josef Spillner wrote:
> # -*- coding: utf-8 -*-
> print "'%2s'" % "a"
> print "'%2s'" % "á"
> print "'%2s'" % u"á"
> 
> In the second case, while the string literal is recognized as utf-8 (thus two 
> bytes being one character in this case), it eats the two character format 
> string alone and doesn't leave any space for the empty character.

This is correct behaviour, and by design.

> Note that if the file encoding is not given, then it would display as 'á', 
> which is correct under the circumstances.

It is correct either way. A byte string is a byte string is a byte 
string is a  string of bytes is not a Unicode string.

The string in the second print statement actually *has* two bytes, so 
that it takes two bytes of output is correct.

Regards,
Martin


More information about the I18n-sig mailing list