right adjusted strings containing umlauts

MRAB python at mrabarnett.plus.com
Thu Aug 8 11:19:38 EDT 2013


On 08/08/2013 15:40, Neil Cerutti wrote:
> On 2013-08-08, Kurt Mueller <kurt.alfred.mueller at gmail.com> wrote:
>> I'd like to print strings right adjusted.
>> ( Python 2.7.3, Linux 3.4.47-2.38-desktop )
>>
>> from __future__ import print_function
>> print( '>{0:>3}<'.format( 'a' ) )
>>>  a<
>>
>> But if the string contains an Umlaut:
>> print( '>{0:>3}<'.format( '??' ) )
>>> ??<
>>
>> Same with % notation:
>> print( '>%3s<' % ( 'a' ) )
>>>  a<
>> print( '>%3s<' % ( '??' ) )
>>> ??<
>>
>> For a string with no Umlaut it uses 3 characters, but for an
>> Umlaut it uses only 2 characters.
>>
>> I guess it has to to with unicode.
>> How do I get it right?
>
> You guessed it!
>
> Use unicode strings instead of byte strings, e.g., u"...".
>
It also matters which actual codepoints you're using in the Unicode
string.

You could have u'ä', which is one codepoint (u'\xE4' or u'\N{LATIN
SMALL LETTER A WITH DIAERESIS}'), or u'ä', which two codepoints
(u'a\u0308' or u'\N{LATIN SMALL LETTER A}\N{COMBINING DIAERESIS}').




More information about the Python-list mailing list