How to find number of characters in a unicode string?

faulkner faulkner612 at comcast.net
Mon Sep 18 16:29:49 EDT 2006


are you sure you're using unicode objects?
len(u'\uffff') == 1
the encodings module should help you turn '\xff\xff' into u'\uffff'.

Preben Randhol wrote:
> Hi
>
> If I use len() on a string containing unicode letters I get the number
> of bytes the string uses. This means that len() can report size 6 when
> the unicode string only contains 3 characters (that one would write by
> hand or see on the screen). Is there a way to calculate in characters
> and not in bytes to represent the characters.
>
> The reason for asking is that PyGTK needs number of characters to set
> the width of Entry widgets to a certain length, and it expects viewable
> characters and not number of bytes to represent them.
> 
> 
> Thanks in advance
> 
> 
> Preben




More information about the Python-list mailing list