Python-list Digest, Vol 61, Issue 443

Marc 'BlackJack' Rintsch bj_666 at gmx.net
Thu Oct 30 06:58:30 EDT 2008


On Thu, 30 Oct 2008 13:50:47 +0300, Seid Mohammed wrote:

> ok
> but still i am not clear with my problem. if i test this one
> ==============
>  kk ='how old are you'
>>>> len(kk)
> 15
> ==========
> but in my case
> ==========
>>>> abebe = 'አበበ በሶ በላ'
>>>> len(abebe)
>  23
> ==========
> why the lenght is 23 while I am expecting to be 9 only. becuase I have 9
> characters(including space) just typed. there must be a kind of trick
> over it.

You have typed 9 characters but they are not encoded as 9 bytes.  I guess 
your environment uses UTF-8 as encoding, because mine does too and:

In [124]: abebe = 'አበበ በሶ በላ'

In [125]: len(abebe)
Out[125]: 23

In [126]: s = 'አ'

In [127]: len(s)
Out[127]: 3

In [128]: s
Out[128]: '\xe1\x8a\xa0'

So that one character is encoded in three bytes.  If you really want to 
operate on characters instead of bytes, use `unicode` objects:

In [129]: u = abebe.decode('utf-8')

In [130]: len(u)
Out[130]: 9

In [131]: print u[0]
አ

Ciao,
	Marc 'BlackJack' Rintsch



More information about the Python-list mailing list