Unicode Problem

Ulrich Eckhardt eckhardt at satorlaser.com
Thu Oct 30 04:28:39 EDT 2008


Seid Mohammed wrote:
> I am new to python.

Welcome! :)

>>>> abebe = 'አበበ በሶ በላ'
>>>> abebe
> '\xe1\x8a\xa0\xe1\x89\xa0\xe1\x89\xa0 \xe1\x89\xa0\xe1\x88\xb6
> \xe1\x89\xa0\xe1\x88\x8b'
>>>> print abebe
> አበበ በሶ በላ
>>>> abeba = ['አበበ','በሶ','በላ']
>>>> abeba
> ['\xe1\x8a\xa0\xe1\x89\xa0\xe1\x89\xa0', '\xe1\x89\xa0\xe1\x88\xb6',
> '\xe1\x89\xa0\xe1\x88\x8b']
>>>> print abeba
> ['\xe1\x8a\xa0\xe1\x89\xa0\xe1\x89\xa0', '\xe1\x89\xa0\xe1\x88\xb6',
> '\xe1\x89\xa0\xe1\x88\x8b']
>>>> len(abebe)
> 23
> ========================
> so my question is
> 1)why >>> abebe prints  '\xe1\x8a\xa0\xe1\x89\xa0\xe1\x89\xa0
> \xe1\x89\xa0\xe1\x88\xb6 \xe1\x89\xa0\xe1\x88\x8b' instead of አበበ በሶ
> በላ
> 2) why >>> print abeba don't print the expected አበበ በሶ በላ string

When you just type an identifier X on the commandline, Python outputs the
result of calling repr(X). This typically gives you something that you
could enter in any Python program. Note that e.g. the string 'አበበ በሶ በላ' is
not suitable in any Python program, it requires an encoding where those
characters are supported like e.g. UTF-8.

Now, if you type "print X" on the commandline, it will output the thing as a
string instead, giving you the original contents. If, like for a list, no
string representation exists, it will fall back to using repr() instead.


Disclaimer: I'm not a pro yet myself, but I think this covers the background
a bit. Maybe someone will correct me if I'm horribly wrong.

Uli

-- 
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932




More information about the Python-list mailing list