Swedish characters in Python strings

Stefan stefan at wale.dyndns.dk
Sat Oct 12 15:17:14 EDT 2002


Urban Anjar wrote:
> Hi,
> I have found something that looks like a bug, or at least a not so
> pleasant feature. In Swedish we often use the characters å, ä and ö (a
> with a ring, a with two dots and o with two dots) and I don't get them
> to work perfectly
> well in Python.
> 
> Python 2.2.1 (#1, Aug 30 2002, 12:15:30)
> [GCC 3.2 20020822 (Red Hat Linux Rawhide 3.2-4)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>>
>>>> S = 'abc'
>>>> print S
> abc
>>>> print len(S)
> 3
> 
> That is perfectly OK, but...
> 
>>>> S = 'åäö'
>>>> print S
> åäö
>>>> print len(S)
> 6

åäö cannot be represented with single bytes in Unicode, thats why åäö is 
multibyte characters.

Learn more about unicode on this page:

http://www.cl.cam.ac.uk/~mgk25/unicode.html

> 
> Please let me know if I do something wrong or if you too think
> about this as a bug.
> 
> Sincerely,
> Urban Anjar




More information about the Python-list mailing list