Multibyte Character Surport for Python
Chris Liechti
cliechti at gmx.net
Thu May 9 17:03:55 EDT 2002
huaiyu at gauss.almadan.ibm.com (Huaiyu Zhu) wrote in
news:slrnadlmm2.5kg.huaiyu at gauss.almadan.ibm.com:
> Out of curiosity: If a character is two bytes, what would len()
> report? If s is a unicode string with wide characters, would list(s)
> be made of characters or bytes? Would that be different under the
> current situation, or the PEP 263, or under Stephen's proposal? Would
> it change depending on how the unicode is encoded?
we have an interactive console:
>>> len(unicode("hello"))
5
len gives you the number of characters no matter how many bytes are needed
to represent them.
>>> list(unicode("hello"))
[u'h', u'e', u'l', u'l', u'o']
so you get a list of unicode characters.
> A list of such simple questions and answers for various proposals
> would help many more people to understand the relevant PEPs.
i think most of that get's clear when you play around with the current
python and its unicode handling so that it does not need a special mention.
chris
--
Chris <cliechti at gmx.net>
More information about the Python-list
mailing list