byte count unicode string

willie willie at jamots.com
Wed Sep 20 01:53:21 EDT 2006


# What's the correct way to get the
# byte count of a unicode (UTF-8) string?
# I couldn't find a builtin method
# and the following is memory inefficient.

ustr = "example\xC2\x9D".decode('UTF-8')

num_chars = len(ustr)    # 8

buf = ustr.encode('UTF-8')

num_bytes = len(buf)     # 9


# Thanks.




More information about the Python-list mailing list