Unicode Newbie

Fredrik Lundh fredrik at pythonware.com
Tue Sep 9 10:57:44 EDT 2003


Manuel Huesser wrote:

> The unicode function implies that you only can use 2 ** 16 chars
> (unichr supports only this range) but with a given encoding e.g.
> unicode(",,,", "utf-8") i should be able to encode
> up to 2** 31 chars.

Nope.  Read on.

> "\xfc\x12\x12\x12\x12\x12\x12" is an example for a 7
> byte utf-8 string. But on encoding i get the following
> error:
>
> UTF-8 decoding error: unsupported Unicode code range

Unicode supports ~2**20 code points (17*64k), not 2**31 characters.
Your example is not a valid UTF-8 string.

> Is there any possibility to do the job?

Not if you're using a conforming Unicode implementation.

</F>








More information about the Python-list mailing list