[Tutor] how to struct.pack a unicode string?

eryksun eryksun at gmail.com
Thu Jan 3 13:52:46 CET 2013


On Tue, Jan 1, 2013 at 1:29 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>
> 2 Since "wide builds" use so much extra memory for the average ASCII
>   string, hardly anyone uses them.

On Windows (and I think OS X, too) a narrow build has been practical
since the wchar_t type is 16-bit. As to Linux I'm most familiar with
Debian, which uses a wide build. Do you know off-hand which distros
release a narrow build?

> But more important than the memory savings, it means that for the first
> time Python's handling of Unicode strings is correct for the entire range
> of all one million plus characters, not just the first 65 thousand.

Still, be careful not to split 'characters':

    >>> list(normalize('NFC', '\u1ebf'))
    ['ế']
    >>> list(normalize('NFD', '\u1ebf'))
    ['e', '̂', '́']


More information about the Tutor mailing list