New internal string format in 3.3

Chris Angelico rosuav at gmail.com
Sun Aug 19 06:26:44 EDT 2012


On Sun, Aug 19, 2012 at 8:19 PM,  <wxjmfauth at gmail.com> wrote:
> This is precicely the weak point of this flexible
> representation. It uses latin-1 and latin-1 is for
> most users simply unusable.

No, it uses Unicode, and as an optimization, attempts to store the
codepoints in less than four bytes for most strings. The fact that a
one-byte storage format happens to look like latin-1 is rather
coincidental.

ChrisA



More information about the Python-list mailing list