How to waste computer memory?

Random832 random832 at fastmail.com
Sun Mar 20 14:17:22 EDT 2016


On Sun, Mar 20, 2016, at 10:55, Ben Bacarisse wrote:
> It's 21.  The reason being (or at least part of the reason being) that
> 21 bits can be UTF-8 encoded in 4 bytes: 11110xxx 10xxxxxx 10xxxxxx
> 10xxxxxx (3 + 3*6).

The reason is the UTF-16 limit. Prior to that, UTF-8 had no such limit
(it could encode up to 31 bits, as six bytes), and it doesn't account
for the fact that four bytes can encode up to U+1FFFFF rather than
U+10FFFF.



More information about the Python-list mailing list