How to waste computer memory?

Ian Kelly ian.g.kelly at gmail.com
Fri Mar 18 23:56:54 EDT 2016


On Fri, Mar 18, 2016 at 10:44 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Sat, 19 Mar 2016 02:31 am, Random832 wrote:
>
>> On Fri, Mar 18, 2016, at 11:17, Ian Kelly wrote:
>>> If the string is simple UCS-2, that's easy.
>
> Hmmm, well, nobody uses UCS-2 any more, since that only covers the first
> 65536 code points. Rather, languages like Javascript and Java, and the
> Windows OS, use UTF-16, which is a *variable width* extension to UCS-2. I
> don't know about Windows, but Javascript implements this badly, so that
> 4-byte UTF-16 code points are treated as *two* surrogate code points
> instead of the single code point they are meant to be.

The reason I specifically brought up UCS-2 is because *Python* uses
UCS-2 (or Latin-1, or UCS-4 depending on the nature of the string).



More information about the Python-list mailing list