RE Module Performance

Michael Torrie torriem at gmail.com
Fri Jul 26 22:05:03 EDT 2013


On 07/26/2013 07:21 AM, wxjmfauth at gmail.com wrote:
>>>> sys.getsizeof('––') - sys.getsizeof('–')
> 
> I have already explained / commented this.

Maybe it got lost in translation, but I don't understand your point with
that.

> Hint: To understand Unicode (and every coding scheme), you should
> understand "utf". The how and the *why*.

Hmm, so if python used utf-8 internally to represent unicode strings
would not that punish *all* users (not just non-ascii users) since
searching a string for a certain character position requires an O(n)
operation?  UTF-32 I could see (and indeed that's essentially what FSR
uses when necessary does it not?), but not utf-8 or utf-16.




More information about the Python-list mailing list