Finding size of Variable

wxjmfauth at gmail.com wxjmfauth at gmail.com
Mon Feb 10 09:07:14 EST 2014


Le samedi 8 février 2014 03:48:12 UTC+1, Steven D'Aprano a écrit :
> 
> 
> We consider it A GOOD THING that Python spends memory for programmer 
> 
> convenience and safety. Python looks for memory optimizations when it can 
> 
> save large amounts of memory, not utterly trivial amounts. So in a Python 
> 
> wide build, a ten-thousand block character string requires a little bit 
> 
> more than 40KB. In Python 3.3, that can be reduced to only 10KB for a 
> 
> purely Latin-1 string, or 20K for a string without any astral characters. 
> 
> That's the sort of memory savings that are worthwhile, reducing memory 
> 
> usage by 75%.
> 
> 
> 

In its attempt to save memory, Python only succeeds to
do worse than any utf* coding schemes.

---

Python does not save memory at all. A str (unicode string)
uses less memory only - and only - because and when one uses
explicitly characters which are consuming less memory.

Not only the memory gain is zero, Python falls back to the
worse case.

>>> sys.getsizeof('a' * 1000000)
1000025
>>> sys.getsizeof('a' * 1000000 + 'oe')
2000040
>>> sys.getsizeof('a' * 1000000 + 'oe' + '\U00010000')
4000048

The opposite of what the utf8/utf16 do!

>>> sys.getsizeof(('a' * 1000000 + 'oe' + '\U00010000').encode('utf-8'))
1000023
>>> sys.getsizeof(('a' * 1000000 + 'oe' + '\U00010000').encode('utf-16'))
2000025


jmf



More information about the Python-list mailing list