Finding size of Variable
Tim Chase
python.list at tim.thechases.com
Mon Feb 10 09:43:08 EST 2014
On 2014-02-10 06:07, wxjmfauth at gmail.com wrote:
> Python does not save memory at all. A str (unicode string)
> uses less memory only - and only - because and when one uses
> explicitly characters which are consuming less memory.
>
> Not only the memory gain is zero, Python falls back to the
> worse case.
>
> >>> sys.getsizeof('a' * 1000000)
> 1000025
> >>> sys.getsizeof('a' * 1000000 + 'oe')
> 2000040
> >>> sys.getsizeof('a' * 1000000 + 'oe' + '\U00010000')
> 4000048
If Python used UTF-32 for EVERYTHING, then all three of those cases
would be 4000048, so it clearly disproves your claim that "python
does not save memory at all".
> The opposite of what the utf8/utf16 do!
>
> >>> sys.getsizeof(('a' * 1000000 + 'oe' +
> >>> '\U00010000').encode('utf-8'))
> 1000023
> >>> sys.getsizeof(('a' * 1000000 + 'oe' +
> >>> '\U00010000').encode('utf-16'))
> 2000025
However, as pointed out repeatedly, string-indexing in fixed-width
encodings are O(1) while indexing into variable-width encodings (e.g.
UTF8/UTF16) are O(N). The FSR gives the benefits of O(1) indexing
while saving space when a string doesn't need to use a full 32-bit
width.
-tkc
More information about the Python-list
mailing list