Finding size of Variable

Mon Feb 10 10:02:50 EST 2014

On 2/10/14 9:43 AM, Tim Chase wrote:
> On 2014-02-10 06:07, wxjmfauth at gmail.com wrote:
>> Python does not save memory at all. A str (unicode string)
>> uses less memory only - and only - because and when one uses
>> explicitly characters which are consuming less memory.
>>
>> Not only the memory gain is zero, Python falls back to the
>> worse case.
>>
>>>>> sys.getsizeof('a' * 1000000)
>> 1000025
>>>>> sys.getsizeof('a' * 1000000 + 'oe')
>> 2000040
>>>>> sys.getsizeof('a' * 1000000 + 'oe' + '\U00010000')
>> 4000048
>
> If Python used UTF-32 for EVERYTHING, then all three of those cases
> would be 4000048, so it clearly disproves your claim that "python
> does not save memory at all".
>
>> The opposite of what the utf8/utf16 do!
>>
>>>>> sys.getsizeof(('a' * 1000000 + 'oe' +
>>>>> '\U00010000').encode('utf-8'))
>> 1000023
>>>>> sys.getsizeof(('a' * 1000000 + 'oe' +
>>>>> '\U00010000').encode('utf-16'))
>> 2000025
>
> However, as pointed out repeatedly, string-indexing in fixed-width
> encodings are O(1) while indexing into variable-width encodings (e.g.
> UTF8/UTF16) are O(N).  The FSR gives the benefits of O(1) indexing
> while saving space when a string doesn't need to use a full 32-bit
> width.
>
> -tkc
>
>
>

Please don't engage in this debate with JMF.  His mind is made up, and 
he will not be swayed, no matter how persuasive and reasonable your 
arguments.  Just ignore him.

-- 
Ned Batchelder, http://nedbatchelder.com