Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

Peter Otten __peter__ at web.de
Sat Aug 7 02:50:35 EDT 2010


dmtr wrote:

>> > Well...  63 bytes per item for very short unicode strings... Is there
>> > any way to do better than that? Perhaps some compact unicode objects?
>>
>> There is a certain price you pay for having full-feature Python objects.
> 
> Are there any *compact* Python objects? Optimized for compactness?
> 
>> What are you trying to accomplish anyway? Maybe the array module can be
>> of some help. Or numpy?
> 
> Ultimately a dict that can store ~20,000,000 entries: (u'short
> string' : (int, int, int, int, int, int, int)).

I don't know to what extent it still applys but switching off cyclic garbage 
collection with

import gc
gc.disable()

while building large datastructures used to speed up things significantly. 
That's what I would try first with your real data.

Encoding your unicode strings as UTF-8 could save some memory.

When your integers fit into two bytes, say, you can use an array.array() 
instead of the tuple.

Peter




More information about the Python-list mailing list