memory utilization blow up with dict structure

Peter Otten __peter__ at web.de
Fri Sep 23 08:34:14 EDT 2016


Christian wrote:

> Hi,
> 
> I'm wondering why python blow up a dictionary structure so much.
> 
> The ids and cat substructure could have 0..n entries but in the most cases
> they are <= 10,t is limited by <= 6.
> 
> Thanks for any advice to save memory.
> Christian
> 
> 
> Example:
> 
> {'0a0f7a3a0e09826caef1bff707785662': {'ids':
> {{'aa316b86-8169-11e6-bab9-0050563e2d7c',
>  'aa3174f0-8169-11e6-bab9-0050563e2d7c',
>  'aa319408-8169-11e6-bab9-0050563e2d7c',
>  'aa3195e8-8169-11e6-bab9-0050563e2d7c',
>  'aa319732-8169-11e6-bab9-0050563e2d7c',
>  'aa319868-8169-11e6-bab9-0050563e2d7c',
>  'aa31999e-8169-11e6-bab9-0050563e2d7c',
>  'aa319b06-8169-11e6-bab9-0050563e2d7c'},
>   't': {'type1', 'type2'},
>   'dt': datetime.datetime(2016, 9, 11, 15, 15, 54, 343000),
>   'nids': 8,
>   'ntypes': 2,
>   'cat': [('ABC', 'aa316b86-8169-11e6-bab9-0050563e2d7c', '74', ''),
>    ('ABC','aa3174f0-8169-11e6-bab9-0050563e2d7c', '3', 'type1'),
>    ('ABC','aa319408-8169-11e6-bab9-0050563e2d7c','3', 'type1'),
>    ('ABC','aa3195e8-8169-11e6-bab9-0050563e2d7c', '3', 'type2'),
>    ('ABC','aa319732-8169-11e6-bab9-0050563e2d7c', '3', 'type1'),
>    ('ABC','aa319868-8169-11e6-bab9-0050563e2d7c', '3', 'type1'),
>    ('ABC','aa31999e-8169-11e6-bab9-0050563e2d7c', '3', 'type1'),
>    ('ABC','aa319b06-8169-11e6-bab9-0050563e2d7c', '3', 'type2')]},

Not so much to save memory, but because redundant data always bears the risk 
to get out of sync:

For a value v in your dict, do

v["ids"] == {t[1] for t in v["cat"]} 
len(v["ids"]) == len(v["cat"])

v["nids"] ==  len(v["ids"])
v["ntypes"] == len(v["t"])
v["t"] == {t[-1] for t in v["cat"]} - {""}

always hold? 

And if you want to go fancy: are the IDs always 128-bit integers that share 
all but the leading 32 bits?




More information about the Python-list mailing list