Better dict of dicts

DillonCo dillonco at comcast.net
Thu Apr 19 18:10:19 EDT 2007


On Thursday 19 April 2007, Bill Jackson wrote:
> I have a dictionary of dictionaries where the keys are typically very
> long tuples and repeated in each inner dictionary.  The dictionary
> representation is nice because it handles sparseness well...and it is
> nice to be able to look up values based on a string rather than a
> number.  However, since my keys are quite long, I worry that I am
> wasting a lot of memory.  I'm looking for better data structures.

I think you may want to look into that rarely used function "intern" 
(under "on-essential Built-in Functions").

Basically, Python keeps a cache of certain strings are are frequently used so 
comparisons and dictionary lookups only require a pointer comparison.  You 
could then subclass dict (though using "DictMixin" could be better) like:

class IDict(DictMixin):
    def __setitem__(self, key, value):
        key=intern(key)
        self.__dict[key]=value

That's totally untested and incomplete, but you hopefully get the idea.

Python (or at least CPython) seems to auto intern some strings occasionally 
(you could look at the source if you care about the exact rules).
Example:
>>> a="1234567890"
>>> b="1234567890"
>>> a is b
True

So you don't have all that much to worry about.




More information about the Python-list mailing list