dictionary of dictionaries

Mon Dec 10 04:58:54 EST 2007

kettle wrote:

> On Dec 9, 5:49 pm, Marc 'BlackJack' Rintsch <bj_... at gmx.net> wrote:
>> On Sun, 09 Dec 2007 00:35:18 -0800, kettle wrote:
>> > Hi,
>> >  I'm wondering what the best practice is for creating an extensible
>> > dictionary-of-dictionaries in python?
>>
>> >  In perl I would just do something like:
>>
>> > my %hash_of_hashes;
>> > for(my $i=0;$i<10;$i++){
>> >     for(my $j=0;$j<10;$j++){
>> >    ${$hash_of_hashes{$i}}{$j} = int(rand(10));
>> >     }
>> > }
>>
>> > but it seems to be more hassle to replicate this in python.  I've
>> > found a couple of references around the web but they seem cumbersome.
>> > I'd like something compact.
>>
>> Use `collections.defaultdict`:
>>
>> from collections import defaultdict
>> from random import randint
>>
>> data = defaultdict(dict)
>> for i in xrange(11):
>>     for j in xrange(11):
>>         data[i][j] = randint(0, 10)
>>
>> If the keys `i` and `j` are not "independent" you might use a "flat"
>> dictionary with a tuple of both as keys:
>>
>> data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))
>>
>> And just for completeness: The given data in the example can be stored in a
>> list of lists of course:
>>
>> data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]
>>
>> Ciao,
>>         Marc 'BlackJack' Rintsch
> 
> Thanks for the heads up.  Indeed it's just as nice as perl.  One more
> question though, this defaultdict seems to only work with python2.5+
> in the case of python < 2.5 it seems I have to do something like:
> #!/usr/bin/python
> from random import randint
> 
> dict_dict = {}
> for x in xrange(10):
>     for y in xrange(10):
>         r = randint(0,10)
>         try:
>             dict_dict[x][y] = r
>         except:
>             if x in dict_dict:
>                 dict_dict[x][y] = r
>             else:
>                 dict_dict[x] = {}
>                 dict_dict[x][y] = r

You can clean that up a bit:

from random import randrange

dict_dict = {}
for x in xrange(10):
    dict_dict[x] = dict((y, randrange(11)) for y in xrange(10))

> what I really want to / need to be able to do is autoincrement the
> values when I hit another word.  Again in perl I'd just do something
> like:
> 
> my %my_hash;
> while(<FILE>){
>   chomp;
>   @_ = split(/\s+/);
>   grep{$my_hash{$_}++} @_;
> }
> 
> and this generalizes transparently to a hash of hashes or hash of a
> hash of hashes etc.  In python < 2.5 this seems to require something
> like:
> 
> for line in file:
>   words = line.split()
>   for word in words:
>     my_dict[word] = 1 + my_dict.get(word, 0)
> 
> which I guess I can generalize to a dict of dicts but it seems it will
> require more if/else statements to check whether or not the higher-
> level keys exist.  I guess the real answer is that I should just
> migrate to python2.5...!

Well, there's also dict.setdefault()

>>> pairs = ["ab", "ab", "ac", "bc"]
>>> outer = {}
>>> for a, b in pairs:
...     inner = outer.setdefault(a, {})
...     inner[b] = inner.get(b, 0) + 1
... 
>>> outer
{'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}

and it's not hard to write your own defaultdict

>>> class Dict(dict):
...     def __getitem__(self, key):
...             return self.get(key, 0)
... 
>>> d = Dict()
>>> for c in "abbbcdeafgh": d[c] += 1
... 
>>> d
{'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}

Peter