dictionary of dictionaries

kettle Josef.Robert.Novak at gmail.com
Tue Dec 11 02:51:00 EST 2007


On Dec 10, 6:58 pm, Peter Otten <__pete... at web.de> wrote:
> kettle wrote:
> > On Dec 9, 5:49 pm, Marc 'BlackJack' Rintsch <bj_... at gmx.net> wrote:
> >> On Sun, 09 Dec 2007 00:35:18 -0800, kettle wrote:
> >> > Hi,
> >> >  I'm wondering what the best practice is for creating an extensible
> >> > dictionary-of-dictionaries in python?
>
> >> >  In perl I would just do something like:
>
> >> > my %hash_of_hashes;
> >> > for(my $i=0;$i<10;$i++){
> >> >     for(my $j=0;$j<10;$j++){
> >> >    ${$hash_of_hashes{$i}}{$j} = int(rand(10));
> >> >     }
> >> > }
>
> >> > but it seems to be more hassle to replicate this in python.  I've
> >> > found a couple of references around the web but they seem cumbersome.
> >> > I'd like something compact.
>
> >> Use `collections.defaultdict`:
>
> >> from collections import defaultdict
> >> from random import randint
>
> >> data = defaultdict(dict)
> >> for i in xrange(11):
> >>     for j in xrange(11):
> >>         data[i][j] = randint(0, 10)
>
> >> If the keys `i` and `j` are not "independent" you might use a "flat"
> >> dictionary with a tuple of both as keys:
>
> >> data = dict(((i, j), randint(0, 10)) for i in xrange(11) for j in xrange(11))
>
> >> And just for completeness: The given data in the example can be stored in a
> >> list of lists of course:
>
> >> data = [[randint(0, 10) for dummy in xrange(11)] for dummy in xrange(11)]
>
> >> Ciao,
> >>         Marc 'BlackJack' Rintsch
>
> > Thanks for the heads up.  Indeed it's just as nice as perl.  One more
> > question though, this defaultdict seems to only work with python2.5+
> > in the case of python < 2.5 it seems I have to do something like:
> > #!/usr/bin/python
> > from random import randint
>
> > dict_dict = {}
> > for x in xrange(10):
> >     for y in xrange(10):
> >         r = randint(0,10)
> >         try:
> >             dict_dict[x][y] = r
> >         except:
> >             if x in dict_dict:
> >                 dict_dict[x][y] = r
> >             else:
> >                 dict_dict[x] = {}
> >                 dict_dict[x][y] = r
>
> You can clean that up a bit:
>
> from random import randrange
>
> dict_dict = {}
> for x in xrange(10):
>     dict_dict[x] = dict((y, randrange(11)) for y in xrange(10))
>
>
>
> > what I really want to / need to be able to do is autoincrement the
> > values when I hit another word.  Again in perl I'd just do something
> > like:
>
> > my %my_hash;
> > while(<FILE>){
> >   chomp;
> >   @_ = split(/\s+/);
> >   grep{$my_hash{$_}++} @_;
> > }
>
> > and this generalizes transparently to a hash of hashes or hash of a
> > hash of hashes etc.  In python < 2.5 this seems to require something
> > like:
>
> > for line in file:
> >   words = line.split()
> >   for word in words:
> >     my_dict[word] = 1 + my_dict.get(word, 0)
>
> > which I guess I can generalize to a dict of dicts but it seems it will
> > require more if/else statements to check whether or not the higher-
> > level keys exist.  I guess the real answer is that I should just
> > migrate to python2.5...!
>
> Well, there's also dict.setdefault()
>
> >>> pairs = ["ab", "ab", "ac", "bc"]
> >>> outer = {}
> >>> for a, b in pairs:
>
> ...     inner = outer.setdefault(a, {})
> ...     inner[b] = inner.get(b, 0) + 1
> ...>>> outer
>
> {'a': {'c': 1, 'b': 2}, 'b': {'c': 1}}
>
> and it's not hard to write your own defaultdict
>
> >>> class Dict(dict):
>
> ...     def __getitem__(self, key):
> ...             return self.get(key, 0)
> ...>>> d = Dict()
> >>> for c in "abbbcdeafgh": d[c] += 1
> ...
> >>> d
>
> {'a': 2, 'c': 1, 'b': 3, 'e': 1, 'd': 1, 'g': 1, 'f': 1, 'h': 1}
>
> Peter

One last question.  I've heard the 'Explicit vs. Implicit' argument
but this seems to boil down to a question of general usage case
scenarios and what most people 'expect' for default behavior.  The
above defaultdict implementation defining the __getitem__ method seems
like it is more generally useful than the real default.  What is the
reasoning behind NOT using this as the default implementation for a
dict in python?



More information about the Python-list mailing list