Python version of STL multimap?

Alex Martelli aleax at aleax.it
Sat Jul 6 17:27:14 EDT 2002


Roy Smith wrote:
        ...
> multimap container template.  Exactly what I've been doing in Python
> without knowing the right name for it.

Not quite -- multimap is a sorted container, not a hashed one.

> So, the question is, is there a standard way in Python to do multimaps,
> or should I just continue to roll my own with one of the above idioms
> whenever I need one?

There's a recipe in the Cookbook about this, but I'm not sure it's
entirely usable or complete in the online form.

Summarizing: there are a few optimal choices depending on what
exactly you want.  For many purpose the ideal form is a dict
of dicts -- the value being either True or False, if you want
items to be "in" the data structure either once or not at all
(set-like behavior) or a count of occurrences if you want items
to be "in" the data structure a number of times (bag-like behavior).

Using the set-like behavior for definiteness, and Python 2.2:

Adding an item under a key (whether or not it was already there):
    s.setdefault(key, {})[item] = True
removing an item from a key (whether or not it was previously there):
    s.get(key, {})[item] = False
checking if an item exists under a key:
    s.get(key, {}).get(item, False)
iterating on all items of a key, in arbitrary order:
    for item in s.get(key, {}):
        whatever(item)

Basically, as you see, just about everything relies on the setdefault and
get items of dictionaries.  s.get(key, X) and s.setdefault(key, X)
return the same value (s[key] if key in s, otherwise X) but X also
sets s[key] to X if previously key wasn't in s.

This representation may leave False entries as values of s[key][item]
and indeed entire, useless dicts filled with False entries as values
of some s[key]'s.  That's OK in terms of behavior but may waste too
much memory for a long-running application.  So you may want to
periodically recompact your data structure s at otherwise idle times:

for key in s.keys():
    for item in s[key].keys():
        if not s[key][item]: del s[key][item]
    if not s[key]: del s[key]

If such periodic "recompacting" is a problem then you may want to
choose other idioms for "removing an item from a key", such as:

try: del s[key][item]
except KeyError: pass

This never stores a False value explicitly and only leaves you
with the possibility of empty sub-dict's at some s[key] (which you
may still want to re-compact -- either at each such deletion, or
periodically as above).  For immediate self-compaction you could use:

try: 
    del s[key][item]
    if not s[key]: del s[key]
except KeyError: pass

However, these idioms are slower than the previously recommended
ones in some common cases.  If your performance is critical, try
benchmarking with the various possibilities.


Alex




More information about the Python-list mailing list