Hash map with multiple keys per value ?

Alex Martelli aleax at mail.comcast.net
Sat Nov 12 13:47:47 EST 2005


Chris Stiles <b9a9-3arx at spamex.com> wrote:

> "snoe" <case.nelson at gmail.com> writes:
> > Are you looking for this type of thing?
> >
> > class Test:
> >     value = 900
> 
> No, what I'm trying to do is this, assume two sets of aliases:
> 
> a, b, c, d and e
> 
> x, y and z
> 
> a returns b c d and e, b returns a c d and e, x returns y and z, y returns x
> and z etc.
> 
> Where any of the sets of aliases can grow or shrink at any time.

So, it seems a reasonable underlying representation would be a mapping
from item to set of items.  Keeping N different sets for a set of N
aliases would appear a bit wasteful of memory (it's hard to say without
knowing usecases and performance desiderata...), so you might have all
aliases map to the same set of aliases and dynamically construct the
required result by subtracting the lookup key itself.

Another design issue is, however, how do you identify, when adding an
alias, what it's an alias of -- from previous posts I'll assume that all
aliases are 'names' for some (hopefully hashable) object, so we'll also
need a mapping from said object back to its set of aliases.  Or, would
alias insertion just be of the form "x aliases to y" rather than "x
names object XXX"?

Also not clarified in terms of specs: when an alias is inserted, do you
need to check if it was already an alias for something else (in another
set) and [a] raise an exception or [b] silently remove it from the
previous set?  That applies to the "x names object XXX" style but not
necessarily to the "x aliases y" style, unless in the latter case you
want to say "...replacing any previous aliases of x" to change rather
than merge alias-sets.  But it would then seem simpler to simply have a
separate removal function "x doesn't alias to anything any longer", to
be optionally called before the insertion method.

So, I think there are enough questions open about the specs to warrant
more exploration.  Under some given set of interpretations for the above
open questions, you might have, e.g. (untested code, and no special
attention paid to performance):

class Aliases(object):
  def __init__(self):
     self.d = {}
  def lookup(self, x):
     if x in self.d:
       return self.d[x]-set([x])
     else:
       return set()
  def alias(self, x, y):
     dx = self.d.get(x,set([x]))
     dy = self.d.get(y,set([y]))
     self.d[x]=self.d[y]=dx+dy
  def unalias(self, x):
      if x in self.d:
          self.d[x].remove(x)
          if not self.d[x]: del self.d[x]

Perhaps by working on this concrete example (which likely has bugs,
being untested, and even more likely misinterpret some specs, since I'm
having to guess at quite a few details) you can nail down the exact kind
of API syntax and semantics that you require...?  Once the specs are
entirely clear, we might (if needed) worry about efficiency, but that
seems somewhat premature at this stage...


Alex



More information about the Python-list mailing list