An ordered dictionary for the Python library?

Paul Rubin http
Thu Sep 13 16:14:14 EDT 2007


Mark Summerfield <m.n.summerfield at googlemail.com> writes:
> > Yes.  It should use a functional data structure.
> Could you elaborate?

I mean use a data structure like an AVL tree, that can make a new dict
object quickly, whose contents are the same as the old object except
for one element (i.e. most of the contents are shared with the old
dict).  You should also have ordered versions of both lists and sets.

Consider the situation with lists:

   a = some 1000 element list
   b = a + [23]

Now a has 1000 elements and b has 1001 elements (23 followed by a's 
elements), but constructing b required copying all of a's elements.

It's sort of the same with sets:

   a = 1000 element set
   b = a + set((23,))

b had to copy the contents of a.  By using a tree structure, you can
construct b in O(log(1000)) operations so that most of the elements
are shared with a, while at the same time you don't mutate a.  That
makes it a lot easier to program in a style where you have a bunch of
slightly different versions of some set or dict flying around.  For
an example of where this came up, see this thread:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/78b5953488d772e9/82f701c302122777

For sample implemetations and API's, you could look at the Hedgehog
Lisp version of AVL trees (including a Python dict-like interface):

   http://hedgehog.oliotalo.fi/

and Haskell's Data.Map library:

   http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Map.html

I think there is something similar in the ML Basis library but I'm not
familiar with that.

For more about functional data structures, see some of Chris Okasaki's
articles:

  http://www.eecs.usma.edu/webs/people/okasaki/pubs.html

Dr. Okasaki also has written a book about the topic, called "Purely
Functional Data Structures", which is very good.

That said, in Python maybe this stuff would be easier to use
improperly because Python coding style uses mutation so much.  It
might be best to limit sharing to the case of frozen dicts and frozen
sets.



More information about the Python-list mailing list