[Python-ideas] ordered dict

Talin talin at acm.org
Fri Apr 20 21:50:39 CEST 2007


Josiah Carlson wrote:
> Mathias Panzenböck <grosser.meister.morti at gmx.net> wrote:
>> Some kind of ordered dictionary would be nice to have in the
>> standard library. e.g. a AVL tree or something like that.
>> It would be nice so we can do things like that:
>>
>> for value in tree[:end_key]:
>> 	do_something_with(value)
>>
>> del tree[:end_key]
>>
>>
>> A alternative would be just to sort the keys of a dict but
>> that's O( n log n ) for each sort. Depending on what's the more
>> often occurring case (lookup, insert, get key-range, etc.) a
>> other kind of dict object would make sense.
>>
>> What do you think?
> 
> This has been brought up many times.  The general consensus has been
> 'you don't get what you think you get'.
> 
>     >>> u'a' < 'b' < () < u'a'
>     True
> 
> That is to say, there isn't a total ordering on objects that would make
> sense as a sorted key,value dictionary.  In Python 3.0, objects that
> don't make sense to compare won't be comparable, so list.sort() and/or
> an AVL tree may make sense again.
> 
> However, the problem with choosing to use an AVL (Red-Black, 2-3, etc.)
> tree is deciding semantics.  Do you allow duplicate keys?  Do you allow
> insertion and removal by position?  Do you allow the fetching of the
> key/value at position X?  Do you allow the fetching of the position for
> key X?  Insertion before/after (bisect_left, bisect_right equivalents).
> Etcetera.

I generally agree. I also think that the term "ordered dictionary" ought 
to be avoided.

One the one hand, I have no particular objection to someone creating an 
implementation of RB trees, B+-trees, PATRICIA radix trees and so on - 
in fact, these might be very useful things to have as standard 
collection classes.

However, 'dict' has a whole set of semantic baggage that goes along with 
it that may or may not apply to these other container types; And 
similarly, these other container types may have operations and semantics 
that don't correspond to the standard Python dictionary. One expects to 
be able to do certain things with an RB-tree that are either disallowed 
or very inefficient with a regular dict, and the converse is true as 
well. You give a number of examples such as fetching the position for a 
given key.

So my feeling is - let trees be trees, and dicts be dicts, and don't 
attempt to conflate the two. Otherwise, you end up with what I like to 
call the "overfactoring" anti-pattern - that is, attempt to generalize 
and unify two disparate systems that have different purposes and design 
intents into a single uniform interface.

-- Talin



More information about the Python-ideas mailing list