sorteddict [was a PEP proposal, but isn't anymore!]

Duncan Booth duncan.booth at invalid.invalid
Sat Sep 29 10:23:06 EDT 2007


Antoon Pardon <apardon at forel.vub.ac.be> wrote:

> On 2007-09-27, gatti at dsdata.it <gatti at dsdata.it> wrote:
> 
>> Is this a practical use case? When are sequential visits of all
>> elements in order frequently suspended to make insertions and
>> deletions, with a need for efficient lookup by key?
> 
> Does it need to be a sequential visit of *all* elements?
> 
> Suppose you have a mapping of start times to tasks. You can then want
> to iterate over all tasks that need to be started between noon en 4 pm
> next monday.  If you have a hashtable you still will need to sort all
> the keys even if you will visit only 10%. If you have a tree you can
> just visit the specified keys.
> 
It still depends on the exact pattern of use. If you have an implementation 
which tracks additions and deletions and sorts them into the list when 
required (as we came up with earlier) then this is much more efficient that 
re-sorting the whole list every time. Sorting a large sorted list with a 
few unsorted elements on the end is comparable to inserting the elements 
individually into a tree, and you still have the hashtable benefits on 
accessing elements to help level the playing field.

For me though, the most convincing use-case for a sorted dictionary is one 
that I don't think has been mentioned yet:
There are situations when you want to use a dictionary with existing 
library code that doesn't care about the random key ordering, but you have 
additional requirements which the original code didn't know about. 

For example, in the sorteddict code I added an implementation for the 
__repr__ method and an associated doctest. Unlike iteration over sorteddict 
itself, I didn't bother to make __repr__ stable, so in that particular 
doctest it only tests the repr of a sorteddict with a single element. 
If that was fixed to make the repr stable it would be a real benefit for 
writing any doctests which want to produce a dictionary as a result.

Another example would be if you had a library which serialised a dictionary 
to xml. There is nothing wrong with the library if it doesn't care about 
order, but if you have some other reason why you want the xml to be stable 
(e.g. because you store it in a version control system and want to compare 
revisions) then a sorteddict would allow you to impose that behaviour on 
the library from outside.

Contrary to my earlier insistence that sorteddict is only really useful if 
you can have a key parameter, both of these examples simply want an 
arbitrary but defined order of iteration for dictionary keys. A much 
simpler sorteddict that has been discussed earlier would be sufficient.



More information about the Python-list mailing list