sorteddict PEP proposal [started off as orderedict]
Duncan Booth
duncan.booth at invalid.invalid
Wed Sep 26 04:31:15 EDT 2007
Mark Summerfield <m.n.summerfield at googlemail.com> wrote:
> As you've no doubt realised, this particular implementation gives best
> performance when the pattern of use is: lots of edits, lots of
> lookups, ..., and gives worst performance when the pattern of use is:
> edit, lookup, edit, lookup (in which case using a dict and sorted() is
> probably better).
>
> So there is lots of scope for someone to do a version that has good
> performance for all patterns of use:-)
I that's the point though: you can't write one implementation that has good
performance for all patterns of use: you have to either compromise on
performance somewhere, or use an implementation specifically matched to
your use case.
For example, if the use case was some updates followed by iterating over
the first few keys only, it may be faster to use a heap for iteration.
I suspect that for many use patterns you could improve performance
significantly by having two levels of invalidation for the the key list: in
__setitem__, if the key already exists you don't need to discard the list
in __keycache (although if a key or cmp function was provided it may no
longer be sorted). In that case it might be worthwhile keeping __keycache
but flagging it as needing to be sorted again next time it is used. Given
that Python's sort is very fast on sorted or nearly sorted lists this could
provide a worthwhile speedup for cases where the values change but the keys
don't change often.
More information about the Python-list
mailing list