should python have a sort list-map object in the std-lib?

Alex Martelli aleax at mail.comcast.net
Mon Nov 28 10:46:45 EST 2005


Tim Henderson <tim.tadh at gmail.com> wrote:

> ahh, i hadn't thought of using a proirity queue but that is the correct
> solution most of the time, except i suppose when you have input that
> causes you to excessively reheap which could be problematic.

The worst case is still O(N logN) for heap as well as timsort.  But
since timsort can take advantage of order or reverse order already
present in the sequence, it's surely possible to find real use cases in
which one sort at the end of all insertions is O(N) and thus much
faster.  Nevertheless, that also depends on having a lot of insertions
before you need any sorting; if you need to "walk sorted" after each
insertion or thereabouts, I would guess heap would be faster again.


> i may still look into writing a general sorted map though, it could be
> useful especially if there were and easy way to set the type of
> functionality required with out resorting to several different types of
> sorted-maps.

You can record a callable equivalent to sort's key= once and for all at
the creation of the "sorted map".  heapq's functions don't directly
support that in 2.4 (they will in 2.5) but it's easy to implement via
explicit key-extraction upon insertion (once known as the D step in the
DSU, decorate-sort-undecorate, idiom).  It's unclear whether you want to
be able to key the sorting off keys only, or keys AND values: the latter
is more general but also takes more work and complication.

As for "different types", if I catch your drift...:

I would suggest avoiding the temptation to overload the type with a
bazillion options-flags requiring deeply different behavior, e.g.
copying keys and/or values versus just taking references to either or
both, or either requiring or foregoing hashable keys (with different
implementations).  If such differences are warranted by use cases, it's
better to have several different types than one complicated one.  I
would personally suggest mimicking dict's semantic: require hashable
keys, make no copies.


Alex



More information about the Python-list mailing list