Balanced trees

Dan Stromberg drsalists at gmail.com
Thu Mar 13 23:12:41 EDT 2014


On Thu, Mar 13, 2014 at 4:57 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> On Mon, 10 Mar 2014 19:34:48 +0200, Marko Rauhamaa wrote:
>
>>> With a high level language like Python, using the provided hash table
>>> will almost always cream any hand-built tree, no matter how
>>> advantageous the data is to the tree.
>>
>> The main thing is there are use cases where order is essential. That's
>> why I have had to implement the AVL tree in Python myself.
>
> from collections import OrderedDict
>
> gives you a fast, ordered mapping. In this case, the order is that of
> insertion order. If you want sort order instead, for "small" amounts of
> data, say below a million or ten million items, you're likely to have
> acceptable if not superior results by just sorting them items when and as
> needed.

This is one of my pet things.

Sorting the same (slightly tweaked) data inside of a tight loop is
rarely a good idea - despite the fact that the "sort" itself tends to
be O(n) with Python's rather awesome builtin sorting algorithm.  This
is because sorting inside a loop tends to yield O(n^2) algorithms, or
even O((n^2)*logn).

But if you're sorting once at the end of whatever other processing,
sorting is great.  It's (still) O(nlogn), but it's got a terrific
constant.



More information about the Python-list mailing list