Balanced trees

Tue Mar 18 18:21:28 EDT 2014

On Tue, Mar 18, 2014 at 3:03 PM, Marko Rauhamaa <marko at pacujo.net> wrote:
> Dan Stromberg <drsalists at gmail.com>:
> For a proper comparison, I'd like a fixed, identical dataset and set of
> operations run against each data structure.
>
> How about this test program:

I used to do essentially this, but it was time-prohibitive and
produced harder-to-read graphs - harder to read because the enormous
values of the bad trees were dwarfing the values of the good trees.

Imagine doing 100000000 operation tests for the unbalanced binary
tree. For a series of random keys, it would do quite well (probably
second only to dict), but for a series of sequential keys it would
take longer than anyone would reasonably want to wait because it's
basically a storage-inefficient linked list.

Rather than throw out unbalanced binary tree altogether, it makes more
sense to run it until it gets "too slow".

The workload+interpreter pairs are all tested the same way, it's just
that the ones that are doing badly are thrown out before they're able
to get a lot worse. Studying the graphs will likely help develop an
intuition for what's happening.