Balanced trees
Roy Smith
roy at panix.com
Mon Mar 10 22:13:54 EDT 2014
In article <mailman.8031.1394499924.18130.python-list at python.org>,
Dan Stromberg <drsalists at gmail.com> wrote:
> On Mon, Mar 10, 2014 at 6:59 AM, Roy Smith <roy at panix.com> wrote:
> > On the other hand, log n, for n = 1 million, is just 20. It's not hard
> > to imagine a hash function which costs 20x what a node traversal does,
> > in which case, the log n lookup is ahead for all n < 1 million.
>
> FWIW, both the hash table and the tree will have constants. So a tree
> would be c*20 in its most significant term, and the hash table would
> be d*1 in its. The real-world performance depends quite a bit on
> those constants at small values of n. I don't really consider a
> million all that big, but the meaning of "big" of course depends.
Well, the largest disk volume I can configure in AWS is 1 TB. So, I
guess we can take that to be "big".
Assuming 1-character strings, and no overhead (both strange assumptions,
but it makes the math easier), that's 10^12 nodes in our binary tree.
That's still only 40 layers deep. Log n is your friend.
More information about the Python-list
mailing list