Balanced trees

Steven D'Aprano steve+comp.lang.python at pearwood.info
Mon Mar 10 22:02:51 EDT 2014


On Mon, 10 Mar 2014 09:01:23 -0700, Rustom Mody wrote:

> 2. Being pointed out that a finite-input table-lookup being called a
> hash-function is a rather nonsensical claim and goes counter to the
> basic tenets of asymptotic notation. (In CS unlike in math 'asymptote'
> is always infinity) IOW

That's backwards. In maths "asymptote" always implies infinity. Within 
any finite range, there must be a minimum distance that a line approaches 
a curve, beyond which it gets no closer. But that's not an asymptote: an 
asymptote requires that the line approaches the curve arbitrarily 
closely, which requires taking the limit approaching infinity.

But in computer science, while it may be possible to ignore all real-
world considerations and imagine what happens when the size of your list 
approaches infinity, that's not terribly common or useful. In reality, 
all lists and hash tables contain only a finite number of slots, many 
data types only have a finite number of possible keys, and beyond a 
certain point the Big Oh analysis breaks down. Computer scientists are 
not so foolish as to not understand this.

Big Oh analysis doesn't require behaviour as the input approaches 
infinity, but rather as the input exceeds a certain (usually unstated) 
size. "For large enough N, this function scales as O(N**2)" is a typical 
conclusion. Not "As N approaches infinity, this function scales as 
O(N**2)".

Many data types used as keys only have a fixed number of possible values 
(256 bytes, 65536 short ints, etc.) and even those with no fixed limit 
still have a practical finite limit. There's only so much matter in the 
universe, so talking about limits as the amount of data approaches 
infinity is nonsense. Where would you store it?

My example may have been extreme in its simplicity, but if you find 
yourself in the lucky circumstance that you can afford a slot for every 
possible key, and have a perfect hash function that guarantees no 
collisions, then you will have the same conclusion: guaranteed O(1) 
lookups, insertions and deletions.

http://en.wikipedia.org/wiki/Perfect_hash_function




-- 
Steven D'Aprano
http://import-that.dreamwidth.org/



More information about the Python-list mailing list