Balanced trees
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Mon Mar 10 22:02:51 EDT 2014
On Mon, 10 Mar 2014 09:01:23 -0700, Rustom Mody wrote:
> 2. Being pointed out that a finite-input table-lookup being called a
> hash-function is a rather nonsensical claim and goes counter to the
> basic tenets of asymptotic notation. (In CS unlike in math 'asymptote'
> is always infinity) IOW
That's backwards. In maths "asymptote" always implies infinity. Within
any finite range, there must be a minimum distance that a line approaches
a curve, beyond which it gets no closer. But that's not an asymptote: an
asymptote requires that the line approaches the curve arbitrarily
closely, which requires taking the limit approaching infinity.
But in computer science, while it may be possible to ignore all real-
world considerations and imagine what happens when the size of your list
approaches infinity, that's not terribly common or useful. In reality,
all lists and hash tables contain only a finite number of slots, many
data types only have a finite number of possible keys, and beyond a
certain point the Big Oh analysis breaks down. Computer scientists are
not so foolish as to not understand this.
Big Oh analysis doesn't require behaviour as the input approaches
infinity, but rather as the input exceeds a certain (usually unstated)
size. "For large enough N, this function scales as O(N**2)" is a typical
conclusion. Not "As N approaches infinity, this function scales as
O(N**2)".
Many data types used as keys only have a fixed number of possible values
(256 bytes, 65536 short ints, etc.) and even those with no fixed limit
still have a practical finite limit. There's only so much matter in the
universe, so talking about limits as the amount of data approaches
infinity is nonsense. Where would you store it?
My example may have been extreme in its simplicity, but if you find
yourself in the lucky circumstance that you can afford a slot for every
possible key, and have a perfect hash function that guarantees no
collisions, then you will have the same conclusion: guaranteed O(1)
lookups, insertions and deletions.
http://en.wikipedia.org/wiki/Perfect_hash_function
--
Steven D'Aprano
http://import-that.dreamwidth.org/
More information about the Python-list
mailing list