Newbie: finding the key/index of the min/max element

Armin Steinhoff a-steinhoff at web.de
Sun May 5 12:54:24 EDT 2002


Tim Peters <tim.one at comcast.net> wrote in message news:<mailman.1020590831.7588.python-list at python.org>...
> [David Eppstein]
> > ...
> 

Have someone tried to speed it up wit Pyrex ??

Armin

> 
> # Find the rank'th-smallest value in a, in worst-case quadratic time.
> def short_find(a, rank):
>     a.sort()
>     return a[rank - 1]
> 
> # Find the rank'th-smallest value in a, in worst-case linear time.
> def find(a, rank):
>     n = len(a)
>     assert 1 <= rank <= n
>     if n <= 7:
>         return short_find(a, rank)
> 
>     # Find median of median-of-7's.
>     medians = [short_find(a[i : i+7], 4) for i in xrange(0, n-6, 7)]
>     median = find(medians, (len(medians) + 1) // 2)
> 
>     # Partition around the median.
>     # a[:i]   <= median
>     # a[j+1:] >= median
>     i, j = 0, n-1
>     while i <= j:
>         while a[i] < median:
>             i += 1
>         while a[j] > median:
>             j -= 1
>         if i <= j:
>             a[i], a[j] = a[j], a[i]
>             i += 1
>             j -= 1
> 
>     if rank <= i:
>         return find(a[:i], rank)
>     else:
>         return find(a[i:], rank - i)
> 
> def tryone(n, tries):
>     from random import random
>     from time import clock as now
> 
>     x = [None] * n
>     median_rank = (n+1) // 2
>     sum1 = sum2 = 0.0
>     for i in range(tries):
>         for i in xrange(n):
>             x[i] = random()
>         y = x[:]
> 
>         start = now()
>         got1 = find(x, median_rank)
>         elapsed = now() - start
>         sum1 += elapsed
> 
>         start = now()
>         got2 = short_find(y, median_rank)
>         elapsed = now() - start
>         sum2 += elapsed
> 
>         if got1 != got2:
>             print "OUCH got1 %r got2 %r" % (got1, got2)
> 
>     return sum1, sum2
> 
> def drive(tries):
>     for i in range(23):
>         n = 2**i + 1
>         fast, slow = tryone(n, tries)
>         print "%8d %7.3f %7.3f" % (n, fast/tries, slow/tries)
> 
> if __name__ == "__main__":
>     drive(3)
> 
> 
> PS:  If you're more interested in expected time than worst-case time,
> replacing the median-of-median-of-7s business with
> 
>         median = short_find([a[0], a[-1], a[n//2]], 2)
> 
> yields a Python median-finder that's usually significantly faster than the
> sort method starting in the range of 512K to 1M elements.
> 
> PPS:  If you do care about worst-case time, the Python version can be made
> significantly faster by boosting the median-of-7 gimmick to median-of-k for
> larger odd values of k.  Fun for the whole family:  plot speedups against k
> until you're all baffled <wink>.



More information about the Python-list mailing list