[Python-checkins] python/dist/src/Lib heapq.py,1.22,1.23
rhettinger at users.sourceforge.net
rhettinger at users.sourceforge.net
Sat Jun 12 04:33:38 EDT 2004
Update of /cvsroot/python/python/dist/src/Lib
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv19530
Modified Files:
heapq.py
Log Message:
Improve the memory performance and speed of heapq.nsmallest() by using
an alternate algorithm when the number of selected items is small
relative to the full iterable.
Index: heapq.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/heapq.py,v
retrieving revision 1.22
retrieving revision 1.23
diff -C2 -d -r1.22 -r1.23
*** heapq.py 10 Jun 2004 05:03:16 -0000 1.22
--- heapq.py 12 Jun 2004 08:33:36 -0000 1.23
***************
*** 131,134 ****
--- 131,135 ----
from itertools import islice, repeat
+ import bisect
def heappush(heap, item):
***************
*** 197,200 ****
--- 198,223 ----
Equivalent to: sorted(iterable)[:n]
"""
+ if hasattr(iterable, '__len__') and n * 10 <= len(iterable):
+ # For smaller values of n, the bisect method is faster than a minheap.
+ # It is also memory efficient, consuming only n elements of space.
+ it = iter(iterable)
+ result = sorted(islice(it, 0, n))
+ if not result:
+ return result
+ insort = bisect.insort
+ pop = result.pop
+ los = result[-1] # los --> Largest of the nsmallest
+ for elem in it:
+ if los <= elem:
+ continue
+ insort(result, elem)
+ pop()
+ los = result[-1]
+ return result
+ # An alternative approach manifests the whole iterable in memory but
+ # saves comparisons by heapifying all at once. Also, saves time
+ # over bisect.insort() which has O(n) data movement time for every
+ # insertion. Finding the n smallest of an m length iterable requires
+ # O(m) + O(n log m) comparisons.
h = list(iterable)
heapify(h)
More information about the Python-checkins
mailing list