[Python-checkins] python/dist/src/Lib heapq.py,1.22,1.23

rhettinger at users.sourceforge.net rhettinger at users.sourceforge.net
Sat Jun 12 04:33:38 EDT 2004


Update of /cvsroot/python/python/dist/src/Lib
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv19530

Modified Files:
	heapq.py 
Log Message:
Improve the memory performance and speed of heapq.nsmallest() by using
an alternate algorithm when the number of selected items is small 
relative to the full iterable.



Index: heapq.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/heapq.py,v
retrieving revision 1.22
retrieving revision 1.23
diff -C2 -d -r1.22 -r1.23
*** heapq.py	10 Jun 2004 05:03:16 -0000	1.22
--- heapq.py	12 Jun 2004 08:33:36 -0000	1.23
***************
*** 131,134 ****
--- 131,135 ----
  
  from itertools import islice, repeat
+ import bisect
  
  def heappush(heap, item):
***************
*** 197,200 ****
--- 198,223 ----
      Equivalent to:  sorted(iterable)[:n]
      """
+     if hasattr(iterable, '__len__') and n * 10 <= len(iterable):
+         # For smaller values of n, the bisect method is faster than a minheap.
+         # It is also memory efficient, consuming only n elements of space.
+         it = iter(iterable)
+         result = sorted(islice(it, 0, n))
+         if not result:
+             return result
+         insort = bisect.insort
+         pop = result.pop
+         los = result[-1]    # los --> Largest of the nsmallest
+         for elem in it:
+             if los <= elem:
+                 continue
+             insort(result, elem)
+             pop()
+             los = result[-1]
+         return result
+     # An alternative approach manifests the whole iterable in memory but
+     # saves comparisons by heapifying all at once.  Also, saves time
+     # over bisect.insort() which has O(n) data movement time for every
+     # insertion.  Finding the n smallest of an m length iterable requires
+     #    O(m) + O(n log m) comparisons.
      h = list(iterable)
      heapify(h)




More information about the Python-checkins mailing list