What about an EXPLICIT naming scheme for built-ins?

Raymond Hettinger python at rcn.com
Wed Sep 8 22:10:42 EDT 2004


Carlos Ribeiro <carribeiro at gmail.com> wrote in message news:<mailman.2838.1094223946.5135.python-list at python.org>...
> On Sat, 4 Sep 2004 00:15:41 +1000, Andrew Durdin <adurdin at gmail.com> wrote:
> > However, I think it'd be hard to make an iterator implementation of
> > sorted() without either copying the sequence or resorting to O(n^2)
> > performance, which is not good.
> > This may just be an example where inconsistency is a practical necessity.
> 
> You may be right. At the least, this reasoning could possibly be
> included in the FAQ, as it will probably turn out to be a frequent
> question in the near future. But having sorted() to return a iterator
> still sounds like a logical thing to do, so I'm not sure about it.
> 
> So we now have a different question -- does it make sense to have
> sort() to return a iterator? What would be the advantages, and what
> does it take to implement it?
> 
> Today, sorted() returns a brand new sorted sequence. So it already
> takes the full memory and time penalty while doing so. However, **if
> sorted() was implemented as a generator**, it would have some
> advantage in special cases:
> 
> -- sorting long sequences to use only a few of the first elements. For
> example, if you need to sort a long list to get the top 10 elements. A
> generator would yield only those elements, saving a lot of time that
> would go into sorting everything else.

  . . .

All of the needs you describe are adequately met by heaps which
provide just-in-time sorting.

In Py2.4, two new heapq functions, nlargest() and nsmallest(),
specifically address your use case of finding the top or bottom 10
elements while using no more than n elements of memory.

Also in Py2.4, all heapq functions including the two new ones run at C
speed.  Pure python equivalents have been kept for those who like to
dicker with the code.

If you can't live without an itersort, generators make it easy to wrap
an iterator interface around the above.  Since the O(n log n) portion
is done at C speed, the performance will be on par with what you would
expect from a built-in:

    def itersort(arr):
	heapify(a)
	try:
	    while a:
		yield heappop(a)
	except IndexError:
	    pass

With all the real needs having been met, this thread can now return to
naming debates which can be joyfully fought in perpetuity without
resolution.

Cheers,


Raymond Hettinger



More information about the Python-list mailing list