[issue6614] heapq.nsmallest and nlargest should be smarter/more usable/more consistent
Joshua Bronson
report at bugs.python.org
Fri Jul 31 21:22:33 CEST 2009
Joshua Bronson <jabronson at gmail.com> added the comment:
> I prefer the docs the way they are. They help the reader understand
> the relationship between min, max, nsmallest, nlargest, and sorted.
Except that it's no longer true that "when n==1, it is more efficient to use the
builtin min() and max() functions." Shouldn't this be updated to say "when n==1,
it is equivalent to using the builtin min() and max() functions"?
> I'm not sure where you got the n * 10 <= len(iterable) switch-over
> point.
It's right in the file you linked to. Search for "n * 10" in
http://svn.python.org/view/python/branches/release31-maint/Lib/heapq.py?
revision=73579&view=markup.
> That is arbitrary. The correct switchover point depends on the
> cost of the comparison function, whether the length of the input is
> known, whether the input data is partially ordered, memory constraints,
> whether a key function is used, and on other factors.
So should that be removed, then?
> FWIW, I also wrote the logic for random.sample(). The switchover logic
> was straight-forward because performance depended on factors that were
> fully known (length of input, sample size, memory utilization, and
> average number of probes for each strategy).
> One other thought: When memory is tight, the programmer needs to be
> able to select the heap algorithm in favor of sorted() even for
> relatively large values of n. I do not want an automatic switchover
> point that takes away a programmer's choice between speed and space
> efficiency.
Agreed, and thanks for the additional info.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6614>
_______________________________________
More information about the Python-bugs-list
mailing list