[issue28754] Argument Clinic for bisect.bisect_left

Sat Nov 26 14:00:04 EST 2016

Raymond Hettinger added the comment:

[SS]
> This is rather a random difference. Try to run the bench several times.

I did run several times.  The results were perfectly stable (+/- 1ns).  It tells us that METH_FASTCALL is something we want to use as broadly as possible.

[JP]
> it yielded a consistent 18% improvement on bisect.bisect("abcdef", "c")
> on two different machines.

Thanks for running timings.  When you do timing in the future, try to select timings that are indicative of actual use.  No one uses bisect to search strings for a character -- the typical use case is searching a list of range breakpoints like the example shown in the docs:

    def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
        i = bisect(breakpoints, score)
        return grades[i]

Hash tables are faster for exact lookup.  Bisect is mainly used for searching ranges (i.e. tax brackets or position in a cumulative distribution).

[VS]
> Speedup for tiny lists is just a nice side effect.

To you perhaps; however, the entire reason I put this code in many years ago was for performance.  The existing pure python code was already very fast.  Anyway, this patch is a nice improvement in that regard.

[Everyone]

What to do about the "hi" argument is a sticky point.  This is a long-standing API, so any API changes would likely break some code or lock us into exposing unintuitive implementation details (like hi=-1).

The point of view of the existing C code, the pure Python version, and the docs is that the "hi" argument is an optional argument that when omitted will default to "len(a)".  It was not intended that a user would explicitly pass in "hi=None" as allowed by the pure python version but not by the C version, nor was it intended to have the user pass in "hi=-1" as allowed by the C version but causes incorrect behavior on the pure Python version.

It is unfortunate that the argument clinic has not yet grown support for the concept of "there is special meaning for an omitted argument that cannot be simulated by passing in a particular default value".  I believe when this issue has arisen elsewhere, the decision was to skip applying argument clinic at all.  See:  builtin_getattr, builtin_next, dict_pop, etc.

In this case though, we already have a conflict, so it may be possible to resolve the issue through clear parameter descriptions, indicating that "hi=-1" and "hi=None" are implementation details and that it is intended that the user never explicitly pass in either of those values, and that only the guaranteed ways to get the default is to omit the argument entirely or to pass in "hi=len(a)" as a numeric argument.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue28754>
_______________________________________