Dynamically swapping between two algorithms

Thu Sep 25 20:30:42 EDT 2014

On 24Sep2014 00:48, Steven D'Aprano <steve+comp.lang.python at pearwood.info> wrote:
>I have a certain calculation which can be performed two radically different
>ways. With the first algorithm, let's call it SHORT, performance is very
>fast for small values of the argument, but terrible for large values. For
>the second algorithm, LARGE, performance is quite poor for small values,
>but excellent for large values.
>
>To add to the complexity, I perform this calculation repeatedly, in a loop:
>
>value = 1
>while True:
>    x = SHORT(value)  # Or should I use LARGE? Decisions, decisions...
>    process(x)
>    value += some_increment()
>    if condition: break
>
>Luckily, the value never gets smaller. So if I could somehow determine a
>cut-over point, CUTOFF, I might write my loop like this:
>
>value = 1
>while True:
>    f = SHORT if value < CUTOFF else LARGE
>    x = f(value)
>    process(x)
>    value += some_increment()
>    if condition: break
>
>alas, the CUTOVER point is likely to be machine-dependent. Take it as a
>given that inserting a fixed CUTOVER point into the source code (say,
>``CUTOVER = 123456``) is not likely to be very effective, and dynamically
>calculating it at import time is impractical.

I have two suggestions. The first suggestion presumes multiple CPUs, and is a 
bit of a brute force hack. If the total runtime domain for SHORT (total runtime 
for when SHORT is better) is small compared to the domain for LARGE it may be 
worthwhile.

FIRST SUGGESTION

Run SHORT and LARGE in subprocesses, flat out. When either stops, kill the 
other. Once LARGE is the first to stop, do not bother with SHORT any more.

FIRST SUGGESTION

Chris Angelico's approach looks pretty cool, but seems unstable (in choices) 
initially.

Are the input values predictable? Or a fixed sequences?

I am wondering if there's any scope for gathering all the values (or all the 
values up to comfortably past the unknown cutover point) before running SHORT 
or LARGE at all?

I was thinking bisecting to find the CUTOFF, but the cost of the wrong 
algorithm for values well over the CUTOFF would be nasty.

Alternatively:

Did you say SHORT becomes quadratic in cost as it grows and LARGE roughtly 
linear? Pretend they really are purely quadratic and linear respectively.

Then you could start with both SHORT and LARGE for your first value.
Run SHORT and LARGE. Measure the elapsed time for each. Compute their ratio.

I think you should be able to estimate the cutover point as the point where the 
two curves intersect. Run the SHORT algorithm until them. Then run both.  
Remeasure. Re-estimate. Repeat until definitely LARGE domain (estimate points 
into the past, maybe with an error margin). Then just use LARGE.

Thoughts?

Cheers,
Cameron Simpson <cs at zip.com.au>