Dynamically swapping between two algorithms

MRAB python at mrabarnett.plus.com
Tue Sep 23 11:42:59 EDT 2014


On 2014-09-23 15:48, Steven D'Aprano wrote:
> I have a certain calculation which can be performed two radically different
> ways. With the first algorithm, let's call it SHORT, performance is very
> fast for small values of the argument, but terrible for large values. For
> the second algorithm, LARGE, performance is quite poor for small values,
> but excellent for large values.
>
> To add to the complexity, I perform this calculation repeatedly, in a loop:
>
> value = 1
> while True:
>      x = SHORT(value)  # Or should I use LARGE? Decisions, decisions...
>      process(x)
>      value += some_increment()
>      if condition: break
>
> Luckily, the value never gets smaller. So if I could somehow determine a
> cut-over point, CUTOFF, I might write my loop like this:
>
> value = 1
> while True:
>      f = SHORT if value < CUTOFF else LARGE
>      x = f(value)
>      process(x)
>      value += some_increment()
>      if condition: break
>
> alas, the CUTOVER point is likely to be machine-dependent. Take it as a
> given that inserting a fixed CUTOVER point into the source code (say,
> ``CUTOVER = 123456``) is not likely to be very effective, and dynamically
> calculating it at import time is impractical.
>
> *If* Python was a different language, I would spawn two threads, one using
> SHORT and the other using LARGE, then which ever completes first, I'd just
> kill the other. Alas, this won't work because (1) the GIL and (2) you
> cannot forcibly kill threads, only ask them to die and hope they listen.
>
> I am seeking other ideas for dynamically swapping between the two
> algorithms, based on runtime information. Any thoughts?
>
>
> (1) I can't tell in advance how many loops I will make.
>
> (2) Both SHORT and LARGE get slower as the size of their argument increases.
> This is unavoidable due to the nature of the problem.
>
> (3) SHORT starts off relatively speedy, significantly faster than LARGE for
> the first few tens of thousands of loops. I'm not talking about trivial
> micro-optimizations here, I'm talking about the difference between 0.1
> second for SHORT versus 10 seconds for LARGE.
>
> (4) But as the size of the argument increases, SHORT suffers catastrophic
> quadratic slowdown, while LARGE slows down only linearly. (Give or take.)
>
> (5) Consequently, by the time I reach a few million loops, the difference is
> now between (say) 5 minutes for LARGE versus 15 hours for SHORT.
>
> (6) There is no single algorithm which performs acceptably across the entire
> range of values I'm expecting to process in practice.
>
> (7) Leaving the choice up to the caller is not an option. I am the caller.
>
How about this:

Try both of them initially, and use the faster one until it takes
longer than the slower one did initially. Then try both of them again.
Repeat.




More information about the Python-list mailing list