Dynamically swapping between two algorithms

C Smith illusiontechniques at gmail.com
Tue Sep 23 16:54:54 EDT 2014


Once the runtime of SHORT starts to increase by a certain threshold,
Such as 2x, 4x, or 16x its last runtime? The other ideas already
proposed sound better, but I am wondering if it would work.

On Tue, Sep 23, 2014 at 12:21 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 9/23/2014 10:48 AM, Steven D'Aprano wrote:
>>
>> I have a certain calculation which can be performed two radically
>> different
>> ways. With the first algorithm, let's call it SHORT, performance is very
>> fast for small values of the argument, but terrible for large values. For
>> the second algorithm, LARGE, performance is quite poor for small values,
>> but excellent for large values.
>>
>> To add to the complexity, I perform this calculation repeatedly, in a
>> loop:
>>
>> value = 1
>> while True:
>>      x = SHORT(value)  # Or should I use LARGE? Decisions, decisions...
>>      process(x)
>>      value += some_increment()
>>      if condition: break
>>
>> Luckily, the value never gets smaller. So if I could somehow determine a
>> cut-over point, CUTOFF, I might write my loop like this:
>>
>> value = 1
>> while True:
>>      f = SHORT if value < CUTOFF else LARGE
>>      x = f(value)
>>      process(x)
>>      value += some_increment()
>>      if condition: break
>>
>> alas, the CUTOVER point is likely to be machine-dependent. Take it as a
>> given that inserting a fixed CUTOVER point into the source code (say,
>> ``CUTOVER = 123456``) is not likely to be very effective, and dynamically
>> calculating it at import time is impractical.
>>
>> *If* Python was a different language, I would spawn two threads, one using
>> SHORT and the other using LARGE, then which ever completes first, I'd just
>> kill the other. Alas, this won't work because (1) the GIL and (2) you
>> cannot forcibly kill threads, only ask them to die and hope they listen.
>>
>> I am seeking other ideas for dynamically swapping between the two
>> algorithms, based on runtime information. Any thoughts?
>>
>>
>> (1) I can't tell in advance how many loops I will make.
>>
>> (2) Both SHORT and LARGE get slower as the size of their argument
>> increases.
>> This is unavoidable due to the nature of the problem.
>>
>> (3) SHORT starts off relatively speedy, significantly faster than LARGE
>> for
>> the first few tens of thousands of loops. I'm not talking about trivial
>> micro-optimizations here, I'm talking about the difference between 0.1
>> second for SHORT versus 10 seconds for LARGE.
>>
>> (4) But as the size of the argument increases, SHORT suffers catastrophic
>> quadratic slowdown, while LARGE slows down only linearly. (Give or take.)
>
>
> One possibility is to apply both algorithms to a few values below any
> plausible cutoff value, fit curves (line and quadratic), and find
> intersection of extrapolated cutoff.  Possible calculate a few more values
> to verify and refine extrapolation.
>
>
>> (5) Consequently, by the time I reach a few million loops, the difference
>> is
>> now between (say) 5 minutes for LARGE versus 15 hours for SHORT.
>>
>> (6) There is no single algorithm which performs acceptably across the
>> entire
>> range of values I'm expecting to process in practice.
>>
>> (7) Leaving the choice up to the caller is not an option. I am the caller.
>>
>>
>>
>>
>
>
> --
> Terry Jan Reedy
>
> --
> https://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list