Working with the set of real numbers

Thu Mar 6 07:27:16 EST 2014

On 5 March 2014 12:57, Dave Angel <davea at davea.name> wrote:
>  Oscar Benjamin <oscar.j.benjamin at gmail.com> Wrote in message:
>> On 4 March 2014 23:20, Dave Angel <davea at davea.name> wrote:
>>>
>>> On the assumption that division by 2 is very fast,  and that a
>>>  general multiply isn't too bad, you could improve on Newton by
>>>  observing that the slope is 2.
>>>
>>>    err = n - guess * guess
>>>     guess += err/2
>>
>> I gues you mean like this:
>>
>> def sqrt(n):
>>     err = guess = 1
>>     while err > 1e-10:
>>         err = n - guess * guess
>>         guess += err/2
>>     return guess
>>
>> This requires log2(10)*N iterations to get N digits.
>
> No idea how you came up with that,  but I see an error in my
>  stated algorithm,  which does surely penalize it. The slope isn't
>  2, but 2x.  So the line should have been
>         guess += err/(2*guess)

Ah, now I understand. This is the same algorithm that Chris posted
which in turn is the same as the one that I had previously posted: all
three are just using Newton's method.

Newton's method uses the slope argument to find a root of f(x) i.e. an
x0 such that f(x0) = 0. We start with a guess x[n]. We then find the
value of the function f(x[n]) and slope fp(x[n]) and extrapolating to
the point where f(x) hits the x-axis we find an improved estimate with

    x[n+1] = x[n] - f(x[n]) / fp(x[n])

In our case f(x) = x**2 - y and so fp(x) = 2*x which gives

    x[n+1] = x[n] - (x[n]**2 - y) / (2*x[n])

and after rearranging we can express this as

    x[n+1] = (x[n] + y/x[n]) / 2.

So my loop

        while x ** 2 - y > x * eps:
            x = (x + y/x) / 2

and Chris' loop:

    while abs(guess1-guess2) > epsilon:
        guess1 = n/guess2
        guess2 = (guess1 + guess2)/2

and now your loop

    while err > 1e-10:
        err = n - guess * guess
        guess += err/(2 * guess)

are all performing the same basic algorithm. The only significant
difference is that mine tests for a relative error condition where as
the other two test for absolute error. This means that it will still
converge to an answer with the correct precision even when the root is
large e.g. sqrt(1e100). The other two are susceptible to infinite
loops in this case.

> Now if you stop the loop after 3 iterations (or use some other
>  approach to get a low-precision estimate,  then you can calculate
>  scale = 1/(2*estimate)
>
> and then for remaining iterations,
>         guess += err *scale

Fair enough. That's a reasonable suggestion for improvement. This is
often known as the "modified Newton's method". For lack of a better
reference see here:
http://math.stackexchange.com/questions/645088/modified-newtons-method

Using an approximate slope to avoid division can be a significant
optimisation. This is especially true when using the multidimensional
generalisation of Newton's method since in this case the division is
replaced with solving a system of linear equations (vastly more
expensive). However as noted in that stackexchange answer the use of
an approximate slope prevents quadratic convergence so in the
asymptotic case where we want large numbers of digits it can be much
slower.

Actually it's worse than that since we are guaranteed to converge to
an incorrect answer. The update step x = (x + y/x) / 2 will converge
when x^2 = y but we replace it with x = (x + y/x0)/2 where x0 is near
to the true root. This will converge when x == (x + y/x0)/2 i.e. when
x = y/x0 which is not the correct answer (and can be computed without
a loop in any case). It's possible to fix that by alternating between
steps where we improve our estimate  of the slope and steps where we
use that estimate for refinement but I don't think that the advantage
of not using division counts against the loss of quadratic convergence
once we get to more than 10s of digits.

You can compare them yourself with this code:

def sqrt1(y, prec=1000):
    '''Solve x**2 = y'''
    assert y > 0
    eps = D(10) ** -(prec + 5)
    x = D(y)
    with localcontext() as ctx:
        ctx.prec = prec + 10
        while abs(x ** 2 - y) > x * eps:
            x = (x + y/x) / 2
    return x

def sqrt2(y, prec=1000):
    '''Solve x**2 = y'''
    assert y > 0
    eps = D(10) ** -(prec + 5)
    x = D(y)
    with localcontext() as ctx:
        ctx.prec = prec + 10
        for n in range(7):
            x = (x + y/x) / 2
        x0r = 1 / x
        while abs(x ** 2 - y) > x * eps:
            err = x**2 - y
            err2 = x - x0r * y
            x = (x + x0r * y) / 2
    return x

sqrt2(2) goes into an infinite loop at an error level of ~1e-100 (the
exact point where this happens depends on how many warmup iterations
it uses).

[snip]
>>
>> This method works out much slower than Newton with division at 10000
>> digits: 40s (based on a single trial) vs 80ms (timeit result).
>
> Well considering you did not special-case the divide by 2, I'm not
>  surprised it's slower.

Multiplying by D('0.5') instead of dividing by 2 makes no measurable
difference to the timings. I don't know whether that's because it's
already special-cased by Decimal or just that the performance costs
don't match up in the same way for the multiprecision algorithms (when
the divisor is a single digit number).

Oscar