[Python-bugs-list] [ python-Bugs-465527 ] sqrt on a long int returns invalid resul

Fri, 28 Sep 2001 14:21:22 -0700

Bugs item #465527, was opened at 2001-09-26 23:54
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=465527&group_id=5470

Category: Python Library
Group: Not a Bug
Status: Closed
Resolution: Invalid
Priority: 5
Submitted By: Edward Kandrot (arcanelord)
Assigned to: Tim Peters (tim_one)
Summary: sqrt on a long int returns invalid resul

Initial Comment:
n = 
1881988129206079638386972394616504398071635633794173827
0076335642298885971523466548531906060650474304531738801
1303396716199692321205734031879550656996221305168759307
650257059L
x = long(math.sqrt(n))

the result of this is that x is off by many orders of 
magnitude.  There should be a sqrt function that 
returns a long integer result so that large integer 
math will work as expected.  I have written this 
function in python, one that would set x to be the 
integer part of a sqrt, but the hd it was on is 
currently damaged.  If this is determined to be an 
actual problem that needs to be fixed, I will submit 
the code in this bug (unless someone beats me to it).

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-09-28 14:21

Message:
Logged In: YES 
user_id=31435

Well, a correct algorithm is more than a little subtle 
(floating division can't work here due to overflows, and if 
by "/" you're thinking "//" the algorithm is incorrect), 
and an *efficient* correct algorithm much more so.  My own 
long-int integral-root code is both correct and quick, and 
is about 500(!) lines of Python (more than half in comment 
blocks sketching a correctness proof).

Part of that was due to the inability to compute logs of 
longs quickly in 2.1, in order to get a good starting 
guess.  I fixed that in 2.2, though, by generalizing 
math.log(long) and math.log10(long) to generate good log 
approximations for longs of any magnitude.

The rest is that Newton steps approximately double the 
number of good bits on each iteration, and in the early 
iterations, when you've only got a few dozen good bits, 
computing the Newton steps with full long precision is a 
monstrous waste of time (when you've only got 12 good bits, 
(say) hundred-thousand bit arithmetic is factors of 
millions slower than necessary).

So an efficient algorithm scales each step to use no more 
precision than is justified during that step.

All of which is indeed an argument not to write it in C, 
though:  the complex logic is much easier expressed in 
Python.  The vast bulk of the runtime is consumed by 
division, and that's already in C (although GMP does long 
division much quickker than Python does it -- and probably 
always will).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-09-28 06:45

Message:
Logged In: YES 
user_id=6380

Also, is the algorithm more than this?

while abs(n-x*x) > 2*x+1:
    x = (x + n/x) / 2

I see no advantage to coding that in C, since all the time
goes (presumably) into the arithmetic, which is already all
in C.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-09-27 20:31

Message:
Logged In: YES 
user_id=31435

I don't dispute that the function can be useful.  That's 
not enough, though, and sqrt(2L) =- 1.414... is surely more 
useful to more people more often than sqrt(2L) == 1L.

You should investigate the crypto work already done in 
Python, for example

http://www.amk.ca/python/code/crypto.html

There are also at least 2 Python wrappers for GMP, one 
hosted at SourceForge and another from Marc-Andre Lemburg.  
If you're doing serious crypto work, you certainly want 
GMP.  If you're just putzing around, inventing long int 
algorithms on your own is seriously educational <0.9 wink>.

----------------------------------------------------------------------

Comment By: Edward Kandrot (arcanelord)
Date: 2001-09-27 16:42

Message:
Logged In: YES 
user_id=102159

Actually, this is a very useful function, since it is used 
in crytpography and other factoring-large-number projects.  
Without this function, I can not see how people would use 
python for this work, unless they did like I did, found out 
where the problem was, and write my own library for it.  
Everyone working on crypto using python will have to write 
their own sqrt, as it currently stands.

I would recommend adding this to some library, maybe a 
large integer library could be made.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2001-09-27 13:48

Message:
Logged In: YES 
user_id=31435

Closing as a Not-A-Bug, because it's functioning as 
designed.  The math functions convert their arguments to C 
doubles, so you're never going to get more than 53 bits of 
precision of out them.

>>> >>> sqrt(n)
4.338188710978442e+86
>>>

To 53 signficant bits, that's an excellent approximation, 
and that's all the math.* functions can hope to deliver.

Wholly accurate unbounded-int sqrt, pow etc are things 
Python doesn't have, and it's unclear whether it should 
(lots of code for a tiny audience).  If that's wanted, I 
suggest that writing a PEP is in order first.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-09-27 13:43

Message:
Logged In: YES 
user_id=6380

The math module does all its work in double precision
floating point.  I believe that the precision you are asking
is unobtainable using floating point: x is much larger than
the number of bits retained in the mantissa of a double
precision number (53 I believe). So the best error bound on
n-x*x that you can get is of the order of n/2**53. The value
you get for n-x*x is significantly smaller than that, so I
think the math module has done the best it can.

----------------------------------------------------------------------

Comment By: Edward Kandrot (arcanelord)
Date: 2001-09-27 13:29

Message:
Logged In: YES 
user_id=102159

I am using 2.1.  I saw the problem on Linux previously, but 
I am on a Win2k box now with these results:

>>> x
433818871097844195711080363479661366933416624844037360445918
708654950711556266965073920L
>>> n-x*x
579455590187726959148083548468770884154260745498744050023793
649517960092688750631137087163290051389345006612268597554742
3732078994891141758379026196586090659L

Since I do not have the program to give the correct answer, 
I can not tell you what it is right now (until I fix my 
Linux HD).  I can tell you it is wrong by how much larger 
the remainder is from the sqrt.  n-x*x should be less than
2*x+1, wereas the results from above is 
133570858437146139501587560404236730965976658068588390668591
16036188472L*x+151281127770264061379550826654612638192882839
13990470767939415731499169681374054240419L  (ie many 
magnitudes of order greater than expected).

I will shortly recover the algorithm that calcs sqrt 
correctly and post it here, so at least we can see what I 
was expecting as a result.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2001-09-27 10:58

Message:
Logged In: YES 
user_id=6380

Which Python version? On which platform? In 2.1 and 2.2 I
get a decent result.

Assigning to Tim who usually answers these questions.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=465527&group_id=5470