[Tutor] Limitation of int() in converting strings

Sat Dec 22 03:06:44 CET 2012

Oh, another comment...

On 18/12/12 01:36, Oscar Benjamin wrote:

> I have often found myself writing awkward functions to prevent a
> rounding error from occurring when coercing an object with int().
> Here's one:
>
> def make_int(obj):
>      '''Coerce str, float and int to int without rounding error
>      Accepts strings like '4.0' but not '4.1'
>      '''
>      fnum = float('%s' % obj)
>      inum = int(fnum)
>      assert inum == fnum
>      return inum

Well, that function is dangerously wrong. In no particular order,
I can find four bugs and one design flaw.

1) It completely fails to work as advertised when Python runs with
optimizations on:

[steve at ando python]$ cat make_int.py
def make_int(obj):
     '''Coerce str, float and int to int without rounding error
     Accepts strings like '4.0' but not '4.1'
     '''
     fnum = float('%s' % obj)
     inum = int(fnum)
     assert inum == fnum
     return inum

print make_int('4.0')
print make_int('4.1')  # this should raise an exception

[steve at ando python]$ python -O make_int.py
4
4

2) Even when it does work, it is misleading and harmful to raise
AssertionError. The problem is with the argument's *value*, hence
*ValueError* is the appropriate exception, not ImportError or
TypeError or KeyError ... or AssertionError. Don't use assert as
a lazy way to get error checking for free.

3) Worse, it falls over when given a sufficiently large int value:

py> make_int(10**500)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "<stdin>", line 6, in make_int
OverflowError: cannot convert float infinity to integer

but at least you get an exception to warn you that something has
gone wrong.

4) Disturbingly, the function silently does the wrong thing even
for exact integer arguments:

py> n = 10**220  # an exact integer value
py> make_int(n) == n
False

5) It loses precision for string values:

py> s = "1"*200
py> make_int(s) % 10
8L

And not by a little bit:

py> make_int(s) - int(s)  # should be zero
13582401819835255060712844221836126458722074364073358155901190901
52694241435026881979252811708675741954774190693711429563791133046
96544199238575935688832088595759108887701431234301497L

Lest you think that it is only humongous numbers where this is a
problem, it is not. A mere seventeen digits is enough:

py> s = "10000000000000001"
py> make_int(s) - int(s)
-1L

And at that point I stopped looking for faults.

-- 
Steven