Overflow-less integer arithmetic?

Mon Aug 20 03:33:28 EDT 2001

[Erik Max Francis]
> I'm working on an application where I need arithmetic to be done modulo
> 2^32 (actually, often it will be 2^n with n < 32).  Python (very nicely)
> throws OverflowErrors when an arithmetic calculation breaks these
> bounds,

Do you really find such OverflowErrors more useful than not?  I'm asking
because PEP 237 seeks to eradicate them:

    http://python.sf.net/peps/pep-0237.html

and "Part A" is a candidate for Python 2.2 (in short, "Part A" *just*
auto-converts overflowing int + - * / to longs).  In full,

   A. Short int operations that currently raise OverflowError return
      a long int value instead.  This is the only change in this
      phase.  Literals will still distinguish between short and long
      ints.  The other semantic differences listed above (including
      the behavior of <<) will remain.  Because this phase only
      changes situations that currently raise OverflowError, it is
      assumed that this won't break existing code.  (Code that
      depends on this exception would have to be too convoluted to be
      concerned about it.)  For those concerned about extreme
      backwards compatibility, a command line option will allow a
      warning to be issued at this point, but this is off by default.

> but this is one rare case where I'd prefer it not to happen.
> Essentially I'd like things to behave just as if you had a signed
> 32-bit two's-complement number.

Python longs are unbounded signed 2's-comp integers, so you're very close.

> What is the standard Pythonic way to accomplish this?

Use Python longs.  For op in + - * ^ | & ^, the last N bits of

    long1 op long2

are exactly the same as if you had "real" N-bit signed 2's-comp arithmetic.
"Signed" vs "unsigned" is pretty much a mental conceit here too -- faking
2's-comp N-bit unsigned arithmetic is just as easy.  For signed, it's the
conversion at the end that's delicate.

> The obvious thing (catching the OverflowError and then doing the
> computation in longs and converting back) didn't seem to work, since
> the obvious bitmask didn't do the job, since obviously the bitwise
> and operator doesn't have quite the same meaning for longs:

The & operator isn't really the problem.

> max at oxygen:~% python
> Python 2.0 (#2, Apr  4 2001, 19:28:30)
> [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
> Type "copyright", "credits" or "license" for more information.
> >>> i = 2147483648L # one greater than the largest int

I took the liberty of changing the impossible-to-read trailing "l" to an "L"
there.

> >>> i
> 2147483648L
> >>> i & 0xffffffff
> 2147483648L # whoops

You're getting screwed by the hex literal:  you're doing

    long & int

so the int part gets converted to long first.  But 0xffffffff is -1 as a
short int (on a 32-bit box), so coerces to -1L, i.e. to an infinite string
of 1 bits.

    i & 0xffffffff == i

is thus always true on a 32-bit box, regardless of the value of i, and
regardless of whether i is short or long.  Stick an L at the end of the hex
literal to get 32 1-bits, which is infinitely fewer than you're getting now
<wink>.

It's unclear to me whether you want a negative value in the end represented
as a short int or still as a long.  Here's both ways:

def tolong(i, bits):
    i &= (1L << bits) - 1  # get last "bits" bits, as unsigned
    if i & (1L << (bits - 1)):  # is negative in N-bit 2's comp
        i -= 1L << bits         # ... so make it negative
    return i

def toshort(i, bits):
    return int(tolong(i, bits))

print tolong(1 << 31, 32)
print tolong(1L << 31, 32)

print toshort(1 << 31, 32)
print toshort(1L << 31, 32)

That prints -2147483648 four times.