[Cython] Use of long type for intermediate integral variables

Thu Jul 2 09:49:54 CEST 2015

> "libc.stdint.int64_t" is hand-wavingly declared as "long"

There are some deeper issues in the rest of your message, but as a
preliminary matter, isn't this a clear error for linux-32 and windows?

-Robert

On Wed, Jul 1, 2015 at 11:30 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:

> Robert McGibbon schrieb am 01.07.2015 um 11:12:
> > I noticed an issue on Windows when debugging an issue in scipy
> > <https://github.com/scipy/scipy/issues/4907>, but I think it might be a
> > little more general.  In some places in the generated code, it looks like
> > intermediate integral variables are declared with type long, even when
> long
> > is too small to hold necessary value. For example, with the code pasted
> > below, the value n+1 is stored in a variable of type long (using Cython
> > 0.22.1) before being supplied to F.__getitem__.
> >
> > This is especially pertinent on Windows (32 bit and 64 bit) and 32-bit
> > linux, where longs are 32-bits, so you get an overflow for a program like
> > the example below. The result is that it prints 1 instead of the expected
> > value, 2**53+1 = 9007199254740993. But this same issue comes up basically
> > whenever you do arithmetic on an array index in 64-bit Windows, for
> indices
> > larger than 2**31-1, since sizeof(long) << sizeof(void*).
> >
> > ```
> > from libc.stdint cimport int64_t
> >
> > class F(object):
> >     def __getitem__(self, i):
> >         print(i)
> >
> > cdef int64_t n = 2**53
> > f = F()
> > f[n+1]
> > ```
>
> Thanks for the report and the investigation. I can imagine why this is the
> case. "libc.stdint.int64_t" is hand-wavingly declared as "long" and the
> literal 1 is also of type "long" in Cython, so it infers that using "long"
> is good enough to hold the result of the sum.
>
> You can work around this by casting the 1 to <int64_t>, but that's clumsy
> and error prone. The problem is that Cython doesn't know the exact type of
> typedefs at translation time, only the C compilers will eventually know and
> might have diverging ideas about it. Your specific issue could be helped by
> preferring typedefs over standard integer types in the decision which type
> to use for arithmetic expressions, but that would then break the case where
> the typedef-ed type happens to be smaller than the standard one, e.g.
>
>     cdef extern from "...":
>         ctypedef long long SomeInt  # declared large enough, just in case
>
>     def test(SomeInt x):
>         cdef long long y = 1
>         return x + y
>
> If Cython inferred "int64" for the type of the result, the C code would be
> correct if sizeof(SomeInt) >= sizeof(long long), but not if it's smaller.
>
> Also, what should happen in expressions that used two different user
> provided typedefs of the same declared base type? The decision here must
> necessarily be arbitrary.
>
> So, I agree that what you have found is a problem, it feels like fixing it
> by preferring typedefs would generally be a good idea, but on the other
> hand, it might break existing code (which usually means that it *will*
> break someone's code), and it would not fix all possible problematic cases.
>
> Not an easy decision...
>
> Stefan
>
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> https://mail.python.org/mailman/listinfo/cython-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20150702/1636ebd5/attachment.html>