Undefined behaviour in C [was Re: The Cost of Dynamism]

Oscar Benjamin oscar.j.benjamin at gmail.com
Sun Mar 27 17:16:40 EDT 2016


On 27 Mar 2016 23:11, "Ben Bacarisse" <ben.usenet at bsb.me.uk> wrote:
>
> Steven D'Aprano <steve at pearwood.info> writes:
>
> > On Sun, 27 Mar 2016 05:13 pm, Paul Rubin wrote:
> >
> >> Steven D'Aprano <steve at pearwood.info> writes:
> >>> For example, would you consider that this isolated C code is
> >>> "meaningless"?
> >>> int i = n + 1;
> >>
> >> It's meaningful as long as n is in a certain range of values so there's
> >> no overflow.
> >>
> >>> But according to the standard, it's "meaningless", since it might
> >>> overflow, and signed int overflow is Undefined Behaviour.
> >>
> >> No it's not meaningless if it "might" overflow, it's meaningless if it
> >> -does- overflow,
> >
> > No! That's exactly wrong!
> >
> > Paul, thank you for inadvertently proving the point I am trying to get
> > across. People, even experienced C coders, simply don't understand what
the
> > C standard says and what C compilers can and will do.
> >
> > If the C compiler cannot prove that n is strictly less than MAXINT (or
is
> > that spelled INT_MAX?),
>
> (the latter)
>
> > the *entire program* (or at least the bits reachable from this line,
> > in both directions) is Undefined, and the compiler has no obligations
> > at all.
>
> If I understand you correctly, you are claiming that in this program
>
>   #include <stdio.h>
>
>   int main(int argc, char **argv)
>   {
>      int n = argc > 1 ? atoi(argv[1]) : 0;
>      int i = n + 1;  // not needed but used because it's the line in
question
>      printf("Hello world\n");
>   }
>
> everything after "int i = n + 1;" is undefined because the compiler
> can't prove that n is strictly less than INT_MAX.

Although Steve is incorrect to say that everything is undefined just
because the compiler can't prove that n != INT_MAX one thing that he is
right about is that undefined behaviour applies to the whole program. So if
the n+1 does lead to undefined behaviour (i.e. if n == INT_MAX) then the
behaviour is also undefined *before* that line. If we change the line to
INT_MAX+1 then a compiler is free to do whatever it likes with the *entire*
program.

In practice what this means is that an optimising compiler can see n+1 and
then optimise using the assumption that n!=INT_MAX (since there are *zero*
constraints on the behaviour of the *entire* program if that assumption is
broken). This optimisation could for example remove an if(n==INT_MAX) block
as dead code even if the check occurs *before* the line that involves n+1
which can be surprising. Of course if the check is used to conditionally
execute n+1 then the assumption cannot be applied by the optimiser so it's
still entirely possible to avoid this particular undefined behaviour.

--
Oscar



More information about the Python-list mailing list