Verifiably better, validated Enum for Python

Steve D'Aprano steve+python at pearwood.info
Fri May 26 18:48:33 EDT 2017


On Fri, 26 May 2017 10:56 pm, Rhodri James wrote:

> On 26/05/17 12:46, Steve D'Aprano wrote:
>> On Thu, 25 May 2017 11:26 am, Chris Angelico wrote:
>> 
>>> On Thu, May 25, 2017 at 9:34 AM, bartc<bc at freeuk.com>  wrote:
>>>> That was quite likely with older Fortrans, where subroutines only used
>>>> pass-by-reference, but which didn't stop you passing references to
>>>> constants that the subroutine could then modify.
>>>>
>>>> I doubt that's still the case. With C, however, I tried this today:
>>>>
>>>>    "abcdef"[3] = 'z';
>>>>    puts("abcdef");
>>>>
>>>> Some versions crashed; some printed 'abczef', and some 'abcdef'. (The
>>>> language says this is undefined, but it doesn't try too hard to stop
>>>> you doing it.)
>>> And why should they try to stop you? The whole point of undefined
>>> behaviour is that you shouldn't be doing this, so if you do, the
>>> interpreter's allowed to do anything.
>> Does the C specification actually refer to this as undefined? I'm not a C
>> expert, but it seems to me that it is defined as an error.
> 
> The C99 standard (the one I happen to have beside me) says:
> 
> Section 6.4.5 (String Literals) Para 6
> 
> "It is unspecified whether these arrays are distinct provided their
> elements have the appropriate values.  If the program attempts to modify
> such an array, the behavior is undefined."

That seems to be definitive then: it is undefined behaviour to assign to a
string literal, which means the C compiler can:

(1) modify the literal;
(2) emit a compile-time error (which is what gcc appears to do);
(3) erase your hard drive;
(4) or anything else that it wishes to do.

 
> The equivalent section in the rationale is a bit clearer if less
> definitive:  "String literals are not required to be modifiable.  This
> specification allows implementations to share copies of strings with
> identical text, to place string literals in read-only memory, and to
> perform certain optimisations."  It also observes that some members of
> the C89 committee insisted that string literals be modifiable.
> 
> In other words it's not an error but it won't always work, which is what
> the standard generally means by "undefined".  

The standard means something more than that by undefined. Something much
nastier.

> Anyone doing 
> cross-platform work should avoid it like the plague it will become to
> them.

/s/Anyone doing cross-platform work/Everyone/

Its not just *cross-platform* where this is a problem. It means that the
behaviour of undefined code can vary from version to version of the same
compiler on the same platform, or even according to the phase of the moon
on the day you compile the code.

I don't actually believe that any real compiler will *literally* contain
code that looks like this:


if phase_of_moon(now()) != X:
    # emit machine code you expected
else:
    # erase your hard drive


but if one did, that would be perfectly legal and permitted by the standard.
More realistically, the compiler is allowed to re-arrange your code, delete
dead code, etc. If it sees undefined behaviour, it is allowed to do the
same **even if it changes the semantics of the source code**. Hence, the
compiler might decide that the optimal way to proceed is to erase your hard
drive.

There's a subtle point here: according to the C standard, code with
undefined behaviour *has no semantics*, so the compiler isn't really
changing the semantics of your source code.

It isn't just the section of code which performs the undefined behaviour
that has no meaning, but any part of the program which is reachable from
that code, or reaches that code. So according to the standard, since there
are no semantics to start with, the compiler can do whatever it likes.

You or I see code like:

    "abcdef"[3] = 'z';

and say "that code is supposed to set the 3rd character of the string
literal to 'z'". But that's not what the C 99 standard says it means. The C
99 standard says that it has no meaning at all, and the compiler can do
whatever it likes.



-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list