Verifiably better, validated Enum for Python

Steve D'Aprano steve+python at pearwood.info
Fri May 26 22:59:54 EDT 2017


On Sat, 27 May 2017 09:36 am, Chris Angelico wrote:

> On Sat, May 27, 2017 at 8:48 AM, Steve D'Aprano
> <steve+python at pearwood.info> wrote:
>> I don't actually believe that any real compiler will *literally* contain
>> code that looks like this:
>>
>>
>> if phase_of_moon(now()) != X:
>>     # emit machine code you expected
>> else:
>>     # erase your hard drive
>>
>>
>> but if one did, that would be perfectly legal and permitted by the
>> standard. More realistically, the compiler is allowed to re-arrange your
>> code, delete dead code, etc. If it sees undefined behaviour, it is
>> allowed to do the same **even if it changes the semantics of the source
>> code**. Hence, the compiler might decide that the optimal way to proceed
>> is to erase your hard drive.
>>
>> There's a subtle point here: according to the C standard, code with
>> undefined behaviour *has no semantics*, so the compiler isn't really
>> changing the semantics of your source code.
>>
> 
> In the Python spec, is there anything ANYWHERE that makes any stronger
> claim about the example I gave? Here's an example:
> 
> from ctypes import cast, POINTER, c_int
> def demo_function():
>     cast(id(29), POINTER(c_int))[6] = 100
>     if 29 > 50:
>         print(29)
> 
> What is a standards-compliant Python interpreter allowed to do?

There's no such thing, so your question is moot.

There is no Python standard. There's only:

- do what CPython does;

- do what the documentation says;

- if they disagree, or don't say, ask Guido;

- if he doesn't answer, or doesn't care, do what you like.

But in this specific case, we can reason like this:

29 is not larger than 50, so the only *correct* behaviour of demo_function()
is to call cast() and then return None. And a peep-hole optimizer would be
allowed to forego the pointless test whether 29 > 50 (of course it isn't)
and remove the call to print as dead code.

However, since ctypes allows you to interface with the CPython
implementation, the call to cast risks messing up the CPython interpreter's
internal state. But if you make arbitrary changes to the internal state of
the interpreter, no behaviour is guaranteed *at all*.

The (actual or de facto) Python standard doesn't come into this. You're not
running Python, you're running some hacked interpreter.

I assume the intention is to find the cached int 29 and change it's internal
value to 100. That *certainly* is implementation dependent, there's no
language promise that 29 is cached *at all*, and indeed in Python 2.7 on my
computer, your cast fails to have any visible effect I can see:

py> from ctypes import cast, POINTER, c_int
py> cast(id(29), POINTER(c_int))[6] = 100
py> print 29, 29+1, 29>50
29 30 False


Not even if I run a variant:

py> x = 29
py> cast(id(x), POINTER(c_int))[6] = 100
py> x
29
py> x + 1
30


So I don't know what the cast assignment has done, but it's not what you
expected.



> 1) print "29"

That would be a strange thing to do. If it happened, it would mean the
interpreter was broken.

> 2) print "100"

That too would mean the interpreter is broken.

> 3) pretend the entire body of the function was "pass"

Absolutely not, because the compiler doesn't know what cast() does or if it
has any side-effects (or if it even exists) until it actually calls it.

> 4) erase your hard disk

Well, if that's what cast() does, then that's what it will do.


> 5) something else

It will do whatever cast() does.


> Suppose that someone writes a highly-optimized Python that tracks
> global namespace updates (so it knows that these really are the things
> imported from ctypes), and does all manner of dead code removal and so
> on. Is there anything in the spec that defines exactly what the
> semantics of this function are, after you mess around in ctypes? 
> And if not, then wouldn't it be legal to remove the print call, as dead
> code? Show me something that precludes that.

Guido dislikes optimizers in general, so there's that :-)


> Show me something that 
> makes this any more defined than C's behaviour in those cases you hate
> so much.

If you think there is *any* similarity between what ctypes is doing and what
the C standard says C compilers are allowed to do in the face of undefined
behaviour, you really haven't understood what C means about undefined
behaviour.

I've linked to these before, written by *actual* C programmers who know what
they're talking about, unlike me. They have forgotten more about C than
I've ever learned. Read them. Pay attention to what they say. When they say
that undefined behaviour can lead to the compiler erasing your hard drive,
and time travel, they mean it. These things *can and do* happen, and real
bugs happen because the compiler reinterprets their code.

http://blog.regehr.org/archives/213
http://blog.regehr.org/archives/226
http://blog.regehr.org/archives/232

http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html

https://blogs.msdn.microsoft.com/oldnewthing/20140627-00/?p=633/

https://randomascii.wordpress.com/2014/05/19/undefined-behavior-can-format-your-drive/


And unlike me, some of them at least think that the optimization benefits
outweigh the disadvantages.



-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list