Best search algorithm to find condition within a range

Fri Apr 10 09:18:31 EDT 2015

On Fri, 10 Apr 2015 03:56 am, Chris Angelico wrote:

> On Fri, Apr 10, 2015 at 3:35 AM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> It's not so much the undefined behaviour part that gets me. That's bad
>> enough. But the concept that having the compiler ignore what you write in
>> the source code because you've accidentally hit some undefined part of
>> the spec *is a feature* rather than a horrible, horrible design flaw that
>> blows my mind.
> 
> In Python, you can create an assert with a side effect. This is
> something you should never do, 

"Never" is a bit strong. There's nothing wrong with:

assert log("some message") or condition

for example.

> and the compiler is free to optimize that side effect out. 

Not really. The compiler is free to not execute the assertion at all, which
is not the same thing. A Python compiler which determined that log() always
returned None, and optimized the assertion to "assert condition", would be
buggy.

> Is that outright malicious? The only difference  
> is that in Python, it's well-defined - you can *depend* on that code
> being optimized out under certain specific conditions, rather than it
> depending on the compiler.

There's more to my complaint than just that. In Python, the side-effects of
optimizing assert are localised to the assert statement. That's good! But
in C, the compiler can and will throw away more than assertions. Any chunk
of code could be thrown away, including code that should execute *before*
the undefined behaviour. You have spooky action at a distance and time
travel.

Explaining this requires a bit of set up, and I'm going to steal it from
Raymond Chen, whose blog is awesome despite him working for the Evil
Empire. (Not Google, the other one. No, not Apple. The one that used to be
scary. No, not IBM, more recently than that. Yes, Microsoft, that's the
one!) If you want to read about it in the original C, here it is:

http://blogs.msdn.com/b/oldnewthing/archive/2014/06/27/10537746.aspx

Suppose you write this function in Python:

def value_or_fallback(thingy):
    print("The value of thingy is %d" % thingy)
    return int(thingy) if thingy is not None else 42

`value_or_fallback` is buggy, because %d of None is an error. In actual
Python, the semantics of this are completely well defined: a runtime
TypeError. But in BizarroPython, that operation is undefined behaviour. So
the compiler can optimize that to:

def value_or_fallback(thingy):
    print("The value of thingy is %d" % thingy)
    return int(thingy)

BizarroPython can reason that if `thingy` is None, the first line (the print
call) is already performing undefined behaviour, so the compiler can do
anything, including act as if `thingy` is not None.

To put it another way, the compiler knows that if thingy is None, the print
call will crash before reaching the return call, so it can ignore the error
checking around the return. If thingy is None, the error checking will
never get the chance to run. If thingy isn't None, the error checking is
superfluous. (If this reasoning seems dubious to you, it's because it is.)

So far, this isn't too surprising. If we don't think too hard about it, we
might even consider this to be reasonable behaviour. But now the undefined
behaviour of `value_or_fallback` infects everything it touches, and allows
BizarroPython free reign to brutalise your code with a rusty shovel.

def unwitting(door_is_open):
    if door_is_open:
        walk_on_in()
    else:
       ring_bell()
       # wait for the door to open using the fallback value
       fallback = value_or_fallback(thingy)
       wait_for_door_to_open(fallback)

BizarroPython can optimize this entire function to:

def unwitting(door_is_open):
    walk_on_in()

The compiler observes that if `door_is_open` is false, the else branch
always performs undefined behaviour. Therefore, the entire else branch can
be treated as dead code.

Now let's start travelling in time:

def keep_checking_door():
    while True:
        response = input("Is the door open?")
        if len(response) != 1:
            return None
        door_is_open = response == 'Y'
        unwitting(door_is_open)

Our BizarroPython compiler may reason that "if door_is_open is false, then
the behaviour is undefined" and compile the function as if it were this:

def keep_checking_door():
    while True:
        response = input("Is the door open?")
        if len(response) != 1:
            return None
        door_is_open = response == 'Y'
        if not door_is_open:
            os.abort()  # Dumps core.
        walk_on_in()

All this is well and good, if your intention was to crash faster. But if
you're trying to debug the crash, you have lost a valuable debugging hint.
If you thought you could debug your code by tracing it in your head, or on
paper, you've got a nasty surprise waiting. If you reason that since the
bell never gets rung, the bug causing the crash must be somewhere *before*
the call to ring_bell, but that's not the case. The bug occurs after
ring_bell, and propagates "backwards in time" to crash the application
before reaching the buggy code.

Obviously this isn't *actual* time travel, a less provocative phrase is the
one used by the C standard:

    However, if any such execution contains an undefined operation, 
    this International Standard places no requirement on the 
    implementation executing that program with that input (not even 
    with regard to operations preceding the first undefined operation).

I prefer to call it time travel.

> (At least, I think that's the case. Is it 
> legal for a Python implementation to fully evaluate asserts and then
> simply not raise the exception?)

The documented semantics of assertions in Python, like most (all?) other
languages which have them, is that assertions either run or don't run. What
doesn't happen is for them to run but suppress any exceptions. That would
be pointless: all the expensive of running the test, without the benefit of
seeing whether the test fails or not.

Could another language work the way you suggest? Well sure, another language
might define assertions any way they like, just as it might use "if x in
sequence" as a loop statement and have a rule that print() outputs to
stdout except on public holidays, when it outputs to a temporary file in
your home directory. There's no limit to how foolish languages could be.

> Take this example:
> 
> def spam(lst):
>     assert lst[0] < lst[1]
>     # carry on with the rest of the code
> 
> If lst is a Python list, this is side-effect free. But what if it
> isn't? What if __getitem__ actually mutates the object? The assertion
> can be dropped, and the getitem calls not made. Is that a horrible,
> horrible design flaw in Python, or is it an ill-designed object that
> does things that programmers should be allowed to assume never happen?

Neither. This is perfectly reasonable behaviour, side-effects or not.

assert can be considered syntactic sugar for this:

# assert expression
if assertions_enabled:
    if expression: raise AssertionError

If expression happens to have side-effects, the fact that it isn't evaluated
when asserts are disabled is no more surprising, or horrible, than the fact
that expression won't be evaluated here:

if False:
    print(expression)

(In Python's case, `assertions_enabled` is actually the read-only built-in
dunder variable __debug__.)

-- 
Steven