unintuitive for-loop behavior

Steve D'Aprano steve+python at pearwood.info
Sun Oct 2 00:19:39 EDT 2016


On Sun, 2 Oct 2016 11:44 am, Gregory Ewing wrote:

> Steve D'Aprano wrote:
>> When you say:
>> 
>>     x = 0
>>     x = 1
>> 
>> inside a function, and the interpreter does the name binding twice,
>> there's no way of telling whether it writes to the same cell each time or
>> not.
> 
> Yes, there is:
> 
> ...  x = 0
> ...  f1 = lambda: x
> ...  x = 1
> ...  f2 = lambda: x
> ...  print(f1(), f2())
> ...
>  >>> f()
> 1 1
> 
> This indicates that both assignments updated the same slot.
> Otherwise the result would have been "0 1".


No it doesn't mean that at all. The result you see is compatible with *both*
the "update existing slot" behaviour and "create a new slot" behavior. The
*easiest* way to prove that is to categorically delete the existing "slot"
and re-create it:

x = 0
f1 = lambda: x
del x
assert 'x' not in locals()
x = 1
f2 = lambda: x
print(f1(), f2())


which will still print exactly the same results. 

Objection: I predict that you're going to object that despite the `del x`
and the assertion, *if* this code is run inside a function, the "x slot"
actually does still exist. It's not "really" deleted, the interpreter just
makes sure that the high-level behaviour is the same as if it actually were
deleted.

Well yes, but that's exactly what I'm saying: that's not an objection, it
supports my argument! The way CPython handles local variables is an
implementation detail. The high-level semantics is *identical* between
CPython functions, where local variables live in a static array of "slots"
and re-binding always updates an existing slot, and IronPython, where they
don't. The only way you can tell them apart is by studying the
implementation.

In IronPython, you could have the following occur in a function locals, just
as it could happen CPython for globals:

- delete the name binding "x"
- which triggers a dictionary resize
- bind a value to x again
- because the dictionary is resized, the new "slot" for x is in a 
  completely different position of the dictionary to the old one

There is no user-visible difference between these two situations. Your code
works the same way whether it is executed inside a function or at the
global top level, whether functions use the CPython local variable
optimization or not.


> It's not currently possible to observe the other behaviour in
> Python, because the only way to create new bindings for local
> names is to enter a function.

Not at all. Delete a name, and the binding for that name is gone. Now assign
to that name again, and a new binding must be created, by definition, since
a moment ago it no longer existed.

x = 1
del x
assert 'x' not in locals()
x = 2


> The change to for-loop semantics 
> I'm talking about would introduce another way.

I have lost track of how this is supposed to change for-loop semantics.

I'm especially confused because you seem to be arguing that by using an
implementation which CPython already uses, for-loops would behave
differently. So either I'm not reading you right or you're not explaining
yourself well.



>> Certainly when you call a function, the local bindings need to be
>> created. Obviously they didn't exist prior to calling the function! I
>> didn't think that was the difference you were referring to, and I fail to
>> see how it could be relevant to the question of for-loop behaviour.
> 
> My proposed change is (mostly) equivalent to turning the
> loop body into a thunk and passing the loop variable in as
> a parameter.

A thunk is not really well-defined in Python, because it doesn't exist, and
therefore we don't know what properties it will have. But generally when
people talk about thunks, they mean something like a light-weight anonymous
function without any parameters:

https://en.wikipedia.org/wiki/Thunk


You say "passing the loop variable in as a parameter" -- this doesn't make
sense. Variables are not values in Python. You cannot pass in a
*variable* -- you can pass in a name (the string 'x') or the *value* bound
to the name, but there's no existing facility in Python to pass in a
variable. If there was, you could do the classic "pass by reference" test:

    Write a *procedure* which takes as argument two variables 
    and swaps their contents

but there's no facility to do something like that in Python. So its hard to
talk about your hypothetical change except in hand-wavy terms:

"Something magically and undefined happens, which somehow gives the result I
want."



> This is the way for-loops or their equivalent are actually
> implemented in Scheme, Ruby, Smalltalk and many other similar
> languages, which is why they don't have the same "gotcha".

Instead, they presumably have some other gotcha -- "why don't for loops work
the same as unrolled loops?", perhaps.

In Python, the most obvious gotcha would be that if for-loops introduced
their own scope, you would have to declare any other variables in the outer
scope nonlocal. So this would work fine as top-level code:

x = 1
for i in range(10):
    print(x)
    x += 1

but inside a function it would raise UnboundLocalError, unless you changed
it to this:


def spam():
    x = 1
    for i in range(10):
        nonlocal x
        print(x)
        x += 1


So we would be swapping a gotcha that only affects a small number of people,
using a fairly advanced concept (functional programming techniques), with
at least one trivial work-around, for a gotcha that would affect nearly
everyone nearly all the time. Yay for progress!



> (Incidentally, this is why some people describe Python's
> behaviour here as "broken". They ask -- it works perfectly
> well in these other languages, why is Python different?)

*shrug* 

Define "it". Python works perfectly well too. It just works differently from
what some people expect, especially if they don't think about the meaning
of what they're doing and want the interpreter to DWIM.


> The trick with cells is a way to get the same effect in
> CPython, without the overhead of an actual function call
> on each iteration (and avoiding some of the other problems
> that using a thunk would entail).

"The trick with cells" -- what trick do you mean?


> The cell trick isn't strictly necessary, though -- other
> Python implementations could use a thunk if they had to.




-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list