unintuitive for-loop behavior

Steve D'Aprano steve+python at pearwood.info
Sat Oct 1 03:06:02 EDT 2016


Earlier, I wrote:

> On Sat, 1 Oct 2016 10:46 am, Gregory Ewing wrote:
[...]
>> Whenever there's binding going on, it's necessary to decide
>> whether it should be creating a new binding or updating an
>> existing one.
> 
> Right.

I changed my mind -- I don't think that's correct.

I think Greg's suggestion only makes sense for languages where variables are
boxes with fixed locations, like C or Pascal. In that case, the difference
between creating a new binding and updating a new one is *possibly*
meaningful:

# create a new binding
x: address 1234 ----> [  box contains 999 ]
x: address 5678 ----> [  a different box, containing 888 ]

What happens to the old x? I have no idea, but the new x is a different box.
Maybe the old box remains there, for any existing code that refers to the
address of (old) x. Maybe the compiler is smart enough to add address 1234
to the free list of addresses ready to be used for the next variable. Maybe
its just lost and unavailable until this function exists.

# update an existing binding
x: address 1234 ----> [  box contains 999 ]
x: address 1234 ----> [  same box, now contains 888 ]


That's the normal behaviour of languages like C and Pascal. But its distinct
from the previous, hypothetical behaviour.

But Python doesn't work that way! Variables aren't modelled by boxes in
fixed locations, and there is no difference between "create a new binding"
and "update an existing one". They are indistinguishable. In both
cases, 'x' is a key in a namespace dict, which is associated with a value.
There's no difference between:

    x = 999
    del x
    x = 888

and 

    x = 999
    x = 888


If you consider the namespace dict as a hash table, e.g. something like this
(actual implementations may differ):

[ UNUSED, UNUSED, (key='y', value=23), (key='a', value=True), 
  UNUSED, (key='x', value=999), UNUSED ]

then binding 888 to 'x' must put the key in the same place in the dict,
since that's where hash('x') will point.

Subject to linear addressing, chaining, re-sizes, or other implementation
details of hash tables of course. But all else being equal, you cannot
distinguish between the "new binding" and "update existing binding"
cases -- in both cases, the same cell in the hash table gets affected,
because that's where hash('x') points. If you put it somewhere else, it
cannot be found.

So I think I was wrong to agree with Greg's statement. I think that for
languages like Python where variables are semantically name bindings in a
namespace rather than fixed addresses, there is no difference between
updating an existing binding and creating a new one.

In a language like Python, the only distinction we can make between name
bindings is, which namespace is the binding in? In other words, what is the
current block of code's scope?




-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list