What is a function parameter =[] for?

Thu Nov 26 21:44:05 EST 2015

On Fri, 27 Nov 2015 12:40 pm, Ben Finney wrote:

> Steven D'Aprano <steve at pearwood.info> writes:
> 
>> On Thu, 26 Nov 2015 09:34 pm, Dave Farrance wrote:
>>
>>
>> > (Conversely, I see that unlike CPython, all PyPy's numbers have
>> > unchanging ids, even after exiting PyPy and restarting, so it seems
>> > that PyPy's numerical ids are "faked".)
>>
>> I'm pretty sure that they are faked.
> 
> It's still not been expressed what “fake” refers to here. Or, rather,
> what “real” thing was being expected, and how these don't qualify.

They are faked in the sense that in this implementation, the object lifespan
that you think of as the Python programmer has little if any connection to
the actual lifespan of the chunk of memory representing that object.

Suppose you have a Python object which in turn contains three other objects:

L = [10001, 10002, 10003]

Take the id of that list:

myid = id(L)

You do some long-running processing of that list, and then compare the id at
the end:

assert myid == id(L)

This is guaranteed to pass by the language definition.

We can say that at *every instant* from the creation of the list to the
moment it is garbage collected, if you took a snap-shot of the process'
memory, and manually scanned through it, you could be able to identify four
object structs, corresponding to the list and the three ints. They might
move around and be found in different memory locations, (as in IronPython
and Jython), but they will be there.

But not with PyPy. Not only might the objects have moved location, but in
some snapshots you won't find them at all. The JIT compiler may convert the
high-level Python objects (a list, and three ints) to (let's say) a
low-level array containing three C shorts, perform processing on that, and
then recreate the Python objects only when finished.

Looked at from the perspective of the implementation, the actual lifespan of
the objects (in the sense of the struct in memory that represents that
object) may be much less than what we see from the perspective of the
high-level Python code.

Of course in Python, the object continues to exist the whole time, as far as
we can tell -- there is no test we can do from Python code that can
distinguish the state of the object, just as there is no test we can do
from Python to tell whether an object has moved memory location or not. But
from outside of Python -- say, a debugger that hooks into the particular
implementation, or by taking a dump of the memory and looking at it in a
hex editor -- we can see the high-level objects come and go and move
around.

The PyPy implementation has to take special actions to preserve the ID
across object recreations. That is what I mean by "faked".

-- 
Steven