What other languages use the same data model as Python?

John Nagle nagle at animats.com
Wed May 4 17:52:11 EDT 2011


On 5/4/2011 3:51 AM, Steven D'Aprano wrote:
> On Wed, 04 May 2011 02:56:28 -0700, Devin Jeanpierre wrote:
>
>> Python is pass-by-value in a
>> meaningful sense, it's just that by saying that we say that the values
>> being passed are references/pointers. This is maybe one level of
>> abstraction below what's ideal,
>
> "Maybe"?
>
> Given the following statement of Python code:
>
>>>> x = "spam"
>
> what is the value of the variable x? Is it...?
>
> (1) The string "spam".
>
> (2) Some invisible, inaccessible, unknown data structure deep in the
> implementation of the Python virtual machine, which the coder cannot
> access in any way using pure Python code.
>
> (Possibly a pointer, but since it's an implementation detail, other
> implementations may make different choices.)
>
> (3) Something else.
>
>
> I argue that any answer except for (1) is (almost always) counter-
> productive: it adds more confusion than shedding light. It requires
> thinking at the wrong level, at the implementation level instead of the
> level of Python code. If we define "value" to mean the invisible,
> inaccessible reference, then that leaves no word to describe was the
> string "spam" is.
>
> (I say "almost always" counter-productive because abstractions leak, and
> sometimes you do need to think about implementation.)

    Yes.  In Python, the main leak involves the "is" operator and the 
"id()" function.  Consider:

 >>> x = "spam"
 >>> y = "spam"
 >>> x == y
True
 >>> x is y
True
 >>> z = x + 'a'
 >>> z = z[:4]
 >>> z
'spam'
 >>> x is z
False
 >>> x == z
True
 >>> id(x)
30980704
 >>> id(y)
30980704
 >>> id(z)
35681952

There, the abstraction has broken down.  x, y, and z all reference
the value "spam", but they reference two, not one or three, instances
of it.

    Arguably, Python should not allow "is" or "id()" on
immutable objects.  The programmer shouldn't be able to tell when
the system decides to optimize an immutable.

"is" is more of a problem than "id()"; "id()" is an explicit peek
into an implementation detail.  The "is" operator is a normal
part of the language, and can result in weird semantics.  Consider

 >>> 1 is (1+1-1)
True
 >>> 100000 is (100000+1-1)
False

That's a quirk of CPython's boxed number implementation.   All
integers are boxed, but there's a set of canned objects for
small integers.  CPython's range for this is -5 to +256,
incidentally.  That's visible through the "is" operator.
Arguably, it should not be.

				John Nagle



More information about the Python-list mailing list