anything like C++ references?

Bengt Richter bokr at oz.net
Wed Jul 16 01:18:48 EDT 2003


On Tue, 15 Jul 2003 22:14:20 +0100, Stephen Horne <intentionally at blank.co.uk> wrote:

>On 15 Jul 2003 20:32:30 GMT, bokr at oz.net (Bengt Richter) wrote:
>
>>On Tue, 15 Jul 2003 02:40:27 +0100, Stephen Horne <intentionally at blank.co.uk> wrote:
>>
>>>On Mon, 14 Jul 2003 00:07:44 -0700, Erik Max Francis <max at alcyone.com>
>>>wrote:
>>>
>>>>Stephen Horne wrote:
>>>>
>>>>> Imagine, for instance, changing all assignments and other 'copying' to
>>>>> use a copy-on-write system. Then add a pointer type and a 'newcopyof'
>>>>> operator. And nick the C-style prefix '*' for dereferencing and add
>>>>> '&' as a 'make a pointer to this object' operator (absolutely NOT a
>>>>> physical memory address).
>>>>
>>>>The problem is that this misses the point that Python is not C++.  You
>>>>shouldn't try grafting syntaxes and approaches that make Python look
>>>>more like C++, you should be learning to use Python on its own merits.
>>>
>>>Not true.
>>>
>>>If you eliminate all violations of the idea of variables bound to
>>>values (basically the whole point of my thread), then mutable
>>>containers cannot achieve the same thing.
>>>
>>>If a copy-on-write scheme is added to Python, you'd get the following
>>>results from using mutable containers...
>>>
>>>>>> a = [1, 2, 3]
>>>>>> b = a
>>>>>> b[2] = 4
>>>>>> a
>>>[1, 2, 3]
>>>>>> b
>>>[1, 2, 4]
>>>
>>Yes, but it would be hugely inefficient for the case where, e.g., you are
>>modifying lines read from a 20MB log file. Imagine inducing a 20MB copy
>>every time you wanted to delete or modify a line. Now you have to invent
>>a way to program around what can be done very conveniently and efficiently
>>thanks to Python's semantics.
>
>What makes you say that!
>
>Copies are only made when they are needed. The lazy copy optimisation,
>in other words, still exists.
(BTW, don't get hung up on this part, ok? Skip forward to the ==[ code part ]==
below if you're going to skip ;-)

Ok, so the copy happens at b[2]=4 right? This is still useless copying
if the holder of the "a" reference *wants* to have it as a continuing
reference to the same single shared list. And you can program that in C++
if you want to, and you presumably would. Except now you have to create pointer variable.

Ok now how would you handle the case of multiple copies of that pointer variable?
Make it "smart" so that you get copy-on-write of what it points to? That's the discussion
we're having about the first level of Python references, ISTM.

>
>Delete or modify one string in a list of strings, and the same stuff
>would happen as happens now. Unless, perhaps somewhere you don't know
So a 20MB log file in the form of a string list might be 500,000 lines of 40-byte log entries,
which would be 2MB of 4-byte pointers. Cheap copy when you explicitly don't want it,
but want to share the same list?

>about, the caller of your function who passed that list in to you has
>a separate reference they expect to stay unchanged. In that case, the
>first changed string triggers a copy of the list - but as the list
>only contains references to strings, it doesn't trigger copying of all
>the strings. The second line changed doesn't require the list to be
>modified again because you already have your separate copy.
Sure, but the first unwanted copy cost you.
>
>C++ uses exactly this kind of approach for the std::string class among
>others. Copy-on-write is a pretty common transparent implementation
>detail in 'heavy' classes, including those written by your everyday
>programmers. Does that mean C++ is slower that Python? Of course not!
>
It could be, if the unwanted copying was big enough. Except you wouldn't
program it like that. You'd program it to share mutable data like Python
for a case like that.

In any case, the performance is a side issue, however practically important,
to the discussion of the language semantics.

===[ code part ]=================
I'm a bit disappointed in not getting a comment on class NSHorne ;-)

BTW, that was not a faked interactive log. It doesn't have lazy copy, but
that could be arranged. Though that's an implementation optimization issue, right?

Does that namespace have the semantics you are calling for or not? ;-)
If I add a mechanism to create and dereference "pointers" re that namespace, will
that do it?  Here's a go at that:

 >>> class NSHorne(object):
 ...     from cPickle import dumps, loads
 ...     class Ptr(object):
 ...         def __init__(self, ns, name): self.ns=ns; self.name=name
 ...         def __getitem__(self, i): return getattr(self.ns, self.name)
 ...         def __setitem__(self, i, v): setattr(self.ns, self.name, v)
 ...     def __setattr__(self, name, val):
 ...         if isinstance(val, self.Ptr):
 ...             object.__setattr__(self, name, val)
 ...         else:
 ...             object.__setattr__(self, name, self.loads(self.dumps(val)))
 ...     def __getitem__(self, name): return self.Ptr(self, name)
 ...
 >>> ns = NSHorne()
set the value of x:
 >>> ns.x = [1, 2, 3]

make a "pointer" to x
 >>> ns.px = ns['x']

You can use ordinary Python aliases with the "pointer"
 >>> p=ns.px

You *p dereference it like:
 >>> p[:]
 [1, 2, 3]

And here's a function that expects such a pointer
 >>> def foo(bar):
 ...    bar[:] = 'something from bar'  # like *bar=
 ...

You can pass it to an ordinary Python function

 >>> foo(p)

And dereference it again to see what happened
 >>> p[:]  #like *p
 'something from bar'

Or look at what it was pointing to
 >>> ns.x
 'something from bar'

or look via the original pointer in ns
 >>> ns.px[:]  # like *px
 'something from bar'

and alias it some more
 >>> p2 = p
 >>> p2[:]=(3,4,5)
 >>> ns.x
 (3, 4, 5)

Or to play with one of your original examples:

 >>> ns.a = [1, 2, 3]
 >>> ns.b = ns.a
 >>> ns.b[2] = 4
 >>> ns.a
 [1, 2, 3]
 >>> ns.b
 [1, 2, 4]

Make "pointer: to b:
 >>> p = ns['b']
 >>> ns.b
 [1, 2, 4]
 >>> ns.b[0]=-1
 >>> ns.b
 [-1, 2, 4]

Now dereference p:
 >>> p[:]
 [-1, 2, 4]

Use dereferenced pointer to mod value
 >>> p[:][1]=-1

Check values
 >>> ns.b
 [-1, -1, 4]
 >>> ns.a
 [1, 2, 3]
 >>> p[:]
 [-1, -1, 4]
 >>>

But we can set something else than that list:
 >>> p[:] = 'something else'
 >>> ns.b
 'something else'

IOW,
    p = &x is spelled ns.p = ns['x']
and
    *p = x is spelled ns.p[:] = ns.x
and
    y = *p is spelled ns.y = ns.p[:]
and
    you can pass ns.p to a function def foo(bar): and dereference it as bar[:] there

Does this give names in ns the "variable" semantics you wanted? Is there any difference
other than "semantic sugar?" Explaining the difference, if any, will also clarify what
you are after, to me, and maybe others ;-)

Regards,
Bengt Richter




More information about the Python-list mailing list