Question about references/copies

Alex Martelli aleaxit at yahoo.com
Sat Aug 28 05:32:39 EDT 2004


Henning Kage <c0dec at gmx.de> wrote:

> I'm using Python only for some months now and I'm wondering, whether such
> assignments as above

"As above" _where_?

> are creating bitwise copies of an object or just
> recieve a reference. That means I wanted to know, wheter Python in general
> differs between references and copies:
> 
> class someclass:
>   def __init__( self, otherobject):
>     self.someattribute = otherobject

Python makes no copy unless you explicitly ask for a copy -- what you
get is always a reference.  There are several ways to ask for copies --
the most general and powerful ones you'll find in standard library
modules copy (both copy.copy and copy.deepcopy -- all other ways of
asking for a copy, except copy.deepcopy, get shallow copies).

The only area where one might have some doubt is _slicing_.  When you
code x=y[z:t] you're getting a (shallow) copy, if y belongs to any of
the builtin types that implement slicing, BUT there are important
third-party extensions whose types behave differently -- specifically,
Numeric; if the type of y is Numeric.array, then x shares (some of) the
data of y.

So we can say that Python _allows_ third-party and user-coded types with
slicing that works by data sharing, even though the preferred semantics
of slicing is by (shallow) copy.


> And my second question is, whether I should use a cast in such cases or
> not (I know, a cast isn't mandatory here...)

Python does not really have the concept of a cast.  Something that looks
like a cast to you is probably one of the ways to ask for a copy: for
example, if x is a list, list(x) is a (shallow) copy of x -- if y is a
dict, dict(y) is a (shallow) copy of y, and so on.

If you take an argument X that is of any iterable type, and you need to
perform local operations on the items of X (which will not risk altering
the original value of X), it is quite common to start with something
like:
    X = list(X)
if X was already a list this makes a copy (so the original does not get
altered); if X was, say, a tuple, an open file, a dict, a string, ...,
in other words any bound iterable, this in any case ensures X is now a
list with that iterable's items (be careful: some iterables change state
when iterated upon -- if X was a file, that file object is now at
end-of-file, and the caller may need, if feasible, to call its method
seek to get that file back to the previous state, for example -- so the
"will not risk altering" tidbit above does need qualification).

So anyway you might now call such mutator methods on X as sort, reverse,
extend, pop -- all the nice methods that list offers and other iterable
types don't -- and in the end presumably return or store some result
based on the suitably-mutated X.

Another similar operation that you may do reasonably frequently as your
familiarity with Python grows is to ensure you have an _iterator_ for a
generically _iterable_ argument, and sometimes the best way is something
like:
    X = iter(X)
you don't get many methods in X this way (basically, only X.next...) but
you do ensure that X "keeps state" that tracks its current iteration.
Note a big difference here: if X is already an iterator, iter(X) will
NOT make a copy -- it wll return a reference to X unchanged.  An example
generator using this technique:

def interleave(X, Y):
    X = iter(X)
    Y = iter(Y)
    while True:
        yield X.next()
        yield Y.next()

not necessarily the absolute best approach, but pretty clear -- and the
calls to iter are indispensable here, in case X and/or Y as originally
passed were iterABLES that are not iterATORS such as lists etc.


Alex



More information about the Python-list mailing list