Semi-newbie, rolling my own __deepcopy__

ladasky at my-deja.com ladasky at my-deja.com
Thu Apr 21 19:53:28 EDT 2005


I'm back...

Thanks to Michael Spencer and Steven Bethard for their excellent help.
It has taken me a few sessions of reading, and programming, and I've
had to pick up the exploded fragments of my skull from time to time.
But I now have succeeded in making deepcopy work for a simple class
that I wrote myself.  Unfortunately, I'm still getting errors when I
try to copy the object I really want.  This will be a long post
including code examples and tracebacks, so please bear with me.

First things first.  For days, I didn't realize or appreciate that
there is a difference between "class C:" and "class C(object):".
Making this one change, made the difference between working code and
incomprehensible exceptions being thrown by the interpreter.  Where can
I read a description of the 'object' class?  If subclassing 'object' is
necessary to make common functions like __new__ and deepcopy work out
of the box, why doesn't Chapter 9 of the Python tutorial discuss this?

Now, here's some working code, with minimal distractions from the main
point, which is the deepcopy process:

----------------------------------------------------------------------------

from copy import deepcopy
from random import randint
requiredItems = ["a", "b", "c"]

class Test(object):

    def __init__(self, items = {}):
        self.addedItems = []
        for key in items:
            if not hasattr(self, key):
                setattr(self, key, items[key])
                if key not in requiredItems:
                    self.addedItems.append(key)
        for var in requiredItems:
            if not hasattr(self, var):
                setattr(self, var, randint(1,5))

    def contents(self):
        items = {}
        for key in (requiredItems + self.addedItems):
            items[key] = getattr(self, key)
        return items

    def __deepcopy__(self, memo={}):
        newTestObj = Test.__new__(Test)
        memo[id(self)] = newTestObj
        newTestObj.__init__(deepcopy(self.contents(), memo))
        return newTestObj

    def show(self):
        report = ""
        for x in (requiredItems + self.addedItems):
            report=report+"  "+str(x)+" = "+str(getattr(self,x))
        return report

foo = Test()
bar = Test({"c":10})
snafu = Test({"c":20, "d":30})
print "foo:", foo.show()
print "bar:", bar.show()
print "snafu:", snafu.show()
clone = deepcopy(foo)
print "clone, should be deepcopy of foo:", clone.show()
clone.b = 40
print "clone, should be changed:", clone.show()
print "foo, should NOT change:", foo.show()

----------------------------------------------------------------------------

Here's a sample output from the above program:

> foo:    a = 2   b = 5   c = 2
> bar:    a = 3   b = 1   c = 10
> snafu:    a = 2   b = 2   c = 20   d = 30
> clone, should be deepcopy of foo:    a = 2   b = 5   c = 2
> clone, should be changed:    a = 2   b = 40   c = 2
> foo, should NOT change:    a = 2   b = 5   c = 2
> Exit code: 0

You can see that I'm playing with the extension of the objects by
passing dictionaries containing novel attributes.  You can ignore that,
though doing so was in fact my stepping stone to the useful 'contents'
method.  I've come to appreciate that there are several ways to pass
attributes between objects.  I like this dictionary approach, because
it explicitly passes the names of the attributes along with their
values.

It took me a while to figure out this expression...

newTestObj = Test.__new__(Test)

That's interesting, and also a bit scary.  It suggests that you can ask
the Test object/class to create a new object of some class *other than
Test*!  Is this because '__new__' is actually a method of the 'object'
class, and not of 'Test'?  Or are there some strange games to be played
here?  (What the heck are metaclasses?  Are they relevant here?  Do I
*really* want to know?)

Anyway, this test program works fine when the attributes that I want to
copy are numeric objects, lists or strings.  But the real object that I
want to copy also contains *arrays* as attributes.  When I change the
program thus...

Near the top with the other import statements, add:
from array import array
from random import random

Change the last line of __init__ to read:
setattr(self, var, array("d", [random() for x in range(3)]))

... I am provoking an exception from deep within Python's guts.  New
instances are created fine, but they aren't getting deepcopied:

> foo:    a = array('d', [0.52605955021044, 0.48584632687459, etc.
> bar:    a = array('d', [0.91072903604066, 0.63424430516644, etc.
> snafu:  a = array('d', [0.63255804677449, 0.67492348886257, etc.
>
> Traceback (most recent call last):
>   File "deepcopy experiment.py", line 66, in ?
>     clone = deepcopy(foo)
>   File "C:\Program Files\Python2_3\lib\copy.py", line 190,
>   in deepcopy
>     y = copier(memo)
>   File "deepcopy experiment.py", line 43, in __deepcopy__
>     newTestObj.__init__(deepcopy(self.contents(), memo))
>   File "C:\Program Files\Python2_3\lib\copy.py", line 179,
>   in deepcopy
>     y = copier(x, memo)
>   File "C:\Program Files\Python2_3\lib\copy.py", line 270,
>   in _deepcopy_dict
>     y[deepcopy(key, memo)] = deepcopy(value, memo)
>   File "C:\Program Files\Python2_3\lib\copy.py", line 206,
>   in deepcopy
>     y = _reconstruct(x, rv, 1, memo)
>   File "C:\Program Files\Python2_3\lib\copy.py", line 338,
>   in _reconstruct
>     y = callable(*args)
>   File "C:\Program Files\Python2_3\lib\copy_reg.py", line 92,
>   in __newobj__
>     return cls.__new__(cls, *args)
> TypeError: array() takes at least 1 argument (0 given)
> Exit code: 1

I have also tried this with an object with mixed attributes, which are
bundled into a dictionary with numerics first, and then an array (code
not shown).  In this case, deepcopy works fine on the numerics, and
does not throw the TypeError until the array is reached.

Does the array class not implement deepcopy correctly?  Is this a
language bug, or do I need to do something extra with arrays?  Maybe I
should just tolerate the overhead, junk the arrays, and revert to using
lists?

I'm SO close to having this work -- awaiting your sage advice once
again...

--
Rainforest laid low.
"Wake up and smell the ozone,"
Says man with chainsaw.
John J. Ladasky Jr., Ph.D.




More information about the Python-list mailing list