in a pickle

duncan smith duncan at invalid.invalid
Wed Mar 6 20:18:59 EST 2019


On 06/03/2019 20:24, Peter Otten wrote:
> duncan smith wrote:
> 
>> On 06/03/2019 16:14, duncan smith wrote:
>>> Hello,
>>>       I've been trying to figure out why one of my classes can be
>>> pickled but not unpickled. (I realise the problem is probably with the
>>> pickling, but I get the error when I attempt to unpickle.)
>>>
>>> A relatively minimal example is pasted below.
>>>
>>>
>>>>>> import pickle
>>>>>> class test(dict):
>>> def __init__(self, keys, shape=None):
>>> self.shape = shape
>>> for key in keys:
>>> self[key] = None
>>>
>>> def __setitem__(self, key, val):
>>> print (self.shape)
>>> dict.__setitem__(self, key, val)
>>>
>>>
>>>>>> x = test([1,2,3])
>>> None
>>> None
>>> None
>>>>>> s = pickle.dumps(x)
>>>>>> y = pickle.loads(s)
>>> Traceback (most recent call last):
>>>   File "<pyshell#114>", line 1, in <module>
>>>     y = pickle.loads(s)
>>>   File "<pyshell#111>", line 8, in __setitem__
>>>     print (self.shape)
>>> AttributeError: 'test' object has no attribute 'shape'
>>>
>>>
>>> I have DUCkDuckGo'ed the issue and have tinkered with __getnewargs__ and
>>> __getnewargs_ex__ without being able to figure out exactly what's going
>>> on. If I comment out the print call, then it seems to be fine. I'd
>>> appreciate any pointers to the underlying problem. I have one or two
>>> other things I can do to try to isolate the issue further, but I think
>>> the example is perhaps small enough that someone in the know could spot
>>> the problem at a glance. Cheers.
>>>
>>> Duncan
>>>
>>
>> OK, this seems to  be a "won't fix" bug dating back to 2003
>> (https://bugs.python.org/issue826897). The workaround,
>>
>>
>> class DictPlus(dict):
>>   def __init__(self, *args, **kwargs):
>>     self.extra_thing = ExtraThingClass()
>>     dict.__init__(self, *args, **kwargs)
>>   def __setitem__(self, k, v):
>>     try:
>>       do_something_with(self.extra_thing, k, v)
>>     except AttributeError:
>>       self.extra_thing = ExtraThingClass()
>>       do_something_with(self.extra_thing, k, v)
>>     dict.__setitem__(self, k, v)
>>   def __setstate__(self, adict):
>>     pass
>>
>>
>> doesn't work around the problem for me because I need the actual value
>> of self.shape from the original instance. But I only need it for sanity
>> checking, and under the assumption that the original instance was valid,
>> I don't need to do this when unpickling. I haven't managed to find a
>> workaround that exploits that (yet?). Cheers.
> 
> I've been playing around with __getnewargs__(), and it looks like you can 
> get it to work with a custom __new__(). Just set the shape attribute there 
> rather than in __init__():
> 
> $ cat pickle_dict_subclass.py 
> import pickle
> 
> 
> class A(dict):
>     def __new__(cls, keys=(), shape=None):
>         obj = dict.__new__(cls)
>         obj.shape = shape
>         return obj
> 
>     def __init__(self, keys=(), shape=None):
>         print("INIT")
>         for key in keys:
>             self[key] = None
>         print("EXIT")
> 
>     def __setitem__(self, key, val):
>         print(self.shape, ": ", key, " <-- ", val, sep="")
>         super().__setitem__(key, val)
> 
>     def __getnewargs__(self):
>         print("GETNEWARGS")
>         return ("xyz", self.shape)
> 
> x = A([1, 2, 3], shape="SHAPE")
> x["foo"] = "bar"
> print("pickling:")
> s = pickle.dumps(x)
> print("unpickling:")
> y = pickle.loads(s)
> print(y)
> $ python3 pickle_dict_subclass.py 
> INIT
> SHAPE: 1 <-- None
> SHAPE: 2 <-- None
> SHAPE: 3 <-- None
> EXIT
> SHAPE: foo <-- bar
> pickling:
> GETNEWARGS
> unpickling:
> SHAPE: 1 <-- None
> SHAPE: 2 <-- None
> SHAPE: 3 <-- None
> SHAPE: foo <-- bar
> {1: None, 2: None, 3: None, 'foo': 'bar'}
> 
> It's not clear to me how the dict items survive when they are not included 
> in the __getnewargs__() result, but apparently they do.
> 
> 
> 

Thanks Peter. The docs for pickle say "When a class instance is
unpickled, its __init__() method is usually not invoked. The default
behaviour first creates an uninitialized instance and then restores the
saved attributes." Your outputs above seem to confirm that it isn't
calling my __init__(), but it *is* calling my __setitem__() when it is
restoring the dict items. I actually have about 50-60 lines of code in
my real __setitem__(), but I probably don't need it if I'm unpickling.
My __setitem__() actually updates a secondary data structure which was
why I thought I needed to do more than just restore the items, but if
that's going to be restored in the same way I just need the dict items
to be restored. The following seems to achieve that.


    def __setitem__(self, key, val):
        try:
            self.shape
        except AttributeError:
            dict.__setitem__(self, key, val)
        else:
            # do other stuff


A bit close to original workaround that I didn't think would work
(although that used __setstate__() for some reason). I'm not sure I
prefer it to defining __new__() though. Ideally I'd have something like,


    def __setitem__(self, key, val):
        if unpickling:
            dict.__setitem__(self, key, val)
        else:
            # do other stuff


rather than potentially masking a bug. Anyway, thanks for your solution
and the outputs that pointed me to another possible solution. Cheers.

Duncan



More information about the Python-list mailing list