[Tutor] BadPickleGet error

Peter Otten __peter__ at web.de
Sat Jun 25 19:53:38 CEST 2011


Rick Pasotto wrote:

> Traceback (most recent call last):
>   File "/usr/share/rss2email/rss2email.py", line 748, in <module>
>     else: run()
>   File "/usr/share/rss2email/rss2email.py", line 488, in run
>     feeds, feedfileObject = load()
>   File "/usr/share/rss2email/rss2email.py", line 439, in load
>     feeds = pickle.load(feedfileObject)
> cPickle.BadPickleGet: 2574'8
> 
> Could someone please explain what the number(s) after 'BadPickleGet'
> mean?

It's a key into a dictionary with previously retrieved objects.
When you save the same object in a pickle it is written once, and for the 
second occurence a reference to the previously stored object is written:

>>> import cPickle
>>> import pickletools
>>> data = ("alpha", "alpha")
>>> cPickle.dumps(data)
"(S'alpha'\np1\ng1\ntp2\n."
>>> dump = _
>>> pickletools.dis(dump)
    0: (    MARK
    1: S        STRING     'alpha'
   10: p        PUT        1
   13: g        GET        1
   16: t        TUPLE      (MARK at 0)
   17: p    PUT        2
   20: .    STOP
highest protocol among opcodes = 0

Now let's break it:

>>> broken = dump.replace("g1", "g2")
>>> cPickle.loads(broken)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
cPickle.BadPickleGet: 2

> I assume the data file is corrupt but the question is can I fix it with
> an editor using that info?
 
Probably not. You could run your file through pickletools.dis() to find the 
location of the bad key

>>> pickletools.dis(broken)
    0: (    MARK
    1: S        STRING     'alpha'
   10: p        PUT        1
   13: g        GET        2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/pickletools.py", line 2006, in dis
    raise ValueError(errormsg)
ValueError: memo key 2 has never been stored into

but if you replace it with a valid one your data will still be messed up. 
Also, there are probably other errors in the file.

Another idea, patching the Unpickler implemented in Python to replace bad 
references with a dummy object:

>>> import pickle
>>> class U(pickle.Unpickler):
...     def my_load_get(self):
...             try:
...                     obj = self.load_get()
...             except KeyError:
...                     obj = "<object could not be retrieved>"
...             self.append(obj)
...     dispatch = pickle.Unpickler.dispatch.copy()
...     dispatch[pickle.GET] = my_load_get
...
>>> from cStringIO import StringIO
>>> U(StringIO(broken)).load()
('alpha', '<object could not be retrieved>')

suffers from the same problems.



More information about the Tutor mailing list