TypeError: can't pickle HASH objects?

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Thu Oct 2 05:07:09 EDT 2008


En Wed, 01 Oct 2008 16:50:05 -0300, est <electronixtar at gmail.com> escribió:

>>>> import md5
>>>> a=md5.md5()
>>>> import pickle
>>>> pickle.dumps(a)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "C:\Python25\lib\pickle.py", line 1366, in dumps
>     Pickler(file, protocol).dump(obj)
>   File "C:\Python25\lib\pickle.py", line 224, in dump
>     self.save(obj)
>   File "C:\Python25\lib\pickle.py", line 306, in save
>     rv = reduce(self.proto)
>   File "C:\Python25\lib\copy_reg.py", line 69, in _reduce_ex
>     raise TypeError, "can't pickle %s objects" % base.__name__
> TypeError: can't pickle HASH objects
>
> Why can't I pickle a md5 object? Is it because md5 algorithm needs to
> read 512-bits at a time?
>
> I need to md5() some stream, pause(python.exe quits), and resume
> later.  It seems that the md5 and hashlib in  std module could not be
> serialized?

Yep, they're implemented in C and have no provision for serializing.
If you can use the old _md5 module, it is far simpler to serialize; a  
md5object just contains a small struct with 6 integers and 64 chars, no  
pointers.

With some help from ctypes (and a lot of black magic!) one can extract the  
desired state, and restore it afterwards:

--- begin code ---
import _md5
import ctypes

assert _md5.MD5Type.__basicsize__==96

def get_md5_state(m):
     if type(m) is not _md5.MD5Type:
         raise TypeError, 'not a _md5.MD5Type instance'
     return ctypes.string_at(id(m)+8, 88)

def set_md5_state(m, state):
     if type(m) is not _md5.MD5Type:
         raise TypeError, 'not a _md5.MD5Type instance'
     if not isinstance(state,str):
         raise TypeError, 'state must be str'
     if len(state)!=88:
         raise ValueError, 'len(state) must be 88'
     a88 = ctypes.c_char*88
     pstate = a88(*list(state))
     ctypes.memmove(id(m)+8, ctypes.byref(pstate), 88)

--- end code ---

py> m1 = _md5.new()
py> m1.update("this is a ")
py> s = get_md5_state(m1)
py> del m1
py>
py> m2 = _md5.new()
py> set_md5_state(m2, s)
py> m2.update("short test")
py> print m2.hexdigest()
95ad1986e9a9f19615cea00b7a44b912
py> print _md5.new("this is a short test").hexdigest()
95ad1986e9a9f19615cea00b7a44b912

The code above was only tested with Python 2.5.2 on Windows, not more than  
you can see. It might or might not work with other versions or platforms.  
It may even create a (small) black hole and eat your whole town. Use at  
your own risk.

-- 
Gabriel Genellina




More information about the Python-list mailing list