marshal vs pickle
Aaron Watters
aaron.watters at gmail.com
Wed Oct 31 09:45:00 EDT 2007
On Oct 31, 3:31 am, "Evan Klitzke" <e... at yelp.com> wrote:
> Can anyone elaborate more on the difference between marshal and
> pickle. In what conditions would using marshal be unsafe? If one can
> guarantee that the marshalled objects would be created and read by the
> same version of Python, is that enough?
Yes, I think that's enough. I like to use
marshal a lot because it's the absolutely fastest
way to store and load data to/from Python. Furthermore
because marshal is "stupid" the programmer has complete
control. A lot of the overhead you get with the
pickles which make them generally much slower than
marshal come from the cleverness by which pickle will
recognized shared objects and all that junk. When I
serialize, I generally don't need
that because I know what I'm doing.
For example both gadfly SQL
http://gadfly.sourceforge.net
and nucular full text/fielded search
http://nucular.sourceforge.net
use marshal as the underlying serializer. Using cPickle
would probably make serialization worse than 2x slower.
This is one of the 2 or 3 key tricks which make these
packages as fast as they are.
-- Aaron Watters
===
http://www.xfeedme.com/nucular/gut.py/go?FREETEXT=halloween
More information about the Python-list
mailing list