marshal vs pickle

Wed Oct 31 09:45:00 EDT 2007

On Oct 31, 3:31 am, "Evan Klitzke" <e... at yelp.com> wrote:
> Can anyone elaborate more on the difference between marshal and
> pickle. In what conditions would using marshal be unsafe? If one can
> guarantee that the marshalled objects would be created and read by the
> same version of Python, is that enough?

Yes, I think that's enough.  I like to use
marshal a lot because it's the absolutely fastest
way to store and load data to/from Python. Furthermore
because marshal is "stupid" the programmer has complete
control.  A lot of the overhead you get with the
pickles which make them generally much slower than
marshal come from the cleverness by which pickle will
recognized shared objects and all that junk.  When I
serialize, I generally don't need
that because I know what I'm doing.

For example both gadfly SQL

   http://gadfly.sourceforge.net

and nucular full text/fielded search

   http://nucular.sourceforge.net

use marshal as the underlying serializer.  Using cPickle
would probably make serialization worse than 2x slower.
This is one of the 2 or 3 key tricks which make these
packages as fast as they are.

   -- Aaron Watters

===
http://www.xfeedme.com/nucular/gut.py/go?FREETEXT=halloween