marshal vs pickle

Raymond Hettinger python at rcn.com
Wed Oct 31 13:37:19 EDT 2007


On Oct 31, 6:45 am, Aaron Watters <aaron.watt... at gmail.com> wrote:
>  I like to use
> marshal a lot because it's the absolutely fastest
> way to store and load data to/from Python. Furthermore
> because marshal is "stupid" the programmer has complete
> control.  A lot of the overhead you get with the
> pickles which make them generally much slower than
> marshal come from the cleverness by which pickle will
> recognized shared objects and all that junk.  When I
> serialize,

I believe this FUD is somewhat out-of-date.  Marshalling
became smarter about repeated and shared objects.  The
pickle module (using mode 2) has a similar implementation
to marshal and both use the same tricks, but pickle is
much more flexible in the range of objects it can handle
(i.e. sets became marshalable only recently while deques
can pickle but not marshal)

For the most part, users are almost always better-off
using pickle which is version independent, fast, and
can handle many more types of objects than marshal.

Also FWIW, in most applications of pickling/marshaling,
the storage or tranmission times dominate computation
time. I've gotten nice speed-ups by zipping the pickle
before storing, transmitting, or sharing (RPC apps
for example).


Raymond




More information about the Python-list mailing list