pickle alternative
Andrew Dalke
dalke at dalkescientific.com
Tue May 31 03:11:12 EDT 2005
simonwittber wrote:
> I've written a simple module which serializes these python types:
>
> IntType, TupleType, StringType, FloatType, LongType, ListType, DictType
For simple data types consider "marshal" as an alternative to "pickle".
> It appears to work faster than pickle, however, the decode process is
> much slower (5x) than the encode process. Has anyone got any tips on
> ways I might speed this up?
def dec_int_type(data):
value = int(unpack('!i', data.read(4))[0])
return value
That 'int' isn't needed -- unpack returns an int not a string
representation of the int.
BTW, your code won't work on 64 bit machines.
def enc_long_type(obj):
return "%s%s%s" % ("B", pack("!L", len(str(obj))), str(obj))
There's no need to compute str(long) twice -- for large longs
it takes a lot of work to convert to base 10. For that matter,
it's faster to convert to hex, and the hex form is more compact.
Every decode you do requires several function calls. While
less elegant, you'll likely get better performance (test it!)
if you minimize that; try something like this
def decode(data):
return _decode(StringIO(data).read)
def _decode(read, unpack = struct.unpack):
code = read(1)
if not code:
raise IOError("reached the end of the file")
if code == "I":
return unpack("!i", read(4))[0]
if code == "F":
return unpack("!f", read(4))[0]
if code == "L":
count = unpack("!i", read(4))
return [_decode(read) for i in range(count)]
if code == "D":
count = unpack("!i", read(4))
return dict([_decode(read) for i in range(count)]
...
Andrew
dalke at dalkescientific.com
More information about the Python-list
mailing list