[Python-Dev] Re: [Python-checkins] python/nondist/sandbox/datetime picklesize.py,NONE,1.1
Michael Hudson
mwh@python.net
03 Dec 2002 18:27:59 +0000
Michael Hudson <mwh@python.net> writes:
> tim_one@users.sourceforge.net writes:
>
> > New program just to display pickle sizes. This makes clear that the
> > copy_reg based C implementation is much more space-efficient in the
> > end than the __getstate__/__setstate__ based Python implementation,
> > but that 4-byte date objects still suffer > 10 bytes of overhead each
> > no matter how many of them you pickle in one gulp.
>
> Presumably there's a possibility of an optimization for pickling
> homogeneous (i.e. all the same type) lists (in pickle.py, not here).
>
> Hard to say whether it would be worth it, though.
Here's a fairly simple minded patch to the pickling side of pickle.py:
it seems to save about 6 bytes per object in the good cases.
with:
list of 100 dates via C -- 1236 bytes, 12.36 bytes/obj
without:
list of 100 dates via C -- 1871 bytes, 18.71 bytes/obj
I'm not going to pursue this further unless someone thinks it's a
worthwhile move.
Cheers,
M.
Index: pickle.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/pickle.py,v
retrieving revision 1.72
diff -c -r1.72 pickle.py
*** pickle.py 13 Nov 2002 22:01:26 -0000 1.72
--- pickle.py 3 Dec 2002 18:24:37 -0000
***************
*** 109,114 ****
--- 109,115 ----
INST = 'i'
LONG_BINGET = 'j'
LIST = 'l'
+ HOM_LIST = 'k'
EMPTY_LIST = ']'
OBJ = 'o'
PUT = 'p'
***************
*** 439,445 ****
--- 440,478 ----
def save_empty_tuple(self, object):
self.write(EMPTY_TUPLE)
+ def save_hom_list(self, object):
+ reduce = dispatch_table[type(object[0])]
+
+ write = self.write
+ save = self.save
+ memo = self.memo
+
+ write(HOM_LIST)
+
+ o = object[0]
+
+ c, a = reduce(o)
+
+ l = [a]
+
+ for o in object[1:]:
+ l.append(reduce(o)[1])
+
+ save(c)
+ save(l)
+
def save_list(self, object):
+ t = {}
+
+ for o in object:
+ t[type(o)] = 1
+ if len(t) > 1:
+ break
+ else:
+ if t and dispatch_table.has_key(t.iterkeys().next()):
+ self.save_hom_list(object)
+ return
+
d = id(object)
write = self.write
--
Unfortunately, nigh the whole world is now duped into thinking that
silly fill-in forms on web pages is the way to do user interfaces.
-- Erik Naggum, comp.lang.lisp