[issue3873] Unpickling is really slow
STINNER Victor
report at bugs.python.org
Thu Jul 29 00:58:04 CEST 2010
STINNER Victor <victor.stinner at haypocalc.com> added the comment:
New version of my patch:
- add "used" attribute to UnpicklerBuffer structure: disable the read buffer for not seekable file and for protocol 0 (at the first call to unpickle_readline)
- check if PyObject_GetAttrString(file, "seek") is NULL or not
- unpickle_readline() flushs also the buffer
- add a new patch specific to the read buffer: ensure that unpickler doesn't eat data at the end of the file
test_pickle pass without any error.
Disable read buffer at the first call to unpickle_readline() because unpickle_readline() have to flush the buffer. I will be very difficult to optimize protocol 0, but I hope that nobody uses it nowadays.
===========
Benchmark with [0]*10**6, Python compiled with pydebug.
Without the patch
-----------------
Protocol 0:
- dump: 598.0 ms
- load (seekable=False): 3337.3 ms
- load (seekable=True): 3309.6 ms
Protocol 1:
- dump: 217.8 ms
- load (seekable=False): 864.2 ms
- load (seekable=True): 873.3 ms
Protocol 2:
- dump: 226.5 ms
- load (seekable=False): 867.8 ms
- load (seekable=True): 854.6 ms
With the patch
--------------
Protocol 0
- dump: 615.5 ms
- load (seekable=False): 3201.3 ms
- load (seekable=True): 3223.4 ms
Protocol 1
- dump: 219.8 ms
- load (seekable=False): 942.1 ms
- load (seekable=True): 175.2 ms
Protocol 2
- dump: 221.1 ms
- load (seekable=False): 943.9 ms
- load (seekable=True): 175.5 ms
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3873>
_______________________________________
More information about the Python-bugs-list
mailing list