[Python-ideas] o(n**2) problem with marshal.dumps for large objects, with patch

Christian Heimes lists at cheimes.de
Thu Jan 10 20:04:31 CET 2008


Aaron Watters wrote:
> Hi  folks.  Much to my surprise I found that one of
> my applications seemed to be running slow as a result
> of marshal.dumps.  I think the culprit is the w_more(...)
> function, which grows the marshal buffer in 1k units.
> This means that a marshal of size 100k will have 100
> reallocations and string copies.  Other parts of Python
> (and java etc) have a proportional reallocation strategy
> which reallocates a new size based on the existing size.
> This mean a 100k marshal requires just 5 or so
> reallocations and string copies (n log n versus n**2
> asymptotic performance).
> 
> I humbly submit the following patch (based on python 2.6a0
> source).  I think it is a strict improvement on the existing
> code, but I've been wrong before (twice ;)).

Could you please submit the patch at http://bugs.python.org/ ? Patches
in mailing list posts have a tendency to get lost. An unified diff (diff
-u) is preferred.

Thanks

Christian




More information about the Python-ideas mailing list