socket send O(N**2) complexity

exarkun at twistedmatrix.com exarkun at twistedmatrix.com
Mon Sep 21 16:29:41 EDT 2009


On 08:00 pm, rtw at freenet.co.uk wrote:
>Zac Burns wrote in news:mailman.211.1253559803.2807.python- 
>list at python.org
>in comp.lang.python:
>>The mysocket.mysend method given at
>>http://docs.python.org/howto/sockets.html has an (unwitting?) O(N**2)
>>complexity for long msg due to the string slicing.
>>
>>I've been looking for a way to optimize this, but aside from a pure
>>python 'string slice view' that looks at the original string I can't
>>think of anything. Perhaps start and end keywords could be added to
>>send? I can't think of a reason for the end keyword,  but it would be
>>there for symmetry.
>
>I ran this script on various versions of python I have access to:
>
>#encoding: utf-8
>raw_input( "start" )
>
>s = 'x' * 1000000
>r = [None] * 1000
>
>raw_input( "allocated 1 meg + " )
>
>for i in xrange(1000):
>  r[i] = s[:]
>
>raw_input( "end" )
>
>Niether of the CPython versions (2.5 and 3.0 (with modified code))
>exibited any memory increase between "allocated 1 meg + " and "end"

You bumped into a special case that CPython optimizes.  s[:] is s.  If 
you repeat your test with s[1:], you'll see memory climb as one might 
normally expect.
>pypy-c (1.0.0) showed a 30k jump, and IronPython 2.0 showed a few megs
>jump.
>
>AIUI, as a python string is imutable, a slice of a string is a
>new string which points (C char *) to the start of the slice data
>and with a length that is the length of the slice, about 8 bytes
>on 32 bit machine.
>
>So even though a slice assignment new_s = s[:] appears to a python
>programmer to make a "copy" of s, its only the a few bytes of
>metadata (the pointer and the length) that is really copied, the
>strings character data stays where it is.
>
>So the code you cite is in fact O(N) as the copy is constant size.

This all (basically) valid for the special case of s[:].  For any other 
string slicing, though, the behavior is indeed O(N**2).

To the OP, you can get view-like behavior with the "buffer" builtin. 
Here's an example of its usage from Twisted, where it is employed for 
exactly the reason raised here:

http://twistedmatrix.com/trac/browser/trunk/twisted/internet/abstract.py#L93

Jean-Paul



More information about the Python-list mailing list