How naive is Python?

Steven D'Aprano steve at REMOVEME.cybersource.com.au
Sun Jan 14 23:17:59 EST 2007


On Mon, 15 Jan 2007 03:25:01 +0000, John Nagle wrote:

>     How naive (in the sense that compiler people use the term)
> is the current Python system?  For example:
> 
> 	def foo() :
> 	    s = "This is a test"
> 	    return(s)
> 
>          s2 = foo()
> 
> How many times does the string get copied?


Let's find out. Results below for Python 2.3 -- other versions may vary.

>>> def foo():
...     s = "This is a test"
...     return s
...
>>> def foo2():
...     return "This is a test"
...
>>> import dis
>>> dis.dis(foo)
  2           0 LOAD_CONST               1 ('This is a test')
              3 STORE_FAST               0 (s)

  3           6 LOAD_FAST                0 (s)
              9 RETURN_VALUE
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE
>>> dis.dis(foo2)
  2           0 LOAD_CONST               1 ('This is a test')
              3 RETURN_VALUE
              4 LOAD_CONST               0 (None)
              7 RETURN_VALUE

foo and foo2 functions compile to different byte-code. foo does a little
more work than foo2, but it is likely to be a trivial difference.

>>> s1 = foo()
>>> s2 = foo()
>>> s1 == s2, s1 is s2
(True, True)

So the string "This is a test" within foo is not copied each time the
function is called. However, the string "This is a test" is duplicated
between foo and foo2 (the two functions don't share the same string
instance):

>>> s3 = foo2()
>>> s3 == s1, s3 is s1
(True, False)


 
> Or, for example:
> 
>          s1 = "Test1"
> 	s2 = "Test2"
> 	s3 = "Test3"
> 	s = s1 + s2 + s3
> 
> Any redundant copies performed, or is that case optimized?

I don't believe it is optimized. I believe that in Python 2.5 simple
numeric optimizations are performed, so that "x = 1 + 3" would compile to
"x = 4", but I don't think that holds for strings. If you are running 2.5,
you could find out with dis.dis.



> How about this?
> 
> 	kcount = 1000
> 	s = ''
> 	for i in range(kcount) :
> 	    s += str(i) + ' '
> 
> Is this O(N) or O(N^2) because of recopying of "s"?

That will be O(N**2), except that CPython version 2.4 (or is it 2.5?) can,
sometimes, optimize it. Note that this is an implementation detail, and
doesn't hold for other Pythons like Jython, IronPython or PyPy.



-- 
Steven D'Aprano 




More information about the Python-list mailing list