question on string object handling in Python 2.7.8

Wed Dec 24 06:22:32 EST 2014

Dave Tian wrote:

> Hi,
> 
> There are 2 statements:
> A: a = ‘h’
> B: b = ‘hh’
> 
> According to me understanding, A should be faster as characters would
> shortcut this 1-byte string ‘h’ without malloc; B should be slower than A
> as characters does not work for 2-byte string ‘hh’, which triggers the
> malloc. However, when I put A/B into a big loop and try to measure the
> performance using cProfile, B seems always faster than A. 
>
> Testing code: 
> for i in range(0, 100000000): a = ‘h’ #or b = ‘hh’ 
> Testing cmd: python -m cProfile test.py

Any performance difference is entirely an artifact of your testing method.
You have completely misinterpreted what this piece of code will do.

What happens here is that you time a piece of code to:

- Build a large list containing 100 million individual int objects. Each int
object has to be allocated at run time, as does the list. Each int object
is about 12 bytes in size.

- Then, the name i is bound to one of those int objects. This is a fast
pointer assignment.

- Then, a string object containing either 'h' or 'hh' is allocated. In
either case, that requires 21 bytes, plus one byte per character. So either
22 or 23 bytes.

- The name a is bound to that string object. This is also a fast pointer
assignment.

- The loop returns to the top, and the name i is bound to the next int
object.

- Then, the name a is bound *to the same string object*, since it will have
been cached. No further malloc will be needed.

So as you can see, the time you measure is dominated by allocating a massive
list containing 100 million int objects. Only a single string object is
allocated, and the time difference between creating 'h' versus 'hh' is
insignificant.

The byte code can be inspected like this:

py> code = compile("for i in range(100000000): a = 'h'", '', 'exec')
py> from dis import dis
py> dis(code)
  1           0 SETUP_LOOP              26 (to 29)
              3 LOAD_NAME                0 (range)
              6 LOAD_CONST               0 (100000000)
              9 CALL_FUNCTION            1
             12 GET_ITER
        >>   13 FOR_ITER                12 (to 28)
             16 STORE_NAME               1 (i)
             19 LOAD_CONST               1 ('h')
             22 STORE_NAME               2 (a)
             25 JUMP_ABSOLUTE           13
        >>   28 POP_BLOCK
        >>   29 LOAD_CONST               2 (None)
             32 RETURN_VALUE

Notice instruction 19 and 22:

             19 LOAD_CONST               1 ('h')
             22 STORE_NAME               2 (a)

The string object is built at compile time, not run time, and Python simply
binds the name a to the pre-existing string object.

If you looked at the output of your timing code, you would see something
like this (only the times would be much larger, I cut the loop down from
100 million to only a few tens of thousands):

[steve at ando ~]$ python -m cProfile /tmp/x.py
         3 function calls in 0.062 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.051    0.051    0.062    0.062 x.py:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable'
of '_lsprof.Profiler' objects}
        1    0.012    0.012    0.012    0.012 {range}

The profiler doesn't even show the time required to bind the name to the
string object.

Here is a better way of demonstrating the same thing:

py> from timeit import Timer
py> t = Timer("a = 'h'")
py> min(t.repeat())
0.0508120059967041
py> t = Timer("a = 'hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh'")
py> min(t.repeat())
0.050585031509399414

No meaningful difference in time. What difference you do see is a fluke of
timing.

-- 
Steven