question on string object handling in Python 2.7.8

Gregory Ewing greg.ewing at canterbury.ac.nz
Wed Dec 24 23:23:31 EST 2014


Dave Tian wrote:

> A: a = ‘h’ 
 > B: b = ‘hh’
> 
> According to me understanding, A should be faster as characters would
> shortcut this 1-byte string ‘h’ without malloc;

It sounds like you're expecting characters to be stored
"unboxed" like in Java.

That's not the way Python works. Objects are used for
everything, including numbers and characters (there is
no separate character type in Python, they're just
length-1 strings).

 > for i in range(0, 100000000):
 >	a = ‘h’ #or b = ‘hh’
 > Testing cmd: python -m cProfile test.py

Since you're assigning a string literal, there's just
one string object being allocated (at the time the code
is read in and compiled). All the loop is doing is
repeatedly assigning a reference to that object to a
or b, which doesn't require any further mallocs;
all it does is adjust reference counts. This will
be swamped by the overhead of the for-loop itself,
which is allocating and deallocating 100 million
integer objects.

I would expect both of these to be exactly the same
speed, within measurement error. Any difference you're
seeing is probably just noise, or the result of some
kind of warming-up effect.

-- 
Greg



More information about the Python-list mailing list