Is empty string cached?

Fri Feb 17 08:06:11 EST 2006

Farshid Lashkari wrote:
>> I really don't understand why it's so important: it's not a part of 
>> the language definition at all, and therefore whatever behavior you 
>> see is simply an artifact of the implementation you observe.
> 
> 
> I guess I should rephrase my question in the form of an example. Should 
> I assume that a new string object is created in each iteration of the 
> following loop?
> 
> for x in xrange(1000000):
>     func(x,'some string')
> 
> Or would it be better to do the following?
> 
> stringVal = 'some string'
> for x in xrange(1000000):
>     func(x,stringVal)
> 
> Or, like you stated, is it not important at all?

In this particular case, it's no big deal, since you use
a literal, which is something Python knows won't change.

In general, it's semantically very different to create an
object at one point and then use a reference to that over
and over in a loop, or to create a new object over and
over again in a loop. E.g.

for x in xrange(1000000):
     func(x, str(5))

v.s.

stringVal = str(5)
for x in xrange(1000000):
     func(x,stringVal)

This isn't just a matter of extra function call overhead. In
the latter case, you are telling Python that all calls to
"func(x,stringVal)" use the same objects as arguments (assuming
that there aren't any assignments to x and stringVal somewhere
else in the loop). In the former case, no such guarantee can
be made from studying the loop.

As for interning strings, it's my understanding that current
CPython interns strings that look like identifiers, i.e.
starts with an ASCII letter or an underscore and is followed
by zero or more ASCII letter, underscore or digit. On the other
hand, it seems id(str(5)) is persistent as well, so the current
implementation seems slightly simplified compared to the
perceived need. Anyway, this is just an implementation choice
made to improve performance, nothing to rely on.