[Python-Dev] redefining is

Tim Peters tim.one at comcast.net
Tue Mar 23 21:48:24 EST 2004


[François Pinard]
> There is question which traversed my mind recently.  While I would
> never use `is' between ordinary strings, I might be tempted to use
> `is' between explicitely interned strings, under the hypothesis that
> for example:
>
>     a = intern('a b')
>     b = intern('a b')
>     a is b
>
> dependably prints `True'.  However, I remember a thread in this forum
> by which strings might be un-interned on the fly -- but do not
> remember the details; I presume it was for strings which the garbage
> collector may not reach anymore.

Right, interned strings are no longer immortal (they used to be, but that
changed).  That can't hurt the kind of case you show above, though -- so
long as either a or b remains bound to its interned string, that string is
reachable and so won't be reclaimed by the garbage collector.

The kind of thing that can fail now is so obscure I doubt anyone was doing
it:  relying on the id() of an interned string remaining valid forever.
Like doing

    address_of_a = id(intern('a b'))

and then later assuming that

    id(some_string) == address_of_a

if and only if some_string is the interned string 'a b'.  That was
"reliable" when interned strings were immortal, but not anymore.  For
example (and it may or may not work anything like this under your Python
installation -- depends on internal vagaries):

>>> id(intern('a b'))
6973920
>>> id(intern('c d'))
6973824
>>> id(intern('e f'))
6973760
>>> id(intern('g h')) # it turns out this one is a repeat of the first
6973920
>>>

> There are a few real-life cases where speed considerations would
> invite programmers to use `is' over `==' for strings, given they all
> get interned to start with so the speed-up could be gleaned later.
> The fact that `is' exposes the implementation is quite welcome in
> such cases.

Sure, and I've done that myself in several programs too.  I like "is"!




More information about the Python-Dev mailing list