Internals of interning strings
Michael Hudson
mwh21 at cam.ac.uk
Fri Mar 24 03:43:42 EST 2000
"Jason Stokes" <jstok at bluedog.apana.org.au> writes:
[schnipp]
>
> On the next call to intern, the string referred to by the name "d" has the
> same value as that referred to by "a". When we go to intern it, we find the
> value of the string referred to by d is already present in the dictionary.
> The string referred to by a is returned as the result of the function.
> Also, the interpreter sets the internal field "ob_sinterned" of the object
> referred to by d to *also* point to a. Now, anywhere the object referred to
> by d is used certain operations can be slightly optimized. If you invoke
> intern on the object referred to by d again, the PyString_InternInPlace
> routine sees that its "ob_sinterned" field already points to an object, and
> returns that, instead of looking it up in the dictionary again. And if "d"
> is hashed, the hash function returns the cached hash value of the object
> currently pointed to by "a".
Yup, I think that's right.
> I don't know if that's clear, but I didn't want to include the whole source
> listing. Anyway, the question is: is this the only reason for the extra
> entry "ob_sinterned" in the PyString struct? That is, a couple of
> optimisations, costing an extra 4 bytes per string object?
What I think you're missing is that the `intern' builtin is an
interface to what is essentially an *internal* optimisation strategy.
It's there mainly to optimise the lookup of strings in dictionaries -
because if two strings are interned, then testing for equality is just
a pointer comparision. The compiler automatically interns likely
looking strings, so when executing
self.foobar = self.foobar + 1
the string "foobar" is already interned and so the lookups of it in
self.__dict__ are quicker than otherwise.
At least, that's my understanding of the situation.
You can build Python without INTERN_STRINGS to see how the space/time
behaviour changes, but I doubt you'll like the results.
Cheers,
M.
--
very few people approach me in real life and insist on proving they are
drooling idiots. -- Erik Naggum, comp.lang.lisp
More information about the Python-list
mailing list