interning strings

Tim Peters tim.peters at gmail.com
Sun Nov 7 21:27:24 EST 2004


[Mike Thompson]
> ...
> From your explanation there seems to be no language rules, just
> implementation accidents.  And none of those will be particularly
> helpful in my case.

String interning is purely an optimization.  Python added the concept
to speed its own name lookups, and the rules it uses for
auto-interning are effective for that.  It wasn't necessary to expose
the interning facilities to users to meet its goal, and, especially
since interned strings were originally immortal, it would have been a
horrible idea to intern all strings.  The machinery was exposed just
because it's Pythonic to expose internals when reasonably possible. 
There wasn't, and shouldn't be, an expectation that exposed internals
will be perfectly suited as-is to arbitrary applications.

> However, I still think I'm going to try using the builtin 'intern' rather than my
> own dict cache.

That's fine -- that's why it got exposed.  Don't assume that any
string is interned unless you explicitly intern() it, and you'll be
happy (and it doesn't hurt to intern() a string that's already
interned -- you just get back a reference to the already-interned copy
then).

[earlier]
> I can find documentation of a builtin called 'intern' but its use seems
> frowned upon these days.

Not by me, but it's never been useful to *most* apps, apart from the
indirect benefits they get from Python's internal uses of string
interning.  It's rare that an app really wants some strings stored
uniquely, and possibly never than an app wants all strings stored
uniquely.  Most apps that use explicit string interning appear to be
looking for no more than a partial workalike for Lisp symbols.



More information about the Python-list mailing list