[Python-Dev] Alternative implementation of interning, take 2

M.-A. Lemburg mal@lemburg.com
Fri, 12 Jul 2002 22:31:41 +0200


Tim Peters wrote:
> [M.-A. Lemburg]
> 
>>If you could spell out what exactly you mean by "indirect interning"
>>that would help.
> 
> 
> Actually, I don't think it would -- the issue is whether the possibility for
> the ob_sinterned member of a PyStringObject not to *be* the string object
> itself ever saves time in your extensions, and it's darned hard to guess
> that.  If you apply the attached patch to current CVS, though, it will tell
> you whenever your code benefits from it.

Cool, I'll try that... hmm, I'll have to backport it to Python 2.1.3
though ;-)

> AFAICT, there are only 3 routines where it *might* save cycles (but note
> that checking for the possibility costs cycles whether or not it pays; it's
> a net loss when it doesn't pay):
> 
> + PyDict_SetItem:  I believe this is the only real possibility for gain.  If
> it ever helps you here, the patch arranges to print
> 
>     ii paid on a setitem

Scanning the source code: I hardly use PyDict_SetItem(); most usages
are PyDict_SetItemString().

> to stderr whenever it does pay.  I haven't yet seen that get printed.
> 
> + PyString_InternInPlace:  Whenever it pays here, the patch spits
> 
>     ii paid on an InternInPlace

I do use this API, but only in mxURL and mxXMLTools (which is
closed source and works with the evil code below I mentioned ;-).

> That triggers 6 times in the Python test suite, all from test_descr.  Since
> this one is an optimization *of* setting ob_sinterned, it's a
> snake-eating-its-tail kind of thing -- it's of no real benefit unless
> ob_sintered pays off somewhere else too.
> 
> + string_hash:  The patch spits
> 
>     ii paid on a hash???
> 
> The question marks are there because I don't see how it's possible for this
> to get printed.
> 
> 
>>What I do need and rely on is the fact that the
>>Python compiler interns all constant strings and identifiers in
>>Python programs. This makes switching like so:
> 
> 
> Ya, while that's evil, it's not affected by indirect interning.

Cool :-)

If Guido should ever decide to rip this out, I can always switch
to a different technique, e.g. use my own interning token type.

-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
_______________________________________________________________________
eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,...
Python Consulting:                               http://www.egenix.com/
Python Software:                    http://www.egenix.com/files/python/