interning strings
Mike Thompson
none.by.e-mail
Sun Nov 7 18:09:02 EST 2004
[snip very useful explanation]
>
> By the way, why would you want to mess with these implementation details?
> Use the == operator to compare strings and be happy ever after :-)
>
'==' won't help me, I'm afraid.
I need to improve the speed and memory footprint of an application which
reads in a very large XML document.
Some elements in the incoming documents can be filtered out, so I've
written my own SAX handler to extract just what I want. All the same,
the content being read in is substantial.
So, to further reduce memory footprint, my SAX handler tries to manually
intern (using dicts of strings) a lot of the duplicated content and
attributes coming from the XML documents. Also, I use the SAX feature
'feature_string_interning' to hopefully intern the strings used for
attribute names etc.
Which is all working fine, except that now, as a final process, I'd like
to understand interning a bit more.
From your explanation there seems to be no language rules, just
implementation accidents. And none of those will be particularly
helpful in my case.
However, I still think I'm going to try using the builtin 'intern'
rather than my own dict cache. That may provide an advantage, even if it
doesn't work with unicode.
--
Mike
More information about the Python-list
mailing list