chunking a long string?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sun Nov 10 01:39:38 EST 2013


On Sun, 10 Nov 2013 09:14:28 +1100, Chris Angelico wrote:

> And
> as is typical of python-list, it's this extremely minor point that
> became the new course of the thread - 

You say that as if it were a bad thing :-P


> my main point was not whether all,
> some, or no strings get interned, but that string interning makes the
> storage space of duplicate strings immaterial :)

True. It's not just a memory saver[1], but a time saver too. Using Python 
3.3:

py> from timeit import Timer
py> t1 = Timer('s == t', setup='s = "a b"*10000; t = "a b"*10000')
py> t2 = Timer('s == t', 
...     setup='from sys import intern; s = intern("a b"*10000); '
...           't = intern("a b"*10000)')
py> min(t1.repeat(number=100000))
7.651959054172039
py> min(t2.repeat(number=100000))
0.00881262868642807


String equality does a short-cut of checking for identity; if the strings 
are interned, they will be identical.



[1] Assuming that you actually do have duplicate strings. If every string 
is unique, interning them potentially wastes memory.



-- 
Steven



More information about the Python-list mailing list