[issue19187] Use a set for interned strings

STINNER Victor report at bugs.python.org
Wed Oct 9 21:17:40 CEST 2013


STINNER Victor added the comment:

Raymond> FWIW, I'm dubious that there will be any benefit from this at all.  The savings of one-pointer is the dictionary is likely to be insignificant compared to the size of the string object themselves.

As I wrote in python-dev, the dictionary is usually the largest memory block, at least at Python startup. The dictionary (without counting the string, just the dict) is between 192 KB and 1.5 MB on x86_64.

In the implementation of the PEP 454, issue #18874, I added a function to get the length and size of the dictionary of Unicode interned strings.

Objects/unicodeobject.c:

PyObject*
_PyUnicode_GetInterned(void)
{
    return interned;
}

tracemalloc.get_unicode_interned():

http://hg.python.org/features/tracemalloc/file/b797779940a5/Modules/_tracemalloc.c#l4606

You can use this function to see how many KB are saved. In embedded systems, every byte of memory counts :-)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19187>
_______________________________________


More information about the Python-bugs-list mailing list