[Python-Dev] Rethinking intern() and its data structure

Collin Winter collinw at gmail.com
Thu Apr 9 18:29:00 CEST 2009


Hi John,

On Thu, Apr 9, 2009 at 8:02 AM, John Arbash Meinel
<john at arbash-meinel.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I've been doing some memory profiling of my application, and I've found
> some interesting results with how intern() works. I was pretty surprised
> to see that the "interned" dict was actually consuming a significant
> amount of total memory.
> To give the specific values, after doing:
>  bzr branch A B
> of a small project, the total memory consumption is ~21MB

[snip]

> Anyway, I the internals of intern() could be done a bit better. Here are
> some concrete things:

[snip]

Memory usage is definitely something we're interested in improving.
Since you've already looked at this in some detail, could you try
implementing one or two of your ideas and see if it makes a difference
in memory consumption? Changing from a dict to a set looks promising,
and should be a fairly self-contained way of starting on this. If it
works, please post the patch on http://bugs.python.org with your
results and assign it to me for review.

Thanks,
Collin Winter


More information about the Python-Dev mailing list