[Python-Dev] Idea: more compact, interned string key only dict for namespace.

INADA Naoki songofacandy at gmail.com
Thu Jun 23 00:43:17 EDT 2016


Hi, Mark.  Thank you for reply.

On Thu, Jun 23, 2016 at 10:30 AM, Mark Shannon <mark at hotpy.org> wrote:
> Hi all,
>
> I think we need some more data before going any further reimplementing
> dicts.
>
> What I would like to know is, across a set of Python programs (ideally a
> representative set), what the proportion of dicts in memory at any one time
> are:
>
> a) instance dicts
> b) other namespace dicts (classes and modules)
> c) data dicts with all string keys
> d) other data dicts
> e) keyword argument dicts (I'm guessing this is vanishingly small)
>
> I would expect that (a) far exceeds (b) and depending on the application
> also considerably exceeds (c), but I would like some real data.
> From that we can compute the (approximate) memory costs of the competing
> designs.

I think you're right.
But, I don't have clear idea about how to do it.
Is there existing effort about collecting stats of dict?

>
> As an aside, if anyone is really keen to save memory, then removing the
> cycle GC header is the thing to do.
> That uses 24 bytes per object and *half* of all live objects have it.
> And don't forget that any Python object is really two objects, the object
> and its dict, so that is 48 extra bytes every time you create a new object.
>

It's great idea.  But I can't do it before Python 3.6.

My main concern is not saving memory, ordered dict for **kwargs without
significant overhead.

If "orderd, except key sharing dict" is acceptable, no problem.
Key sharing compact dict is smaller than current key sharing dict of Python 3.5
for most cases.
https://docs.google.com/spreadsheets/d/1nN5y6IsiJGdNxD7L7KBXmhdUyXjuRAQR_WbrS8zf6mA/edit#gid=0

Regards,

--
INADA Naoki  <songofacandy at gmail.com>


More information about the Python-Dev mailing list