[Python-Dev] CPython optimization: storing reference counters outside of objects

Antoine Pitrou solipsis at pitrou.net
Sun May 22 14:48:37 CEST 2011


Hello,

On Sun, 22 May 2011 01:57:55 +0200
Artur Siekielski <artur.siekielski at gmail.com> wrote:
> 1. CPU cache lines (64 bytes on X86) containing a beginning of a
> PyObject are very often invalidated, resulting in loosing many chances
> to use the CPU caches

Mutating data doesn't invalidate a cache line. It just makes it
necessary to write it back to memory at some point.

> 2. The copy-on-write after fork() optimization (Linux) is almost
> useless in CPython, because even if you don't modify data directly,
> refcounts are modified, and PyObjects with refcounts inside are spread
> all over process' memory (and one small refcount modification causes
> the whole page - 4kB - to be copied into a child process).

Indeed.

> I'm not a compiler/profiling expert so the main question is if such
> design can work, and maybe someone was thinking about something
> similar? And if CPython was profiled for CPU cache usage?

This has already been proposed a couple of times. I guess what's needed
is for someone to experiment and post benchmark results.

Regards

Antoine.




More information about the Python-Dev mailing list