[Python-Dev] The untuned tunable parameter ARENA_SIZE

Tim Peters tim.peters at gmail.com
Mon Jun 5 23:10:49 EDT 2017


[Larry]
> ...
> Oh!  I thought it also allocated the arenas themselves, in a loop.  I
> thought I saw that somewhere.  Happy to be proved wrong...

There is a loop in `new_arena()`, but it doesn't do what a casual
glance may assume it's doing ;-)  It's actually looping over the
newly-allocated teensy arena descriptor structs, linking them in to a
freelist and recording that they're _not_ (yet) associated with any
address space.


[Tim]
>> So at most 9 arenas ("highwater mark") were ever simultaneously allocated [by the
>> time the REPL prompt appeared in a 64-bit 3.6.1]..

> ... though not completely off-base.

Yes, 9 is in the ballpark of 16.

>> I was hoping to spur a discussion of much higher level issues.  I bet
>> Larry was too ;-)

> Actually I was hoping everyone would just tell me how right I was and thank
> me for my profound insights.

Thank you!  It should be thought about again.

I think _some_ increase of arena size should be a no-brainer, but I
don't expect it to help a lot.  For reasons already partly explained,
I expect we'd get much better bang for the buck by increasing the pool
size:

- Roughly speaking, we bash into a slows-the-code pool boundary 64
times as frequently as an arena boundary. If the arena size increased,
that ratio only gets worse.

- On 64-bit boxes the bytes lost to pool headers increased, but the
pool size did not.  Thus we're guaranteed to "waste" a higher
_percentage_ of allocated bytes than we did on a 32-bit box.

- The small object threshold also doubled.  Generally (not always),
the bytes lost to quantization increase the larger the size class.
For example, for the largest size class now, on a 64-bit box we can
only fit 7 512-byte objects into the 4096 - 48 = 4048 bytes that
remain in a pool.  So, in addition to the 48 pool header bytes, we
_also_ lose the 464 leftover bytes.  So we've lost 512 of the 4096
bytes:  12.5%.  That's certainly not typical, but in any case
quantization losses as percentage of total bytes decrease the larger
the pool.

Alas, I haven't thought of a plausibly practical way to replace
Py_ADDRESS_IN_RANGE unless the pool size increases a whole frickin'
lot :-(


More information about the Python-Dev mailing list