[pypy-dev] BList strategy

Fri May 6 04:56:22 EDT 2016

Hi Omer,

On 6 May 2016 at 09:02, Omer Katz <omer.drow at gmail.com> wrote:
> I'm wondering if PyPy could by a chance benefit from implementing the blist
> algorithm as a strategy. Possibly even switching it to be the default.
>
> What do you guys think?

It would certainly be interesting to try.

The problem with using *only* this approach is the following.  It is
marketed as having "similar performance on small lists", but that's a
"similar" on top of CPython.  That means that if we need (say) to read
three memory locations instead of one, with a test in the middle, then
it is "similar" on CPython; indeed it would hardly be a noticable
overhead with the rest of the CPython interpreter around.  However, it
does have more impact inside JITted code.  I fear we're going to
replace a direct memory read with a call to a helper, for every read.
It is possibly still reasonable; but possibly not.  To be tried,
basically.

In the past we tried something similar with "ropes" (for strings).  In
the future we might try it with utf-8-encoded unicode strings.  They
all have the same drawback, and it needs to be tried and measured on a
case-by-case basis.

As a workaround for that problem, we could implement it as a list
strategy that we only switch to if:

* the number of items is large enough;
* and we try to do one of the operations that would take much less
time in the blist world.

In that way, we would keep most lists unmodified, and only switch to
blists when there is a clear benefit.

A bientôt,

Armin.