[pypy-dev] Copy of list

Tuom Larsen tuom.larsen at gmail.com
Tue Sep 29 12:27:25 CEST 2015


Hi Armin,

thanks a lot, both for the explanation and the fix! I will try it soon.

Have a nice day!

Tuom

PS: The speed difference came from larger piece of code, which I tried
to reproduce in "minimal viable test case". Hence that `timeit`, where
it showed up as well. But in any case, thanks a lot once more!


On Tue, Sep 29, 2015 at 9:25 AM, Armin Rigo <arigo at tunes.org> wrote:
> Hi Tuom,
>
> On Tue, Sep 29, 2015 at 7:31 AM, Tuom Larsen <tuom.larsen at gmail.com> wrote:
>> Please, let me rephrase my question: currently I use `[:]` because it
>> is faster in CPython (0.131 usec vs 0.269 usec per loop). I almost
>> don't mind changing it to `list()` because of PyPy but I was wondering
>> what do PyPy developers recommend. I don't understand why is `[:]`
>> twice as slow as `list()` as it seems it should do the same thing
>> (create a list and copy the content).
>
> Looking at the jit logs, it is tripped by a RPython function with a
> loop in its slow-path.  Fixed in 4e688540cfe9.
>
> There is still a bit of overhead.  For example, lst[:] is equivalent
> to lst[0:9223372036854775807].  The general logic looks like this:
> when doing lst[a:b], if b > len(lst) then replace b with len(lst).
> This means here a check if 9223372036854775807 > len(lst)...  It is
> not possible that the length of a list be that huge, but this
> knowledge is not codified explicitly.
>
> Yes, we could improve that in the future.
> But this is really advanced details.  You should write 'list()' or
> '[:]' as you feel more natural, or maybe as benefits the speed of
> CPython if it makes an important difference there.  Using 'timeit' to
> measure microbenchmarks in PyPy may or may not give a useful result.
> In this case it did only after you stopped using range() and only
> because we don't have more advanced optimizations that realize that
> the resulting list is not needed at all.  In general, you should not
> rely on it.
>
> What you should do instead is measure how much time is spent in some
> real loop of your algorithm, and compare it with variants.  (Make sure
> every variant is run in its own process, otherwise the JITting of
> similar pieces of code might interfere in unexpected ways.)  If you're
> lucky you may be able to find a variant that is overall much faster.
> If you're not, it means that what you're changing is not relevant for
> performance.
>
>
> A bientôt,
>
> Armin.


More information about the pypy-dev mailing list