[Python-Dev] Inplace operations for PyLong objects
Manciu, Catalin Gabriel
catalin.gabriel.manciu at intel.com
Thu Aug 31 19:35:34 EDT 2017
>On my machine, the more realistic code, with an implicit C loop,
>the_value = sum(the_increment for i in range(total_iters))
>gives the same value twice as fast as your explicit Python loop.
>(I cut total_iters down to 10**7).
Your code is faster due to a number of reasons:
- range in Python 3 is implemented in C so it's quite faster
and, because your range only goes up to 10 ** 7, the fastest iterator
is used: rangeiterobject for which the 'next' function is implemented
using native longs instead of CPython PyLongs:
rangeiter_next(rangeiterobject *r) from rangeobject.c
- my code also does some extra work to output a progress indicator
>You might check whether sum uses an in-place accumulator for ints.
- you're right, sum actually works with native longs until it overflows or
you stop adding PyLongs, then it falls back to PyNumber_Add, check:
static PyObject * builtin_sum_impl(PyObject *module, PyObject *iterable,
PyObject *start)
from bltinmodule.c
The focus of this experiment was inplace adds in general. While, as you've
shown, there are ways to write the loop optimally, the benchmark was written
as a huge loop just to showcase that there is an improvement using this
approach. The performance improvement is a result of not having to
allocate/deallocate a PyLong per iteration.
A huge Python program with lots of PyLong inplace operations (not just
adds, this can be applied to all PyLong inplace operations), regardless of them
being in a loop or not, might benefit from such an optimization.
Thank you,
Catalin
More information about the Python-Dev
mailing list