[Numpy-discussion] Dynamic array list implementation

Chris Barker chris.barker at noaa.gov
Thu Dec 31 13:08:34 EST 2015


On Wed, Dec 30, 2015 at 6:34 AM, Nicolas P. Rougier <
Nicolas.Rougier at inria.fr> wrote:

>
> > On 28 Dec 2015, at 19:58, Chris Barker <chris.barker at noaa.gov> wrote:
> >
> > >>> python benchmark.py
> > Python list, append 100000 items: 0.01161
> > Array list, append 100000 items: 0.46854
> >
> > are you pre-allocating any extra space? if not -- it's going to be
> really, really pokey when adding a little bit at a time.
>
>
> Yes, I’m preallocating but it might not be optimal at all given your
> implementation is much faster.
> I’ll try to adapt your code. Thanks.


sounds good -- I'll try to take a look at yours soon - maybe we can merge
the projects. MIne is only operational in one small place, I think.

-CHB






>
> >
> > With my Accumulator class:
> >
> >
> https://github.com/PythonCHB/NumpyExtras/blob/master/numpy_extras/accumulator.py
> >
> > I pre-allocate a larger numpy array to start, and it gets re-allocated,
> with some extra, when filled, using ndarray.resize()
> >
> > this is quite fast.
> >
> > These are settable parameters in the class:
> >
> > DEFAULT_BUFFER_SIZE = 128 # original buffer created.
> > BUFFER_EXTEND_SIZE = 1.25 # array.array uses 1+1/16 -- that seems small
> to me.
> >
> >
> > I looked at the code in array.array (and list, I think), and it does
> stuff to optimize very small arrays, which I figured wasn't the use-case
> here :-)
> >
> > But I did a bunch of experimentation, and as long as you pre-allocate
> _some_ it doesn't make much difference how much :-)
> >
> > BTW,
> >
> > I just went in an updated and tested the Accumulator class code -- it
> needed some tweaks, but it's working now.
> >
> > The cython version is in an unknown state...
> >
> > some profiling:
> >
> > In [11]: run profile_accumulator.py
> >
> >
> > In [12]: timeit accum1(10000)
> >
> > 100 loops, best of 3: 3.91 ms per loop
> >
> > In [13]: timeit list1(10000)
> >
> > 1000 loops, best of 3: 1.15 ms per loop
> >
> > These are simply appending 10,000 integers in a loop -- with teh list,
> the list is turned into a numpy array at the end. So it's still faster to
> accumulate in a list, then make an array, but only a about a factor of 3 --
> I think this is because you are staring with a python integer -- with the
> accumulator function, you need to be checking type and pulling a native
> integer out with each append. but a list can append a python object with no
> type checking or anything.
> >
> > Then the conversion from list to array is all in C.
> >
> > Note that the accumulator version is still more memory efficient...
> >
> > In [14]: timeit accum2(10000)
> >
> > 100 loops, best of 3: 3.84 ms per loop
> >
> > this version pre-allocated the whole internal buffer -- not much faster
> the buffer re-allocation isn't a big deal (thanks to ndarray.resize using
> realloc(), and not creating a new numpy array)
> >
> > In [24]: timeit list_extend1(100000)
> >
> > 100 loops, best of 3: 4.15 ms per loop
> >
> > In [25]: timeit accum_extend1(100000)
> >
> > 1000 loops, best of 3: 1.37 ms per loop
> >
> > This time, the stuff is added in chunks 100 elements at a time -- the
> chunks being created ahead of time -- a list with range() the first time,
> and an array with arange() the second. much faster to extend with arrays...
> >
> > -CHB
> >
> >
> >
> > --
> >
> > Christopher Barker, Ph.D.
> > Oceanographer
> >
> > Emergency Response Division
> > NOAA/NOS/OR&R            (206) 526-6959   voice
> > 7600 Sand Point Way NE   (206) 526-6329   fax
> > Seattle, WA  98115       (206) 526-6317   main reception
> >
> > Chris.Barker at noaa.gov
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151231/8c9dac38/attachment.html>


More information about the NumPy-Discussion mailing list