[Numpy-discussion] Dynamic array list implementation

Chris Barker chris.barker at noaa.gov
Thu Dec 24 13:23:24 EST 2015


On Thu, Dec 24, 2015 at 10:19 AM, Chris Barker <chris.barker at noaa.gov>
wrote:

> I'll try to get the code up on gitHub.
>

Hey look -- it's already there:

https://github.com/PythonCHB/NumpyExtras

too many gitHub accounts.....

Here is the list/growable array/ accumulator:

https://github.com/PythonCHB/NumpyExtras/blob/master/numpy_extras/accumulator.py

And here is the ragged array:

https://github.com/PythonCHB/NumpyExtras/blob/master/numpy_extras/ragged_array.py

I haven't touched either of these for a while -- not really sure what state
they are in.

-CHB





> It would be nice to combine efforts.
>
> -CHB
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>> In my case I need to ensure a contiguous storage to allow easy upload
>> onto the GPU.
>> But my implementation is quite slow, especially when you add one item at
>> a time:
>>
>> >>> python benchmark.py
>> Python list, append 100000 items: 0.01161
>> Array list, append 100000 items: 0.46854
>> Array list, append 100000 items at once: 0.05801
>> Python list, prepend 100000 items: 1.96168
>> Array list, prepend 100000 items: 12.83371
>> Array list, append 100000 items at once: 0.06002
>>
>>
>>
>> I realize I did not answer all Chris' questions:
>>
>> >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] )
>> >>> for item in L: print(item)
>> [0]
>> [1 2]
>> [3 4 5]
>> [6 7 8 9]
>>
>> >>> print (type(L.data))
>> <class 'numpy.ndarray'>
>> >>> print(L.data.dtype)
>> int64
>> >>> print(L.data.shape)
>> (10,)
>>
>>
>> I did not implement operations yet, but it would be a matter for
>> transferring call to the underlying numpy data array.
>> >>> L._data *= 2
>> >>> print(L)
>> [[0], [4 8], [12 16 20], [24 28 32 36]]
>>
>>
>>
>> > On 23 Dec 2015, at 09:34, Stephan Hoyer <shoyer at gmail.com> wrote:
>> >
>> > We have a type similar to this (a typed list) internally in pandas,
>> although it is restricted to a single dimension and far from feature
>> complete -- it only has .append and a .to_array() method for converting to
>> a 1d numpy array. Our version is written in Cython, and we use it for
>> performance reasons when we would otherwise need to create a list of
>> unknown length:
>> > https://github.com/pydata/pandas/blob/v0.17.1/pandas/hashtable.pyx#L99
>> >
>> > In my experience, it's several times faster than using a builtin list
>> from Cython, which makes sense given that it needs to copy about 1/3 the
>> data (no type or reference count for individual elements). Obviously, it
>> uses 1/3 the space to store the data, too. We currently don't expose this
>> object externally, but it could be an interesting project to adapt this
>> code into a standalone project that could be more broadly useful.
>> >
>> > Cheers,
>> > Stephan
>> >
>> >
>> >
>> > On Tue, Dec 22, 2015 at 8:20 PM, Chris Barker <chris.barker at noaa.gov>
>> wrote:
>> >
>> > sorry for being so lazy as to not go look at the project pages, but....
>> >
>> > This sounds like it could be really useful, and maybe supercise a coupl
>> eof half-baked projects of mine. But -- what does "dynamic" mean?
>> >
>> > - can you append to these arrays?
>> > - can it support "ragged arrrays" -- it looks like it does.
>> >
>> > >>> L = ArrayList( [[0], [1,2], [3,4,5], [6,7,8,9]] )
>> > >>> print(L)
>> > [[0], [1 2], [3 4 5], [6 7 8 9]]
>> >
>> > so this looks like a ragged array -- but what do you get when you do:
>> >
>> > for row in L:
>> >     print row
>> >
>> >
>> > >>> print(L.data)
>> > [0 1 2 3 4 5 6 7 8
>> >
>> > is .data a regular old 1-d numpy array?
>> >
>> > >>> L = ArrayList( np.arange(10), [3,3,4])
>> > >>> print(L)
>> > [[0 1 2], [3 4 5], [6 7 8 9]]
>> > >>> print(L.data)
>> > [0 1 2 3 4 5 6 7 8 9]
>> >
>> >
>> > does an ArrayList act like a numpy array in other ways:
>> >
>> > L * 5
>> >
>> > L* some_array
>> >
>> > in which case, how does it do broadcasting???
>> >
>> > Thanks,
>> >
>> > -CHB
>> >
>> > >>> L = ArrayList(["Hello", "world", "!"])
>> > >>> print(L[0])
>> > 'Hello'
>> > >>> L[1] = "brave new world"
>> > >>> print(L)
>> > ['Hello', 'brave new world', '!']
>> >
>> >
>> >
>> > Nicolas
>> >
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion at scipy.org
>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Christopher Barker, Ph.D.
>> > Oceanographer
>> >
>> > Emergency Response Division
>> > NOAA/NOS/OR&R            (206) 526-6959   voice
>> > 7600 Sand Point Way NE   (206) 526-6329   fax
>> > Seattle, WA  98115       (206) 526-6317   main reception
>> >
>> > Chris.Barker at noaa.gov
>> >
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion at scipy.org
>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151224/568adf02/attachment.html>


More information about the NumPy-Discussion mailing list