[Numpy-discussion] load from text files Pull Request Review

Christopher Jordan-Squire cjordan1 at uw.edu
Tue Sep 13 16:01:23 EDT 2011


On Tue, Sep 13, 2011 at 2:41 PM, Chris.Barker <Chris.Barker at noaa.gov> wrote:
> On 9/12/11 4:38 PM, Christopher Jordan-Squire wrote:
>> I did some timings to see what the advantage would be, in the simplest
>> case possible, of taking multiple lines from the file to process at a
>> time.
>
> Nice work, only a minor comment:
>> f6 and f7 use stripped down versions of Chris
>> Barker's accumulator idea. The difference is that f6 uses resize when
>> expanding the array while f7 uses np.empty followed by np.append. This
>> avoids the penalty from copying data that np.resize imposes.
>
> I don't think it does:
>
> """
> In [3]: np.append?
> ----------
> arr : array_like
>     Values are appended to a copy of this array.
>
> Returns
> -------
> out : ndarray
>     A copy of `arr` with `values` appended to `axis`.  Note that `append`
>     does not occur in-place: a new array is allocated and filled.
> """
>
> There is no getting around the copying. However, I think resize() uses
> the OS memory re-allocate call, which may, in some instances, have
> over-allocated the memory in the first place, and thus not require a
> copy. So I'm pretty sure ndarray.resize is as good as it gets.
>
>> f6 : 3.26ms
>> f7 : 2.77ms (Apparently it's a lot cheaper to do np.empty followed by
>> append then do to resize)
>
> Darn that profiling proving my expectations wrong again! though I'm
> really confused as to how that could be!
>

Sorry, I cheated by reading the docs. :-)
"""
numpy.resize(a, new_shape)

Return a new array with the specified shape.

If the new array is larger than the original array, then the new array
is filled with repeated copies of a. Note that this behavior is
different from a.resize(new_shape) which fills with zeros instead of
repeated copies of a.
"""

The copying I meant was that numpy.resize will fill the resized array
with repeated copies of the data. So np.empty followed by np.append
avoids that.

-Chris

> -Chris
>
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list