[SciPy-user] Dealing with Large Data Sets
Damian Eads
eads at soe.ucsc.edu
Sun May 11 03:38:50 EDT 2008
Anne Archibald wrote:
> 2008/5/10 Damian Eads <eads at soe.ucsc.edu>:
>> Damian Eads wrote:
>>
>>> which perform the operations in an in-place fashion. If data.sum(axis =
>>> 2) is large, preallocate an array to store the sum,
>>>
>>> # for summing over columns
>>> sum_result = numpy.zeros(data.shape[0:2])
>> I meant to include
>>
>> data **= 2
>> np.sum(data, axis=2, out=sum_result)
>>
>> which does an in-place, element-wise exponentiate, sums over the
>> columns, and stores the result in sum_result.
>
> What is the advantage to preallocating the result rather than letting
> sum() do the allocation?
If the computation is repeated millions of times and the sum array is
large (100s of MBs), then it is certainly advantageous to allocate the
sum array once than for each computation.
Damian
More information about the SciPy-User
mailing list