[Numpy-discussion] NumPy re-factoring project
Pauli Virtanen
pav at iki.fi
Fri Jun 11 12:42:11 EDT 2010
Fri, 11 Jun 2010 15:31:45 +0200, Sturla Molden wrote:
[clip]
>> The innermost dimension is handled via the ufunc loop, which is a
>> simple for loop with constant-size step and is given a number of
>> iterations. The array iterator objects are used only for stepping
>> through the outer dimensions. That is, it essentially steps through
>> your dtype** array, without explicitly constructing it.
>
> Yes, exactly my point. And because the iterator does not explicitely
> construct the array, it sucks for parallel programming (e.g. with
> OpenMP):
>
> - The iterator becomes a bottleneck to which access must be serialized
> with a mutex.
> - We cannot do proper work scheduling (load balancing)
I don't necessarily agree: you can do
for parallelized outer loop {
critical section {
p = get iterator pointer
++iterator
}
inner loop in region `p`
}
This does allow load balancing etc., as a free processor can immediately
grab the next available slice. Also, it would be easier to implement with
OpenMP pragmas in the current code base.
Of course, the assumption here is that the outer iterator overhead is
small compared to the duration of the inner loop. This must then be
compared to the memory access overhead involved in the dtype** array.
--
Pauli Virtanen
More information about the NumPy-Discussion
mailing list