[Cython] CEP: prange for parallel loops

mark florisson markflorisson88 at gmail.com
Tue Apr 5 19:33:42 CEST 2011


On 5 April 2011 18:32, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:
> On 04/05/2011 05:26 PM, Robert Bradshaw wrote:
>>
>> On Tue, Apr 5, 2011 at 8:02 AM, Dag Sverre Seljebotn
>> <d.s.seljebotn at astro.uio.no>  wrote:
>>>
>>> On 04/05/2011 04:58 PM, Dag Sverre Seljebotn wrote:
>>>>
>>>> On 04/05/2011 04:53 PM, Robert Bradshaw wrote:
>>>>>
>>>>> On Tue, Apr 5, 2011 at 3:51 AM, Stefan Behnel<stefan_ml at behnel.de>
>>>>>  wrote:
>>>>>>
>>>>>> mark florisson, 04.04.2011 21:26:
>>>>>>>
>>>>>>> For clarity, I'll add an example:
>>>>>>>
>>>>>>> def f(np.ndarray[double] x, double alpha):
>>>>>>>     cdef double s = 0
>>>>>>>     cdef double tmp = 2
>>>>>>>     cdef double other = 6.6
>>>>>>>
>>>>>>>     with nogil:
>>>>>>>         for i in prange(x.shape[0]):
>>>>>>>             # reading 'tmp' makes it firstprivate in addition to
>>>>>>> lastprivate
>>>>>>>             # 'other' is only ever read, so it's shared
>>>>>>>             printf("%lf %lf %lf\n", tmp, s, other)
>>>>>>
>>>>>> So, adding a printf() to your code can change the semantics of your
>>>>>> variables? That sounds like a really bad design to me.
>>>>>
>>>>> That's what I was thinking. Basically, if you do an inlace operation,
>>>>> then it's a reduction variable, no matter what else you do to it
>>>>> (including possibly a direct assignment, though we could make that a
>>>>> compile-time error).
>>>>
>>>> -1, I think that's too obscure. Not being able to use inplace operators
>>>> for certain variables will be at the very least be nagging.
>>
>> You could still use inplace operators to your hearts content--just
>> don't bother using the reduced variable outside the loop. (I guess I'm
>> assuming reducing a variable has negligible performance overhead,
>> which it should.) For the rare cases that you want the non-aggregated
>> private, make an assignment to another variable, or use non-inplace
>> operations.
>
> Ahh! Of course! With some control flow analysis we could even eliminate the
> reduction if the variable isn't used after the loop, although I agree the
> cost should be trivial.
>
>
>> Not being able to mix inplace operators might be an annoyance. We
>> could also allow explicit declarations, as per Pauli's suggestion, but
>> not require them. Essentially, as long as we have
>
> I think you should be able to mix them, but if you do a reduction doesn't
> happen. This is slightly uncomfortable, but I believe control flow analysis
> and disabling firstprivate can solve it, see below.
>
> I believe I'm back in the implicit-camp. And the CEP can probably be
> simplified a bit too, I'll try to do that tomorrow.
>
> Two things:
>
>  * It'd still be nice with something like a parallel block for thread
> setup/teardown rather than "if firstthreaditeration():". So, a prange for
> the 50% simplest cases, followed by a parallel-block for the next 30%.

Definitely, I think it could also make way for things such as sections
etc, but I'll bring that up later :)

>  * Control flow analysis can help us tight it up a bit: For loops where you
> actually depend on values of thread-private variables computed in the
> previous iteration (beyond reduction), it'd be nice to raise a warning
> unless the variable is explicitly declared thread-local or similar. There
> are uses for such variables but they'd be rather rare, and such a hint could
> be very helpful.
>
> I'm still not sure if we want firstprivate, even if we can do it. It'd be
> good to see a usecase for it. I'd rather have NaN and 0x7FFFFFFF personally,
> as relying on the firstprivate value is likely a bug -- yes, it makes the
> sequential case work, but that is exactly in the case where parallelizing
> the sequential case would be wrong!!

Yeah, I think if we go the implicit route then firstprivate might be
quite a surprise for users.

> Grepping through 30000 lines of heavily OpenMP-ified Fortran code here
> there's no mention of firstprivate or lastprivate (although we certainly
> want lastprivate to align with the sequential case).
>
> Dag Sverre
>
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>

Basically I'm fine with either implicit or explicit, although I think
the explicit case would be easier to understand for people that have
used OpenMP. In either case it would be nice to give prange a 'nogil'
option.

So to be clear, when we assign to a variable it will be lastprivate,
and when we assign to the subscript of a variable we make that
variable shared (unless it is declared inside the parallel with
block), right?


More information about the cython-devel mailing list