[Cython] prange CEP updated

Thu Apr 21 11:21:18 CEST 2011

On 21 April 2011 10:59, mark florisson <markflorisson88 at gmail.com> wrote:
> On 21 April 2011 10:37, Robert Bradshaw <robertwb at math.washington.edu> wrote:
>> On Mon, Apr 18, 2011 at 7:51 AM, mark florisson
>> <markflorisson88 at gmail.com> wrote:
>>> On 18 April 2011 16:41, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:
>>>> Excellent! Sounds great! (as I won't have my laptop for some days I can't
>>>> have a look yet but I will later)
>>>>
>>>> You're right about (the current) buffers and the gil. A testcase explicitly
>>>> for them would be good.
>>>>
>>>> Firstprivate etc: i think it'd be nice myself, but it is probably better to
>>>> take a break from it at this point so that we can think more about that and
>>>> not do anything rash; perhaps open up a specific thread on them and ask for
>>>> more general input. Perhaps you want to take a break or task-switch to
>>>> something else (fused types?) until I can get around to review and merge
>>>> what you have so far? You'll know best what works for you though. If you
>>>> decide to implement explicit threadprivate variables because you've got the
>>>> flow I certainly wom't object myself.
>>>>
>>>  Ok, cool, I'll move on :) I already included a test with a prange and
>>> a numpy buffer with indexing.
>>
>> Wow, you're just plowing away at this. Very cool.
>>
>> +1 to disallowing nested prange, that seems to get really messy with
>> little benefit.
>>
>> In terms of the CEP, I'm still unconvinced that firstprivate is not
>> safe to infer, but lets leave the initial values undefined rather than
>> specifying them to be NaNs (we can do that as an implementation if you
>> want), which will give us flexibility to change later once we've had a
>> chance to play around with it.
>
> Yes, they are currently undefined (and not initialized to NaN etc).
> The thing is that without the control flow analysis (or perhaps not
> until runtime) you won't know whether a variable is initialized at all
> before the parallel section, so making it firstprivate might actually
> copy an undefined value (perhaps with a trap representation!) into the
> thread-private copy, which might invalidate valid code. e.g. consider
>
> x_is_initialized = False
> if condition:
>    x = 1
>    x_is_initialized = True
>
> for i in prange(10, schedule='static'):
>    if x_is_initialized:
>        printf("%d\n", x)
>    x = i

Erm, that snippet I posted is invalid in any case, as x will be
private. So guess initializing things to NaN in such would have to
occur in the parallel section that should enclose the for. So e.g.
we'd have to do

#pragma omp parallel private(x)
{
    x = INT_MAX;
    #pragma omp for lastprivate(i)
    for (...)
        ...
}

Which would then mean that 'x' cannot be lastprivate anymore :). So
it's either "uninitialized and undefined" or "firstprivate". I
personally prefer the former for the implicit route.

I do like the threadlocal=a stuff to parallel, it's basically what I
proposed a while back except that you don't make them strings, but
better because most of your variables can be inferred, so the
messiness is gone.

>> The "cdef threadlocal(int) foo" declaration syntax feels odd to me...
>> We also probably want some way of explicitly marking a variable as
>> shared and still be able to assign to/flush/sync it. Perhaps the
>> parallel context could be used for these declarations, i.e.
>>
>>    with parallel(threadlocal=a, shared=(b,c)):
>>        ...
>>
>> which would be considered an "expert" usecase.
>
> Indeed, assigning to elements in an array instead doesn't seem very
> convenient :)
>
>> For all the discussion of threadsavailable/threadid, the most common
>> usecase I see is for allocating a large shared buffer and partitioning
>> it. This seems better handled by allocating separate thread-local
>> buffers, no? I still like the context idea, but everything in a
>> parallel block before and after the loop(s) also seems like a natural
>> place to put any setup/teardown code (though the context has the
>> advantage that __exit__ is always called, even if exceptions are
>> raised, which makes cleanup a lot easier to handle).
>
> Currently 'with gil' isn't merged into that branch, and if it will, it
> will be disallowed, as I'm not yet sure how (if at all) it could be
> handled with regard to exceptions. It seems a lot easier to disallow
> it and have the user write a 'with gil' function, from which nothing
> can propagate.
>
>> - Robert
>> _______________________________________________
>> cython-devel mailing list
>> cython-devel at python.org
>> http://mail.python.org/mailman/listinfo/cython-devel
>>
>