[Cython] memoryview slices can't be None?

mark florisson markflorisson88 at gmail.com
Fri Feb 3 22:59:03 CET 2012


On 3 February 2012 18:06, mark florisson <markflorisson88 at gmail.com> wrote:
> On 3 February 2012 17:53, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no> wrote:
>> On 02/03/2012 12:09 AM, mark florisson wrote:
>>>
>>> On 2 February 2012 21:38, Dag Sverre Seljebotn
>>> <d.s.seljebotn at astro.uio.no>  wrote:
>>>>
>>>> On 02/02/2012 10:16 PM, mark florisson wrote:
>>>>>
>>>>>
>>>>> On 2 February 2012 12:19, Dag Sverre Seljebotn
>>>>> <d.s.seljebotn at astro.uio.no>    wrote:
>>>>>>
>>>>>>
>>>>>> I just realized that
>>>>>>
>>>>>> cdef int[:] a = None
>>>>>>
>>>>>> raises an exception; even though I'd argue that 'a' is of the
>>>>>> "reference"
>>>>>> kind of type where Cython usually allow None (i.e., "cdef MyClass b =
>>>>>> None"
>>>>>> is allowed even if type(None) is NoneType). Is this a bug or not, and
>>>>>> is
>>>>>> it
>>>>>> possible to do something about it?
>>>>>>
>>>>>> Dag Sverre
>>>>>> _______________________________________________
>>>>>> cython-devel mailing list
>>>>>> cython-devel at python.org
>>>>>> http://mail.python.org/mailman/listinfo/cython-devel
>>>>>
>>>>>
>>>>>
>>>>> Yeah I disabled that quite early. It was supposed to be working but
>>>>> gave a lot of trouble in cases (segfaults, mainly). At the time I was
>>>>> trying to get rid of all the segfaults and get the basic functionality
>>>>> working, so I disabled it. Personally, I have never liked how things
>>>>
>>>>
>>>>
>>>> Well, you can segfault quite easily with
>>>>
>>>> cdef MyClass a = None
>>>> print a.field
>>>>
>>>> so it doesn't make sense to slices different from cdef classes IMO.
>>>>
>>>>
>>>>> can be None unchecked. I personally prefer to write
>>>>>
>>>>> cdef foo(obj=None):
>>>>>     cdef int[:] a
>>>>>     if obj is None:
>>>>>         obj = ...
>>>>>     a = obj
>>>>>
>>>>> Often you forget to write 'not None' when declaring the parameter (and
>>>>> apparently that it only allowed for 'def' functions).
>>>>>
>>>>> As such, I never bothered to re-enable it. However, it does support
>>>>> control flow with uninitialized slices, and will raise an error if it
>>>>> is uninitialized. Do we want this behaviour (e.g. for consistency)?
>>>>
>>>>
>>>>
>>>> When in doubt, go for consistency. So +1 for that reason. I do believe
>>>> that
>>>> setting stuff to None is rather vital in Python.
>>>>
>>>> What I typically do is more like this:
>>>>
>>>> def f(double[:] input, double[:] out=None):
>>>>    if out is None:
>>>>        out = np.empty_like(input)
>>>>    ...
>>>>
>>>> Having to use another variable name is a bit of a pain. (Come on -- do
>>>> you
>>>> use "a" in real code? What do you actually call "the other obj"? I
>>>> sometimes
>>>> end up with "out_" and so on, but it creates smelly code quite quickly.)
>>>
>>>
>>> No, it was just a contrived example.
>>>
>>>> It's easy to segfault with cdef classes anyway, so decent nonechecking
>>>> should be implemented at some point, and then memoryviews would use the
>>>> same
>>>> mechanisms. Java has decent null-checking...
>>>>
>>>
>>> The problem with none checking is that it has to occur at every point.
>>
>>
>> Well, using control flow analysis etc. it doesn't really. E.g.,
>>
>> for i in range(a.shape[0]):
>>    print i
>>    a[i] *= 3
>>
>> can be unrolled and none-checks inserted as
>>
>> print 0
>> if a is None: raise ....
>> a[0] *= 3
>> for i in range(1, a.shape[0]):
>>    print i
>>    a[i] *= 3 # no need for none-check
>>
>> It's very similar to what you'd want to do to pull boundschecking out of the
>> loop...
>>
>
> Oh, definitely. Both optimizations may not always be possible to do,
> though. The optimization (for boundschecking) is easier for prange()
> than range(), as you can immediately raise an exception as the
> exceptional condition may be issued at any iteration.  What do you do
> with bounds checking when some accesses are in-bound, and some are
> out-of-bound? Do you immediately raise the exception? Are we fine with
> aborting (like Fortran compilers do when you ask them for bounds
> checking)? And how do you detect that the code doesn't already raise
> an exception or break out of the loop itself to prevent the
> out-of-bound access? (Unless no exceptions are propagating and no
> break/return is used, but exceptions are so very common).

I enabled bound checking in nogil contexts:
https://github.com/markflorisson88/cython/commit/73c6b0ea8e7e1c243e87b3966ade834b02664a4f
. It's not optimized yet, but at least it doesn't force users to use
boundscheck(False), it just hints that it would be faster to disable
the bounds checking.

When we actually start optimizing these things (e.g. moving it outside
loops etc), it might also be useful to consider inlining functions at
the Cython level (otherwise optimizations cannot escape the function).

>>> With initialized slices the control flow knows when the slices are
>>> initialized, or when they might not be (and it can raise a
>>> compile-time or runtime error, instead of a segfault if you're lucky).
>>> I'm fine with implementing the behaviour, I just always left it at the
>>> bottom of my todo list.
>>
>>
>> Wasn't saying you should do it, just checking.
>>
>> I'm still not sure about this. I think what I'd really like is
>>
>>  a) Stop cdef classes from being None as well
>>
>>  b) Sort-of deprecate cdef in favor of cast/assertion type statements that
>> help the type inferences:
>>
>> def f(arr):
>>    if arr is None:
>>        arr = ...
>>    arr = int[:](arr) # equivalent to "cdef int[:] arr = arr", but
>>                      # acts as statement, with a specific point
>>                      # for the none-check
>>    ...
>>
>> or even:
>>
>> def f(arr):
>>    if arr is None:
>>        return 'foo'
>>    else:
>>        arr = int[:](arr) # takes effect *here*, does none-check
>>        ...
>>    # arr still typed as int[:] here
>>
>> If we can make this work well enough with control flow analysis I'd never
>> cdef declare local vars again :-)
>
> Hm, what about the following?
>
> def f(arr):
>    if arr is None:
>        return 'foo'
>
>    cdef int[:] arr # arr may not be None
>
>> Dag
>>
>> _______________________________________________
>> cython-devel mailing list
>> cython-devel at python.org
>> http://mail.python.org/mailman/listinfo/cython-devel


More information about the cython-devel mailing list