[Cython] Cython 0.16 RC 1

Sat Apr 14 17:32:31 CEST 2012

On 14 April 2012 14:57, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no> wrote:
> On 04/14/2012 12:46 PM, mark florisson wrote:
>>
>> On 12 April 2012 22:00, Wes McKinney<wesmckinn at gmail.com>  wrote:
>>>
>>> On Thu, Apr 12, 2012 at 10:38 AM, mark florisson
>>> <markflorisson88 at gmail.com>  wrote:
>>>>
>>>> Yet another release candidate, this will hopefully be the last before
>>>> the 0.16 release. You can grab it from here:
>>>> http://wiki.cython.org/ReleaseNotes-0.16
>>>>
>>>> There were several fixes for the numpy attribute rewrite, memoryviews
>>>> and fused types. Accessing the 'base' attribute of a typed ndarray now
>>>> goes through the object layer, which means direct assignment is no
>>>> longer supported.
>>>>
>>>> If there are any problems, please let us know.
>>>> _______________________________________________
>>>> cython-devel mailing list
>>>> cython-devel at python.org
>>>> http://mail.python.org/mailman/listinfo/cython-devel
>>>
>>>
>>> I'm unable to build pandas using git master Cython. I just released
>>> pandas 0.7.3 today which has no issues at all with 0.15.1:
>>>
>>> http://pypi.python.org/pypi/pandas
>>>
>>> For example:
>>>
>>> 16:57 ~/code/pandas  (master)$ python setup.py build_ext --inplace
>>> running build_ext
>>> cythoning pandas/src/tseries.pyx to pandas/src/tseries.c
>>>
>>> Error compiling Cython file:
>>> ------------------------------------------------------------
>>> ...
>>>        self.store = {}
>>>
>>>        ptr =<int32_t**>  malloc(self.depth * sizeof(int32_t*))
>>>
>>>        for i in range(self.depth):
>>>            ptr[i] =<int32_t*>  (<ndarray>  label_arrays[i]).data
>>>                                                          ^
>>> ------------------------------------------------------------
>>>
>>> pandas/src/tseries.pyx:107:59: Compiler crash in
>>> AnalyseExpressionsTransform
>>>
>>> ModuleNode.body = StatListNode(tseries.pyx:1:0)
>>> StatListNode.stats[23] = StatListNode(tseries.pyx:86:5)
>>> StatListNode.stats[0] = CClassDefNode(tseries.pyx:86:5,
>>>    as_name = u'MultiMap',
>>>    class_name = u'MultiMap',
>>>    doc = u'\n    Need to come up with a better data structure for
>>> multi-level indexing\n    ',
>>>    module_name = u'',
>>>    visibility = u'private')
>>> CClassDefNode.body = StatListNode(tseries.pyx:91:4)
>>> StatListNode.stats[1] = StatListNode(tseries.pyx:95:4)
>>> StatListNode.stats[0] = DefNode(tseries.pyx:95:4,
>>>    modifiers = [...]/0,
>>>    name = u'__init__',
>>>    num_required_args = 2,
>>>    py_wrapper_required = True,
>>>    reqd_kw_flags_cname = '0',
>>>    used = True)
>>> File 'Nodes.py', line 342, in analyse_expressions:
>>> StatListNode(tseries.pyx:96:8)
>>> File 'Nodes.py', line 342, in analyse_expressions:
>>> StatListNode(tseries.pyx:106:8)
>>> File 'Nodes.py', line 5903, in analyse_expressions:
>>> ForInStatNode(tseries.pyx:106:8)
>>> File 'Nodes.py', line 342, in analyse_expressions:
>>> StatListNode(tseries.pyx:107:21)
>>> File 'Nodes.py', line 4767, in analyse_expressions:
>>> SingleAssignmentNode(tseries.pyx:107:21)
>>> File 'Nodes.py', line 4872, in analyse_types:
>>> SingleAssignmentNode(tseries.pyx:107:21)
>>> File 'ExprNodes.py', line 7082, in analyse_types:
>>> TypecastNode(tseries.pyx:107:21,
>>>    result_is_used = True,
>>>    use_managed_ref = True)
>>> File 'ExprNodes.py', line 4274, in analyse_types:
>>> AttributeNode(tseries.pyx:107:59,
>>>    attribute = u'data',
>>>    initialized_check = True,
>>>    is_attribute = 1,
>>>    member = u'data',
>>>    needs_none_check = True,
>>>    op = '->',
>>>    result_is_used = True,
>>>    use_managed_ref = True)
>>> File 'ExprNodes.py', line 4360, in analyse_as_ordinary_attribute:
>>> AttributeNode(tseries.pyx:107:59,
>>>    attribute = u'data',
>>>    initialized_check = True,
>>>    is_attribute = 1,
>>>    member = u'data',
>>>    needs_none_check = True,
>>>    op = '->',
>>>    result_is_used = True,
>>>    use_managed_ref = True)
>>> File 'ExprNodes.py', line 4436, in analyse_attribute:
>>> AttributeNode(tseries.pyx:107:59,
>>>    attribute = u'data',
>>>    initialized_check = True,
>>>    is_attribute = 1,
>>>    member = u'data',
>>>    needs_none_check = True,
>>>    op = '->',
>>>    result_is_used = True,
>>>    use_managed_ref = True)
>>>
>>> Compiler crash traceback from this point on:
>>>  File "/home/wesm/code/repos/cython/Cython/Compiler/ExprNodes.py",
>>> line 4436, in analyse_attribute
>>>    replacement_node = numpy_transform_attribute_node(self)
>>>  File "/home/wesm/code/repos/cython/Cython/Compiler/NumpySupport.py",
>>> line 18, in numpy_transform_attribute_node
>>>    numpy_pxd_scope = node.obj.entry.type.scope.parent_scope
>>> AttributeError: 'TypecastNode' object has no attribute 'entry'
>>> building 'pandas._tseries' extension
>>> creating build
>>> creating build/temp.linux-x86_64-2.7
>>> creating build/temp.linux-x86_64-2.7/pandas
>>> creating build/temp.linux-x86_64-2.7/pandas/src
>>> gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -O2 -fPIC
>>> -I/home/wesm/epd/lib/python2.7/site-packages/numpy/core/include
>>> -I/home/wesm/epd/include/python2.7 -c pandas/src/tseries.c -o
>>> build/temp.linux-x86_64-2.7/pandas/src/tseries.o
>>> pandas/src/tseries.c:1:2: error: #error Do not use this file, it is
>>> the result of a failed Cython compilation.
>>> error: command 'gcc' failed with exit status 1
>>>
>>>
>>> -----
>>>
>>> I kludged this particular line in the pandas/timeseries branch so it
>>> will build on git master Cython, but I was treated to dozens of
>>> failures, errors, and finally a segfault in the middle of the test
>>> suite. Suffice to say I'm not sure I would advise you to release the
>>> library in its current state until all of this is resolved. Happy to
>>> help however I can but I'm back to 0.15.1 for now.
>>>
>>> - Wes
>>> _______________________________________________
>>> cython-devel mailing list
>>> cython-devel at python.org
>>> http://mail.python.org/mailman/listinfo/cython-devel
>>
>>
>> It seems that the numpy stopgap solution broke something in Pandas,
>> I'm not sure what or how, but it leads to segfaults where code is
>> trying to retrieve objects from a numpy array that are NULL. I tried
>> disabling the numpy rewrites which unbreaks this with the cython
>> release branch, so I think we should do another RC either with the
>> attribute rewrite disabled or fixed.
>>
>> Dag, do you know what could have been broken by this fix that could
>> lead to these results?
>
>
> I can't imagine what causes a change like you say... one thing that could
> cause a segfault is that technically we should now call import_array in
> every module using numpy.pxd; while we don't do that. If a NumPy version is
> used where PyArray_DATA or similar is not a macro, you would
> segfault....that should be fixed...
>
> Dag
>
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel

Yeah that makes sense, but the thing is that pandas is already calling
import_array everywhere, and the function calls themselves work, it's
the result that's NULL. Now this could be a bug in pandas, but seeing
that pandas works fine without the stopgap solution (that is, it
doesn't pass all the tests but at least it doesn't segfault), I think
it's something funky on our side.

So I suppose I'll disable the fix for 0.16, and we can try to fix it
for the next release.