From wesmckinn at gmail.com Tue May 1 00:05:34 2012 From: wesmckinn at gmail.com (Wes McKinney) Date: Mon, 30 Apr 2012 18:05:34 -0400 Subject: [Cython] Wacky idea: proper macros In-Reply-To: References: <4F9EFABD.5080408@astro.uio.no> <339e72d3-3ef5-44d9-ac89-79dd494cc460@email.android.com> Message-ID: On Mon, Apr 30, 2012 at 5:36 PM, William Stein wrote: > On Mon, Apr 30, 2012 at 2:32 PM, Dag Sverre Seljebotn > wrote: >> >> >> Wes McKinney wrote: >> >>>On Mon, Apr 30, 2012 at 4:55 PM, Nathaniel Smith wrote: >>>> On Mon, Apr 30, 2012 at 9:49 PM, Dag Sverre Seljebotn >>>> wrote: >>>>> JIT is really the way to go. It is one thing that a JIT could >>>optimize the >>>>> case where you pass a callback to a function and inline it run-time. >>>But >>>>> even if it doesn't get that fancy, it'd be great to just be able to >>>write >>>>> something like "cython.eval(s)" and have that be compiled (I guess >>>you could >>>>> do that now, but the sheer overhead of the C compiler and all the >>>.so files >>>>> involved means nobody would sanely use that as the main way of >>>stringing >>>>> together something like pandas). >>>> >>>> The overhead of running a fully optimizing compiler over pandas on >>>> every import is pretty high, though. You can come up with various >>>> caching mechanisms, but they all mean introducing some kind of >>>compile >>>> time/run time distinction. So I'm skeptical we'll just be able to get >>>> rid of that concept, even in a brave new LLVM/PyPy/Julia world. >>>> >>>> -- Nathaniel >>>> _______________________________________________ >>>> cython-devel mailing list >>>> cython-devel at python.org >>>> http://mail.python.org/mailman/listinfo/cython-devel >>> >>>I'd be perfectly OK with just having to compile pandas's "data engine" >>>and generate loads of C/C++ code. JIT-compiling little array >>>expressions would be cool too. I've got enough of an itch that I might >>>have to start scratching pretty soon. >> >> I think a good start is: >> >> Myself I'd look into just using Jinja2 to generate all the Cython code, rather than those horrible Python interpolated strings...that should give you something that's at least rather pleasant for you to work with once you are used to it (even if it is a bit horrible to newcomers to the code base). >> >> You can even check in the generated sources. >> >> And we've discussed letting cython be smart with templating languages and error report on a line in the original template, such features will certainly accepted once somebody codes it up. >> >> ?(I can give you me breakdown of how I eliminate other templating languages than Jinja2 for this purpose tomorrow if you are interested). > > Can you point us to a good example of you using jinja2 for this purpose? > > I'm a big fan of Jinja2 in general (e.g., for HTML)... > >> >> Dag >> >>>_______________________________________________ >>>cython-devel mailing list >>>cython-devel at python.org >>>http://mail.python.org/mailman/listinfo/cython-devel >> >> -- >> Sent from my Android phone with K-9 Mail. Please excuse my brevity. >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > > > -- > William Stein > Professor of Mathematics > University of Washington > http://wstein.org > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel I agree, it'd be cool to see an example or two. I have some ideas for a mini DSL / code-generation framework that might suit my needs; jinja2 might be then the right tool for doing the templating / codegen. If I could cut the amount of Cython code I have in half (and make it easier to write simple functions, which are currently more than 50% boilerplate) that would be a big win for me. - Wes From ask at linet.dk Tue May 1 09:53:17 2012 From: ask at linet.dk (Ask F. Jakobsen) Date: Tue, 1 May 2012 09:53:17 +0200 (CEST) Subject: [Cython] Code generated for the expression int(x)+1 In-Reply-To: <2012339557.72.1335857921632.JavaMail.root@pippin.linet.dk> Message-ID: <1359224693.78.1335858797533.JavaMail.root@pippin.linet.dk> Hi all, I am having a simple performance problem that can be resolved by splitting up an expression in two lines. I don't know if it is a bug or I am missing something. The piece of code below is translated to slow code 1) cdef int i i=int(x)+1 whereas the code below is translated to fast code 2) cdef int i i=int(x) i=i+1 Snippet of generated code by cython 1) /* "test.pyx":4 * cdef double x=3.2 * cdef int i * i=int(x)+1 # <<<<<<<<<<<<<< * return i * */ __pyx_t_1 = PyFloat_FromDouble(__pyx_v_x); if (unlikely(!__pyx_t_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;} __Pyx_GOTREF(__pyx_t_1); __pyx_t_2 = PyTuple_New(1); if (unlikely(!__pyx_t_2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;} __Pyx_GOTREF(((PyObject *)__pyx_t_2)); PyTuple_SET_ITEM(__pyx_t_2, 0, __pyx_t_1); __Pyx_GIVEREF(__pyx_t_1); __pyx_t_1 = 0; __pyx_t_1 = PyObject_Call(((PyObject *)((PyObject*)(&PyInt_Type))), ((PyObject *)__pyx_t_2), NULL); if (unlikely(!__pyx_t_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;} __Pyx_GOTREF(__pyx_t_1); __Pyx_DECREF(((PyObject *)__pyx_t_2)); __pyx_t_2 = 0; __pyx_t_2 = PyNumber_Add(__pyx_t_1, __pyx_int_1); if (unlikely(!__pyx_t_2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;} __Pyx_GOTREF(__pyx_t_2); __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0; __pyx_t_3 = __Pyx_PyInt_AsInt(__pyx_t_2); if (unlikely((__pyx_t_3 == (int)-1) && PyErr_Occurred())) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;} __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0; __pyx_v_i = __pyx_t_3; 2) /* "test.pyx":11 * cdef double x=3.2 * cdef int i * i=int(x) # <<<<<<<<<<<<<< * i=i+1 * return i */ __pyx_v_i = ((int)__pyx_v_x); /* "test.pyx":12 * cdef int i * i=int(x) * i=i+1 # <<<<<<<<<<<<<< * return i * */ __pyx_v_i = (__pyx_v_i + 1); I am using Cython-0.15.1 Best regards, Ask From stefan_ml at behnel.de Tue May 1 10:28:58 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 01 May 2012 10:28:58 +0200 Subject: [Cython] Code generated for the expression int(x)+1 In-Reply-To: <1359224693.78.1335858797533.JavaMail.root@pippin.linet.dk> References: <1359224693.78.1335858797533.JavaMail.root@pippin.linet.dk> Message-ID: <4F9F9ECA.6010102@behnel.de> Ask F. Jakobsen, 01.05.2012 09:53: > I am having a simple performance problem that can be resolved by splitting up an expression in two lines. I don't know if it is a bug or I am missing something. > > The piece of code below is translated to slow code > > 1) > cdef int i > i=int(x)+1 What you are saying here is: Convert x (known to be a C double) to an arbitrary size Python integer value, add 1, convert it to a C int and assign it to i. > whereas the code below is translated to fast code > > 2) > cdef int i > i=int(x) > i=i+1 This means: Convert x (known to be a C double) to an arbitrary size Python integer value, convert that to a C int and assign it to i, then add 1 and assign the result to i. In the first case, Cython cannot safely assume that the result of the int() conversion will fit into a C int and will therefore evaluate the expression in Python space. Note that the "+1" only hits a specific case where it looks safe, if you had written "int(x) // 200", this decision would make a lot more sense, because the intermediate result of int(x) could really be larger than a C int, even though the result of the division will have to fit into one (or will be made to fit, because you say so). In the second case, you explicitly tell Cython that the result of the int() conversion will fit into a C int and that *you* accept the responsibility for any overflows, so Cython can safely optimise the Python coercion away and reduce the int() call to a bare C cast from double to int. You can get the same result by writing down the cast yourself. Stefan From d.s.seljebotn at astro.uio.no Tue May 1 10:29:04 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 01 May 2012 10:29:04 +0200 Subject: [Cython] Wacky idea: proper macros In-Reply-To: References: <4F9EFABD.5080408@astro.uio.no> <339e72d3-3ef5-44d9-ac89-79dd494cc460@email.android.com> Message-ID: <4F9F9ED0.2010905@astro.uio.no> On 04/30/2012 11:36 PM, William Stein wrote: > On Mon, Apr 30, 2012 at 2:32 PM, Dag Sverre Seljebotn > wrote: >> >> >> Wes McKinney wrote: >> >>> On Mon, Apr 30, 2012 at 4:55 PM, Nathaniel Smith wrote: >>>> On Mon, Apr 30, 2012 at 9:49 PM, Dag Sverre Seljebotn >>>> wrote: >>>>> JIT is really the way to go. It is one thing that a JIT could >>> optimize the >>>>> case where you pass a callback to a function and inline it run-time. >>> But >>>>> even if it doesn't get that fancy, it'd be great to just be able to >>> write >>>>> something like "cython.eval(s)" and have that be compiled (I guess >>> you could >>>>> do that now, but the sheer overhead of the C compiler and all the >>> .so files >>>>> involved means nobody would sanely use that as the main way of >>> stringing >>>>> together something like pandas). >>>> >>>> The overhead of running a fully optimizing compiler over pandas on >>>> every import is pretty high, though. You can come up with various >>>> caching mechanisms, but they all mean introducing some kind of >>> compile >>>> time/run time distinction. So I'm skeptical we'll just be able to get >>>> rid of that concept, even in a brave new LLVM/PyPy/Julia world. >>>> >>>> -- Nathaniel >>>> _______________________________________________ >>>> cython-devel mailing list >>>> cython-devel at python.org >>>> http://mail.python.org/mailman/listinfo/cython-devel >>> >>> I'd be perfectly OK with just having to compile pandas's "data engine" >>> and generate loads of C/C++ code. JIT-compiling little array >>> expressions would be cool too. I've got enough of an itch that I might >>> have to start scratching pretty soon. >> >> I think a good start is: >> >> Myself I'd look into just using Jinja2 to generate all the Cython code, rather than those horrible Python interpolated strings...that should give you something that's at least rather pleasant for you to work with once you are used to it (even if it is a bit horrible to newcomers to the code base). >> >> You can even check in the generated sources. >> >> And we've discussed letting cython be smart with templating languages and error report on a line in the original template, such features will certainly accepted once somebody codes it up. >> >> (I can give you me breakdown of how I eliminate other templating languages than Jinja2 for this purpose tomorrow if you are interested). > > Can you point us to a good example of you using jinja2 for this purpose? Sure, I just needed some sleep... I only use it for C code, haven't used it for Cython so far (I tend to write things in C and wrap it in Cython). 1) https://github.com/dagss/elemental4py/blob/master/src/elemental_wrapper.cpp.in (work-in-progress) Here I use Jinja2 to write a C wrapper around Elemental (Elemental is a library for dense linear algebra over MPI). The C++ library is a heavy user of templates, I replace the templates with run-time dispatches using if-tests, so that rather than "DistMatrix" you have an elem_matrix struct with ELEM_DOUBLE, ELEM_MC, ELEM_MR. 2) https://github.com/wavemoth/wavemoth/blob/master/src/legendre_transform.c.in This is a numerical kernel where I do loop unrolling etc. using metaprogramming (with Tempita, not Jinja2). https://github.com/wavemoth/wavemoth/blob/cuda/wavemoth/cuda/legendre_transform.cu.in 3) https://github.com/wavemoth/wavemoth/blob/cuda/wavemoth/cuda/legendre_transform.cu.in Numerical kernel in templated CUDA using Tempita. On templating languages I tried) I've scanned through a few and actually tried Tempita, Mako, Jinja2. The features I need: - Pythonic syntax and ability to embed arbitrary Python code - A "call-block", such as this {% call catch('A->grid->ctx') %} BODY {% endcall %} i.e. in Jinja 2, one of the arguments to the function "catch" here is "caller", which when called invokes the body (and can be called multiple times with different arguments) I started out with Tempita because it's so simple to ship, but the lack of a call-block construct + the inability to break lines where I wanted drove me crazy. Then I tried Mako, because it has the largest set of features, but the syntax was simply too gruesome. I first tried to ignore this, but simply couldn't, it made my code totally unreadable. Finally, Jinja2 has most of what I need. Slight disadvantage is it tries to be "pure" and not allow too much arbitrary Python, Ideally what I'd like is something like Tempita but developed further to allow line-breaks and call-blocks, but lacking that I use Jinja2. I don't remember why I didn't like Cheetah (perhaps it doesn't do call-blocks?) Dag From d.s.seljebotn at astro.uio.no Tue May 1 10:32:48 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 01 May 2012 10:32:48 +0200 Subject: [Cython] Wacky idea: proper macros In-Reply-To: <4F9F9ED0.2010905@astro.uio.no> References: <4F9EFABD.5080408@astro.uio.no> <339e72d3-3ef5-44d9-ac89-79dd494cc460@email.android.com> <4F9F9ED0.2010905@astro.uio.no> Message-ID: <4F9F9FB0.7060600@astro.uio.no> On 05/01/2012 10:29 AM, Dag Sverre Seljebotn wrote: > On 04/30/2012 11:36 PM, William Stein wrote: >> On Mon, Apr 30, 2012 at 2:32 PM, Dag Sverre Seljebotn >> wrote: >>> >>> >>> Wes McKinney wrote: >>> >>>> On Mon, Apr 30, 2012 at 4:55 PM, Nathaniel Smith wrote: >>>>> On Mon, Apr 30, 2012 at 9:49 PM, Dag Sverre Seljebotn >>>>> wrote: >>>>>> JIT is really the way to go. It is one thing that a JIT could >>>> optimize the >>>>>> case where you pass a callback to a function and inline it run-time. >>>> But >>>>>> even if it doesn't get that fancy, it'd be great to just be able to >>>> write >>>>>> something like "cython.eval(s)" and have that be compiled (I guess >>>> you could >>>>>> do that now, but the sheer overhead of the C compiler and all the >>>> .so files >>>>>> involved means nobody would sanely use that as the main way of >>>> stringing >>>>>> together something like pandas). >>>>> >>>>> The overhead of running a fully optimizing compiler over pandas on >>>>> every import is pretty high, though. You can come up with various >>>>> caching mechanisms, but they all mean introducing some kind of >>>> compile >>>>> time/run time distinction. So I'm skeptical we'll just be able to get >>>>> rid of that concept, even in a brave new LLVM/PyPy/Julia world. >>>>> >>>>> -- Nathaniel >>>>> _______________________________________________ >>>>> cython-devel mailing list >>>>> cython-devel at python.org >>>>> http://mail.python.org/mailman/listinfo/cython-devel >>>> >>>> I'd be perfectly OK with just having to compile pandas's "data engine" >>>> and generate loads of C/C++ code. JIT-compiling little array >>>> expressions would be cool too. I've got enough of an itch that I might >>>> have to start scratching pretty soon. >>> >>> I think a good start is: >>> >>> Myself I'd look into just using Jinja2 to generate all the Cython >>> code, rather than those horrible Python interpolated strings...that >>> should give you something that's at least rather pleasant for you to >>> work with once you are used to it (even if it is a bit horrible to >>> newcomers to the code base). >>> >>> You can even check in the generated sources. >>> >>> And we've discussed letting cython be smart with templating languages >>> and error report on a line in the original template, such features >>> will certainly accepted once somebody codes it up. >>> >>> (I can give you me breakdown of how I eliminate other templating >>> languages than Jinja2 for this purpose tomorrow if you are interested). >> >> Can you point us to a good example of you using jinja2 for this purpose? > > Sure, I just needed some sleep... > > I only use it for C code, haven't used it for Cython so far (I tend to > write things in C and wrap it in Cython). > > 1) > > https://github.com/dagss/elemental4py/blob/master/src/elemental_wrapper.cpp.in > > > (work-in-progress) Here I use Jinja2 to write a C wrapper around > Elemental (Elemental is a library for dense linear algebra over MPI). > The C++ library is a heavy user of templates, I replace the templates > with run-time dispatches using if-tests, so that rather than > "DistMatrix" you have an elem_matrix struct with > ELEM_DOUBLE, ELEM_MC, ELEM_MR. > > 2) > > https://github.com/wavemoth/wavemoth/blob/master/src/legendre_transform.c.in > > > This is a numerical kernel where I do loop unrolling etc. using > metaprogramming (with Tempita, not Jinja2). > > https://github.com/wavemoth/wavemoth/blob/cuda/wavemoth/cuda/legendre_transform.cu.in > > > 3) > > https://github.com/wavemoth/wavemoth/blob/cuda/wavemoth/cuda/legendre_transform.cu.in > > > Numerical kernel in templated CUDA using Tempita. > > On templating languages I tried) > > I've scanned through a few and actually tried Tempita, Mako, Jinja2. > > The features I need: > > - Pythonic syntax and ability to embed arbitrary Python code > > - A "call-block", such as this > > {% call catch('A->grid->ctx') %} > BODY > {% endcall %} > > > i.e. in Jinja 2, one of the arguments to the function "catch" here is > "caller", which when called invokes the body (and can be called multiple > times with different arguments) > > I started out with Tempita because it's so simple to ship, but the lack > of a call-block construct + the inability to break lines where I wanted > drove me crazy. > > Then I tried Mako, because it has the largest set of features, but the > syntax was simply too gruesome. I first tried to ignore this, but simply > couldn't, it made my code totally unreadable. > > Finally, Jinja2 has most of what I need. Slight disadvantage is it tries > to be "pure" and not allow too much arbitrary Python, [sorry:] Slight disadvantage is it tries to be "pure" and not allow too much arbitrary Python, but one can work around that by using an auxiliary Python module and pass that Python module to the template when instantiating it -- so one kind of can use arbitrary Python in the templating process, one just need to edit separate files. (Which is perhaps better -- I'm unable to make up my mind on such trivial issues.) Dag > > Ideally what I'd like is something like Tempita but developed further to > allow line-breaks and call-blocks, but lacking that I use Jinja2. > > I don't remember why I didn't like Cheetah (perhaps it doesn't do > call-blocks?) > > Dag From stefan_ml at behnel.de Tue May 1 11:21:12 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 01 May 2012 11:21:12 +0200 Subject: [Cython] [cython-users] Conditional import in pure Python mode In-Reply-To: References: Message-ID: <4F9FAB08.2050506@behnel.de> >>> On 29 April 2012 01:33, Ian Bell wrote: >>>> idiom like >>>> >>>> if cython.compiled: >>>> cython.import('from libc.math cimport sin') >>>> else: >>>> from math import sin Actually, in this particular case, I would even accept a solution that special cases the "math" module internally by automatically cimporting libc.math as an override (or rather an adapted version as plain "math.pxd"). This CEP describes a general approach: http://wiki.cython.org/enhancements/overlaypythonmodules It's partly outdated, so things may have become easier these days. Stefan From stefan_ml at behnel.de Tue May 1 21:14:34 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 01 May 2012 21:14:34 +0200 Subject: [Cython] [cython-users] Conditional import in pure Python mode In-Reply-To: References: <4F9FAB08.2050506@behnel.de> Message-ID: <4FA0361A.4040005@behnel.de> Ian Bell, 01.05.2012 15:50: > On Tue, May 1, 2012 at 9:21 PM, Stefan Behnel wrote: >>>>> On 29 April 2012 01:33, Ian Bell wrote: >>>>>> idiom like >>>>>> >>>>>> if cython.compiled: >>>>>> cython.import('from libc.math cimport sin') >>>>>> else: >>>>>> from math import sin >> >> Actually, in this particular case, I would even accept a solution that >> special cases the "math" module internally by automatically cimporting >> libc.math as an override (or rather an adapted version as plain >> "math.pxd"). >> >> This CEP describes a general approach: >> >> http://wiki.cython.org/enhancements/overlaypythonmodules >> >> It's partly outdated, so things may have become easier these days. > > That is exactly what I was looking for. If we could implement that, it > would solve all my problems. It would meet all my needs - on this front at > least. There are two things to do here: 1) Write up a math.pxd that contains declarations equivalent to Python's math module. Note that this may not be entirely trivial because the math module does some error handling and type special casing under the hood. Some of this may still be required for the C level equivalents, although the type special casing would better be done by overriding function signatures using this feature: http://docs.cython.org/src/userguide/external_C_code.html#resolving-naming-conflicts-c-name-specifications Basically, you would declare two (or more) function signatures under the same name, but with different C names. 2) Use math.pxd as an override for the math module. I'm not sure yet how that would best be made to work, but it shouldn't be all that complex. It already works (mostly?) for numpy.pxd, for example, although that's done explicitly in user code. I think we should start with 2) to see how to get this to work in general, before we put too much work into 1). Could you sign up for the cython-devel mailing list please, so that we can coordinate the work there? Stefan From stefan_ml at behnel.de Tue May 1 21:22:21 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 01 May 2012 21:22:21 +0200 Subject: [Cython] Conditional import in pure Python mode In-Reply-To: <4FA0361A.4040005@behnel.de> References: <4F9FAB08.2050506@behnel.de> <4FA0361A.4040005@behnel.de> Message-ID: <4FA037ED.90205@behnel.de> Stefan Behnel, 01.05.2012 21:14: > 2) Use math.pxd as an override for the math module. I'm not sure yet how > that would best be made to work, but it shouldn't be all that complex. It > already works (mostly?) for numpy.pxd, for example, although that's done > explicitly in user code. BTW, I think it would be helpful to make the numpy.pxd cimport automatic as well whenever someone does "import numpy" and friends, right? Stefan From markflorisson88 at gmail.com Tue May 1 21:39:12 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 1 May 2012 20:39:12 +0100 Subject: [Cython] Conditional import in pure Python mode In-Reply-To: <4FA037ED.90205@behnel.de> References: <4F9FAB08.2050506@behnel.de> <4FA0361A.4040005@behnel.de> <4FA037ED.90205@behnel.de> Message-ID: On 1 May 2012 20:22, Stefan Behnel wrote: > Stefan Behnel, 01.05.2012 21:14: >> 2) Use math.pxd as an override for the math module. I'm not sure yet how >> that would best be made to work, but it shouldn't be all that complex. It >> already works (mostly?) for numpy.pxd, for example, although that's done >> explicitly in user code. > > BTW, I think it would be helpful to make the numpy.pxd cimport automatic as > well whenever someone does "import numpy" and friends, right? > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel I'm not sure, it means the user has to have numpy development headers. From faltet at pytables.org Tue May 1 21:49:40 2012 From: faltet at pytables.org (Francesc Alted) Date: Tue, 01 May 2012 14:49:40 -0500 Subject: [Cython] Conditional import in pure Python mode In-Reply-To: References: <4F9FAB08.2050506@behnel.de> <4FA0361A.4040005@behnel.de> <4FA037ED.90205@behnel.de> Message-ID: <4FA03E54.1020406@pytables.org> On 5/1/12 2:39 PM, mark florisson wrote: > On 1 May 2012 20:22, Stefan Behnel wrote: >> Stefan Behnel, 01.05.2012 21:14: >>> 2) Use math.pxd as an override for the math module. I'm not sure yet how >>> that would best be made to work, but it shouldn't be all that complex. It >>> already works (mostly?) for numpy.pxd, for example, although that's done >>> explicitly in user code. >> BTW, I think it would be helpful to make the numpy.pxd cimport automatic as >> well whenever someone does "import numpy" and friends, right? >> >> Stefan >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > I'm not sure, it means the user has to have numpy development headers. But if the user is going to compile a NumPy application, it sounds like strange to me that she should not be required to install the NumPy development headers, right? -- Francesc Alted From stefan_ml at behnel.de Tue May 1 21:51:18 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 01 May 2012 21:51:18 +0200 Subject: [Cython] Conditional import in pure Python mode In-Reply-To: References: <4F9FAB08.2050506@behnel.de> <4FA0361A.4040005@behnel.de> <4FA037ED.90205@behnel.de> Message-ID: <4FA03EB6.6010801@behnel.de> mark florisson, 01.05.2012 21:39: > On 1 May 2012 20:22, Stefan Behnel wrote: >> Stefan Behnel, 01.05.2012 21:14: >>> 2) Use math.pxd as an override for the math module. I'm not sure yet how >>> that would best be made to work, but it shouldn't be all that complex. It >>> already works (mostly?) for numpy.pxd, for example, although that's done >>> explicitly in user code. >> >> BTW, I think it would be helpful to make the numpy.pxd cimport automatic as >> well whenever someone does "import numpy" and friends, right? > > I'm not sure, it means the user has to have numpy development headers. Hmm, right. What about making it an explicit compile time option then? Something like # cython: override_modules = math,numpy Or should we go for an opt-out? # cython: python_modules = math,numpy Sounds like it would hit the more common case by default. Stefan From stefan_ml at behnel.de Tue May 1 22:02:28 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 01 May 2012 22:02:28 +0200 Subject: [Cython] Conditional import in pure Python mode In-Reply-To: <4FA03E54.1020406@pytables.org> References: <4F9FAB08.2050506@behnel.de> <4FA0361A.4040005@behnel.de> <4FA037ED.90205@behnel.de> <4FA03E54.1020406@pytables.org> Message-ID: <4FA04154.5020402@behnel.de> Francesc Alted, 01.05.2012 21:49: > On 5/1/12 2:39 PM, mark florisson wrote: >> On 1 May 2012 20:22, Stefan Behnel wrote: >>> Stefan Behnel, 01.05.2012 21:14: >>>> 2) Use math.pxd as an override for the math module. I'm not sure yet how >>>> that would best be made to work, but it shouldn't be all that complex. It >>>> already works (mostly?) for numpy.pxd, for example, although that's done >>>> explicitly in user code. >>> BTW, I think it would be helpful to make the numpy.pxd cimport automatic as >>> well whenever someone does "import numpy" and friends, right? >>> >> I'm not sure, it means the user has to have numpy development headers. > > But if the user is going to compile a NumPy application, it sounds like > strange to me that she should not be required to install the NumPy > development headers, right? Let's say it's not impossible that someone uses NumPy and Cython without any accelerated C level connection between the two, but it's rather unlikely, given that Cython already has all dependencies that this connection would require as well, except for the NumPy header files. So I would suggest to make the automatic override the default for any module for which a .pxd file with the same fully qualified name is found in the search path, and to require users to explicitly disable this feature for a given module using a module level (or external) compiler directive if they feel like getting slower code (or working around a bug or whatever). Anyway, given that this feature isn't even implemented yet, it may appear a bit premature to discuss these details. Stefan From robertwb at gmail.com Wed May 2 08:56:08 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Tue, 1 May 2012 23:56:08 -0700 Subject: [Cython] Conditional import in pure Python mode In-Reply-To: <4FA04154.5020402@behnel.de> References: <4F9FAB08.2050506@behnel.de> <4FA0361A.4040005@behnel.de> <4FA037ED.90205@behnel.de> <4FA03E54.1020406@pytables.org> <4FA04154.5020402@behnel.de> Message-ID: On Tue, May 1, 2012 at 1:02 PM, Stefan Behnel wrote: > Francesc Alted, 01.05.2012 21:49: >> On 5/1/12 2:39 PM, mark florisson wrote: >>> On 1 May 2012 20:22, Stefan Behnel wrote: >>>> Stefan Behnel, 01.05.2012 21:14: >>>>> 2) Use math.pxd as an override for the math module. I'm not sure yet how >>>>> that would best be made to work, but it shouldn't be all that complex. It >>>>> already works (mostly?) for numpy.pxd, for example, although that's done >>>>> explicitly in user code. math.pxd would be a bit trickier, as we're trying to shadow python functions with independent c implementations (rather than declaring structure to the single numpy array object and exposing c-level only methods. We'd need to support stuff like double x = ... double y = sin(x) # fast cdef object f = sin # grab the builtin one? but this is by no means insurmountable and could be really useful. >>>> BTW, I think it would be helpful to make the numpy.pxd cimport automatic as >>>> well whenever someone does "import numpy" and friends, right? >>>> >>> I'm not sure, it means the user has to have numpy development headers. >> >> But if the user is going to compile a NumPy application, it sounds like >> strange to me that she should not be required to install the NumPy >> development headers, right? > > Let's say it's not impossible that someone uses NumPy and Cython without > any accelerated C level connection between the two, but it's rather > unlikely, given that Cython already has all dependencies that this > connection would require as well, except for the NumPy header files. > > So I would suggest to make the automatic override the default for any > module for which a .pxd file with the same fully qualified name is found in > the search path, and to require users to explicitly disable this feature > for a given module using a module level (or external) compiler directive if > they feel like getting slower code (or working around a bug or whatever). There is another consideration: this can introduce unnecessary and potentially costly dependencies. For example, in Sage one has sage/rings/integer.pxd. Not everything that imports from this file needs c-level access to the Integer type, and requiring everything that imports from sage.rings.integer to be re-compiled when this file changes would increase the (admittedly already lengthy) re-compile, as well as sucking in a (chain of) un-needed declarations. As Cython becomes more and more common, a similar effect could happen between projects as well. This may be the exception rather than the rule, so perhaps it's not *that* bad to let opt-in be the default, i.e. # cython: cimport_on_import = __all__ - Robert From robertwb at gmail.com Wed May 2 09:15:21 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 2 May 2012 00:15:21 -0700 Subject: [Cython] [cython-users] cimport numpy fails with Python 3 semantics In-Reply-To: <4F9E35AF.6050209@behnel.de> References: <15456913.1041.1335634688690.JavaMail.geo-discussion-forums@vbli11> <4F9C3BDC.9040108@behnel.de> <4F9E35AF.6050209@behnel.de> Message-ID: On Sun, Apr 29, 2012 at 11:48 PM, Stefan Behnel wrote: > mark florisson, 28.04.2012 21:57: >> On 28 April 2012 19:50, Stefan Behnel wrote: >>> mark florisson, 28.04.2012 20:33: >>>> I think each module should have its own language level, so I think >>>> that's a bug. I think the rules should be: >>>> >>>> ? ? - if passed as command line argument, use that for all cimported >>>> modules, unless they define their only language level through the >>>> directive >>>> ? ? - if set as a directive, the language level will apply only to that module >>> >>> That's how it works. We don't run the tests with language level 3 in >>> Jenkins because the majority of the tests is not meant to be run with Py3 >>> semantics. Maybe it's time to add a numpy_cy3 test. >>> >>> If there are more problems than just this (which was a bug in numpy.pxd), >>> we may consider setting language level 2 explicitly in numpy.pxd. >> >> Ah, great. Do we have any documentation for that? > > We do now. ;) > > However, I'm not sure cimported .pxd files should always inherit the > language_level setting. It's somewhat of a grey area because user provided > .pxd files would benefit from it since they likely all use the same > language level as the main module, whereas the Cython shipped (and > otherwise globally installed) .pxd files wouldn't gain anything and could > potentially break. > > I think we may want to keep the current behaviour and set the language > level explicitly in the few shipped .pxd files that are not language level > agnostic (i.e. those that actually contain code). +1, I'm not worried about breaking the ones that ship with Cython, as we can manually specify the language level on those if necessary. This does have implications for automatically cimporting on import when a .pxd is found though, especially if one pulls in .pxd files from another project. - Robert From stefan_ml at behnel.de Wed May 2 09:33:43 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 02 May 2012 09:33:43 +0200 Subject: [Cython] Conditional import in pure Python mode In-Reply-To: References: <4F9FAB08.2050506@behnel.de> <4FA0361A.4040005@behnel.de> <4FA037ED.90205@behnel.de> <4FA03E54.1020406@pytables.org> <4FA04154.5020402@behnel.de> Message-ID: <4FA0E357.1000303@behnel.de> Robert Bradshaw, 02.05.2012 08:56: > On Tue, May 1, 2012 at 1:02 PM, Stefan Behnel wrote: >> Francesc Alted, 01.05.2012 21:49: >>> On 5/1/12 2:39 PM, mark florisson wrote: >>>> On 1 May 2012 20:22, Stefan Behnel wrote: >>>>> Stefan Behnel, 01.05.2012 21:14: >>>>>> 2) Use math.pxd as an override for the math module. I'm not sure yet how >>>>>> that would best be made to work, but it shouldn't be all that complex. It >>>>>> already works (mostly?) for numpy.pxd, for example, although that's done >>>>>> explicitly in user code. > > math.pxd would be a bit trickier, as we're trying to shadow python > functions with independent c implementations (rather than declaring > structure to the single numpy array object and exposing c-level only > methods. We'd need to support stuff like > > double x = ... > double y = sin(x) # fast > cdef object f = sin # grab the builtin one? > > but this is by no means insurmountable and could be really useful. I already did that for the builtin abs() function. Works nicely so far, although not from a .pxd but declared internally in Builtin.py. It's not currently supported for methods (I tried it for one of the builtin types and it seemed to require more work than I wanted to invest at that point), but I don't think we need that here. Module level functions should totally be enough for math.pxd. >>>>> BTW, I think it would be helpful to make the numpy.pxd cimport automatic as >>>>> well whenever someone does "import numpy" and friends, right? >>>>> >>>> I'm not sure, it means the user has to have numpy development headers. >>> >>> But if the user is going to compile a NumPy application, it sounds like >>> strange to me that she should not be required to install the NumPy >>> development headers, right? >> >> Let's say it's not impossible that someone uses NumPy and Cython without >> any accelerated C level connection between the two, but it's rather >> unlikely, given that Cython already has all dependencies that this >> connection would require as well, except for the NumPy header files. >> >> So I would suggest to make the automatic override the default for any >> module for which a .pxd file with the same fully qualified name is found in >> the search path, and to require users to explicitly disable this feature >> for a given module using a module level (or external) compiler directive if >> they feel like getting slower code (or working around a bug or whatever). > > There is another consideration: this can introduce unnecessary and > potentially costly dependencies. For example, in Sage one has > sage/rings/integer.pxd. Not everything that imports from this file > needs c-level access to the Integer type, and requiring everything > that imports from sage.rings.integer to be re-compiled when this file > changes would increase the (admittedly already lengthy) re-compile, as > well as sucking in a (chain of) un-needed declarations. Ah, yes, I see your point. The .pxd may actually introduce substantially more (and different) dependencies than the already compiled .so file. Just because the import happens in Cython code and not in Python code doesn't mean it should do different things. > As Cython > becomes more and more common, a similar effect could happen between > projects as well. Agreed. A compile time C level dependency is much more fragile and version dependent than a Python level import. I can see a lot of cases where this matters. > This may be the exception rather than the rule, so perhaps it's not > *that* bad to let opt-in be the default, i.e. > > # cython: cimport_on_import = __all__ So, you would allow it to receive either a sequence of explicit module names or "__all__" to enable it by default, right? Sounds like a reasonable directive to me. Stefan From stefan_ml at behnel.de Wed May 2 09:36:55 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 02 May 2012 09:36:55 +0200 Subject: [Cython] [cython-users] cimport numpy fails with Python 3 semantics In-Reply-To: References: <15456913.1041.1335634688690.JavaMail.geo-discussion-forums@vbli11> <4F9C3BDC.9040108@behnel.de> <4F9E35AF.6050209@behnel.de> Message-ID: <4FA0E417.70802@behnel.de> Robert Bradshaw, 02.05.2012 09:15: > On Sun, Apr 29, 2012 at 11:48 PM, Stefan Behnel wrote: >> mark florisson, 28.04.2012 21:57: >>> On 28 April 2012 19:50, Stefan Behnel wrote: >>>> mark florisson, 28.04.2012 20:33: >>>>> I think each module should have its own language level, so I think >>>>> that's a bug. I think the rules should be: >>>>> >>>>> - if passed as command line argument, use that for all cimported >>>>> modules, unless they define their only language level through the >>>>> directive >>>>> - if set as a directive, the language level will apply only to that module >>>> >>>> That's how it works. We don't run the tests with language level 3 in >>>> Jenkins because the majority of the tests is not meant to be run with Py3 >>>> semantics. Maybe it's time to add a numpy_cy3 test. >>>> >>>> If there are more problems than just this (which was a bug in numpy.pxd), >>>> we may consider setting language level 2 explicitly in numpy.pxd. >>> >>> Ah, great. Do we have any documentation for that? >> >> We do now. ;) >> >> However, I'm not sure cimported .pxd files should always inherit the >> language_level setting. It's somewhat of a grey area because user provided >> .pxd files would benefit from it since they likely all use the same >> language level as the main module, whereas the Cython shipped (and >> otherwise globally installed) .pxd files wouldn't gain anything and could >> potentially break. >> >> I think we may want to keep the current behaviour and set the language >> level explicitly in the few shipped .pxd files that are not language level >> agnostic (i.e. those that actually contain code). > > +1, I'm not worried about breaking the ones that ship with Cython, as > we can manually specify the language level on those if necessary. This > does have implications for automatically cimporting on import when a > .pxd is found though, especially if one pulls in .pxd files from > another project. Right. Given that some people have started distributing .pxd files for some C libraries on PyPI, we should advertise it in a visible part of the documentation that authors of .pxd files have to take care to define a language level if they depend on it. Stefan From robertwb at gmail.com Wed May 2 09:59:46 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 2 May 2012 00:59:46 -0700 Subject: [Cython] Conditional import in pure Python mode In-Reply-To: <4FA0E357.1000303@behnel.de> References: <4F9FAB08.2050506@behnel.de> <4FA0361A.4040005@behnel.de> <4FA037ED.90205@behnel.de> <4FA03E54.1020406@pytables.org> <4FA04154.5020402@behnel.de> <4FA0E357.1000303@behnel.de> Message-ID: On Wed, May 2, 2012 at 12:33 AM, Stefan Behnel wrote: > Robert Bradshaw, 02.05.2012 08:56: >> On Tue, May 1, 2012 at 1:02 PM, Stefan Behnel wrote: >>> Francesc Alted, 01.05.2012 21:49: >>>> On 5/1/12 2:39 PM, mark florisson wrote: >>>>> On 1 May 2012 20:22, Stefan Behnel wrote: >>>>>> Stefan Behnel, 01.05.2012 21:14: >>>>>>> 2) Use math.pxd as an override for the math module. I'm not sure yet how >>>>>>> that would best be made to work, but it shouldn't be all that complex. It >>>>>>> already works (mostly?) for numpy.pxd, for example, although that's done >>>>>>> explicitly in user code. >> >> math.pxd would be a bit trickier, as we're trying to shadow python >> functions with independent c implementations (rather than declaring >> structure to the single numpy array object and exposing c-level only >> methods. We'd need to support stuff like >> >> double x = ... >> double y = sin(x) # fast >> cdef object f = sin # grab the builtin one? >> >> but this is by no means insurmountable and could be really useful. > > I already did that for the builtin abs() function. Works nicely so far, > although not from a .pxd but declared internally in Builtin.py. > > It's not currently supported for methods (I tried it for one of the builtin > types and it seemed to require more work than I wanted to invest at that > point), but I don't think we need that here. Module level functions should > totally be enough for math.pxd. Yep. >>>>>> BTW, I think it would be helpful to make the numpy.pxd cimport automatic as >>>>>> well whenever someone does "import numpy" and friends, right? >>>>>> >>>>> I'm not sure, it means the user has to have numpy development headers. >>>> >>>> But if the user is going to compile a NumPy application, it sounds like >>>> strange to me that she should not be required to install the NumPy >>>> development headers, right? >>> >>> Let's say it's not impossible that someone uses NumPy and Cython without >>> any accelerated C level connection between the two, but it's rather >>> unlikely, given that Cython already has all dependencies that this >>> connection would require as well, except for the NumPy header files. >>> >>> So I would suggest to make the automatic override the default for any >>> module for which a .pxd file with the same fully qualified name is found in >>> the search path, and to require users to explicitly disable this feature >>> for a given module using a module level (or external) compiler directive if >>> they feel like getting slower code (or working around a bug or whatever). >> >> There is another consideration: this can introduce unnecessary and >> potentially costly dependencies. For example, in Sage one has >> sage/rings/integer.pxd. Not everything that imports from this file >> needs c-level access to the Integer type, and requiring everything >> that imports from sage.rings.integer to be re-compiled when this file >> changes would increase the (admittedly already lengthy) re-compile, as >> well as sucking in a (chain of) un-needed declarations. > > Ah, yes, I see your point. The .pxd may actually introduce substantially > more (and different) dependencies than the already compiled .so file. Just > because the import happens in Cython code and not in Python code doesn't > mean it should do different things. > > >> As Cython >> becomes more and more common, a similar effect could happen between >> projects as well. > > Agreed. A compile time C level dependency is much more fragile and version > dependent than a Python level import. I can see a lot of cases where this > matters. > > >> This may be the exception rather than the rule, so perhaps it's not >> *that* bad to let opt-in be the default, i.e. >> >> # cython: cimport_on_import = __all__ > > So, you would allow it to receive either a sequence of explicit module > names or "__all__" to enable it by default, right? Sounds like a reasonable > directive to me. Perhaps the default could be "those that ship with Cython" or even some other hand-picked list. (In this case, we'd want users to be able to add to and remove from the default set, e.g. # cython: cimport_on_import = +my_module, -math It'd also be nice to be able to specify a package and all its submodules...) - Robert From stefan_ml at behnel.de Wed May 2 15:49:44 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 02 May 2012 15:49:44 +0200 Subject: [Cython] Conditional import in pure Python mode In-Reply-To: <4FA0361A.4040005@behnel.de> References: <4F9FAB08.2050506@behnel.de> <4FA0361A.4040005@behnel.de> Message-ID: <4FA13B78.5070003@behnel.de> Stefan Behnel, 01.05.2012 21:14: > 1) Write up a math.pxd that contains declarations equivalent to Python's > math module. Note that this may not be entirely trivial because the math > module does some error handling and type special casing under the hood. Having taken a slightly deeper look at this now, I think it would make sense to start with this part. An equivalent implementation will need to do the same kind of error handling, so there will be more than just libm function declarations in the file. Inline functions work in .pxd files (although I'm not sure inlining is really a good idea here). > Some of this may still be required for the C level equivalents, although > the type special casing would better be done by overriding function > signatures using this feature: > > http://docs.cython.org/src/userguide/external_C_code.html#resolving-naming-conflicts-c-name-specifications > > Basically, you would declare two (or more) function signatures under the > same name, but with different C names. Given that most functions work on double values and return double, this won't be an issue. The functions that return integers are an issue, though, because Python can easily handle this under the hood by simply returning arbitrary sized integer objects. A C version cannot safely return C integers without risking overflows. We should leave those out entirely for now and just fall back to the normal Python implementation. Ian, could you give the .pxd file a try? You should be able to test this by importing math, followed by a cimport of your new math module. Stefan From ian.h.bell at gmail.com Thu May 3 03:09:08 2012 From: ian.h.bell at gmail.com (Ian Bell) Date: Thu, 3 May 2012 13:09:08 +1200 Subject: [Cython] PXD file for overriding math functions Message-ID: Ok, I think I am missing something - I am a bit lost in the Cython jargon. Here is where I stand, let me know where I am going wrong... I compiled with cython.py -a test.py, but it isn't working. If I am completely not on the right track, feel free to let me know :) Ian ##### test.py (Pure python file) ##### from math import sin def f(r): s=sin(r) return s r=3.141592654/3.0 print f(r) #### test.pxd ##### cimport cython import cython cimport math_override @cython.locals(s=cython. double) cpdef f(double r) #### math_override.pxd ##### cdef extern from "math.h": double sin_d "sin" (double x) float sin_f "sin" (float x) cpdef inline double sin(double x): return sin_d(x) cpdef inline float sin(float x): return sin_f(x) -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.s.seljebotn at astro.uio.no Thu May 3 14:24:32 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 03 May 2012 14:24:32 +0200 Subject: [Cython] CEP1000: Native dispatch through callables In-Reply-To: References: <4F87530F.7050000@astro.uio.no> <4F8D6112.1000906@astro.uio.no> <4F8F33AE.50401@astro.uio.no> <4F8F38F9.7020008@astro.uio.no> <4F8FEF7A.2090501@astro.uio.no> <4F9010B0.5080100@astro.uio.no> Message-ID: <4FA27900.80101@astro.uio.no> I'm afraid I'm going to try to kick this thread alive again. I want us to have something that Travis can implement in numba and "his" portion of SciPy, and also that could be used by NumPy devs. Since the decisions are rather arbitrary, perhaps we can try to quickly get to the "+1" stage (or, depending on how things turn out, a tournament starting with at most one proposal per person). On 04/20/2012 09:30 AM, Robert Bradshaw wrote: > On Thu, Apr 19, 2012 at 6:18 AM, Dag Sverre Seljebotn > wrote: >> On 04/19/2012 01:20 PM, Nathaniel Smith wrote: >>> >>> On Thu, Apr 19, 2012 at 11:56 AM, Dag Sverre Seljebotn >>> wrote: >>>> >>>> I thought of some drawbacks of getfuncptr: >>>> >>>> - Important: Doesn't allow you to actually inspect the supported >>>> signatures, which is needed (or at least convenient) if you want to use >>>> an >>>> FFI library or do some JIT-ing. So an iteration mechanism is still needed >>>> in >>>> addition, meaning the number of things for the object to implement grows >>>> a >>>> bit large. Default implementations help -- OTOH there really wasn't a >>>> major >>>> drawback with the table approach as long as JIT's can just replace it? >>> >>> >>> But this is orthogonal to the table vs. getfuncptr discussion. We're >>> assuming that the table might be extended at runtime, which means you >>> can't use it to determine which signatures are supported. So we need >>> some sort of extra interface for the caller and callee to negotiate a >>> type anyway. (I'm intentionally agnostic about whether it makes more >>> sense for the caller or the callee to be doing the iterating... in >>> general type negotiation could be quite complicated, and I don't think >>> we know enough to get that interface right yet.) >> >> >> Hmm. Right. Let's define an explicit goal for the CEP then. >> >> What I care about at is getting the spec right enough such that, e.g., NumPy >> and SciPy, and other (mostly manually written) C extensions with slow >> development pace, can be forward-compatible with whatever crazy things >> Cython or Numba does. >> >> There's 4 cases: >> >> 1) JIT calls JIT (ruled out straight away) >> >> 2) JIT calls static: Say that Numba wants to optimize calls to np.sin etc. >> without special-casing; this seem to require reading a table of static >> signatures >> >> 3) Static calls JIT: This is the case when scipy.integrate routines calls a >> Numba callback and Numba generates a specialization for the dtype they >> explicitly needs. This calls for getfuncptr (but perhaps in a form which we >> can't quite determine yet?). >> >> 4) Static calls static: Either table or getfuncptr works. >> >> My gut feeling is go for 2) and 4) in this round => table. > > getfuncptr is really simple and flexible, but I'm with you on both of > these to points, and the overhead was not trivial. It's interesting to hear you say the overhead was not trivial (that was my hunch too but I sort of yielded to peer pressure). I think SAGE has some history with this -- isn't one of the reasons for the "cpdef" vs. "cdef" split that "cpdef" has the cost of a single lookup for the presence of a __dict__ on the object, which was an unacceptable penalty for parts of Sage? That can't have been much more than a 1ns penalty per instance. > Of course we could offer both, i.e. look at the table first, if it's > not there call getfuncptr if it's non-null, then fall back to "slow" > call or error. These are all opt-in depending on how hard you want to > try to optimize things. That's actually exactly what I was envisioning -- in time (with JITs on both ends) the table could act sort of as a cache for commonly used overloads, and getfuncptr would access the others more slowly. > As far as keys vs. interning, I'm also tempted to try to have my cake > and eat it too. Define a space-friendly encoding for signatures and > require interning for anything that doesn't fit into a single > sizeof(void*). The fact that this cutoff would vary for 32 vs 64-bit > would require some care, but could be done with macros in C. If the > signatures produce non-aligned "pointer" values there won't be any > collisions, and this way libraries only have to share in the global > (Python-level?) interning scheme iff they want to expose/use "large" > signatures. That was the approach I described to Nathaniel as having the "worst features of both" -- lack of readable gdb dumps of the keys, and having to define an interning mechanism for use by the 5% cases that don't fit. To sum up hat's been said earlier: The only thing that would blow the key size above 64 bits except very many arguments would be things like classes/interfaces/vtables. But in that case, reasonable-sized keys for the vtables can be computed (whether by interning, cryptographic hashing, or a GUID like Microsoft COM). So I'm still +1 on my proposal; but I would be happy with an intern-based proposal if somebody bothers to flesh it out a bit (I don't quite know how I'd do it and would get lost in PyObject* vs. char* and cross-language state sharing...). My proposal in summary: - Table with variable-sized entries (not getfuncptr, not interning) that can be scanned by the caller in 128-bit increments. - Only use 64 bit pointers, in order to keep table format the same on 32 bit and 64 bit. - Do encoding of the signature strings. Utility functions to work with this (both to scan tables and encode/decode a format string) will be provided as C code by the CEP that can be bundled. Pros: - Table format is not specific to Python world (it makes as much sense to use, e.g., internally in Julia) - No state needs to be shared between packages run-time (they can use the bundled C code in isolation if they wish) - No need for an interning machinery - More easily compatible with multiple interpreter states (?) - Minor performance benefit of table over getfuncptr (intern vs. key didn't matter). [Cue comment that this doesn't matter.] Cons: - Lack of instant low-level debuggability, like in the interned case (a human needs to run a function on the key constant to see what it corresponds to) - Not as extendable as getfuncptr (though currently we don't quite know how we would extend it, and it's easy to add getfuncptr in the future) Notes: - When extended to handle vtable argument types, these still needs to be some interning or crypto-hashing. But that is likely to come up anyway as part of a COM-like queryInterface protocol, and at that point we will be better at making those decisions and design a good interning mechanism. Dag From robertwb at gmail.com Thu May 3 22:18:03 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Thu, 3 May 2012 13:18:03 -0700 Subject: [Cython] CEP1000: Native dispatch through callables In-Reply-To: <4FA27900.80101@astro.uio.no> References: <4F87530F.7050000@astro.uio.no> <4F8D6112.1000906@astro.uio.no> <4F8F33AE.50401@astro.uio.no> <4F8F38F9.7020008@astro.uio.no> <4F8FEF7A.2090501@astro.uio.no> <4F9010B0.5080100@astro.uio.no> <4FA27900.80101@astro.uio.no> Message-ID: On Thu, May 3, 2012 at 5:24 AM, Dag Sverre Seljebotn wrote: > I'm afraid I'm going to try to kick this thread alive again. I want us to > have something that Travis can implement in numba and "his" portion of > SciPy, and also that could be used by NumPy devs. That's great, I'd like to get things moving forward on this. > Since the decisions are rather arbitrary, perhaps we can try to quickly get > to the "+1" stage (or, depending on how things turn out, a tournament > starting with at most one proposal per person). > > > On 04/20/2012 09:30 AM, Robert Bradshaw wrote: >> >> On Thu, Apr 19, 2012 at 6:18 AM, Dag Sverre Seljebotn >> ?wrote: >>> >>> On 04/19/2012 01:20 PM, Nathaniel Smith wrote: >>>> >>>> >>>> On Thu, Apr 19, 2012 at 11:56 AM, Dag Sverre Seljebotn >>>> ? ?wrote: >>>>> >>>>> >>>>> I thought of some drawbacks of getfuncptr: >>>>> >>>>> ?- Important: Doesn't allow you to actually inspect the supported >>>>> signatures, which is needed (or at least convenient) if you want to use >>>>> an >>>>> FFI library or do some JIT-ing. So an iteration mechanism is still >>>>> needed >>>>> in >>>>> addition, meaning the number of things for the object to implement >>>>> grows >>>>> a >>>>> bit large. Default implementations help -- OTOH there really wasn't a >>>>> major >>>>> drawback with the table approach as long as JIT's can just replace it? >>>> >>>> >>>> >>>> But this is orthogonal to the table vs. getfuncptr discussion. We're >>>> assuming that the table might be extended at runtime, which means you >>>> can't use it to determine which signatures are supported. So we need >>>> some sort of extra interface for the caller and callee to negotiate a >>>> type anyway. (I'm intentionally agnostic about whether it makes more >>>> sense for the caller or the callee to be doing the iterating... in >>>> general type negotiation could be quite complicated, and I don't think >>>> we know enough to get that interface right yet.) >>> >>> >>> >>> Hmm. Right. Let's define an explicit goal for the CEP then. >>> >>> What I care about at is getting the spec right enough such that, e.g., >>> NumPy >>> and SciPy, and other (mostly manually written) C extensions with slow >>> development pace, can be forward-compatible with whatever crazy things >>> Cython or Numba does. >>> >>> There's 4 cases: >>> >>> ?1) JIT calls JIT (ruled out straight away) >>> >>> ?2) JIT calls static: Say that Numba wants to optimize calls to np.sin >>> etc. >>> without special-casing; this seem to require reading a table of static >>> signatures >>> >>> ?3) Static calls JIT: This is the case when scipy.integrate routines >>> calls a >>> Numba callback and Numba generates a specialization for the dtype they >>> explicitly needs. This calls for getfuncptr (but perhaps in a form which >>> we >>> can't quite determine yet?). >>> >>> ?4) Static calls static: Either table or getfuncptr works. >>> >>> My gut feeling is go for 2) and 4) in this round => ?table. >> >> >> getfuncptr is really simple and flexible, but I'm with you on both of >> these to points, and the overhead was not trivial. > > > It's interesting to hear you say the overhead was not trivial (that was my > hunch too but I sort of yielded to peer pressure). I think SAGE has some > history with this -- isn't one of the reasons for the "cpdef" vs. "cdef" > split that "cpdef" has the cost of a single lookup for the presence of a > __dict__ on the object, which was an unacceptable penalty for parts of Sage? > That can't have been much more than a 1ns penalty per instance. It's mostly historical, as a lot of Sage was written before cpdef existed (and people following this pattern after the fact). There are also some cases where cdef is used because the "leaf" classes are often in Python but have no need to override the given method, and an actual dictionary lookup would be required otherwise (e.g. in the coercion model). >> Of course we could offer both, i.e. look at the table first, if it's >> not there call getfuncptr if it's non-null, then fall back to "slow" >> call or error. These are all opt-in depending on how hard you want to >> try to optimize things. > > > That's actually exactly what I was envisioning -- in time (with JITs on both > ends) the table could act sort of as a cache for commonly used overloads, > and getfuncptr would access the others more slowly. OK, then +1 >> As far as keys vs. interning, I'm also tempted to try to have my cake >> and eat it too. Define a space-friendly encoding for signatures and >> require interning for anything that doesn't fit into a single >> sizeof(void*). The fact that this cutoff would vary for 32 vs 64-bit >> would require some care, but could be done with macros in C. If the >> signatures produce non-aligned "pointer" values there won't be any >> collisions, and this way libraries only have to share in the global >> (Python-level?) interning scheme iff they want to expose/use "large" >> signatures. > > > That was the approach I described to Nathaniel as having the "worst features > of both" -- lack of readable gdb dumps of the keys, and having to define an > interning mechanism for use by the 5% cases that don't fit. Yes, it has the best and worst features of both. > To sum up hat's been said earlier: The only thing that would blow the key > size above 64 bits except very many arguments would be things like > classes/interfaces/vtables. But in that case, reasonable-sized keys for the > vtables can be computed (whether by interning, cryptographic hashing, or a > GUID like Microsoft COM). > > So I'm still +1 on my proposal; but I would be happy with an intern-based > proposal if somebody bothers to flesh it out a bit (I don't quite know how > I'd do it and would get lost in PyObject* vs. char* and cross-language state > sharing...). > > My proposal in summary: > > ?- Table with variable-sized entries (not getfuncptr, not interning) that > can be scanned by the caller in 128-bit increments. > > ?- Only use 64 bit pointers, in order to keep table format the same on 32 > bit and 64 bit. > > ?- Do encoding of the signature strings. Utility functions to work with this > (both to scan tables and encode/decode a format string) will be provided as > C code by the CEP that can be bundled. > > Pros: > > ?- Table format is not specific to Python world (it makes as much sense to > use, e.g., internally in Julia) > > ?- No state needs to be shared between packages run-time (they can use the > bundled C code in isolation if they wish) > > ?- No need for an interning machinery > > ?- More easily compatible with multiple interpreter states (?) > > ?- Minor performance benefit of table over getfuncptr (intern vs. key didn't > matter). [Cue comment that this doesn't matter.] > > Cons: > > ?- Lack of instant low-level debuggability, like in the interned case (a > human needs to run a function on the key constant to see what it corresponds > to) > > ?- Not as extendable as getfuncptr (though currently we don't quite know how > we would extend it, and it's easy to add getfuncptr in the future) > > Notes: > > ?- When extended to handle vtable argument types, these still needs to be > some interning or crypto-hashing. But that is likely to come up anyway as > part of a COM-like queryInterface protocol, and at that point we will be > better at making those decisions and design a good interning mechanism. +1 to going with this, with the following suggestions for future interoperability: 1) Even if we don't flesh out getfuncptr at this point, lets leave a slot in the spec for it which must be set to NULL. 2) Lets define the encoding to emit odd first words, to allow using (alligned) pointers in some future interning extension without worrying about collision. This could be used to prevent matching on the 2n+1th words as well when scanning the table. - Robert From ian.h.bell at gmail.com Sat May 5 11:55:38 2012 From: ian.h.bell at gmail.com (Ian Bell) Date: Sat, 5 May 2012 21:55:38 +1200 Subject: [Cython] Suggestion of adding working examples to website Message-ID: One "feature" that matplotlib (Python's 2D plotting library) has which makes it easy to jump into matplotlib is the huge section of working examples: http://matplotlib.sourceforge.net/examples/index.html and http://matplotlib.sourceforge.net/gallery.html . From this, within a couple of days you can get minimally proficient with matplotlib. Having been (and continuing to be) a new user of Cython, I have found the learning curve to be very steep. The documentation online is pretty good (though it could use some work in places). Sometimes all it would take would be some working examples and the documentation would be completely clear. I taught myself the use of matplotlib through the old cut&paste and iterate method. I find that the one thing that is consistently the most challenging about the Cython docs is the lack of distutils setup.py files for more interesting configurations. Without them it requires a certain amount of guessing, playing, and Googling to make sense of how the pieces are supposed to go together. A working examples section could be VERY helpful in this regards. Also, the options for the distutils extensions are not documented at all so far as I can tell. Since the docs were built with Sphinx, it ought to be pretty easy to pull in docstrings if they exist. Just my 2 cents. I would be happy to work with you all to compile some simple examples for common uses - like the numpy convolve example for instance, and the integration example as well. Regards, Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From markflorisson88 at gmail.com Sat May 5 13:08:59 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sat, 5 May 2012 12:08:59 +0100 Subject: [Cython] CEP1000: Native dispatch through callables In-Reply-To: <4FA27900.80101@astro.uio.no> References: <4F87530F.7050000@astro.uio.no> <4F8D6112.1000906@astro.uio.no> <4F8F33AE.50401@astro.uio.no> <4F8F38F9.7020008@astro.uio.no> <4F8FEF7A.2090501@astro.uio.no> <4F9010B0.5080100@astro.uio.no> <4FA27900.80101@astro.uio.no> Message-ID: On 3 May 2012 13:24, Dag Sverre Seljebotn wrote: > I'm afraid I'm going to try to kick this thread alive again. I want us to > have something that Travis can implement in numba and "his" portion of > SciPy, and also that could be used by NumPy devs. > > Since the decisions are rather arbitrary, perhaps we can try to quickly get > to the "+1" stage (or, depending on how things turn out, a tournament > starting with at most one proposal per person). > > > On 04/20/2012 09:30 AM, Robert Bradshaw wrote: >> >> On Thu, Apr 19, 2012 at 6:18 AM, Dag Sverre Seljebotn >> ?wrote: >>> >>> On 04/19/2012 01:20 PM, Nathaniel Smith wrote: >>>> >>>> >>>> On Thu, Apr 19, 2012 at 11:56 AM, Dag Sverre Seljebotn >>>> ? ?wrote: >>>>> >>>>> >>>>> I thought of some drawbacks of getfuncptr: >>>>> >>>>> ?- Important: Doesn't allow you to actually inspect the supported >>>>> signatures, which is needed (or at least convenient) if you want to use >>>>> an >>>>> FFI library or do some JIT-ing. So an iteration mechanism is still >>>>> needed >>>>> in >>>>> addition, meaning the number of things for the object to implement >>>>> grows >>>>> a >>>>> bit large. Default implementations help -- OTOH there really wasn't a >>>>> major >>>>> drawback with the table approach as long as JIT's can just replace it? >>>> >>>> >>>> >>>> But this is orthogonal to the table vs. getfuncptr discussion. We're >>>> assuming that the table might be extended at runtime, which means you >>>> can't use it to determine which signatures are supported. So we need >>>> some sort of extra interface for the caller and callee to negotiate a >>>> type anyway. (I'm intentionally agnostic about whether it makes more >>>> sense for the caller or the callee to be doing the iterating... in >>>> general type negotiation could be quite complicated, and I don't think >>>> we know enough to get that interface right yet.) >>> >>> >>> >>> Hmm. Right. Let's define an explicit goal for the CEP then. >>> >>> What I care about at is getting the spec right enough such that, e.g., >>> NumPy >>> and SciPy, and other (mostly manually written) C extensions with slow >>> development pace, can be forward-compatible with whatever crazy things >>> Cython or Numba does. >>> >>> There's 4 cases: >>> >>> ?1) JIT calls JIT (ruled out straight away) >>> >>> ?2) JIT calls static: Say that Numba wants to optimize calls to np.sin >>> etc. >>> without special-casing; this seem to require reading a table of static >>> signatures >>> >>> ?3) Static calls JIT: This is the case when scipy.integrate routines >>> calls a >>> Numba callback and Numba generates a specialization for the dtype they >>> explicitly needs. This calls for getfuncptr (but perhaps in a form which >>> we >>> can't quite determine yet?). >>> >>> ?4) Static calls static: Either table or getfuncptr works. >>> >>> My gut feeling is go for 2) and 4) in this round => ?table. >> >> >> getfuncptr is really simple and flexible, but I'm with you on both of >> these to points, and the overhead was not trivial. > > > It's interesting to hear you say the overhead was not trivial (that was my > hunch too but I sort of yielded to peer pressure). I think SAGE has some > history with this -- isn't one of the reasons for the "cpdef" vs. "cdef" > split that "cpdef" has the cost of a single lookup for the presence of a > __dict__ on the object, which was an unacceptable penalty for parts of Sage? > That can't have been much more than a 1ns penalty per instance. > > >> Of course we could offer both, i.e. look at the table first, if it's >> not there call getfuncptr if it's non-null, then fall back to "slow" >> call or error. These are all opt-in depending on how hard you want to >> try to optimize things. > > > That's actually exactly what I was envisioning -- in time (with JITs on both > ends) the table could act sort of as a cache for commonly used overloads, > and getfuncptr would access the others more slowly. > > >> As far as keys vs. interning, I'm also tempted to try to have my cake >> and eat it too. Define a space-friendly encoding for signatures and >> require interning for anything that doesn't fit into a single >> sizeof(void*). The fact that this cutoff would vary for 32 vs 64-bit >> would require some care, but could be done with macros in C. If the >> signatures produce non-aligned "pointer" values there won't be any >> collisions, and this way libraries only have to share in the global >> (Python-level?) interning scheme iff they want to expose/use "large" >> signatures. > > > That was the approach I described to Nathaniel as having the "worst features > of both" -- lack of readable gdb dumps of the keys, and having to define an > interning mechanism for use by the 5% cases that don't fit. > > To sum up hat's been said earlier: The only thing that would blow the key > size above 64 bits except very many arguments would be things like > classes/interfaces/vtables. But in that case, reasonable-sized keys for the > vtables can be computed (whether by interning, cryptographic hashing, or a > GUID like Microsoft COM). > > So I'm still +1 on my proposal; but I would be happy with an intern-based > proposal if somebody bothers to flesh it out a bit (I don't quite know how > I'd do it and would get lost in PyObject* vs. char* and cross-language state > sharing...). > > My proposal in summary: > > ?- Table with variable-sized entries (not getfuncptr, not interning) that > can be scanned by the caller in 128-bit increments. Hm, so the caller knows what kind of key it needs to compare to, so if it has a 64 bits key then it won't need to compare 128 bits (padded with zeroes?). But if it doesn't compare 128 bits, then it means 128 bit keys cannot have 64 bit keys as prefix. Will that be a problem, or would it make sense to make the first entry a pointer pointing to 128 bit keys, and the rest are all 64 bit keys (or even 32 bit keys and two pointers)? e.g. a contiguous list of [128 bit key/pointer list-pointer, 64-bit keys & func pointers, 128 bit keys & func pointers, NULL] Even with a naive encoding scheme you could encode 3 scalar arguments and a return value in 32 bits (e.g. 'dddd'). That might be better on x86? > ?- Only use 64 bit pointers, in order to keep table format the same on 32 > bit and 64 bit. Pointer to the function? I think that would only be harder to use than native pointers? > ?- Do encoding of the signature strings. Utility functions to work with this > (both to scan tables and encode/decode a format string) will be provided as > C code by the CEP that can be bundled. > > Pros: > > ?- Table format is not specific to Python world (it makes as much sense to > use, e.g., internally in Julia) > > ?- No state needs to be shared between packages run-time (they can use the > bundled C code in isolation if they wish) > > ?- No need for an interning machinery > > ?- More easily compatible with multiple interpreter states (?) > > ?- Minor performance benefit of table over getfuncptr (intern vs. key didn't > matter). [Cue comment that this doesn't matter.] > > Cons: > > ?- Lack of instant low-level debuggability, like in the interned case (a > human needs to run a function on the key constant to see what it corresponds > to) > > ?- Not as extendable as getfuncptr (though currently we don't quite know how > we would extend it, and it's easy to add getfuncptr in the future) > > Notes: > > ?- When extended to handle vtable argument types, these still needs to be > some interning or crypto-hashing. But that is likely to come up anyway as > part of a COM-like queryInterface protocol, and at that point we will be > better at making those decisions and design a good interning mechanism. > > Dag > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From d.s.seljebotn at astro.uio.no Sat May 5 18:27:11 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Sat, 05 May 2012 18:27:11 +0200 Subject: [Cython] CEP1000: Native dispatch through callables In-Reply-To: References: <4F87530F.7050000@astro.uio.no> <4F8D6112.1000906@astro.uio.no> <4F8F33AE.50401@astro.uio.no> <4F8F38F9.7020008@astro.uio.no> <4F8FEF7A.2090501@astro.uio.no> <4F9010B0.5080100@astro.uio.no> <4FA27900.80101@astro.uio.no> Message-ID: <4FA554DF.9040702@astro.uio.no> On 05/05/2012 01:08 PM, mark florisson wrote: > On 3 May 2012 13:24, Dag Sverre Seljebotn wrote: >> I'm afraid I'm going to try to kick this thread alive again. I want us to >> have something that Travis can implement in numba and "his" portion of >> SciPy, and also that could be used by NumPy devs. >> >> Since the decisions are rather arbitrary, perhaps we can try to quickly get >> to the "+1" stage (or, depending on how things turn out, a tournament >> starting with at most one proposal per person). >> >> >> On 04/20/2012 09:30 AM, Robert Bradshaw wrote: >>> >>> On Thu, Apr 19, 2012 at 6:18 AM, Dag Sverre Seljebotn >>> wrote: >>>> >>>> On 04/19/2012 01:20 PM, Nathaniel Smith wrote: >>>>> >>>>> >>>>> On Thu, Apr 19, 2012 at 11:56 AM, Dag Sverre Seljebotn >>>>> wrote: >>>>>> >>>>>> >>>>>> I thought of some drawbacks of getfuncptr: >>>>>> >>>>>> - Important: Doesn't allow you to actually inspect the supported >>>>>> signatures, which is needed (or at least convenient) if you want to use >>>>>> an >>>>>> FFI library or do some JIT-ing. So an iteration mechanism is still >>>>>> needed >>>>>> in >>>>>> addition, meaning the number of things for the object to implement >>>>>> grows >>>>>> a >>>>>> bit large. Default implementations help -- OTOH there really wasn't a >>>>>> major >>>>>> drawback with the table approach as long as JIT's can just replace it? >>>>> >>>>> >>>>> >>>>> But this is orthogonal to the table vs. getfuncptr discussion. We're >>>>> assuming that the table might be extended at runtime, which means you >>>>> can't use it to determine which signatures are supported. So we need >>>>> some sort of extra interface for the caller and callee to negotiate a >>>>> type anyway. (I'm intentionally agnostic about whether it makes more >>>>> sense for the caller or the callee to be doing the iterating... in >>>>> general type negotiation could be quite complicated, and I don't think >>>>> we know enough to get that interface right yet.) >>>> >>>> >>>> >>>> Hmm. Right. Let's define an explicit goal for the CEP then. >>>> >>>> What I care about at is getting the spec right enough such that, e.g., >>>> NumPy >>>> and SciPy, and other (mostly manually written) C extensions with slow >>>> development pace, can be forward-compatible with whatever crazy things >>>> Cython or Numba does. >>>> >>>> There's 4 cases: >>>> >>>> 1) JIT calls JIT (ruled out straight away) >>>> >>>> 2) JIT calls static: Say that Numba wants to optimize calls to np.sin >>>> etc. >>>> without special-casing; this seem to require reading a table of static >>>> signatures >>>> >>>> 3) Static calls JIT: This is the case when scipy.integrate routines >>>> calls a >>>> Numba callback and Numba generates a specialization for the dtype they >>>> explicitly needs. This calls for getfuncptr (but perhaps in a form which >>>> we >>>> can't quite determine yet?). >>>> >>>> 4) Static calls static: Either table or getfuncptr works. >>>> >>>> My gut feeling is go for 2) and 4) in this round => table. >>> >>> >>> getfuncptr is really simple and flexible, but I'm with you on both of >>> these to points, and the overhead was not trivial. >> >> >> It's interesting to hear you say the overhead was not trivial (that was my >> hunch too but I sort of yielded to peer pressure). I think SAGE has some >> history with this -- isn't one of the reasons for the "cpdef" vs. "cdef" >> split that "cpdef" has the cost of a single lookup for the presence of a >> __dict__ on the object, which was an unacceptable penalty for parts of Sage? >> That can't have been much more than a 1ns penalty per instance. >> >> >>> Of course we could offer both, i.e. look at the table first, if it's >>> not there call getfuncptr if it's non-null, then fall back to "slow" >>> call or error. These are all opt-in depending on how hard you want to >>> try to optimize things. >> >> >> That's actually exactly what I was envisioning -- in time (with JITs on both >> ends) the table could act sort of as a cache for commonly used overloads, >> and getfuncptr would access the others more slowly. >> >> >>> As far as keys vs. interning, I'm also tempted to try to have my cake >>> and eat it too. Define a space-friendly encoding for signatures and >>> require interning for anything that doesn't fit into a single >>> sizeof(void*). The fact that this cutoff would vary for 32 vs 64-bit >>> would require some care, but could be done with macros in C. If the >>> signatures produce non-aligned "pointer" values there won't be any >>> collisions, and this way libraries only have to share in the global >>> (Python-level?) interning scheme iff they want to expose/use "large" >>> signatures. >> >> >> That was the approach I described to Nathaniel as having the "worst features >> of both" -- lack of readable gdb dumps of the keys, and having to define an >> interning mechanism for use by the 5% cases that don't fit. >> >> To sum up hat's been said earlier: The only thing that would blow the key >> size above 64 bits except very many arguments would be things like >> classes/interfaces/vtables. But in that case, reasonable-sized keys for the >> vtables can be computed (whether by interning, cryptographic hashing, or a >> GUID like Microsoft COM). >> >> So I'm still +1 on my proposal; but I would be happy with an intern-based >> proposal if somebody bothers to flesh it out a bit (I don't quite know how >> I'd do it and would get lost in PyObject* vs. char* and cross-language state >> sharing...). >> >> My proposal in summary: >> >> - Table with variable-sized entries (not getfuncptr, not interning) that >> can be scanned by the caller in 128-bit increments. > > Hm, so the caller knows what kind of key it needs to compare to, so if > it has a 64 bits key then it won't need to compare 128 bits (padded > with zeroes?). But if it doesn't compare 128 bits, then it means 128 > bit keys cannot have 64 bit keys as prefix. Will that be a problem, or Did you read the CEP? I also clarified this in a post in response to Nathaniel. The idea is that the scanner doesn't need to branch on the key-length anywhere. This requires a) making each key n*64 bits long where n is odd => function pointers are always at (m*128 + 64) bits from the start for some non-negative integer m, b) insert some protective prefix for every 128 bits in the key. > would it make sense to make the first entry a pointer pointing to 128 > bit keys, and the rest are all 64 bit keys (or even 32 bit keys and > two pointers)? e.g. a contiguous list of [128 bit key/pointer > list-pointer, 64-bit keys& func pointers, 128 bit keys& func > pointers, NULL] I don't really understand this description, but in general I'm sceptical about the pipelining abilities of pointer-chasing code. It may be OK, but it would require a benchmark, and if there's not a reason to have it... > Even with a naive encoding scheme you could encode 3 scalar arguments > and a return value in 32 bits (e.g. 'dddd'). That might be better on > x86? Me and Robert have been assuming some non-ASCII encoding that would allow many more arguments in 64 bits. > >> - Only use 64 bit pointers, in order to keep table format the same on 32 >> bit and 64 bit. > > Pointer to the function? I think that would only be harder to use than > native pointers? That was to make the multiple-of-128-bit-entry idea work without having to require that keys are different between 32 bits and 64 bits platforms. Dag >> - Do encoding of the signature strings. Utility functions to work with this >> (both to scan tables and encode/decode a format string) will be provided as >> C code by the CEP that can be bundled. >> >> Pros: >> >> - Table format is not specific to Python world (it makes as much sense to >> use, e.g., internally in Julia) >> >> - No state needs to be shared between packages run-time (they can use the >> bundled C code in isolation if they wish) >> >> - No need for an interning machinery >> >> - More easily compatible with multiple interpreter states (?) >> >> - Minor performance benefit of table over getfuncptr (intern vs. key didn't >> matter). [Cue comment that this doesn't matter.] >> >> Cons: >> >> - Lack of instant low-level debuggability, like in the interned case (a >> human needs to run a function on the key constant to see what it corresponds >> to) >> >> - Not as extendable as getfuncptr (though currently we don't quite know how >> we would extend it, and it's easy to add getfuncptr in the future) >> >> Notes: >> >> - When extended to handle vtable argument types, these still needs to be >> some interning or crypto-hashing. But that is likely to come up anyway as >> part of a COM-like queryInterface protocol, and at that point we will be >> better at making those decisions and design a good interning mechanism. >> >> Dag >> >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Sat May 5 19:13:04 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sat, 5 May 2012 18:13:04 +0100 Subject: [Cython] CEP1000: Native dispatch through callables In-Reply-To: <4FA554DF.9040702@astro.uio.no> References: <4F87530F.7050000@astro.uio.no> <4F8D6112.1000906@astro.uio.no> <4F8F33AE.50401@astro.uio.no> <4F8F38F9.7020008@astro.uio.no> <4F8FEF7A.2090501@astro.uio.no> <4F9010B0.5080100@astro.uio.no> <4FA27900.80101@astro.uio.no> <4FA554DF.9040702@astro.uio.no> Message-ID: On 5 May 2012 17:27, Dag Sverre Seljebotn wrote: > On 05/05/2012 01:08 PM, mark florisson wrote: >> >> On 3 May 2012 13:24, Dag Sverre Seljebotn >> ?wrote: >>> >>> I'm afraid I'm going to try to kick this thread alive again. I want us to >>> have something that Travis can implement in numba and "his" portion of >>> SciPy, and also that could be used by NumPy devs. >>> >>> Since the decisions are rather arbitrary, perhaps we can try to quickly >>> get >>> to the "+1" stage (or, depending on how things turn out, a tournament >>> starting with at most one proposal per person). >>> >>> >>> On 04/20/2012 09:30 AM, Robert Bradshaw wrote: >>>> >>>> >>>> On Thu, Apr 19, 2012 at 6:18 AM, Dag Sverre Seljebotn >>>> ? ?wrote: >>>>> >>>>> >>>>> On 04/19/2012 01:20 PM, Nathaniel Smith wrote: >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Apr 19, 2012 at 11:56 AM, Dag Sverre Seljebotn >>>>>> ? ? ?wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> I thought of some drawbacks of getfuncptr: >>>>>>> >>>>>>> ?- Important: Doesn't allow you to actually inspect the supported >>>>>>> signatures, which is needed (or at least convenient) if you want to >>>>>>> use >>>>>>> an >>>>>>> FFI library or do some JIT-ing. So an iteration mechanism is still >>>>>>> needed >>>>>>> in >>>>>>> addition, meaning the number of things for the object to implement >>>>>>> grows >>>>>>> a >>>>>>> bit large. Default implementations help -- OTOH there really wasn't a >>>>>>> major >>>>>>> drawback with the table approach as long as JIT's can just replace >>>>>>> it? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> But this is orthogonal to the table vs. getfuncptr discussion. We're >>>>>> assuming that the table might be extended at runtime, which means you >>>>>> can't use it to determine which signatures are supported. So we need >>>>>> some sort of extra interface for the caller and callee to negotiate a >>>>>> type anyway. (I'm intentionally agnostic about whether it makes more >>>>>> sense for the caller or the callee to be doing the iterating... in >>>>>> general type negotiation could be quite complicated, and I don't think >>>>>> we know enough to get that interface right yet.) >>>>> >>>>> >>>>> >>>>> >>>>> Hmm. Right. Let's define an explicit goal for the CEP then. >>>>> >>>>> What I care about at is getting the spec right enough such that, e.g., >>>>> NumPy >>>>> and SciPy, and other (mostly manually written) C extensions with slow >>>>> development pace, can be forward-compatible with whatever crazy things >>>>> Cython or Numba does. >>>>> >>>>> There's 4 cases: >>>>> >>>>> ?1) JIT calls JIT (ruled out straight away) >>>>> >>>>> ?2) JIT calls static: Say that Numba wants to optimize calls to np.sin >>>>> etc. >>>>> without special-casing; this seem to require reading a table of static >>>>> signatures >>>>> >>>>> ?3) Static calls JIT: This is the case when scipy.integrate routines >>>>> calls a >>>>> Numba callback and Numba generates a specialization for the dtype they >>>>> explicitly needs. This calls for getfuncptr (but perhaps in a form >>>>> which >>>>> we >>>>> can't quite determine yet?). >>>>> >>>>> ?4) Static calls static: Either table or getfuncptr works. >>>>> >>>>> My gut feeling is go for 2) and 4) in this round => ? ?table. >>>> >>>> >>>> >>>> getfuncptr is really simple and flexible, but I'm with you on both of >>>> these to points, and the overhead was not trivial. >>> >>> >>> >>> It's interesting to hear you say the overhead was not trivial (that was >>> my >>> hunch too but I sort of yielded to peer pressure). I think SAGE has some >>> history with this -- isn't one of the reasons for the "cpdef" vs. "cdef" >>> split that "cpdef" has the cost of a single lookup for the presence of a >>> __dict__ on the object, which was an unacceptable penalty for parts of >>> Sage? >>> That can't have been much more than a 1ns penalty per instance. >>> >>> >>>> Of course we could offer both, i.e. look at the table first, if it's >>>> not there call getfuncptr if it's non-null, then fall back to "slow" >>>> call or error. These are all opt-in depending on how hard you want to >>>> try to optimize things. >>> >>> >>> >>> That's actually exactly what I was envisioning -- in time (with JITs on >>> both >>> ends) the table could act sort of as a cache for commonly used overloads, >>> and getfuncptr would access the others more slowly. >>> >>> >>>> As far as keys vs. interning, I'm also tempted to try to have my cake >>>> and eat it too. Define a space-friendly encoding for signatures and >>>> require interning for anything that doesn't fit into a single >>>> sizeof(void*). The fact that this cutoff would vary for 32 vs 64-bit >>>> would require some care, but could be done with macros in C. If the >>>> signatures produce non-aligned "pointer" values there won't be any >>>> collisions, and this way libraries only have to share in the global >>>> (Python-level?) interning scheme iff they want to expose/use "large" >>>> signatures. >>> >>> >>> >>> That was the approach I described to Nathaniel as having the "worst >>> features >>> of both" -- lack of readable gdb dumps of the keys, and having to define >>> an >>> interning mechanism for use by the 5% cases that don't fit. >>> >>> To sum up hat's been said earlier: The only thing that would blow the key >>> size above 64 bits except very many arguments would be things like >>> classes/interfaces/vtables. But in that case, reasonable-sized keys for >>> the >>> vtables can be computed (whether by interning, cryptographic hashing, or >>> a >>> GUID like Microsoft COM). >>> >>> So I'm still +1 on my proposal; but I would be happy with an intern-based >>> proposal if somebody bothers to flesh it out a bit (I don't quite know >>> how >>> I'd do it and would get lost in PyObject* vs. char* and cross-language >>> state >>> sharing...). >>> >>> My proposal in summary: >>> >>> ?- Table with variable-sized entries (not getfuncptr, not interning) that >>> can be scanned by the caller in 128-bit increments. >> >> >> Hm, so the caller knows what kind of key it needs to compare to, so if >> it has a 64 bits key then it won't need to compare 128 bits (padded >> with zeroes?). But if it doesn't compare 128 bits, then it means 128 >> bit keys cannot have 64 bit keys as prefix. Will that be a problem, or > > > Did you read the CEP? I also clarified this in a post in response to > Nathaniel. The idea is that the scanner doesn't need to branch on the > key-length anywhere. This requires a) making each key n*64 bits long where n > is odd => function pointers are always at (m*128 + 64) bits from the start > for some non-negative integer m, b) insert some protective prefix for every > 128 bits in the key. > > Oh sorry, I didn't read the updated CEP, you want arbitrarily sized keys. I assumed you would hash any key larger than 64 bits to a 128 bit key (e.g. md5). For instance if you have a large(r) number of signatures, some of which are complex (greater than 64 bits, so hashed) and some of which are simple, then if you know the signature you need in advance, you can either follow the pointer to the 128 bit keys, or skip the pointer entirely and continue with the 64 bit keys. I suppose the common case is a few signatures, in which case a linear scan is likely faster in the 128 bit case (which is the uncommon case). >> would it make sense to make the first entry a pointer pointing to 128 >> bit keys, and the rest are all 64 bit keys (or even 32 bit keys and >> two pointers)? e.g. a contiguous list of [128 bit key/pointer >> list-pointer, 64-bit keys& ?func pointers, 128 bit keys& ?func >> pointers, NULL] > > > I don't really understand this description, but in general I'm sceptical > about the pipelining abilities of pointer-chasing code. It may be OK, but it > would require a benchmark, and if there's not a reason to have it... > > >> Even with a naive encoding scheme you could encode 3 scalar arguments >> and a return value in 32 bits (e.g. 'dddd'). That might be better on >> x86? > > > Me and Robert have been assuming some non-ASCII encoding that would allow > many more arguments in 64 bits. > Sure. The point was that the worst encoding scheme can already serve the common case. >> >>> ?- Only use 64 bit pointers, in order to keep table format the same on 32 >>> bit and 64 bit. >> >> >> Pointer to the function? I think that would only be harder to use than >> native pointers? > > > That was to make the multiple-of-128-bit-entry idea work without having to > require that keys are different between 32 bits and 64 bits platforms. Right, I got that now (reading the CEP is kind of mandatory :). Thanks. > Dag > > >>> ?- Do encoding of the signature strings. Utility functions to work with >>> this >>> (both to scan tables and encode/decode a format string) will be provided >>> as >>> C code by the CEP that can be bundled. >>> >>> Pros: >>> >>> ?- Table format is not specific to Python world (it makes as much sense >>> to >>> use, e.g., internally in Julia) >>> >>> ?- No state needs to be shared between packages run-time (they can use >>> the >>> bundled C code in isolation if they wish) >>> >>> ?- No need for an interning machinery >>> >>> ?- More easily compatible with multiple interpreter states (?) >>> >>> ?- Minor performance benefit of table over getfuncptr (intern vs. key >>> didn't >>> matter). [Cue comment that this doesn't matter.] >>> >>> Cons: >>> >>> ?- Lack of instant low-level debuggability, like in the interned case (a >>> human needs to run a function on the key constant to see what it >>> corresponds >>> to) >>> >>> ?- Not as extendable as getfuncptr (though currently we don't quite know >>> how >>> we would extend it, and it's easy to add getfuncptr in the future) >>> >>> Notes: >>> >>> ?- When extended to handle vtable argument types, these still needs to be >>> some interning or crypto-hashing. But that is likely to come up anyway as >>> part of a COM-like queryInterface protocol, and at that point we will be >>> better at making those decisions and design a good interning mechanism. >>> >>> Dag >>> >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >> >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Sat May 5 21:50:51 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 05 May 2012 21:50:51 +0200 Subject: [Cython] Python array support (#113) In-Reply-To: References: Message-ID: <4FA5849B.5090004@behnel.de> > https://github.com/cython/cython/pull/113 This looks ok to me now. There have been objections back when we discussed the initial patch for array.array support, so what do you think about merging this in? Stefan From stefan_ml at behnel.de Sun May 6 07:42:18 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 06 May 2012 07:42:18 +0200 Subject: [Cython] Python array support (#113) In-Reply-To: <4FA5849B.5090004@behnel.de> References: <4FA5849B.5090004@behnel.de> Message-ID: <4FA60F3A.2020101@behnel.de> Stefan Behnel, 05.05.2012 21:50: >> https://github.com/cython/cython/pull/113 > > This looks ok to me now. There have been objections back when we discussed > the initial patch for array.array support, so what do you think about > merging this in? One think I'm not sure about is how to deal with the header file. It would be nice to not rely on an external dependency that users need to ship with their code. Moving this into our utility code to write it into the C file would remove that need, but we don't currently have a way to trigger utility code insertion from .pxd files explicitly. Should we special case "cimport cpython.array" for this? Oh, and maybe we should also provide a fused type for the supported array item types to make it easier for users to write generic array code? (Although the mass of types may be overkill for most users...) Stefan From robertwb at gmail.com Sun May 6 09:16:03 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Sun, 6 May 2012 00:16:03 -0700 Subject: [Cython] Python array support (#113) In-Reply-To: <4FA60F3A.2020101@behnel.de> References: <4FA5849B.5090004@behnel.de> <4FA60F3A.2020101@behnel.de> Message-ID: On Sat, May 5, 2012 at 10:42 PM, Stefan Behnel wrote: > Stefan Behnel, 05.05.2012 21:50: >>> ? https://github.com/cython/cython/pull/113 >> >> This looks ok to me now. There have been objections back when we discussed >> the initial patch for array.array support, so what do you think about >> merging this in? > > One think I'm not sure about is how to deal with the header file. It would > be nice to not rely on an external dependency that users need to ship with > their code. Moving this into our utility code to write it into the C file > would remove that need, but we don't currently have a way to trigger > utility code insertion from .pxd files explicitly. Should we special case > "cimport cpython.array" for this? That's my biggest concern as well (though I only quickly skimmed through the code). I would be OK with special casing this Python builtin. > Oh, and maybe we should also provide a fused type for the supported array > item types to make it easier for users to write generic array code? > (Although the mass of types may be overkill for most users...) I think it's easy enough for the end user to make a fused type of the specializations they care about. - Robert From markflorisson88 at gmail.com Sun May 6 10:56:17 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sun, 6 May 2012 09:56:17 +0100 Subject: [Cython] Python array support (#113) In-Reply-To: <4FA5849B.5090004@behnel.de> References: <4FA5849B.5090004@behnel.de> Message-ID: On 5 May 2012 20:50, Stefan Behnel wrote: >> ? https://github.com/cython/cython/pull/113 > > This looks ok to me now. There have been objections back when we discussed > the initial patch for array.array support, so what do you think about > merging this in? > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel Great, I think it can be quite useful for some people. I think only some documentation is missing. Also, very minor complaint, the way it allocates shape, strides and the format string in __getbuffer__ is weird and complicated for no good reason. I think it's better to declare two Py_ssize_t scalars or one-sized arrays as class attributes and one char[2] array, and use those (then you can also get rid of __releasebuffer__). From stefan_ml at behnel.de Sun May 6 11:05:51 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 06 May 2012 11:05:51 +0200 Subject: [Cython] Python array support (#113) In-Reply-To: References: <4FA5849B.5090004@behnel.de> Message-ID: <4FA63EEF.5030604@behnel.de> mark florisson, 06.05.2012 10:56: > On 5 May 2012 20:50, Stefan Behnel wrote: >>> https://github.com/cython/cython/pull/113 >> >> This looks ok to me now. There have been objections back when we discussed >> the initial patch for array.array support, so what do you think about >> merging this in? > > Great, I think it can be quite useful for some people. I think only > some documentation is missing. > > Also, very minor complaint, the way it allocates shape, strides and > the format string in __getbuffer__ is weird and complicated for no > good reason. Maybe the reason is just that it wasn't written by you. ;) I take it that it's best to merge this pull request then, and to further fix it up afterwards. > I think it's better to declare two Py_ssize_t scalars or > one-sized arrays as class attributes and one char[2] array, and use > those (then you can also get rid of __releasebuffer__). Yes, that would be good. Note that itemsize is specifically designed so that it can be pointed to by strides for 1D arrays, and I guess shape can similarly just point to ob_size. Stefan From markflorisson88 at gmail.com Sun May 6 11:05:58 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sun, 6 May 2012 10:05:58 +0100 Subject: [Cython] Python array support (#113) In-Reply-To: References: <4FA5849B.5090004@behnel.de> Message-ID: On 6 May 2012 09:56, mark florisson wrote: > On 5 May 2012 20:50, Stefan Behnel wrote: >>> ? https://github.com/cython/cython/pull/113 >> >> This looks ok to me now. There have been objections back when we discussed >> the initial patch for array.array support, so what do you think about >> merging this in? >> >> Stefan >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > Great, I think it can be quite useful for some people. I think only > some documentation is missing. > > Also, very minor complaint, the way it allocates shape, strides and > the format string in __getbuffer__ is weird and complicated for no > good reason. I think it's better to declare two Py_ssize_t scalars or > one-sized arrays as class attributes and one char[2] array, and use > those (then you can also get rid of __releasebuffer__). Or hm, that might be a problem with their variable nature? Then the shape of the buffer would suddenly change... Maybe malloc is better in __getbuffer__ then. From markflorisson88 at gmail.com Sun May 6 11:07:14 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sun, 6 May 2012 10:07:14 +0100 Subject: [Cython] Python array support (#113) In-Reply-To: <4FA63EEF.5030604@behnel.de> References: <4FA5849B.5090004@behnel.de> <4FA63EEF.5030604@behnel.de> Message-ID: On 6 May 2012 10:05, Stefan Behnel wrote: > mark florisson, 06.05.2012 10:56: >> On 5 May 2012 20:50, Stefan Behnel wrote: >>>> ? https://github.com/cython/cython/pull/113 >>> >>> This looks ok to me now. There have been objections back when we discussed >>> the initial patch for array.array support, so what do you think about >>> merging this in? >> >> Great, I think it can be quite useful for some people. I think only >> some documentation is missing. >> >> Also, very minor complaint, the way it allocates shape, strides and >> the format string in __getbuffer__ is weird and complicated for no >> good reason. > > Maybe the reason is just that it wasn't written by you. ;) > > I take it that it's best to merge this pull request then, and to further > fix it up afterwards. > Definitely, +1. >> I think it's better to declare two Py_ssize_t scalars or >> one-sized arrays as class attributes and one char[2] array, and use >> those (then you can also get rid of __releasebuffer__). > > Yes, that would be good. Note that itemsize is specifically designed so > that it can be pointed to by strides for 1D arrays, and I guess shape can > similarly just point to ob_size. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Sun May 6 11:32:24 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 06 May 2012 11:32:24 +0200 Subject: [Cython] Python array support (#113) In-Reply-To: References: <4FA5849B.5090004@behnel.de> Message-ID: <4FA64528.5010305@behnel.de> mark florisson, 06.05.2012 11:05: > On 6 May 2012 09:56, mark florisson wrote: >> On 5 May 2012 20:50, Stefan Behnel wrote: >>>> https://github.com/cython/cython/pull/113 >>> >>> This looks ok to me now. There have been objections back when we discussed >>> the initial patch for array.array support, so what do you think about >>> merging this in? >> >> Great, I think it can be quite useful for some people. I think only >> some documentation is missing. >> >> Also, very minor complaint, the way it allocates shape, strides and >> the format string in __getbuffer__ is weird and complicated for no >> good reason. I think it's better to declare two Py_ssize_t scalars or >> one-sized arrays as class attributes and one char[2] array, and use >> those (then you can also get rid of __releasebuffer__). > > Or hm, that might be a problem with their variable nature? Then the > shape of the buffer would suddenly change... I'm fine with saying that any user who changes the size of an array while a buffer view on it is being held (and used) is just plain out of warranty. After all, a realloc() is allowed to move the memory buffer around and may really do it in some cases, so the length is the least of our problems, even if the array doesn't shrink but only grows. I just noticed that the array module supports the buffer interface natively in Python 3. That makes this whole patch somewhat less interesting, because it's essentially just a work-around for a missing feature in Py2. Py3 does the setup like this: view->buf = (void *)self->ob_item; view->obj = (PyObject*)self; Py_INCREF(self); if (view->buf == NULL) view->buf = (void *)emptybuf; view->len = (Py_SIZE(self)) * self->ob_descr->itemsize; view->readonly = 0; view->ndim = 1; view->itemsize = self->ob_descr->itemsize; view->suboffsets = NULL; view->shape = NULL; if ((flags & PyBUF_ND)==PyBUF_ND) { view->shape = &((Py_SIZE(self))); } view->strides = NULL; if ((flags & PyBUF_STRIDES)==PyBUF_STRIDES) view->strides = &(view->itemsize); view->format = NULL; view->internal = NULL; if ((flags & PyBUF_FORMAT) == PyBUF_FORMAT) view->format = self->ob_descr->formats; It also counts the number of exports and prevents resizing while a buffer is being exported. The current .pxd implementation cannot achieve that, which is really unfortunate. ISTM that it should fall through to the native buffer interface if the underlying array supports it. Stefan From markflorisson88 at gmail.com Sun May 6 16:28:43 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sun, 6 May 2012 15:28:43 +0100 Subject: [Cython] 0.17 Message-ID: Hey, I think we already have quite a bit of functionality (nearly) ready, after merging some pending pull requests maybe it will be a good time for a 0.17 release? I think it would be good to also document to what extent pypy support works, what works and what doesn't. Stefan, since you added a large majority of the features, would you want to be the release manager? In summary, the following pull requests should likely go in - array.array support (unless further discussion prevents that) - fused types runtime buffer dispatch - newaxis - more? The memoryview documentation should also be reworked a bit. Matthew, are you still willing to have a go at that? Otherwise I can clean up the mess first, some things are no longer true and simply outdated, and then have a second opinion. Mark From d.s.seljebotn at astro.uio.no Sun May 6 19:51:56 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Sun, 06 May 2012 19:51:56 +0200 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: <4FA6BA3C.3030001@astro.uio.no> On 05/06/2012 04:28 PM, mark florisson wrote: > Hey, > > I think we already have quite a bit of functionality (nearly) ready, > after merging some pending pull requests maybe it will be a good time > for a 0.17 release? I think it would be good to also document to what > extent pypy support works, what works and what doesn't. Stefan, since > you added a large majority of the features, would you want to be the > release manager? > > In summary, the following pull requests should likely go in > - array.array support (unless further discussion prevents that) > - fused types runtime buffer dispatch > - newaxis > - more? Sounds more like a 0.16.1? (Did we have any rules for that -- except the obvious one that breaking backwards compatibility in noticeable ways has to increment the major?) Dag > > The memoryview documentation should also be reworked a bit. Matthew, > are you still willing to have a go at that? Otherwise I can clean up > the mess first, some things are no longer true and simply outdated, > and then have a second opinion. > > Mark > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Sun May 6 20:22:51 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 06 May 2012 20:22:51 +0200 Subject: [Cython] 0.17 In-Reply-To: <4FA6BA3C.3030001@astro.uio.no> References: <4FA6BA3C.3030001@astro.uio.no> Message-ID: <4FA6C17B.1080303@behnel.de> Dag Sverre Seljebotn, 06.05.2012 19:51: > On 05/06/2012 04:28 PM, mark florisson wrote: >> I think we already have quite a bit of functionality (nearly) ready, >> after merging some pending pull requests maybe it will be a good time >> for a 0.17 release? I think it would be good to also document to what >> extent pypy support works, what works and what doesn't. Stefan, since >> you added a large majority of the features, would you want to be the >> release manager? >> >> In summary, the following pull requests should likely go in >> - array.array support (unless further discussion prevents that) >> - fused types runtime buffer dispatch >> - newaxis >> - more? > > > Sounds more like a 0.16.1? (Did we have any rules for that -- except the > obvious one that breaking backwards compatibility in noticeable ways has to > increment the major?) Those are only the pending pull requests, the current feature set in the master branch is way larger than that. I'll start writing up the release notes soon. Stefan From vitja.makarov at gmail.com Sun May 6 20:29:30 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Sun, 6 May 2012 22:29:30 +0400 Subject: [Cython] 0.17 In-Reply-To: <4FA6BA3C.3030001@astro.uio.no> References: <4FA6BA3C.3030001@astro.uio.no> Message-ID: 2012/5/6 Dag Sverre Seljebotn : > On 05/06/2012 04:28 PM, mark florisson wrote: >> >> Hey, >> >> I think we already have quite a bit of functionality (nearly) ready, >> after merging some pending pull requests maybe it will be a good time >> for a 0.17 release? I think it would be good to also document to what >> extent pypy support works, what works and what doesn't. Stefan, since >> you added a large majority of the features, would you want to be the >> release manager? >> >> In summary, the following pull requests should likely go in >> ? ? - array.array support (unless further discussion prevents that) >> ? ? - fused types runtime buffer dispatch >> ? ? - newaxis >> ? ? - more? > > > > Sounds more like a 0.16.1? (Did we have any rules for that -- except the > obvious one that breaking backwards compatibility in noticeable ways has to > increment the major?) > > Dag > > >> >> The memoryview documentation should also be reworked a bit. Matthew, >> are you still willing to have a go at that? Otherwise I can clean up >> the mess first, some things are no longer true and simply outdated, >> and then have a second opinion. >> +1, I think that 0.16 has some bugs and we should make a bugfix release. -- vitja. From stefan_ml at behnel.de Sun May 6 20:38:15 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 06 May 2012 20:38:15 +0200 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: <4FA6C517.8080208@behnel.de> mark florisson, 06.05.2012 16:28: > I think we already have quite a bit of functionality (nearly) ready, > after merging some pending pull requests maybe it will be a good time > for a 0.17 release? I think it would be good to also document to what > extent pypy support works, what works and what doesn't. Sure, although it's basically just "what works, works". However, there are certainly things that users must know in order to make their own code work. > Stefan, since > you added a large majority of the features, would you want to be the > release manager? I agree that it would make sense, but I'll be head under water for the next month or so. Maybe it would be better to put out a 0.16.1 with only selected fixes in the meantime? Stefan From markflorisson88 at gmail.com Sun May 6 20:41:04 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sun, 6 May 2012 19:41:04 +0100 Subject: [Cython] 0.17 In-Reply-To: References: <4FA6BA3C.3030001@astro.uio.no> Message-ID: On 6 May 2012 19:29, Vitja Makarov wrote: > 2012/5/6 Dag Sverre Seljebotn : >> On 05/06/2012 04:28 PM, mark florisson wrote: >>> >>> Hey, >>> >>> I think we already have quite a bit of functionality (nearly) ready, >>> after merging some pending pull requests maybe it will be a good time >>> for a 0.17 release? I think it would be good to also document to what >>> extent pypy support works, what works and what doesn't. Stefan, since >>> you added a large majority of the features, would you want to be the >>> release manager? >>> >>> In summary, the following pull requests should likely go in >>> ? ? - array.array support (unless further discussion prevents that) >>> ? ? - fused types runtime buffer dispatch >>> ? ? - newaxis >>> ? ? - more? >> >> >> >> Sounds more like a 0.16.1? (Did we have any rules for that -- except the >> obvious one that breaking backwards compatibility in noticeable ways has to >> increment the major?) >> >> Dag >> >> >>> >>> The memoryview documentation should also be reworked a bit. Matthew, >>> are you still willing to have a go at that? Otherwise I can clean up >>> the mess first, some things are no longer true and simply outdated, >>> and then have a second opinion. >>> > > +1, I think that 0.16 has some bugs and we should make a bugfix release. > > -- > vitja. > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel Stefan bumped the version number to 0.17pre a while back, we had a discussion about this before that. I think the features are large enough to warrant a major release. If we do want a bugfix release, we'll probably have to cherrypick the fixes over, that would be fine as well. From markflorisson88 at gmail.com Sun May 6 20:41:43 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sun, 6 May 2012 19:41:43 +0100 Subject: [Cython] 0.17 In-Reply-To: <4FA6C517.8080208@behnel.de> References: <4FA6C517.8080208@behnel.de> Message-ID: On 6 May 2012 19:38, Stefan Behnel wrote: > mark florisson, 06.05.2012 16:28: >> I think we already have quite a bit of functionality (nearly) ready, >> after merging some pending pull requests maybe it will be a good time >> for a 0.17 release? I think it would be good to also document to what >> extent pypy support works, what works and what doesn't. > > Sure, although it's basically just "what works, works". However, there are > certainly things that users must know in order to make their own code work. > > >> Stefan, since >> you added a large majority of the features, would you want to be the >> release manager? > > I agree that it would make sense, but I'll be head under water for the next > month or so. > > Maybe it would be better to put out a 0.16.1 with only selected fixes in > the meantime? Ok, if no one else wants to take it up (please do speak up if you do), I could give it another go. > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From matthew.brett at gmail.com Sun May 6 21:41:41 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Sun, 6 May 2012 12:41:41 -0700 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: Hi, On Sun, May 6, 2012 at 7:28 AM, mark florisson wrote: > Hey, > > I think we already have quite a bit of functionality (nearly) ready, > after merging some pending pull requests maybe it will be a good time > for a 0.17 release? I think it would be good to also document to what > extent pypy support works, what works and what doesn't. Stefan, since > you added a large majority of the features, would you want to be the > release manager? > > In summary, the following pull requests should likely go in > ? ?- array.array support (unless further discussion prevents that) > ? ?- fused types runtime buffer dispatch > ? ?- newaxis > ? ?- more? > > The memoryview documentation should also be reworked a bit. Matthew, > are you still willing to have a go at that? Otherwise I can clean up > the mess first, some things are no longer true and simply outdated, > and then have a second opinion. Yes, sorry, I have been taken up by releasing my own project. What's the deadline do you think? I have another big release to do for the end of next week, but I might be able to carve out some time, See you, Matthew From markflorisson88 at gmail.com Sun May 6 23:24:17 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sun, 6 May 2012 22:24:17 +0100 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: On 6 May 2012 20:41, Matthew Brett wrote: > Hi, > > On Sun, May 6, 2012 at 7:28 AM, mark florisson > wrote: >> Hey, >> >> I think we already have quite a bit of functionality (nearly) ready, >> after merging some pending pull requests maybe it will be a good time >> for a 0.17 release? I think it would be good to also document to what >> extent pypy support works, what works and what doesn't. Stefan, since >> you added a large majority of the features, would you want to be the >> release manager? >> >> In summary, the following pull requests should likely go in >> ? ?- array.array support (unless further discussion prevents that) >> ? ?- fused types runtime buffer dispatch >> ? ?- newaxis >> ? ?- more? >> >> The memoryview documentation should also be reworked a bit. Matthew, >> are you still willing to have a go at that? Otherwise I can clean up >> the mess first, some things are no longer true and simply outdated, >> and then have a second opinion. > > Yes, sorry, I have been taken up by releasing my own project. What's > the deadline do you think? ?I have another big release to do for the > end of next week, but I might be able to carve out some time, > > See you, > > Matthew Great, I'd say we're probably not going to release anything within the next two weeks, so take your time, there is no hurry really :). From d.s.seljebotn at astro.uio.no Mon May 7 12:40:50 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 May 2012 12:40:50 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7A618.4000503@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> Message-ID: <4FA7A6B2.5000801@astro.uio.no> [moving to dev list] On 05/07/2012 11:17 AM, Stefan Behnel wrote: > Dag Sverre Seljebotn, 07.05.2012 10:44: >> On 05/07/2012 07:48 AM, Stefan Behnel wrote: >>> shaunc, 07.05.2012 07:13: >>>> The following code: >>>> >>>> cdef int foo( double[:] bar ) nogil: >>>> return bar is None >>>> >>>> causes: "Converting to Python object not allowed without gil" >>>> >>>> However, I was under the impression that: "When comparing a value with >>>> None, >>>> keep in mind that, if x is a Python object, x is None and x is not None are >>>> very efficient because they translate directly to C pointer comparisons," >>>> >>>> I guess the problem is that the memoryview is not a python object -- >>>> indeed, this compiles in the form: >>>> >>>> cdef int foo( object bar ) nogil: >>>> >>>> return bar is None >>>> >>>> But this is a bit counterintuitive... do I need to do "with gil" to check >>>> if a memoryview is None? And in a nogil function, I'm not necessarily >>>> guaranteed that I don't have the gil -- what is the best way ensure I have >>>> the gil? (Is there a "secret system call" or should I use a try block?) >>>> >>>> It would seem more appropriate (IMHO, of course :)) to allow "bar is None" >>>> also when bar is a memoryview.... >>> >>> I wonder why a memory view should be allowed to be None in the first place. >>> Buffer arguments aren't (because they get unpacked on entry), so why should >>> memory views? >> >> ? At least when I implemented it, buffers get unpacked but the case of a >> None buffer is treated specially, and you're fully allowed (and segfault if >> you [] it). > > Hmm, ok, maybe I just got confused by the code then. > > I think the docs should state that buffer arguments are best used together > with the "not None" declaration then. I use them with "=None" default values all the time... then do a None-check manually. It's really no different from cdef classes. > And I remember that we wanted to change the default settings for extension > type arguments from "or None" to "not None" years ago but never actually > did it. I remember that there was such a debate, but I certainly don't remember that this was the conclusion :-) I didn't agree with that view then and I don't now. I don't remember what Robert's view was... As far as I can remember (which might be biased towards my personal view), the conclusion was that we left the current semantics in place, relying on better control flow analysis to make None-checks cheaper, and when those are cheap enough, make the nonecheck directive default to True (Java is sort of prior art that this can indeed be done?). Dag From stefan_ml at behnel.de Mon May 7 13:10:56 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 07 May 2012 13:10:56 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7A6B2.5000801@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> Message-ID: <4FA7ADC0.40501@behnel.de> Dag Sverre Seljebotn, 07.05.2012 12:40: > moving to dev list Makes sense. > On 05/07/2012 11:17 AM, Stefan Behnel wrote: >> Dag Sverre Seljebotn, 07.05.2012 10:44: >>> On 05/07/2012 07:48 AM, Stefan Behnel wrote: >>>> I wonder why a memory view should be allowed to be None in the first >>>> place. >>>> Buffer arguments aren't (because they get unpacked on entry), so why >>>> should memory views? >>> >>> ? At least when I implemented it, buffers get unpacked but the case of a >>> None buffer is treated specially, and you're fully allowed (and segfault if >>> you [] it). >> >> Hmm, ok, maybe I just got confused by the code then. >> >> I think the docs should state that buffer arguments are best used together >> with the "not None" declaration then. ... which made me realise that that wasn't even supported. I can't believe no-one ever reported that as a bug... https://github.com/cython/cython/commit/f2de49fd0ac82a02a070b931bf4d2dab47135d0b It's still not supported for memory views. BTW, is there a reason why we shouldn't allow a "not None" declaration for cdef functions? Obviously, the caller would have to do the check in that case. Hmm, maybe it's not that important, because None checks are best done at entry points from user code, which usually means Python code. It seems like "not None" is not supported on cpdef functions, though. > I use them with "=None" default values all the time... then do a > None-check manually. Interesting. Could you given an example? What's the advantage over letting Cython raise an error for you? And, since you are using it as a default argument, why would someone want to call your code entirely without a buffer argument? > It's really no different from cdef classes. I find it at least a bit more surprising because a buffer unpacking argument is a rather strong hint that you expect something that supports this protocol. The fact that you type your function argument with it hints at the intention to properly unpack it on entry. I'm sure there are lots of users who were or will be surprised when they realise that that doesn't exclude None values. >> And I remember that we wanted to change the default settings for extension >> type arguments from "or None" to "not None" years ago but never actually >> did it. > > I remember that there was such a debate, but I certainly don't remember > that this was the conclusion :-) Maybe not, yes. > I didn't agree with that view then and > I don't now. I don't remember what Robert's view was... > > As far as I can remember (which might be biased towards my personal > view), the conclusion was that we left the current semantics in place, > relying on better control flow analysis to make None-checks cheaper, and > when those are cheap enough, make the nonecheck directive default to > True At least for buffer arguments, it silently corrupts data or segfaults in the current state of affairs, as you pointed out. Not exactly ideal. That's another reason why I see a difference between the behaviour of extension types and that of buffer arguments. Buffer indexing is also way more performance critical than the average method call or attribute access on a cdef class. > (Java is sort of prior art that this can indeed be done?). Java was designed to have a JIT compiler underneath which handles external parameters, and its compilers are way smarter than Cython. I agree that there is still a lot we can do based on better static analysis, but there will always be limits. Stefan From d.s.seljebotn at astro.uio.no Mon May 7 13:48:18 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 May 2012 13:48:18 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7ADC0.40501@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> Message-ID: <4FA7B682.5050300@astro.uio.no> On 05/07/2012 01:10 PM, Stefan Behnel wrote: > Dag Sverre Seljebotn, 07.05.2012 12:40: >> moving to dev list > > Makes sense. > >> On 05/07/2012 11:17 AM, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 07.05.2012 10:44: >>>> On 05/07/2012 07:48 AM, Stefan Behnel wrote: >>>>> I wonder why a memory view should be allowed to be None in the first >>>>> place. >>>>> Buffer arguments aren't (because they get unpacked on entry), so why >>>>> should memory views? >>>> >>>> ? At least when I implemented it, buffers get unpacked but the case of a >>>> None buffer is treated specially, and you're fully allowed (and segfault if >>>> you [] it). >>> >>> Hmm, ok, maybe I just got confused by the code then. >>> >>> I think the docs should state that buffer arguments are best used together >>> with the "not None" declaration then. > > ... which made me realise that that wasn't even supported. I can't believe > no-one ever reported that as a bug... > > https://github.com/cython/cython/commit/f2de49fd0ac82a02a070b931bf4d2dab47135d0b > > It's still not supported for memory views. > > BTW, is there a reason why we shouldn't allow a "not None" declaration for > cdef functions? Obviously, the caller would have to do the check in that > case. Hmm, maybe it's not that important, because None checks are best done > at entry points from user code, which usually means Python code. It seems > like "not None" is not supported on cpdef functions, though. > > >> I use them with "=None" default values all the time... then do a >> None-check manually. > > Interesting. Could you given an example? What's the advantage over letting > Cython raise an error for you? And, since you are using it as a default > argument, why would someone want to call your code entirely without a > buffer argument? Here you go: def foo(np.ndarray[double] a, np.ndarray[double] out=None): if out is None: out = np.empty_like(a) # compute result in out return out The pattern of handing in the memory area to write to is one of the fundamental basics of numerical computing; you often just can't implement an algorithm if the called function returns the result in a newly-allocated array. I can explain why that is in detail, but I'd rather you just trusted the testimony of somebody doing numerical computation... It's just a convenience, but often (in particular when testing) it's incredibly convenient to not have to bother with allocating the output array. Another pattern is: def do_something(np.ndarray[double] a, np.ndarray[double] sin_of_a=None): ... so if your caller happened to already have computed something, the function uses it, but OTOH the "something" is a function of the inputs and can be computed on the fly. AND, sometimes it can be computed on the fly in ways more efficient than what the caller could have done, because of memory bus issues etc. etc. Both of these can be "fixed" by a) not allowing the convenient shorthand, or b) declare the argument "object" first and then type it after the "preamble". So the REAL reason I'm arguing this case is consistency with cdef classes. > > >> It's really no different from cdef classes. > > I find it at least a bit more surprising because a buffer unpacking > argument is a rather strong hint that you expect something that supports > this protocol. The fact that you type your function argument with it hints > at the intention to properly unpack it on entry. I'm sure there are lots of > users who were or will be surprised when they realise that that doesn't > exclude None values. Whereas I think there would be more users surprised by the opposite. So there -- we won't know who's right without actually finding some users. And chances are we are both right, since users are different from one another. > > >>> And I remember that we wanted to change the default settings for extension >>> type arguments from "or None" to "not None" years ago but never actually >>> did it. >> >> I remember that there was such a debate, but I certainly don't remember >> that this was the conclusion :-) > > Maybe not, yes. > > >> I didn't agree with that view then and >> I don't now. I don't remember what Robert's view was... >> >> As far as I can remember (which might be biased towards my personal >> view), the conclusion was that we left the current semantics in place, >> relying on better control flow analysis to make None-checks cheaper, and >> when those are cheap enough, make the nonecheck directive default to >> True > > At least for buffer arguments, it silently corrupts data or segfaults in > the current state of affairs, as you pointed out. Not exactly ideal. No different than writing to a field in a cdef class... > > That's another reason why I see a difference between the behaviour of > extension types and that of buffer arguments. Buffer indexing is also way > more performance critical than the average method call or attribute access > on a cdef class. Perhaps, but that's a bit hand-wavy to turn into a principle of language design? "This is performance critical, so therefore we suddenly invert the normal rule"? I just think we should be consistent, not have more special rules for buffers than we need to. The intention all the time was that "np.ndarray[double]" is just a glorified "np.ndarray". People expect it to behave like an optimized "np.ndarray". If "np.ndarray" can be None, why can't "np.ndarray[double]"? BTW, with the coming of memoryviews, me and Mark talked about just deprecating the "mytype[...]" meaning buffers, and rather treat it as np.ndarray, array.array etc. being some sort of "template types". That is, we disallow "object[int]" and require some special declarations in the relevant pxd files. >> (Java is sort of prior art that this can indeed be done?). > > Java was designed to have a JIT compiler underneath which handles external > parameters, and its compilers are way smarter than Cython. I agree that > there is still a lot we can do based on better static analysis, but there > will always be limits. Any static analysis will be able to get you to the point of "not None" if the user has a manual test. And the Python way is often to just spell things out rather than brevity; I think an explicit if-test is much more newbie friendly than "not None", "or None", etc. Performance beyond that is rather theoretical for the moment. I agree that for memoryviews that can be passed in acquired-state to cdef functions there is the question of eliminating an extra branch or so, but that is still far-fetched, and I'd rather Mark raise the issue if it comes an issue than the two of us bikeshedding over it. I'll try to make this my last post to this thread, I feel we're slipping into Dag-and-Stefan-endless-thread territory... Dag From d.s.seljebotn at astro.uio.no Mon May 7 13:51:00 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 May 2012 13:51:00 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7B682.5050300@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> Message-ID: <4FA7B724.5050008@astro.uio.no> On 05/07/2012 01:48 PM, Dag Sverre Seljebotn wrote: > On 05/07/2012 01:10 PM, Stefan Behnel wrote: >> Dag Sverre Seljebotn, 07.05.2012 12:40: >>> moving to dev list >> >> Makes sense. >> >>> On 05/07/2012 11:17 AM, Stefan Behnel wrote: >>>> Dag Sverre Seljebotn, 07.05.2012 10:44: >>>>> On 05/07/2012 07:48 AM, Stefan Behnel wrote: >>>>>> I wonder why a memory view should be allowed to be None in the first >>>>>> place. >>>>>> Buffer arguments aren't (because they get unpacked on entry), so why >>>>>> should memory views? >>>>> >>>>> ? At least when I implemented it, buffers get unpacked but the case >>>>> of a >>>>> None buffer is treated specially, and you're fully allowed (and >>>>> segfault if >>>>> you [] it). >>>> >>>> Hmm, ok, maybe I just got confused by the code then. >>>> >>>> I think the docs should state that buffer arguments are best used >>>> together >>>> with the "not None" declaration then. >> >> ... which made me realise that that wasn't even supported. I can't >> believe >> no-one ever reported that as a bug... >> >> https://github.com/cython/cython/commit/f2de49fd0ac82a02a070b931bf4d2dab47135d0b >> >> >> It's still not supported for memory views. >> >> BTW, is there a reason why we shouldn't allow a "not None" declaration >> for >> cdef functions? Obviously, the caller would have to do the check in that >> case. Hmm, maybe it's not that important, because None checks are best >> done >> at entry points from user code, which usually means Python code. It seems >> like "not None" is not supported on cpdef functions, though. >> >> >>> I use them with "=None" default values all the time... then do a >>> None-check manually. >> >> Interesting. Could you given an example? What's the advantage over >> letting >> Cython raise an error for you? And, since you are using it as a default >> argument, why would someone want to call your code entirely without a >> buffer argument? > > Here you go: > > def foo(np.ndarray[double] a, np.ndarray[double] out=None): > if out is None: > out = np.empty_like(a) > # compute result in out > return out > > The pattern of handing in the memory area to write to is one of the > fundamental basics of numerical computing; you often just can't > implement an algorithm if the called function returns the result in a > newly-allocated array. I can explain why that is in detail, but I'd > rather you just trusted the testimony of somebody doing numerical > computation... > > It's just a convenience, but often (in particular when testing) it's > incredibly convenient to not have to bother with allocating the output > array. > > Another pattern is: > > def do_something(np.ndarray[double] a, > np.ndarray[double] sin_of_a=None): > ... > > so if your caller happened to already have computed something, the > function uses it, but OTOH the "something" is a function of the inputs > and can be computed on the fly. AND, sometimes it can be computed on the > fly in ways more efficient than what the caller could have done, because > of memory bus issues etc. etc. > > Both of these can be "fixed" by a) not allowing the convenient > shorthand, or b) declare the argument "object" first and then type it > after the "preamble". > > So the REAL reason I'm arguing this case is consistency with cdef classes. > > > >> >> >>> It's really no different from cdef classes. >> >> I find it at least a bit more surprising because a buffer unpacking >> argument is a rather strong hint that you expect something that supports >> this protocol. The fact that you type your function argument with it >> hints >> at the intention to properly unpack it on entry. I'm sure there are >> lots of >> users who were or will be surprised when they realise that that doesn't >> exclude None values. > > Whereas I think there would be more users surprised by the opposite. > > So there -- we won't know who's right without actually finding some > users. And chances are we are both right, since users are different from > one another. > >> >> >>>> And I remember that we wanted to change the default settings for >>>> extension >>>> type arguments from "or None" to "not None" years ago but never >>>> actually >>>> did it. >>> >>> I remember that there was such a debate, but I certainly don't remember >>> that this was the conclusion :-) >> >> Maybe not, yes. >> >> >>> I didn't agree with that view then and >>> I don't now. I don't remember what Robert's view was... >>> >>> As far as I can remember (which might be biased towards my personal >>> view), the conclusion was that we left the current semantics in place, >>> relying on better control flow analysis to make None-checks cheaper, and >>> when those are cheap enough, make the nonecheck directive default to >>> True >> >> At least for buffer arguments, it silently corrupts data or segfaults in >> the current state of affairs, as you pointed out. Not exactly ideal. > > No different than writing to a field in a cdef class... Also, I believe that in the strided case, the strides are all set to 0, and the data-pointer is NULL, so you will never corrupt data, you will always try to access *NULL and segfault. Though If you put mode='c' and a very high index you'll corrupt data. Dag > >> >> That's another reason why I see a difference between the behaviour of >> extension types and that of buffer arguments. Buffer indexing is also way >> more performance critical than the average method call or attribute >> access >> on a cdef class. > > Perhaps, but that's a bit hand-wavy to turn into a principle of language > design? "This is performance critical, so therefore we suddenly invert > the normal rule"? > > I just think we should be consistent, not have more special rules for > buffers than we need to. > > The intention all the time was that "np.ndarray[double]" is just a > glorified "np.ndarray". People expect it to behave like an optimized > "np.ndarray". If "np.ndarray" can be None, why can't "np.ndarray[double]"? > > BTW, with the coming of memoryviews, me and Mark talked about just > deprecating the "mytype[...]" meaning buffers, and rather treat it as > np.ndarray, array.array etc. being some sort of "template types". That > is, we disallow "object[int]" and require some special declarations in > the relevant pxd files. > >>> (Java is sort of prior art that this can indeed be done?). >> >> Java was designed to have a JIT compiler underneath which handles >> external >> parameters, and its compilers are way smarter than Cython. I agree that >> there is still a lot we can do based on better static analysis, but there >> will always be limits. > > Any static analysis will be able to get you to the point of "not None" > if the user has a manual test. And the Python way is often to just spell > things out rather than brevity; I think an explicit if-test is much more > newbie friendly than "not None", "or None", etc. > > Performance beyond that is rather theoretical for the moment. > > I agree that for memoryviews that can be passed in acquired-state to > cdef functions there is the question of eliminating an extra branch or > so, but that is still far-fetched, and I'd rather Mark raise the issue > if it comes an issue than the two of us bikeshedding over it. > > I'll try to make this my last post to this thread, I feel we're slipping > into Dag-and-Stefan-endless-thread territory... > > Dag From stefan_ml at behnel.de Mon May 7 15:04:18 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 07 May 2012 15:04:18 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7B682.5050300@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> Message-ID: <4FA7C852.9020004@behnel.de> Dag Sverre Seljebotn, 07.05.2012 13:48: > On 05/07/2012 01:10 PM, Stefan Behnel wrote: >> Dag Sverre Seljebotn, 07.05.2012 12:40: >>> On 05/07/2012 11:17 AM, Stefan Behnel wrote: >>>> Dag Sverre Seljebotn, 07.05.2012 10:44: >>>>> On 05/07/2012 07:48 AM, Stefan Behnel wrote: >>>>>> I wonder why a memory view should be allowed to be None in the first >>>>>> place. >>>>>> Buffer arguments aren't (because they get unpacked on entry), so why >>>>>> should memory views? >>>>> >>>>> ? At least when I implemented it, buffers get unpacked but the case of a >>>>> None buffer is treated specially, and you're fully allowed (and >>>>> segfault if you [] it). >>>> >>>> Hmm, ok, maybe I just got confused by the code then. >>>> >>>> I think the docs should state that buffer arguments are best used together >>>> with the "not None" declaration then. >>> >>> I use them with "=None" default values all the time... then do a >>> None-check manually. >> >> Interesting. Could you given an example? What's the advantage over letting >> Cython raise an error for you? And, since you are using it as a default >> argument, why would someone want to call your code entirely without a >> buffer argument? > > Here you go: > > def foo(np.ndarray[double] a, np.ndarray[double] out=None): > if out is None: > out = np.empty_like(a) Ah, right - output arguments. Hadn't thought of those. Still, since you pass None explicitly as a default argument, this code wouldn't be impacted by disallowing None for buffers by default. That case is already handled specially in the compiler. But a better default would prevent the *first* argument from being None. So, basically, it would do the right thing straight away in your case and generate safer and more efficient code for it, whereas now you have to test 'a' for being None explicitly and Cython won't understand that hint due to insufficient static analysis. At least, since my last commit you can make Cython do the same thing by declaring it "not None". >>> It's really no different from cdef classes. >> >> I find it at least a bit more surprising because a buffer unpacking >> argument is a rather strong hint that you expect something that supports >> this protocol. The fact that you type your function argument with it hints >> at the intention to properly unpack it on entry. I'm sure there are lots of >> users who were or will be surprised when they realise that that doesn't >> exclude None values. > > Whereas I think there would be more users surprised by the opposite. We've had enough complaints from users about None being allowed for typed arguments already to consider it at least a gotcha of the language. The main reason we didn't change this behaviour back then was that it would clearly break user code and we thought we could do without that. That's different from considering it "right" and "good". >>>> And I remember that we wanted to change the default settings for extension >>>> type arguments from "or None" to "not None" years ago but never actually >>>> did it. >>> >>> I remember that there was such a debate, but I certainly don't remember >>> that this was the conclusion :-) >> >> Maybe not, yes. >> >> >>> I didn't agree with that view then and >>> I don't now. I don't remember what Robert's view was... >>> >>> As far as I can remember (which might be biased towards my personal >>> view), the conclusion was that we left the current semantics in place, >>> relying on better control flow analysis to make None-checks cheaper, and >>> when those are cheap enough, make the nonecheck directive default to >>> True >> >> At least for buffer arguments, it silently corrupts data or segfaults in >> the current state of affairs, as you pointed out. Not exactly ideal. > > No different than writing to a field in a cdef class... Hmm, aren't those None checked? At least cdef method calls are AFAIR. I think we should really get back to the habit of making code safe first and fast afterwards. >> That's another reason why I see a difference between the behaviour of >> extension types and that of buffer arguments. Buffer indexing is also way >> more performance critical than the average method call or attribute access >> on a cdef class. > > Perhaps, but that's a bit hand-wavy to turn into a principle of language > design? "This is performance critical, so therefore we suddenly invert the > normal rule"? Since when is the "normal rule" to consider performance so important that we prefer a crash over raising an exception? That's the current state of buffer arguments, after all, so we already inverted the "normal rule", IMHO. > I just think we should be consistent, not have more special rules for > buffers than we need to. Agreed. So, would you accept that we add a None check to every buffer indexing access now and try to eliminate them over time (or with user interaction)? > The intention all the time was that "np.ndarray[double]" is just a > glorified "np.ndarray". People expect it to behave like an optimized > "np.ndarray". If "np.ndarray" can be None, why can't "np.ndarray[double]"? Because it uses syntax that is expected to unpack the buffer. If that buffer doesn't exist, I'd expect an error. It's like using interfaces: I want something here that implements the buffer interface. If it doesn't - reject it. Besides, I hope you are aware that your argumentation stands on the (IMHO questionable) fact that "np.ndarray" by itself can be None by default. If np.ndarray should not be be allowed to be None by default, why should np.ndarray[double]? That argument works in both ways. > BTW, with the coming of memoryviews, me and Mark talked about just > deprecating the "mytype[...]" meaning buffers, and rather treat it as > np.ndarray, array.array etc. being some sort of "template types". That is, > we disallow "object[int]" and require some special declarations in the > relevant pxd files. Hmm, yes, it's unfortunate that we have two different types of syntax now, one that declares the item type before the brackets and one that declares it afterwards. >>> (Java is sort of prior art that this can indeed be done?). >> >> Java was designed to have a JIT compiler underneath which handles external >> parameters, and its compilers are way smarter than Cython. I agree that >> there is still a lot we can do based on better static analysis, but there >> will always be limits. > > Any static analysis will be able to get you to the point of "not None" if > the user has a manual test. Sure. It will at least expect a fatal error at the first operation that won't work with a None value and know that it can't be None afterwards. > And the Python way is often to just spell > things out rather than brevity; I think an explicit if-test is much more > newbie friendly than "not None", "or None", etc. ... with a good default being even more pythonic. ;) Stefan From stefan_ml at behnel.de Mon May 7 16:16:32 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 07 May 2012 16:16:32 +0200 Subject: [Cython] buffer syntax vs. memory view syntax (was: Re: checking for "None" in nogil function) In-Reply-To: <4FA7C852.9020004@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> Message-ID: <4FA7D940.5030607@behnel.de> Stefan Behnel, 07.05.2012 15:04: > Dag Sverre Seljebotn, 07.05.2012 13:48: >> BTW, with the coming of memoryviews, me and Mark talked about just >> deprecating the "mytype[...]" meaning buffers, and rather treat it as >> np.ndarray, array.array etc. being some sort of "template types". That is, >> we disallow "object[int]" and require some special declarations in the >> relevant pxd files. > > Hmm, yes, it's unfortunate that we have two different types of syntax now, > one that declares the item type before the brackets and one that declares > it afterwards. I actually think this merits some more discussion. Should we consider the buffer interface syntax deprecated and focus on the memory view syntax? The words-to-punctuation ratio of the latter may hurt the eyes when encountering it unprepared, but at least it doesn't require two type names, of which the one before the brackets (i.e. "object") is mostly useless. (Although it does reflect the notion that we are dealing with an object here ...) Stefan From vitja.makarov at gmail.com Mon May 7 17:08:09 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Mon, 7 May 2012 19:08:09 +0400 Subject: [Cython] How do you trigger a jenkins build? Message-ID: I've noticed that old one URL hook doesn't work for me now. I tried to check "Build when a change is pushed to GitHub" and set "Jenkins Hook URL" to https://sage.math.washington.edu:8091/hudson/github-webhook/ That doesn't work. What is the right way? -- vitja. From d.s.seljebotn at astro.uio.no Mon May 7 17:48:14 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 May 2012 17:48:14 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7C852.9020004@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> Message-ID: <4FA7EEBE.7060508@astro.uio.no> On 05/07/2012 03:04 PM, Stefan Behnel wrote: > Dag Sverre Seljebotn, 07.05.2012 13:48: >> Here you go: >> >> def foo(np.ndarray[double] a, np.ndarray[double] out=None): >> if out is None: >> out = np.empty_like(a) > > Ah, right - output arguments. Hadn't thought of those. > > Still, since you pass None explicitly as a default argument, this code > wouldn't be impacted by disallowing None for buffers by default. That case > is already handled specially in the compiler. But a better default would > prevent the *first* argument from being None. > > So, basically, it would do the right thing straight away in your case and > generate safer and more efficient code for it, whereas now you have to test > 'a' for being None explicitly and Cython won't understand that hint due to > insufficient static analysis. At least, since my last commit you can make > Cython do the same thing by declaring it "not None". Yes, thanks! >>>> It's really no different from cdef classes. >>> >>> I find it at least a bit more surprising because a buffer unpacking >>> argument is a rather strong hint that you expect something that supports >>> this protocol. The fact that you type your function argument with it hints >>> at the intention to properly unpack it on entry. I'm sure there are lots of >>> users who were or will be surprised when they realise that that doesn't >>> exclude None values. >> >> Whereas I think there would be more users surprised by the opposite. > > We've had enough complaints from users about None being allowed for typed > arguments already to consider it at least a gotcha of the language. > > The main reason we didn't change this behaviour back then was that it would > clearly break user code and we thought we could do without that. That's > different from considering it "right" and "good". > > >>>>> And I remember that we wanted to change the default settings for extension >>>>> type arguments from "or None" to "not None" years ago but never actually >>>>> did it. >>>> >>>> I remember that there was such a debate, but I certainly don't remember >>>> that this was the conclusion :-) >>> >>> Maybe not, yes. >>> >>> >>>> I didn't agree with that view then and >>>> I don't now. I don't remember what Robert's view was... >>>> >>>> As far as I can remember (which might be biased towards my personal >>>> view), the conclusion was that we left the current semantics in place, >>>> relying on better control flow analysis to make None-checks cheaper, and >>>> when those are cheap enough, make the nonecheck directive default to >>>> True >>> >>> At least for buffer arguments, it silently corrupts data or segfaults in >>> the current state of affairs, as you pointed out. Not exactly ideal. >> >> No different than writing to a field in a cdef class... > > Hmm, aren't those None checked? At least cdef method calls are AFAIR. Not at all. That's my whole point -- currently, the rule for None in Cython is "it's your responsibility to never do a native operation on None". I don't like that either, but that's just inherited from Pyrex (and many projects would get speed regressions etc.). I'm not against changing that to "we safely None-check", if done nicely -- it's just that that should be done everywhere at once. In current master (and as far back as I can remember), this code: cdef class A: cdef int field cdef int method(self): print self.field def f(): cdef A a = None a.field = 3 a.method() Turns into: __pyx_v_a = ((struct __pyx_obj_5test2_A *)Py_None); __pyx_v_a->field = 3; ((struct __pyx_vtabstruct_5test2_A *) __pyx_v_a->__pyx_vtab)->method(__pyx_v_a); > I think we should really get back to the habit of making code safe first > and fast afterwards. Nobody has argued otherwise for some time (since the cdivision thread I believe), this is all about Pyrex legacy. Guess part of the story is that there's lots of performance-sensitive code in SAGE using cdef classes which was written in Pyrex before Cython was around... In fact, the nonecheck directive was written by yours truly! And I argued for making it the default at the time! > Because it uses syntax that is expected to unpack the buffer. If that > buffer doesn't exist, I'd expect an error. It's like using interfaces: I > want something here that implements the buffer interface. If it doesn't - > reject it. > > Besides, I hope you are aware that your argumentation stands on the (IMHO > questionable) fact that "np.ndarray" by itself can be None by default. If > np.ndarray should not be be allowed to be None by default, why should > np.ndarray[double]? That argument works in both ways. I'm well aware of it... Dag From d.s.seljebotn at astro.uio.no Mon May 7 18:00:20 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 May 2012 18:00:20 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA7D940.5030607@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> Message-ID: <4FA7F194.5080008@astro.uio.no> On 05/07/2012 04:16 PM, Stefan Behnel wrote: > Stefan Behnel, 07.05.2012 15:04: >> Dag Sverre Seljebotn, 07.05.2012 13:48: >>> BTW, with the coming of memoryviews, me and Mark talked about just >>> deprecating the "mytype[...]" meaning buffers, and rather treat it as >>> np.ndarray, array.array etc. being some sort of "template types". That is, >>> we disallow "object[int]" and require some special declarations in the >>> relevant pxd files. >> >> Hmm, yes, it's unfortunate that we have two different types of syntax now, >> one that declares the item type before the brackets and one that declares >> it afterwards. > > I actually think this merits some more discussion. Should we consider the > buffer interface syntax deprecated and focus on the memory view syntax? I think that's the very-long-term intention. Then again, it may be too early to really tell yet, we just need to see how the memory views play out in real life and whether they'll be able to replace np.ndarray[double] among real users. We don't want to shove things down users throats. But the use of the trailing-[] syntax needs some cleaning up. Me and Mark agreed we'd put this proposal forward when we got around to it: - Deprecate the "object[double]" form, where [dtype] can be stuck on any extension type - But, do NOT (for the next year at least) deprecate np.ndarray[double], array.array[double], etc. Basically, there should be a magic flag in extension type declarations saying "I can be a buffer". For one thing, that is sort of needed to open up things for templated cdef classes/fused types cdef classes, if that is ever implemented. The semantic meaning of trailing [] is still sort of like the C++ meaning; that it templates the argument types (except it's lots of special cases in the compiler for various things rather than a Turing-complete template language...) Dag > > The words-to-punctuation ratio of the latter may hurt the eyes when > encountering it unprepared, but at least it doesn't require two type names, > of which the one before the brackets (i.e. "object") is mostly useless. > (Although it does reflect the notion that we are dealing with an object > here ...) > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From d.s.seljebotn at astro.uio.no Mon May 7 18:03:44 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 May 2012 18:03:44 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA7F194.5080008@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> Message-ID: <4FA7F260.7010403@astro.uio.no> On 05/07/2012 06:00 PM, Dag Sverre Seljebotn wrote: > On 05/07/2012 04:16 PM, Stefan Behnel wrote: >> Stefan Behnel, 07.05.2012 15:04: >>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>> deprecating the "mytype[...]" meaning buffers, and rather treat it as >>>> np.ndarray, array.array etc. being some sort of "template types". >>>> That is, >>>> we disallow "object[int]" and require some special declarations in the >>>> relevant pxd files. >>> >>> Hmm, yes, it's unfortunate that we have two different types of syntax >>> now, >>> one that declares the item type before the brackets and one that >>> declares >>> it afterwards. >> >> I actually think this merits some more discussion. Should we consider the >> buffer interface syntax deprecated and focus on the memory view syntax? > > I think that's the very-long-term intention. Then again, it may be too > early to really tell yet, we just need to see how the memory views play > out in real life and whether they'll be able to replace > np.ndarray[double] among real users. We don't want to shove things down > users throats. > > But the use of the trailing-[] syntax needs some cleaning up. Me and > Mark agreed we'd put this proposal forward when we got around to it: > > - Deprecate the "object[double]" form, where [dtype] can be stuck on any > extension type > > - But, do NOT (for the next year at least) deprecate np.ndarray[double], > array.array[double], etc. Basically, there should be a magic flag in > extension type declarations saying "I can be a buffer". > > For one thing, that is sort of needed to open up things for templated > cdef classes/fused types cdef classes, if that is ever implemented. > > The semantic meaning of trailing [] is still sort of like the C++ > meaning; that it templates the argument types (except it's lots of > special cases in the compiler for various things rather than a > Turing-complete template language...) s/argument types/base type/ Dag > > Dag > >> >> The words-to-punctuation ratio of the latter may hurt the eyes when >> encountering it unprepared, but at least it doesn't require two type >> names, >> of which the one before the brackets (i.e. "object") is mostly useless. >> (Although it does reflect the notion that we are dealing with an object >> here ...) >> >> Stefan >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Mon May 7 18:03:43 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 17:03:43 +0100 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7B724.5050008@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7B724.5050008@astro.uio.no> Message-ID: On 7 May 2012 12:51, Dag Sverre Seljebotn wrote: > On 05/07/2012 01:48 PM, Dag Sverre Seljebotn wrote: >> >> On 05/07/2012 01:10 PM, Stefan Behnel wrote: >>> >>> Dag Sverre Seljebotn, 07.05.2012 12:40: >>>> >>>> moving to dev list >>> >>> >>> Makes sense. >>> >>>> On 05/07/2012 11:17 AM, Stefan Behnel wrote: >>>>> >>>>> Dag Sverre Seljebotn, 07.05.2012 10:44: >>>>>> >>>>>> On 05/07/2012 07:48 AM, Stefan Behnel wrote: >>>>>>> >>>>>>> I wonder why a memory view should be allowed to be None in the first >>>>>>> place. >>>>>>> Buffer arguments aren't (because they get unpacked on entry), so why >>>>>>> should memory views? >>>>>> >>>>>> >>>>>> ? At least when I implemented it, buffers get unpacked but the case >>>>>> of a >>>>>> None buffer is treated specially, and you're fully allowed (and >>>>>> segfault if >>>>>> you [] it). >>>>> >>>>> >>>>> Hmm, ok, maybe I just got confused by the code then. >>>>> >>>>> I think the docs should state that buffer arguments are best used >>>>> together >>>>> with the "not None" declaration then. >>> >>> >>> ... which made me realise that that wasn't even supported. I can't >>> believe >>> no-one ever reported that as a bug... >>> >>> >>> https://github.com/cython/cython/commit/f2de49fd0ac82a02a070b931bf4d2dab47135d0b >>> >>> >>> It's still not supported for memory views. >>> >>> BTW, is there a reason why we shouldn't allow a "not None" declaration >>> for >>> cdef functions? Obviously, the caller would have to do the check in that >>> case. Hmm, maybe it's not that important, because None checks are best >>> done >>> at entry points from user code, which usually means Python code. It seems >>> like "not None" is not supported on cpdef functions, though. >>> >>> >>>> I use them with "=None" default values all the time... then do a >>>> None-check manually. >>> >>> >>> Interesting. Could you given an example? What's the advantage over >>> letting >>> Cython raise an error for you? And, since you are using it as a default >>> argument, why would someone want to call your code entirely without a >>> buffer argument? >> >> >> Here you go: >> >> def foo(np.ndarray[double] a, np.ndarray[double] out=None): >> if out is None: >> out = np.empty_like(a) >> # compute result in out >> return out >> >> The pattern of handing in the memory area to write to is one of the >> fundamental basics of numerical computing; you often just can't >> implement an algorithm if the called function returns the result in a >> newly-allocated array. I can explain why that is in detail, but I'd >> rather you just trusted the testimony of somebody doing numerical >> computation... >> >> It's just a convenience, but often (in particular when testing) it's >> incredibly convenient to not have to bother with allocating the output >> array. >> >> Another pattern is: >> >> def do_something(np.ndarray[double] a, >> np.ndarray[double] sin_of_a=None): >> ... >> >> so if your caller happened to already have computed something, the >> function uses it, but OTOH the "something" is a function of the inputs >> and can be computed on the fly. AND, sometimes it can be computed on the >> fly in ways more efficient than what the caller could have done, because >> of memory bus issues etc. etc. >> >> Both of these can be "fixed" by a) not allowing the convenient >> shorthand, or b) declare the argument "object" first and then type it >> after the "preamble". >> >> So the REAL reason I'm arguing this case is consistency with cdef classes. >> >> >> >>> >>> >>>> It's really no different from cdef classes. >>> >>> >>> I find it at least a bit more surprising because a buffer unpacking >>> argument is a rather strong hint that you expect something that supports >>> this protocol. The fact that you type your function argument with it >>> hints >>> at the intention to properly unpack it on entry. I'm sure there are >>> lots of >>> users who were or will be surprised when they realise that that doesn't >>> exclude None values. >> >> >> Whereas I think there would be more users surprised by the opposite. >> >> So there -- we won't know who's right without actually finding some >> users. And chances are we are both right, since users are different from >> one another. >> >>> >>> >>>>> And I remember that we wanted to change the default settings for >>>>> extension >>>>> type arguments from "or None" to "not None" years ago but never >>>>> actually >>>>> did it. >>>> >>>> >>>> I remember that there was such a debate, but I certainly don't remember >>>> that this was the conclusion :-) >>> >>> >>> Maybe not, yes. >>> >>> >>>> I didn't agree with that view then and >>>> I don't now. I don't remember what Robert's view was... >>>> >>>> As far as I can remember (which might be biased towards my personal >>>> view), the conclusion was that we left the current semantics in place, >>>> relying on better control flow analysis to make None-checks cheaper, and >>>> when those are cheap enough, make the nonecheck directive default to >>>> True >>> >>> >>> At least for buffer arguments, it silently corrupts data or segfaults in >>> the current state of affairs, as you pointed out. Not exactly ideal. >> >> >> No different than writing to a field in a cdef class... > > > Also, I believe that in the strided case, the strides are all set to 0, and > the data-pointer is NULL, so you will never corrupt data, you will always > try to access *NULL and segfault. > > Though If you put mode='c' and a very high index you'll corrupt data. > > Dag > If you have boundschecking on, you'll get an out of bounds error, which is pretty weird :) >> >>> >>> That's another reason why I see a difference between the behaviour of >>> extension types and that of buffer arguments. Buffer indexing is also way >>> more performance critical than the average method call or attribute >>> access >>> on a cdef class. >> >> >> Perhaps, but that's a bit hand-wavy to turn into a principle of language >> design? "This is performance critical, so therefore we suddenly invert >> the normal rule"? >> >> I just think we should be consistent, not have more special rules for >> buffers than we need to. >> >> The intention all the time was that "np.ndarray[double]" is just a >> glorified "np.ndarray". People expect it to behave like an optimized >> "np.ndarray". If "np.ndarray" can be None, why can't "np.ndarray[double]"? >> >> BTW, with the coming of memoryviews, me and Mark talked about just >> deprecating the "mytype[...]" meaning buffers, and rather treat it as >> np.ndarray, array.array etc. being some sort of "template types". That >> is, we disallow "object[int]" and require some special declarations in >> the relevant pxd files. >> >>>> (Java is sort of prior art that this can indeed be done?). >>> >>> >>> Java was designed to have a JIT compiler underneath which handles >>> external >>> parameters, and its compilers are way smarter than Cython. I agree that >>> there is still a lot we can do based on better static analysis, but there >>> will always be limits. >> >> >> Any static analysis will be able to get you to the point of "not None" >> if the user has a manual test. And the Python way is often to just spell >> things out rather than brevity; I think an explicit if-test is much more >> newbie friendly than "not None", "or None", etc. >> >> Performance beyond that is rather theoretical for the moment. >> >> I agree that for memoryviews that can be passed in acquired-state to >> cdef functions there is the question of eliminating an extra branch or >> so, but that is still far-fetched, and I'd rather Mark raise the issue >> if it comes an issue than the two of us bikeshedding over it. >> >> I'll try to make this my last post to this thread, I feel we're slipping >> into Dag-and-Stefan-endless-thread territory... >> >> Dag > > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From vitja.makarov at gmail.com Mon May 7 18:04:01 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Mon, 7 May 2012 20:04:01 +0400 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: 2012/5/7 mark florisson : > On 6 May 2012 20:41, Matthew Brett wrote: >> Hi, >> >> On Sun, May 6, 2012 at 7:28 AM, mark florisson >> wrote: >>> Hey, >>> >>> I think we already have quite a bit of functionality (nearly) ready, >>> after merging some pending pull requests maybe it will be a good time >>> for a 0.17 release? I think it would be good to also document to what >>> extent pypy support works, what works and what doesn't. Stefan, since >>> you added a large majority of the features, would you want to be the >>> release manager? >>> >>> In summary, the following pull requests should likely go in >>> ? ?- array.array support (unless further discussion prevents that) >>> ? ?- fused types runtime buffer dispatch >>> ? ?- newaxis >>> ? ?- more? >>> >>> The memoryview documentation should also be reworked a bit. Matthew, >>> are you still willing to have a go at that? Otherwise I can clean up >>> the mess first, some things are no longer true and simply outdated, >>> and then have a second opinion. >> >> Yes, sorry, I have been taken up by releasing my own project. What's >> the deadline do you think? ?I have another big release to do for the >> end of next week, but I might be able to carve out some time, >> >> See you, >> >> Matthew > > Great, I'd say we're probably not going to release anything within the > next two weeks, so take your time, there is no hurry really :). Hmm, it seems to me that master is currently broken: https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests/BACKEND=c,PYVERSION=py27-ext/ -- vitja. From markflorisson88 at gmail.com Mon May 7 18:04:01 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 17:04:01 +0100 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7ADC0.40501@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> Message-ID: On 7 May 2012 12:10, Stefan Behnel wrote: > Dag Sverre Seljebotn, 07.05.2012 12:40: >> moving to dev list > > Makes sense. > >> On 05/07/2012 11:17 AM, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 07.05.2012 10:44: >>>> On 05/07/2012 07:48 AM, Stefan Behnel wrote: >>>>> I wonder why a memory view should be allowed to be None in the first >>>>> place. >>>>> Buffer arguments aren't (because they get unpacked on entry), so why >>>>> should memory views? >>>> >>>> ? At least when I implemented it, buffers get unpacked but the case of a >>>> None buffer is treated specially, and you're fully allowed (and segfault if >>>> you [] it). >>> >>> Hmm, ok, maybe I just got confused by the code then. >>> >>> I think the docs should state that buffer arguments are best used together >>> with the "not None" declaration then. > > ... which made me realise that that wasn't even supported. I can't believe > no-one ever reported that as a bug... > > https://github.com/cython/cython/commit/f2de49fd0ac82a02a070b931bf4d2dab47135d0b > > It's still not supported for memory views. Yeah, that was never implemented, but probably should be. > BTW, is there a reason why we shouldn't allow a "not None" declaration for > cdef functions? Obviously, the caller would have to do the check in that > case. Why can't the callee just check it? If it's None, just raise an exception like usual? > Hmm, maybe it's not that important, because None checks are best done > at entry points from user code, which usually means Python code. It seems > like "not None" is not supported on cpdef functions, though. > > >> I use them with "=None" default values all the time... then do a >> None-check manually. > > Interesting. Could you given an example? What's the advantage over letting > Cython raise an error for you? And, since you are using it as a default > argument, why would someone want to call your code entirely without a > buffer argument? > > >> It's really no different from cdef classes. > > I find it at least a bit more surprising because a buffer unpacking > argument is a rather strong hint that you expect something that supports > this protocol. The fact that you type your function argument with it hints > at the intention to properly unpack it on entry. I'm sure there are lots of > users who were or will be surprised when they realise that that doesn't > exclude None values. > > >>> And I remember that we wanted to change the default settings for extension >>> type arguments from "or None" to "not None" years ago but never actually >>> did it. >> >> I remember that there was such a debate, but I certainly don't remember >> that this was the conclusion :-) > > Maybe not, yes. > > >> I didn't agree with that view then and >> I don't now. I don't remember what Robert's view was... >> >> As far as I can remember (which might be biased towards my personal >> view), the conclusion was that we left the current semantics in place, >> relying on better control flow analysis to make None-checks cheaper, and >> when those are cheap enough, make the nonecheck directive default to >> True > > At least for buffer arguments, it silently corrupts data or segfaults in > the current state of affairs, as you pointed out. Not exactly ideal. > > That's another reason why I see a difference between the behaviour of > extension types and that of buffer arguments. Buffer indexing is also way > more performance critical than the average method call or attribute access > on a cdef class. > > >> (Java is sort of prior art that this can indeed be done?). > > Java was designed to have a JIT compiler underneath which handles external > parameters, and its compilers are way smarter than Cython. I agree that > there is still a lot we can do based on better static analysis, but there > will always be limits. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From d.s.seljebotn at astro.uio.no Mon May 7 18:07:22 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 May 2012 18:07:22 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> Message-ID: <4FA7F33A.9020903@astro.uio.no> On 05/07/2012 06:04 PM, mark florisson wrote: > On 7 May 2012 12:10, Stefan Behnel wrote: >> Dag Sverre Seljebotn, 07.05.2012 12:40: >>> moving to dev list >> >> Makes sense. >> >>> On 05/07/2012 11:17 AM, Stefan Behnel wrote: >>>> Dag Sverre Seljebotn, 07.05.2012 10:44: >>>>> On 05/07/2012 07:48 AM, Stefan Behnel wrote: >>>>>> I wonder why a memory view should be allowed to be None in the first >>>>>> place. >>>>>> Buffer arguments aren't (because they get unpacked on entry), so why >>>>>> should memory views? >>>>> >>>>> ? At least when I implemented it, buffers get unpacked but the case of a >>>>> None buffer is treated specially, and you're fully allowed (and segfault if >>>>> you [] it). >>>> >>>> Hmm, ok, maybe I just got confused by the code then. >>>> >>>> I think the docs should state that buffer arguments are best used together >>>> with the "not None" declaration then. >> >> ... which made me realise that that wasn't even supported. I can't believe >> no-one ever reported that as a bug... >> >> https://github.com/cython/cython/commit/f2de49fd0ac82a02a070b931bf4d2dab47135d0b >> >> It's still not supported for memory views. > > Yeah, that was never implemented, but probably should be. > >> BTW, is there a reason why we shouldn't allow a "not None" declaration for >> cdef functions? Obviously, the caller would have to do the check in that >> case. > > Why can't the callee just check it? If it's None, just raise an > exception like usual? It's just that there's a lot more potential for rather easy optimization if the caller does it. Dag From stefan_ml at behnel.de Mon May 7 18:12:39 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 07 May 2012 18:12:39 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7F33A.9020903@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7F33A.9020903@astro.uio.no> Message-ID: <4FA7F477.30701@behnel.de> Dag Sverre Seljebotn, 07.05.2012 18:07: > On 05/07/2012 06:04 PM, mark florisson wrote: >> On 7 May 2012 12:10, Stefan Behnel wrote: >>> BTW, is there a reason why we shouldn't allow a "not None" declaration for >>> cdef functions? Obviously, the caller would have to do the check in that >>> case. >> >> Why can't the callee just check it? If it's None, just raise an >> exception like usual? > > It's just that there's a lot more potential for rather easy optimization if > the caller does it. Exactly. The NoneCheckNode is easy to get rid of at any stage in the pipeline, whereas a hard coded None check has a fixed cost at runtime. Stefan From markflorisson88 at gmail.com Mon May 7 18:12:56 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 17:12:56 +0100 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7EEBE.7060508@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7EEBE.7060508@astro.uio.no> Message-ID: On 7 May 2012 16:48, Dag Sverre Seljebotn wrote: > On 05/07/2012 03:04 PM, Stefan Behnel wrote: >> >> Dag Sverre Seljebotn, 07.05.2012 13:48: >> >>> Here you go: >>> >>> def foo(np.ndarray[double] a, np.ndarray[double] out=None): >>> ? ? if out is None: >>> ? ? ? ? out = np.empty_like(a) >> >> >> Ah, right - output arguments. Hadn't thought of those. >> >> Still, since you pass None explicitly as a default argument, this code >> wouldn't be impacted by disallowing None for buffers by default. That case >> is already handled specially in the compiler. But a better default would >> prevent the *first* argument from being None. >> >> So, basically, it would do the right thing straight away in your case and >> generate safer and more efficient code for it, whereas now you have to >> test >> 'a' for being None explicitly and Cython won't understand that hint due to >> insufficient static analysis. At least, since my last commit you can make >> Cython do the same thing by declaring it "not None". > > > Yes, thanks! > > >>>>> It's really no different from cdef classes. >>>> >>>> >>>> I find it at least a bit more surprising because a buffer unpacking >>>> argument is a rather strong hint that you expect something that supports >>>> this protocol. The fact that you type your function argument with it >>>> hints >>>> at the intention to properly unpack it on entry. I'm sure there are lots >>>> of >>>> users who were or will be surprised when they realise that that doesn't >>>> exclude None values. >>> >>> >>> Whereas I think there would be more users surprised by the opposite. >> >> >> We've had enough complaints from users about None being allowed for typed >> arguments already to consider it at least a gotcha of the language. >> >> The main reason we didn't change this behaviour back then was that it >> would >> clearly break user code and we thought we could do without that. That's >> different from considering it "right" and "good". >> >> >>>>>> And I remember that we wanted to change the default settings for >>>>>> extension >>>>>> type arguments from "or None" to "not None" years ago but never >>>>>> actually >>>>>> did it. >>>>> >>>>> >>>>> I remember that there was such a debate, but I certainly don't remember >>>>> that this was the conclusion :-) >>>> >>>> >>>> Maybe not, yes. >>>> >>>> >>>>> I didn't agree with that view then and >>>>> I don't now. I don't remember what Robert's view was... >>>>> >>>>> As far as I can remember (which might be biased towards my personal >>>>> view), the conclusion was that we left the current semantics in place, >>>>> relying on better control flow analysis to make None-checks cheaper, >>>>> and >>>>> when those are cheap enough, make the nonecheck directive default to >>>>> True >>>> >>>> >>>> At least for buffer arguments, it silently corrupts data or segfaults in >>>> the current state of affairs, as you pointed out. Not exactly ideal. >>> >>> >>> No different than writing to a field in a cdef class... >> >> >> Hmm, aren't those None checked? At least cdef method calls are AFAIR. > > > Not at all. That's my whole point -- currently, the rule for None in Cython > is "it's your responsibility to never do a native operation on None". > > I don't like that either, but that's just inherited from Pyrex (and many > projects would get speed regressions etc.). > > I'm not against changing that to "we safely None-check", if done nicely -- > it's just that that should be done everywhere at once. > > In current master (and as far back as I can remember), this code: > > cdef class A: > ? ?cdef int field > ? ?cdef int method(self): > ? ? ? ?print self.field > def f(): > ? ?cdef A a = None > ? ?a.field = 3 > ? ?a.method() > > Turns into: > > ?__pyx_v_a = ((struct __pyx_obj_5test2_A *)Py_None); > ?__pyx_v_a->field = 3; > ?((struct __pyx_vtabstruct_5test2_A *) > __pyx_v_a->__pyx_vtab)->method(__pyx_v_a); > > > > >> I think we should really get back to the habit of making code safe first >> and fast afterwards. > > > Nobody has argued otherwise for some time (since the cdivision thread I > believe), this is all about Pyrex legacy. Guess part of the story is that > there's lots of performance-sensitive code in SAGE using cdef classes which > was written in Pyrex before Cython was around... > > In fact, the nonecheck directive was written by yours truly! And I argued > for making it the default at the time! > > >> Because it uses syntax that is expected to unpack the buffer. If that >> buffer doesn't exist, I'd expect an error. It's like using interfaces: I >> want something here that implements the buffer interface. If it doesn't - >> reject it. >> >> Besides, I hope you are aware that your argumentation stands on the (IMHO >> questionable) fact that "np.ndarray" by itself can be None by default. If >> np.ndarray should not be be allowed to be None by default, why should >> np.ndarray[double]? That argument works in both ways. > > > I'm well aware of it... > > Dag > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel None checking could and should be optimized, it can be done but is a bit tricky. A problem are class attributes, as you can at certain point determine that it's not None, but after any function call etc it can suddenly be None because some code decided to set it to None (maybe weird, but possible). We could do well for local variables of which no address is taken, though. You'd have to recheck after each assignment though. We should probably start implementing single static assignment first, also to implement boundschecking and eliminating some common subexpressions. From markflorisson88 at gmail.com Mon May 7 18:16:42 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 17:16:42 +0100 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7F477.30701@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7F33A.9020903@astro.uio.no> <4FA7F477.30701@behnel.de> Message-ID: On 7 May 2012 17:12, Stefan Behnel wrote: > Dag Sverre Seljebotn, 07.05.2012 18:07: >> On 05/07/2012 06:04 PM, mark florisson wrote: >>> On 7 May 2012 12:10, Stefan Behnel wrote: >>>> BTW, is there a reason why we shouldn't allow a "not None" declaration for >>>> cdef functions? Obviously, the caller would have to do the check in that >>>> case. >>> >>> Why can't the callee just check it? If it's None, just raise an >>> exception like usual? >> >> It's just that there's a lot more potential for rather easy optimization if >> the caller does it. > > Exactly. The NoneCheckNode is easy to get rid of at any stage in the > pipeline, whereas a hard coded None check has a fixed cost at runtime. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel I see, yes. I expect a pointer comparison to be reasonably insignificant compared to function call overhead, but it would also reduce the code in the instruction cache. If you take the address of the function though, or if you declare it public in a pxd, you probably don't want to do that, as you still want to be safe when called from C. Could do the same trick as in the 'less annotations' CEP though, that would be nice. From markflorisson88 at gmail.com Mon May 7 18:18:09 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 17:18:09 +0100 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7F33A.9020903@astro.uio.no> <4FA7F477.30701@behnel.de> Message-ID: On 7 May 2012 17:16, mark florisson wrote: > On 7 May 2012 17:12, Stefan Behnel wrote: >> Dag Sverre Seljebotn, 07.05.2012 18:07: >>> On 05/07/2012 06:04 PM, mark florisson wrote: >>>> On 7 May 2012 12:10, Stefan Behnel wrote: >>>>> BTW, is there a reason why we shouldn't allow a "not None" declaration for >>>>> cdef functions? Obviously, the caller would have to do the check in that >>>>> case. >>>> >>>> Why can't the callee just check it? If it's None, just raise an >>>> exception like usual? >>> >>> It's just that there's a lot more potential for rather easy optimization if >>> the caller does it. >> >> Exactly. The NoneCheckNode is easy to get rid of at any stage in the >> pipeline, whereas a hard coded None check has a fixed cost at runtime. >> >> Stefan >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > I see, yes. I expect a pointer comparison to be reasonably > insignificant compared to function call overhead, but it would also > reduce the code in the instruction cache. If you take the address of > the function though, or if you declare it public in a pxd, you > probably don't want to do that, as you still want to be safe when > called from C. Could do the same trick as in the 'less annotations' > CEP though, that would be nice. ... or you could document that 'not None' means the caller cannot pass it in, but that would be weird as you could do it from Cython and get an exception, but not from C :) That would be better specified in the documentation of the function as its contract or whatever. From markflorisson88 at gmail.com Mon May 7 18:19:47 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 17:19:47 +0100 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: On 7 May 2012 17:04, Vitja Makarov wrote: > 2012/5/7 mark florisson : >> On 6 May 2012 20:41, Matthew Brett wrote: >>> Hi, >>> >>> On Sun, May 6, 2012 at 7:28 AM, mark florisson >>> wrote: >>>> Hey, >>>> >>>> I think we already have quite a bit of functionality (nearly) ready, >>>> after merging some pending pull requests maybe it will be a good time >>>> for a 0.17 release? I think it would be good to also document to what >>>> extent pypy support works, what works and what doesn't. Stefan, since >>>> you added a large majority of the features, would you want to be the >>>> release manager? >>>> >>>> In summary, the following pull requests should likely go in >>>> ? ?- array.array support (unless further discussion prevents that) >>>> ? ?- fused types runtime buffer dispatch >>>> ? ?- newaxis >>>> ? ?- more? >>>> >>>> The memoryview documentation should also be reworked a bit. Matthew, >>>> are you still willing to have a go at that? Otherwise I can clean up >>>> the mess first, some things are no longer true and simply outdated, >>>> and then have a second opinion. >>> >>> Yes, sorry, I have been taken up by releasing my own project. What's >>> the deadline do you think? ?I have another big release to do for the >>> end of next week, but I might be able to carve out some time, >>> >>> See you, >>> >>> Matthew >> >> Great, I'd say we're probably not going to release anything within the >> next two weeks, so take your time, there is no hurry really :). > > Hmm, it seems to me that master is currently broken: > > https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests/BACKEND=c,PYVERSION=py27-ext/ > > -- > vitja. > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel Quite broken, in fact :) It doesn't ever print error messages property anymore. From d.s.seljebotn at astro.uio.no Mon May 7 18:22:38 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 May 2012 18:22:38 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7F33A.9020903@astro.uio.no> <4FA7F477.30701@behnel.de> Message-ID: <4FA7F6CE.3090005@astro.uio.no> On 05/07/2012 06:18 PM, mark florisson wrote: > On 7 May 2012 17:16, mark florisson wrote: >> On 7 May 2012 17:12, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 07.05.2012 18:07: >>>> On 05/07/2012 06:04 PM, mark florisson wrote: >>>>> On 7 May 2012 12:10, Stefan Behnel wrote: >>>>>> BTW, is there a reason why we shouldn't allow a "not None" declaration for >>>>>> cdef functions? Obviously, the caller would have to do the check in that >>>>>> case. >>>>> >>>>> Why can't the callee just check it? If it's None, just raise an >>>>> exception like usual? >>>> >>>> It's just that there's a lot more potential for rather easy optimization if >>>> the caller does it. >>> >>> Exactly. The NoneCheckNode is easy to get rid of at any stage in the >>> pipeline, whereas a hard coded None check has a fixed cost at runtime. >>> >>> Stefan >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >> >> I see, yes. I expect a pointer comparison to be reasonably >> insignificant compared to function call overhead, but it would also >> reduce the code in the instruction cache. If you take the address of >> the function though, or if you declare it public in a pxd, you >> probably don't want to do that, as you still want to be safe when >> called from C. Could do the same trick as in the 'less annotations' >> CEP though, that would be nice. > > ... or you could document that 'not None' means the caller cannot pass > it in, but that would be weird as you could do it from Cython and get > an exception, but not from C :) That would be better specified in the > documentation of the function as its contract or whatever. We're going to need a "Cython ABI" at some point anyway. "Caller checks for None" goes in the ABI docs. Dag From markflorisson88 at gmail.com Mon May 7 18:28:19 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 17:28:19 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA7F194.5080008@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> Message-ID: On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: > On 05/07/2012 04:16 PM, Stefan Behnel wrote: >> >> Stefan Behnel, 07.05.2012 15:04: >>> >>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>> >>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>> deprecating the "mytype[...]" meaning buffers, and rather treat it as >>>> np.ndarray, array.array etc. being some sort of "template types". That >>>> is, >>>> we disallow "object[int]" and require some special declarations in the >>>> relevant pxd files. >>> >>> >>> Hmm, yes, it's unfortunate that we have two different types of syntax >>> now, >>> one that declares the item type before the brackets and one that declares >>> it afterwards. >> >> >> I actually think this merits some more discussion. Should we consider the >> buffer interface syntax deprecated and focus on the memory view syntax? > > > I think that's the very-long-term intention. Then again, it may be too early > to really tell yet, we just need to see how the memory views play out in > real life and whether they'll be able to replace np.ndarray[double] among > real users. We don't want to shove things down users throats. > > But the use of the trailing-[] syntax needs some cleaning up. Me and Mark > agreed we'd put this proposal forward when we got around to it: > > ?- Deprecate the "object[double]" form, where [dtype] can be stuck on any > extension type > > ?- But, do NOT (for the next year at least) deprecate np.ndarray[double], > array.array[double], etc. Basically, there should be a magic flag in > extension type declarations saying "I can be a buffer". > > For one thing, that is sort of needed to open up things for templated cdef > classes/fused types cdef classes, if that is ever implemented. Deprecating is definitely a good start. I think at least if you only allow two types as buffers it will be at least reasonably clear when one is dealing with fused types or buffers. Basically, I think memoryviews should live up to demands of the users, which would mean there would be no reason to keep the buffer syntax. One thing to do is make memoryviews coerce cheaply back to the original objects if wanted (which is likely). Writting np.asarray(mymemview) is kind of annoying. Also, OT (sorry), but I'm kind of worried about the memoryview ABI. If it changes (and I intend to do so), cython modules compiled with different cython versions will become incompatible if they call each other through pxds. Maybe that should be defined as UB... > The semantic meaning of trailing [] is still sort of like the C++ meaning; > that it templates the argument types (except it's lots of special cases in > the compiler for various things rather than a Turing-complete template > language...) > > Dag > >> >> The words-to-punctuation ratio of the latter may hurt the eyes when >> encountering it unprepared, but at least it doesn't require two type >> names, >> of which the one before the brackets (i.e. "object") is mostly useless. >> (Although it does reflect the notion that we are dealing with an object >> here ...) >> >> Stefan >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Mon May 7 18:52:56 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 07 May 2012 18:52:56 +0200 Subject: [Cython] checking for "None" in nogil function In-Reply-To: <4FA7EEBE.7060508@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7EEBE.7060508@astro.uio.no> Message-ID: <4FA7FDE8.9050807@behnel.de> Dag Sverre Seljebotn, 07.05.2012 17:48: > On 05/07/2012 03:04 PM, Stefan Behnel wrote: >> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>> As far as I can remember (which might be biased towards my personal >>>>> view), the conclusion was that we left the current semantics in place, >>>>> relying on better control flow analysis to make None-checks cheaper, and >>>>> when those are cheap enough, make the nonecheck directive default to >>>>> True >>>> >>>> At least for buffer arguments, it silently corrupts data or segfaults in >>>> the current state of affairs, as you pointed out. Not exactly ideal. >>> >>> No different than writing to a field in a cdef class... >> >> Hmm, aren't those None checked? At least cdef method calls are AFAIR. > > Not at all. That's my whole point -- currently, the rule for None in Cython > is "it's your responsibility to never do a native operation on None". > > I don't like that either, but that's just inherited from Pyrex (and many > projects would get speed regressions etc.). > > I'm not against changing that to "we safely None-check", if done nicely -- > it's just that that should be done everywhere at once. I think that gets both of us back on the same track then. :) > In current master (and as far back as I can remember), this code: > > cdef class A: > cdef int field > cdef int method(self): > print self.field > def f(): > cdef A a = None > a.field = 3 > a.method() > > Turns into: > > __pyx_v_a = ((struct __pyx_obj_5test2_A *)Py_None); > __pyx_v_a->field = 3; > ((struct __pyx_vtabstruct_5test2_A *) > __pyx_v_a->__pyx_vtab)->method(__pyx_v_a); Guess I've just been working on the builtins optimiser too long. There, it's obviously not allowed to inject unprotected code like this automatically. It would be fun if we could eventually get to the point where Cython replaces all of the code in f() with an AttributeError, as a combined effort of control flow analysis and dead code removal. A part of that is already there, i.e. Cython would know that 'a' "may be None" in the last two lines and would thus generate a None check with an AttributeError if we allowed it to do that. It wouldn't know that it's always going to be raised, though, so the dead code removal can't strike. I guess that case is just not important enough to implement. BTW, I recently tried to enable None checks in a couple of places and it broke memory views for some reason that I didn't want to investigate. The main problems really seem to be unknown argument values and the lack of proper exception prediction, e.g. in this case: def add_one_2d(int[:,:] buf): for x in xrange(buf.shape[0]): for y in xrange(buf.shape[1]): buf[x,y] += 1 it's statically obvious that only the first access to .shape (outside of all loops) needs a None check and will raise an AttributeError for None, so the check for the second loop can be eliminated as well as the None check on indexing. >> I think we should really get back to the habit of making code safe first >> and fast afterwards. > > Nobody has argued otherwise for some time (since the cdivision thread I > believe), this is all about Pyrex legacy. Guess part of the story is that > there's lots of performance-sensitive code in SAGE using cdef classes which > was written in Pyrex before Cython was around... > > In fact, the nonecheck directive was written by yours truly! And I argued > for making it the default at the time! I've been working on the None checks (and on removing them) repeatedly, although I didn't remember the particular details of discussing the nonecheck directive. Stefan From stefan_ml at behnel.de Mon May 7 19:00:47 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 07 May 2012 19:00:47 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> Message-ID: <4FA7FFBF.4010905@behnel.de> mark florisson, 07.05.2012 18:28: > On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>> Stefan Behnel, 07.05.2012 15:04: >>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it as >>>>> np.ndarray, array.array etc. being some sort of "template types". That >>>>> is, >>>>> we disallow "object[int]" and require some special declarations in the >>>>> relevant pxd files. >>>> >>>> Hmm, yes, it's unfortunate that we have two different types of syntax >>>> now, >>>> one that declares the item type before the brackets and one that declares >>>> it afterwards. >>> >>> I actually think this merits some more discussion. Should we consider the >>> buffer interface syntax deprecated and focus on the memory view syntax? >> >> I think that's the very-long-term intention. Then again, it may be too early >> to really tell yet, we just need to see how the memory views play out in >> real life and whether they'll be able to replace np.ndarray[double] among >> real users. We don't want to shove things down users throats. >> >> But the use of the trailing-[] syntax needs some cleaning up. Me and Mark >> agreed we'd put this proposal forward when we got around to it: >> >> - Deprecate the "object[double]" form, where [dtype] can be stuck on any >> extension type >> >> - But, do NOT (for the next year at least) deprecate np.ndarray[double], >> array.array[double], etc. Basically, there should be a magic flag in >> extension type declarations saying "I can be a buffer". >> >> For one thing, that is sort of needed to open up things for templated cdef >> classes/fused types cdef classes, if that is ever implemented. > > Deprecating is definitely a good start. Then the first step on that road is to rework the documentation so that it pushes users into going for memory views instead of the plain buffer syntax. > I think at least if you only > allow two types as buffers it will be at least reasonably clear when > one is dealing with fused types or buffers. > > Basically, I think memoryviews should live up to demands of the users, > which would mean there would be no reason to keep the buffer syntax. > One thing to do is make memoryviews coerce cheaply back to the > original objects if wanted (which is likely). Writting > np.asarray(mymemview) is kind of annoying. ... and also doesn't do the same thing, I believe. > Also, OT (sorry), but I'm kind of worried about the memoryview ABI. If > it changes (and I intend to do so), cython modules compiled with > different cython versions will become incompatible if they call each > other through pxds. Maybe that should be defined as UB... Would there be a way to only use the plain buffer interface for cross module memory view exchange? That could be an acceptable overhead to pay for ABI independence. Stefan From stefan_ml at behnel.de Mon May 7 19:06:17 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 07 May 2012 19:06:17 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7F33A.9020903@astro.uio.no> <4FA7F477.30701@behnel.de> Message-ID: <4FA80109.9020201@behnel.de> mark florisson, 07.05.2012 18:18: > On 7 May 2012 17:16, mark florisson wrote: >> On 7 May 2012 17:12, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 07.05.2012 18:07: >>>> On 05/07/2012 06:04 PM, mark florisson wrote: >>>>> On 7 May 2012 12:10, Stefan Behnel wrote: >>>>>> BTW, is there a reason why we shouldn't allow a "not None" declaration for >>>>>> cdef functions? Obviously, the caller would have to do the check in that >>>>>> case. >>>>> >>>>> Why can't the callee just check it? If it's None, just raise an >>>>> exception like usual? >>>> >>>> It's just that there's a lot more potential for rather easy optimization if >>>> the caller does it. >>> >>> Exactly. The NoneCheckNode is easy to get rid of at any stage in the >>> pipeline, whereas a hard coded None check has a fixed cost at runtime. >> >> I see, yes. I expect a pointer comparison to be reasonably >> insignificant compared to function call overhead, but it would also >> reduce the code in the instruction cache. If you take the address of >> the function though, or if you declare it public in a pxd, you >> probably don't want to do that, as you still want to be safe when >> called from C. Could do the same trick as in the 'less annotations' >> CEP though, that would be nice. > > ... or you could document that 'not None' means the caller cannot pass > it in, but that would be weird as you could do it from Cython and get > an exception, but not from C :) That would be better specified in the > documentation of the function as its contract or whatever. "not None" on a cdef function means what all declarations on cdef functions mean: the caller is responsible for doing the appropriate type conversions and checks. If a function accepts an int32 and the caller puts a float32 on the stack, it's not the fault of the callee. The same applies to extension type arguments and None checks. Stefan From stefan_ml at behnel.de Mon May 7 19:08:40 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 07 May 2012 19:08:40 +0200 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: <4FA80198.8070303@behnel.de> mark florisson, 07.05.2012 18:19: > On 7 May 2012 17:04, Vitja Makarov wrote: >> Hmm, it seems to me that master is currently broken: >> >> https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests/BACKEND=c,PYVERSION=py27-ext/ >> > Quite broken, in fact :) It doesn't ever print error messages property anymore. Yes, Robert broke the compiler error processing while trying to fix it up for parallel compilation. https://github.com/cython/cython/commit/5d1fddb87fd68991e7fbc79c469273398638b6ff Stefan From markflorisson88 at gmail.com Mon May 7 19:08:52 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 18:08:52 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA7FFBF.4010905@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <4FA7FFBF.4010905@behnel.de> Message-ID: On 7 May 2012 18:00, Stefan Behnel wrote: > mark florisson, 07.05.2012 18:28: >> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>> Stefan Behnel, 07.05.2012 15:04: >>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it as >>>>>> np.ndarray, array.array etc. being some sort of "template types". That >>>>>> is, >>>>>> we disallow "object[int]" and require some special declarations in the >>>>>> relevant pxd files. >>>>> >>>>> Hmm, yes, it's unfortunate that we have two different types of syntax >>>>> now, >>>>> one that declares the item type before the brackets and one that declares >>>>> it afterwards. >>>> >>>> I actually think this merits some more discussion. Should we consider the >>>> buffer interface syntax deprecated and focus on the memory view syntax? >>> >>> I think that's the very-long-term intention. Then again, it may be too early >>> to really tell yet, we just need to see how the memory views play out in >>> real life and whether they'll be able to replace np.ndarray[double] among >>> real users. We don't want to shove things down users throats. >>> >>> But the use of the trailing-[] syntax needs some cleaning up. Me and Mark >>> agreed we'd put this proposal forward when we got around to it: >>> >>> ?- Deprecate the "object[double]" form, where [dtype] can be stuck on any >>> extension type >>> >>> ?- But, do NOT (for the next year at least) deprecate np.ndarray[double], >>> array.array[double], etc. Basically, there should be a magic flag in >>> extension type declarations saying "I can be a buffer". >>> >>> For one thing, that is sort of needed to open up things for templated cdef >>> classes/fused types cdef classes, if that is ever implemented. >> >> Deprecating is definitely a good start. > > Then the first step on that road is to rework the documentation so that it > pushes users into going for memory views instead of the plain buffer syntax. > Well, memoryviews are not yet entirely bug free (although the next release will aim to fix the problems pointed out by users so far), and they also have some other problems. >> I think at least if you only >> allow two types as buffers it will be at least reasonably clear when >> one is dealing with fused types or buffers. >> >> Basically, I think memoryviews should live up to demands of the users, >> which would mean there would be no reason to keep the buffer syntax. >> One thing to do is make memoryviews coerce cheaply back to the >> original objects if wanted (which is likely). Writting >> np.asarray(mymemview) is kind of annoying. > > ... and also doesn't do the same thing, I believe. > > >> Also, OT (sorry), but I'm kind of worried about the memoryview ABI. If >> it changes (and I intend to do so), cython modules compiled with >> different cython versions will become incompatible if they call each >> other through pxds. Maybe that should be defined as UB... > > Would there be a way to only use the plain buffer interface for cross > module memory view exchange? That could be an acceptable overhead to pay > for ABI independence. I want to store extra flags and pointers in there as well, so I don't think that will be enough. It will also be rather annoying and complicate calling code. > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From d.s.seljebotn at astro.uio.no Mon May 7 19:10:42 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 May 2012 19:10:42 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA7FFBF.4010905@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <4FA7FFBF.4010905@behnel.de> Message-ID: <4FA80212.1030808@astro.uio.no> On 05/07/2012 07:00 PM, Stefan Behnel wrote: > mark florisson, 07.05.2012 18:28: >> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>> Stefan Behnel, 07.05.2012 15:04: >>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it as >>>>>> np.ndarray, array.array etc. being some sort of "template types". That >>>>>> is, >>>>>> we disallow "object[int]" and require some special declarations in the >>>>>> relevant pxd files. >>>>> >>>>> Hmm, yes, it's unfortunate that we have two different types of syntax >>>>> now, >>>>> one that declares the item type before the brackets and one that declares >>>>> it afterwards. >>>> >>>> I actually think this merits some more discussion. Should we consider the >>>> buffer interface syntax deprecated and focus on the memory view syntax? >>> >>> I think that's the very-long-term intention. Then again, it may be too early >>> to really tell yet, we just need to see how the memory views play out in >>> real life and whether they'll be able to replace np.ndarray[double] among >>> real users. We don't want to shove things down users throats. >>> >>> But the use of the trailing-[] syntax needs some cleaning up. Me and Mark >>> agreed we'd put this proposal forward when we got around to it: >>> >>> - Deprecate the "object[double]" form, where [dtype] can be stuck on any >>> extension type >>> >>> - But, do NOT (for the next year at least) deprecate np.ndarray[double], >>> array.array[double], etc. Basically, there should be a magic flag in >>> extension type declarations saying "I can be a buffer". >>> >>> For one thing, that is sort of needed to open up things for templated cdef >>> classes/fused types cdef classes, if that is ever implemented. >> >> Deprecating is definitely a good start. > > Then the first step on that road is to rework the documentation so that it > pushes users into going for memory views instead of the plain buffer syntax. -1, premature. Dag > > >> I think at least if you only >> allow two types as buffers it will be at least reasonably clear when >> one is dealing with fused types or buffers. >> >> Basically, I think memoryviews should live up to demands of the users, >> which would mean there would be no reason to keep the buffer syntax. >> One thing to do is make memoryviews coerce cheaply back to the >> original objects if wanted (which is likely). Writting >> np.asarray(mymemview) is kind of annoying. > > ... and also doesn't do the same thing, I believe. > > >> Also, OT (sorry), but I'm kind of worried about the memoryview ABI. If >> it changes (and I intend to do so), cython modules compiled with >> different cython versions will become incompatible if they call each >> other through pxds. Maybe that should be defined as UB... > > Would there be a way to only use the plain buffer interface for cross > module memory view exchange? That could be an acceptable overhead to pay > for ABI independence. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Mon May 7 19:13:15 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 18:13:15 +0100 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA80109.9020201@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7F33A.9020903@astro.uio.no> <4FA7F477.30701@behnel.de> <4FA80109.9020201@behnel.de> Message-ID: On 7 May 2012 18:06, Stefan Behnel wrote: > mark florisson, 07.05.2012 18:18: >> On 7 May 2012 17:16, mark florisson wrote: >>> On 7 May 2012 17:12, Stefan Behnel wrote: >>>> Dag Sverre Seljebotn, 07.05.2012 18:07: >>>>> On 05/07/2012 06:04 PM, mark florisson wrote: >>>>>> On 7 May 2012 12:10, Stefan Behnel wrote: >>>>>>> BTW, is there a reason why we shouldn't allow a "not None" declaration for >>>>>>> cdef functions? Obviously, the caller would have to do the check in that >>>>>>> case. >>>>>> >>>>>> Why can't the callee just check it? If it's None, just raise an >>>>>> exception like usual? >>>>> >>>>> It's just that there's a lot more potential for rather easy optimization if >>>>> the caller does it. >>>> >>>> Exactly. The NoneCheckNode is easy to get rid of at any stage in the >>>> pipeline, whereas a hard coded None check has a fixed cost at runtime. >>> >>> I see, yes. I expect a pointer comparison to be reasonably >>> insignificant compared to function call overhead, but it would also >>> reduce the code in the instruction cache. If you take the address of >>> the function though, or if you declare it public in a pxd, you >>> probably don't want to do that, as you still want to be safe when >>> called from C. Could do the same trick as in the 'less annotations' >>> CEP though, that would be nice. >> >> ... or you could document that 'not None' means the caller cannot pass >> it in, but that would be weird as you could do it from Cython and get >> an exception, but not from C :) That would be better specified in the >> documentation of the function as its contract or whatever. > > "not None" on a cdef function means what all declarations on cdef functions > mean: the caller is responsible for doing the appropriate type conversions > and checks. > > If a function accepts an int32 and the caller puts a float32 on the stack, > it's not the fault of the callee. The same applies to extension type > arguments and None checks. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel Well, 'with gil' makes the callee do something. I would personally expect not None to be enforced at least conceptually in the function itself. In any case, I also think it's really not an important issue, as it's likely pretty uncommon to call it from C. If it does break, it will be easy enough to figure out (unless you accidentally corrupt your memory :) So either solution would be fine with me. From markflorisson88 at gmail.com Mon May 7 19:18:08 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 18:18:08 +0100 Subject: [Cython] checking for "None" in nogil function In-Reply-To: <4FA7FDE8.9050807@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7EEBE.7060508@astro.uio.no> <4FA7FDE8.9050807@behnel.de> Message-ID: On 7 May 2012 17:52, Stefan Behnel wrote: > Dag Sverre Seljebotn, 07.05.2012 17:48: >> On 05/07/2012 03:04 PM, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>> As far as I can remember (which might be biased towards my personal >>>>>> view), the conclusion was that we left the current semantics in place, >>>>>> relying on better control flow analysis to make None-checks cheaper, and >>>>>> when those are cheap enough, make the nonecheck directive default to >>>>>> True >>>>> >>>>> At least for buffer arguments, it silently corrupts data or segfaults in >>>>> the current state of affairs, as you pointed out. Not exactly ideal. >>>> >>>> No different than writing to a field in a cdef class... >>> >>> Hmm, aren't those None checked? At least cdef method calls are AFAIR. >> >> Not at all. That's my whole point -- currently, the rule for None in Cython >> is "it's your responsibility to never do a native operation on None". >> >> I don't like that either, but that's just inherited from Pyrex (and many >> projects would get speed regressions etc.). >> >> I'm not against changing that to "we safely None-check", if done nicely -- >> it's just that that should be done everywhere at once. > > I think that gets both of us back on the same track then. :) > > >> In current master (and as far back as I can remember), this code: >> >> cdef class A: >> ? ? cdef int field >> ? ? cdef int method(self): >> ? ? ? ? print self.field >> def f(): >> ? ? cdef A a = None >> ? ? a.field = 3 >> ? ? a.method() >> >> Turns into: >> >> ? __pyx_v_a = ((struct __pyx_obj_5test2_A *)Py_None); >> ? __pyx_v_a->field = 3; >> ? ((struct __pyx_vtabstruct_5test2_A *) >> __pyx_v_a->__pyx_vtab)->method(__pyx_v_a); > > Guess I've just been working on the builtins optimiser too long. There, > it's obviously not allowed to inject unprotected code like this automatically. > > It would be fun if we could eventually get to the point where Cython > replaces all of the code in f() with an AttributeError, as a combined > effort of control flow analysis and dead code removal. A part of that is > already there, i.e. Cython would know that 'a' "may be None" in the last > two lines and would thus generate a None check with an AttributeError if we > allowed it to do that. It wouldn't know that it's always going to be > raised, though, so the dead code removal can't strike. I guess that case is > just not important enough to implement. > > BTW, I recently tried to enable None checks in a couple of places and it > broke memory views for some reason that I didn't want to investigate. If you do want to implement it, don't hesitate to ask about any memoryview shenanigans a certain person implemented. > The > main problems really seem to be unknown argument values and the lack of > proper exception prediction, e.g. in this case: > > ?def add_one_2d(int[:,:] buf): > ? ? ?for x in xrange(buf.shape[0]): > ? ? ? ? ?for y in xrange(buf.shape[1]): > ? ? ? ? ? ? ?buf[x,y] += 1 > > it's statically obvious that only the first access to .shape (outside of > all loops) needs a None check and will raise an AttributeError for None, so > the check for the second loop can be eliminated as well as the None check > on indexing. > Yes. This can be generalized to common subexpression elimination, for bounds checking, for nonechecking, even for wraparound. >>> I think we should really get back to the habit of making code safe first >>> and fast afterwards. >> >> Nobody has argued otherwise for some time (since the cdivision thread I >> believe), this is all about Pyrex legacy. Guess part of the story is that >> there's lots of performance-sensitive code in SAGE using cdef classes which >> was written in Pyrex before Cython was around... >> >> In fact, the nonecheck directive was written by yours truly! And I argued >> for making it the default at the time! > > I've been working on the None checks (and on removing them) repeatedly, > although I didn't remember the particular details of discussing the > nonecheck directive. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Mon May 7 19:20:44 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 18:20:44 +0100 Subject: [Cython] checking for "None" in nogil function In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7EEBE.7060508@astro.uio.no> <4FA7FDE8.9050807@behnel.de> Message-ID: On 7 May 2012 18:18, mark florisson wrote: > On 7 May 2012 17:52, Stefan Behnel wrote: >> Dag Sverre Seljebotn, 07.05.2012 17:48: >>> On 05/07/2012 03:04 PM, Stefan Behnel wrote: >>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>> As far as I can remember (which might be biased towards my personal >>>>>>> view), the conclusion was that we left the current semantics in place, >>>>>>> relying on better control flow analysis to make None-checks cheaper, and >>>>>>> when those are cheap enough, make the nonecheck directive default to >>>>>>> True >>>>>> >>>>>> At least for buffer arguments, it silently corrupts data or segfaults in >>>>>> the current state of affairs, as you pointed out. Not exactly ideal. >>>>> >>>>> No different than writing to a field in a cdef class... >>>> >>>> Hmm, aren't those None checked? At least cdef method calls are AFAIR. >>> >>> Not at all. That's my whole point -- currently, the rule for None in Cython >>> is "it's your responsibility to never do a native operation on None". >>> >>> I don't like that either, but that's just inherited from Pyrex (and many >>> projects would get speed regressions etc.). >>> >>> I'm not against changing that to "we safely None-check", if done nicely -- >>> it's just that that should be done everywhere at once. >> >> I think that gets both of us back on the same track then. :) >> >> >>> In current master (and as far back as I can remember), this code: >>> >>> cdef class A: >>> ? ? cdef int field >>> ? ? cdef int method(self): >>> ? ? ? ? print self.field >>> def f(): >>> ? ? cdef A a = None >>> ? ? a.field = 3 >>> ? ? a.method() >>> >>> Turns into: >>> >>> ? __pyx_v_a = ((struct __pyx_obj_5test2_A *)Py_None); >>> ? __pyx_v_a->field = 3; >>> ? ((struct __pyx_vtabstruct_5test2_A *) >>> __pyx_v_a->__pyx_vtab)->method(__pyx_v_a); >> >> Guess I've just been working on the builtins optimiser too long. There, >> it's obviously not allowed to inject unprotected code like this automatically. >> >> It would be fun if we could eventually get to the point where Cython >> replaces all of the code in f() with an AttributeError, as a combined >> effort of control flow analysis and dead code removal. A part of that is >> already there, i.e. Cython would know that 'a' "may be None" in the last >> two lines and would thus generate a None check with an AttributeError if we >> allowed it to do that. It wouldn't know that it's always going to be >> raised, though, so the dead code removal can't strike. I guess that case is >> just not important enough to implement. >> >> BTW, I recently tried to enable None checks in a couple of places and it >> broke memory views for some reason that I didn't want to investigate. > > If you do want to implement it, don't hesitate to ask about any > memoryview shenanigans a certain person implemented. > >> The >> main problems really seem to be unknown argument values and the lack of >> proper exception prediction, e.g. in this case: >> >> ?def add_one_2d(int[:,:] buf): >> ? ? ?for x in xrange(buf.shape[0]): >> ? ? ? ? ?for y in xrange(buf.shape[1]): >> ? ? ? ? ? ? ?buf[x,y] += 1 >> >> it's statically obvious that only the first access to .shape (outside of >> all loops) needs a None check and will raise an AttributeError for None, so >> the check for the second loop can be eliminated as well as the None check >> on indexing. >> > > Yes. This can be generalized to common subexpression elimination, for > bounds checking, for nonechecking, even for wraparound. Given the awesome control flow we have now, I don't think implementing SSA is very hard at all. From there it's also not too hard to implement these things. Pulling these things out of loops is slightly harder though, given guards etc, so you need two implementations, one with all checks in there, and one without any checks. You take the checking version when your conditions outside the loop don't match, as you need to raise the (potential) exception at the right point. >>>> I think we should really get back to the habit of making code safe first >>>> and fast afterwards. >>> >>> Nobody has argued otherwise for some time (since the cdivision thread I >>> believe), this is all about Pyrex legacy. Guess part of the story is that >>> there's lots of performance-sensitive code in SAGE using cdef classes which >>> was written in Pyrex before Cython was around... >>> >>> In fact, the nonecheck directive was written by yours truly! And I argued >>> for making it the default at the time! >> >> I've been working on the None checks (and on removing them) repeatedly, >> although I didn't remember the particular details of discussing the >> nonecheck directive. >> >> Stefan >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel From d.s.seljebotn at astro.uio.no Mon May 7 20:40:50 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 07 May 2012 20:40:50 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> Message-ID: <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> mark florisson wrote: >On 7 May 2012 17:00, Dag Sverre Seljebotn >wrote: >> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>> >>> Stefan Behnel, 07.05.2012 15:04: >>>> >>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>> >>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >as >>>>> np.ndarray, array.array etc. being some sort of "template types". >That >>>>> is, >>>>> we disallow "object[int]" and require some special declarations in >the >>>>> relevant pxd files. >>>> >>>> >>>> Hmm, yes, it's unfortunate that we have two different types of >syntax >>>> now, >>>> one that declares the item type before the brackets and one that >declares >>>> it afterwards. >>> >>> >>> I actually think this merits some more discussion. Should we >consider the >>> buffer interface syntax deprecated and focus on the memory view >syntax? >> >> >> I think that's the very-long-term intention. Then again, it may be >too early >> to really tell yet, we just need to see how the memory views play out >in >> real life and whether they'll be able to replace np.ndarray[double] >among >> real users. We don't want to shove things down users throats. >> >> But the use of the trailing-[] syntax needs some cleaning up. Me and >Mark >> agreed we'd put this proposal forward when we got around to it: >> >> ?- Deprecate the "object[double]" form, where [dtype] can be stuck on >any >> extension type >> >> ?- But, do NOT (for the next year at least) deprecate >np.ndarray[double], >> array.array[double], etc. Basically, there should be a magic flag in >> extension type declarations saying "I can be a buffer". >> >> For one thing, that is sort of needed to open up things for templated >cdef >> classes/fused types cdef classes, if that is ever implemented. > >Deprecating is definitely a good start. I think at least if you only >allow two types as buffers it will be at least reasonably clear when >one is dealing with fused types or buffers. > >Basically, I think memoryviews should live up to demands of the users, >which would mean there would be no reason to keep the buffer syntax. But they are different approaches -- use a different type/API, or just try to speed up parts of NumPy.. >One thing to do is make memoryviews coerce cheaply back to the >original objects if wanted (which is likely). Writting >np.asarray(mymemview) is kind of annoying. > It is going to be very confusing to have type(mymemview), repr(mymemview), and so on come out as NumPy arrays, but not have the full API of NumPy. Unless you auto-convert on getattr to... If you want to eradicate the distinction between the backing array and the memory view and make it transparent, I really suggest you kick back alive np.ndarray (it can exist in some 'unrealized' state with delayed construction after slicing, and so on). Implementation much the same either way, it is all about how it is presented to the user. Something like mymemview.asobject() could work though, and while not much shorter, it would have some polymorphism that np.asarray does not have (based probably on some custom PEP 3118 extension) Dag >Also, OT (sorry), but I'm kind of worried about the memoryview ABI. If >it changes (and I intend to do so), cython modules compiled with >different cython versions will become incompatible if they call each >other through pxds. Maybe that should be defined as UB... > >> The semantic meaning of trailing [] is still sort of like the C++ >meaning; >> that it templates the argument types (except it's lots of special >cases in >> the compiler for various things rather than a Turing-complete >template >> language...) >> >> Dag >> >>> >>> The words-to-punctuation ratio of the latter may hurt the eyes when >>> encountering it unprepared, but at least it doesn't require two type >>> names, >>> of which the one before the brackets (i.e. "object") is mostly >useless. >>> (Although it does reflect the notion that we are dealing with an >object >>> here ...) >>> >>> Stefan >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >> >> >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel >_______________________________________________ >cython-devel mailing list >cython-devel at python.org >http://mail.python.org/mailman/listinfo/cython-devel -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. From markflorisson88 at gmail.com Mon May 7 23:21:04 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 7 May 2012 22:21:04 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> Message-ID: On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: > > > mark florisson wrote: > >>On 7 May 2012 17:00, Dag Sverre Seljebotn >>wrote: >>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>> >>>> Stefan Behnel, 07.05.2012 15:04: >>>>> >>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>> >>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>as >>>>>> np.ndarray, array.array etc. being some sort of "template types". >>That >>>>>> is, >>>>>> we disallow "object[int]" and require some special declarations in >>the >>>>>> relevant pxd files. >>>>> >>>>> >>>>> Hmm, yes, it's unfortunate that we have two different types of >>syntax >>>>> now, >>>>> one that declares the item type before the brackets and one that >>declares >>>>> it afterwards. >>>> >>>> >>>> I actually think this merits some more discussion. Should we >>consider the >>>> buffer interface syntax deprecated and focus on the memory view >>syntax? >>> >>> >>> I think that's the very-long-term intention. Then again, it may be >>too early >>> to really tell yet, we just need to see how the memory views play out >>in >>> real life and whether they'll be able to replace np.ndarray[double] >>among >>> real users. We don't want to shove things down users throats. >>> >>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>Mark >>> agreed we'd put this proposal forward when we got around to it: >>> >>> ?- Deprecate the "object[double]" form, where [dtype] can be stuck on >>any >>> extension type >>> >>> ?- But, do NOT (for the next year at least) deprecate >>np.ndarray[double], >>> array.array[double], etc. Basically, there should be a magic flag in >>> extension type declarations saying "I can be a buffer". >>> >>> For one thing, that is sort of needed to open up things for templated >>cdef >>> classes/fused types cdef classes, if that is ever implemented. >> >>Deprecating is definitely a good start. I think at least if you only >>allow two types as buffers it will be at least reasonably clear when >>one is dealing with fused types or buffers. >> >>Basically, I think memoryviews should live up to demands of the users, >>which would mean there would be no reason to keep the buffer syntax. > > But they are different approaches -- use a different type/API, or just try to speed up parts of NumPy.. > >>One thing to do is make memoryviews coerce cheaply back to the >>original objects if wanted (which is likely). Writting >>np.asarray(mymemview) is kind of annoying. >> > > > It is going to be very confusing to have type(mymemview), repr(mymemview), and so on come out as NumPy arrays, but not have the full API of NumPy. Unless you auto-convert on getattr to... Yeah, the idea is as very simple, as you mention, just keep the object around cached, and when you slice construct one lazily. > If you want to eradicate the distinction between the backing array and the memory view and make it transparent, I really suggest you kick back alive np.ndarray (it can exist in some 'unrealized' state with delayed construction after slicing, and so on). Implementation much the same either way, it is all about how it is presented to the user. You mean the buffer syntax? > Something like mymemview.asobject() could work though, and while not much shorter, it would have some polymorphism that np.asarray does not have (based probably on some custom PEP 3118 extension) I was thinking you could allow the user to register a callback, and use that to coerce from a memoryview back to an object (given a memoryview object). For numpy this would be np.asarray, and the implementation is allowed to cache the result (which it will). It may be too magicky though... but it will be convenient. The memoryview will act as a subclass, meaning that any of its methods will override methods of the converted object. > Dag > > > >>Also, OT (sorry), but I'm kind of worried about the memoryview ABI. If >>it changes (and I intend to do so), cython modules compiled with >>different cython versions will become incompatible if they call each >>other through pxds. Maybe that should be defined as UB... >> >>> The semantic meaning of trailing [] is still sort of like the C++ >>meaning; >>> that it templates the argument types (except it's lots of special >>cases in >>> the compiler for various things rather than a Turing-complete >>template >>> language...) >>> >>> Dag >>> >>>> >>>> The words-to-punctuation ratio of the latter may hurt the eyes when >>>> encountering it unprepared, but at least it doesn't require two type >>>> names, >>>> of which the one before the brackets (i.e. "object") is mostly >>useless. >>>> (Although it does reflect the notion that we are dealing with an >>object >>>> here ...) >>>> >>>> Stefan >>>> _______________________________________________ >>>> cython-devel mailing list >>>> cython-devel at python.org >>>> http://mail.python.org/mailman/listinfo/cython-devel >>> >>> >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >>_______________________________________________ >>cython-devel mailing list >>cython-devel at python.org >>http://mail.python.org/mailman/listinfo/cython-devel > > -- > Sent from my Android phone with K-9 Mail. Please excuse my brevity. > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From robertwb at gmail.com Tue May 8 00:35:29 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Mon, 7 May 2012 15:35:29 -0700 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> Message-ID: On Mon, May 7, 2012 at 11:40 AM, Dag Sverre Seljebotn wrote: > > mark florisson wrote: > >>On 7 May 2012 17:00, Dag Sverre Seljebotn >>wrote: >>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>> >>>> Stefan Behnel, 07.05.2012 15:04: >>>>> >>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>> >>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>as >>>>>> np.ndarray, array.array etc. being some sort of "template types". >>That >>>>>> is, >>>>>> we disallow "object[int]" and require some special declarations in >>the >>>>>> relevant pxd files. >>>>> >>>>> >>>>> Hmm, yes, it's unfortunate that we have two different types of >>syntax >>>>> now, >>>>> one that declares the item type before the brackets and one that >>declares >>>>> it afterwards. >>>> >>>> >>>> I actually think this merits some more discussion. Should we >>consider the >>>> buffer interface syntax deprecated and focus on the memory view >>syntax? >>> >>> >>> I think that's the very-long-term intention. Then again, it may be >>too early >>> to really tell yet, we just need to see how the memory views play out >>in >>> real life and whether they'll be able to replace np.ndarray[double] >>among >>> real users. We don't want to shove things down users throats. >>> >>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>Mark >>> agreed we'd put this proposal forward when we got around to it: >>> >>> ?- Deprecate the "object[double]" form, where [dtype] can be stuck on >>any >>> extension type >>> >>> ?- But, do NOT (for the next year at least) deprecate >>np.ndarray[double], >>> array.array[double], etc. Basically, there should be a magic flag in >>> extension type declarations saying "I can be a buffer". >>> >>> For one thing, that is sort of needed to open up things for templated >>cdef >>> classes/fused types cdef classes, if that is ever implemented. >> >>Deprecating is definitely a good start. I think at least if you only >>allow two types as buffers it will be at least reasonably clear when >>one is dealing with fused types or buffers. >> >>Basically, I think memoryviews should live up to demands of the users, >>which would mean there would be no reason to keep the buffer syntax. > > But they are different approaches -- use a different type/API, or just try to speed up parts of NumPy.. Part of the question here is whether using np.ndarray[...] currently (or will) offer any additional functionality. While we should likely start steering people this direction, especially over object[...], it seems too soon to deprecate the old-style buffer access. >>One thing to do is make memoryviews coerce cheaply back to the >>original objects if wanted (which is likely). Writting >>np.asarray(mymemview) is kind of annoying. >> > > > It is going to be very confusing to have type(mymemview), repr(mymemview), and so on come out as NumPy arrays, but not have the full API of NumPy. Unless you auto-convert on getattr to... > > If you want to eradicate the distinction between the backing array and the memory view and make it transparent, I really suggest you kick back alive np.ndarray (it can exist in some 'unrealized' state with delayed construction after slicing, and so on). Implementation much the same either way, it is all about how it is presented to the user. > > Something like mymemview.asobject() could work though, and while not much shorter, it would have some polymorphism that np.asarray does not have (based probably on some custom PEP 3118 extension) I think it's valuable to have a single name refer to both the Python object (on which methods can be called, and a new one might have to be created if there was slicing) and the memory view. In this light, being able to specify something is both a NumPy array (to use some (overlay optimized?) methods on it and a memory view (for fast indexing) without having two different variables can result in much cleaner code (and an easier transition from untyped NumPy). >>Also, OT (sorry), but I'm kind of worried about the memoryview ABI. If >>it changes (and I intend to do so), cython modules compiled with >>different cython versions will become incompatible if they call each >>other through pxds. Maybe that should be defined as UB... >> >>> The semantic meaning of trailing [] is still sort of like the C++ >>meaning; >>> that it templates the argument types (except it's lots of special >>cases in >>> the compiler for various things rather than a Turing-complete >>template >>> language...) >>> >>> Dag >>> >>>> >>>> The words-to-punctuation ratio of the latter may hurt the eyes when >>>> encountering it unprepared, but at least it doesn't require two type >>>> names, >>>> of which the one before the brackets (i.e. "object") is mostly >>useless. >>>> (Although it does reflect the notion that we are dealing with an >>object >>>> here ...) >>>> >>>> Stefan >>>> _______________________________________________ >>>> cython-devel mailing list >>>> cython-devel at python.org >>>> http://mail.python.org/mailman/listinfo/cython-devel >>> >>> >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >>_______________________________________________ >>cython-devel mailing list >>cython-devel at python.org >>http://mail.python.org/mailman/listinfo/cython-devel > > -- > Sent from my Android phone with K-9 Mail. Please excuse my brevity. > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From robertwb at gmail.com Tue May 8 00:42:04 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Mon, 7 May 2012 15:42:04 -0700 Subject: [Cython] 0.17 In-Reply-To: <4FA80198.8070303@behnel.de> References: <4FA80198.8070303@behnel.de> Message-ID: On Mon, May 7, 2012 at 10:08 AM, Stefan Behnel wrote: > mark florisson, 07.05.2012 18:19: >> On 7 May 2012 17:04, Vitja Makarov wrote: >>> Hmm, it seems to me that master is currently broken: >>> >>> https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests/BACKEND=c,PYVERSION=py27-ext/ >>> >> Quite broken, in fact :) It doesn't ever print error messages property anymore. > > Yes, Robert broke the compiler error processing while trying to fix it up > for parallel compilation. > > https://github.com/cython/cython/commit/5d1fddb87fd68991e7fbc79c469273398638b6ff Argh, I made that change at the last minute when I was removing a couple of debug print statements. I say we wait another week or so at least to see if any new bug reports come in, but prep a release to be cut soon, but holding it up on any new features that have not gone in yet. - Robert From robertwb at gmail.com Tue May 8 00:52:41 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Mon, 7 May 2012 15:52:41 -0700 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7A6B2.5000801@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> Message-ID: On Mon, May 7, 2012 at 3:40 AM, Dag Sverre Seljebotn wrote: > [moving to dev list] > > > On 05/07/2012 11:17 AM, Stefan Behnel wrote: >> >> Dag Sverre Seljebotn, 07.05.2012 10:44: >>> >>> On 05/07/2012 07:48 AM, Stefan Behnel wrote: >>>> >>>> shaunc, 07.05.2012 07:13: >>>>> >>>>> The following code: >>>>> >>>>> cdef int foo( double[:] bar ) nogil: >>>>> ? ? ?return bar is None >>>>> >>>>> causes: "Converting to Python object not allowed without gil" >>>>> >>>>> However, I was under the impression that: "When comparing a value with >>>>> None, >>>>> keep in mind that, if x is a Python object, x is None and x is not None >>>>> are >>>>> very efficient because they translate directly to C pointer >>>>> comparisons," >>>>> >>>>> I guess the problem is that the memoryview is not a python object -- >>>>> indeed, this compiles in the form: >>>>> >>>>> cdef int foo( object bar ) nogil: >>>>> >>>>> ? ? ?return bar is None >>>>> >>>>> But this is a bit counterintuitive... do I need to do "with gil" to >>>>> check >>>>> if a memoryview is None? And in a nogil function, I'm not necessarily >>>>> guaranteed that I don't have the gil -- what is the best way ensure I >>>>> have >>>>> the gil? (Is there a "secret system call" or should I use a try block?) >>>>> >>>>> It would seem more appropriate (IMHO, of course :)) to allow "bar is >>>>> None" >>>>> also when bar is a memoryview.... >>>> >>>> >>>> I wonder why a memory view should be allowed to be None in the first >>>> place. >>>> Buffer arguments aren't (because they get unpacked on entry), so why >>>> should >>>> memory views? >>> >>> >>> ? At least when I implemented it, buffers get unpacked but the case of a >>> None buffer is treated specially, and you're fully allowed (and segfault >>> if >>> you [] it). >> >> >> Hmm, ok, maybe I just got confused by the code then. >> >> I think the docs should state that buffer arguments are best used together >> with the "not None" declaration then. > > > I use them with "=None" default values all the time... then do a > None-check manually. > > It's really no different from cdef classes. > > >> And I remember that we wanted to change the default settings for extension >> type arguments from "or None" to "not None" years ago but never actually >> did it. > > > I remember that there was such a debate, but I certainly don't remember > that this was the conclusion :-) I didn't agree with that view then and > I don't now. I don't remember what Robert's view was... > > As far as I can remember (which might be biased towards my personal > view), the conclusion was that we left the current semantics in place, > relying on better control flow analysis to make None-checks cheaper, and > when those are cheap enough, make the nonecheck directive default to > True (Java is sort of prior art that this can indeed be done?). Yes, that was exactly my point of view. - Robert From robertwb at gmail.com Tue May 8 01:13:33 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Mon, 7 May 2012 16:13:33 -0700 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7EEBE.7060508@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7EEBE.7060508@astro.uio.no> Message-ID: On Mon, May 7, 2012 at 8:48 AM, Dag Sverre Seljebotn wrote: > On 05/07/2012 03:04 PM, Stefan Behnel wrote: >> >> Dag Sverre Seljebotn, 07.05.2012 13:48: >> >>> Here you go: >>> >>> def foo(np.ndarray[double] a, np.ndarray[double] out=None): >>> ? ? if out is None: >>> ? ? ? ? out = np.empty_like(a) >> >> >> Ah, right - output arguments. Hadn't thought of those. >> >> Still, since you pass None explicitly as a default argument, this code >> wouldn't be impacted by disallowing None for buffers by default. That case >> is already handled specially in the compiler. But a better default would >> prevent the *first* argument from being None. >> >> So, basically, it would do the right thing straight away in your case and >> generate safer and more efficient code for it, whereas now you have to >> test >> 'a' for being None explicitly and Cython won't understand that hint due to >> insufficient static analysis. At least, since my last commit you can make >> Cython do the same thing by declaring it "not None". > > > Yes, thanks! > > >>>>> It's really no different from cdef classes. >>>> >>>> >>>> I find it at least a bit more surprising because a buffer unpacking >>>> argument is a rather strong hint that you expect something that supports >>>> this protocol. The fact that you type your function argument with it >>>> hints >>>> at the intention to properly unpack it on entry. I'm sure there are lots >>>> of >>>> users who were or will be surprised when they realise that that doesn't >>>> exclude None values. >>> >>> >>> Whereas I think there would be more users surprised by the opposite. >> >> >> We've had enough complaints from users about None being allowed for typed >> arguments already to consider it at least a gotcha of the language. >> >> The main reason we didn't change this behaviour back then was that it >> would >> clearly break user code and we thought we could do without that. That's >> different from considering it "right" and "good". >> >> >>>>>> And I remember that we wanted to change the default settings for >>>>>> extension >>>>>> type arguments from "or None" to "not None" years ago but never >>>>>> actually >>>>>> did it. >>>>> >>>>> >>>>> I remember that there was such a debate, but I certainly don't remember >>>>> that this was the conclusion :-) >>>> >>>> >>>> Maybe not, yes. >>>> >>>> >>>>> I didn't agree with that view then and >>>>> I don't now. I don't remember what Robert's view was... >>>>> >>>>> As far as I can remember (which might be biased towards my personal >>>>> view), the conclusion was that we left the current semantics in place, >>>>> relying on better control flow analysis to make None-checks cheaper, >>>>> and >>>>> when those are cheap enough, make the nonecheck directive default to >>>>> True >>>> >>>> >>>> At least for buffer arguments, it silently corrupts data or segfaults in >>>> the current state of affairs, as you pointed out. Not exactly ideal. >>> >>> >>> No different than writing to a field in a cdef class... >> >> >> Hmm, aren't those None checked? At least cdef method calls are AFAIR. > > > Not at all. That's my whole point -- currently, the rule for None in Cython > is "it's your responsibility to never do a native operation on None". > > I don't like that either, but that's just inherited from Pyrex (and many > projects would get speed regressions etc.). > > I'm not against changing that to "we safely None-check", if done nicely -- > it's just that that should be done everywhere at once. > > In current master (and as far back as I can remember), this code: > > cdef class A: > ? ?cdef int field > ? ?cdef int method(self): > ? ? ? ?print self.field > def f(): > ? ?cdef A a = None > ? ?a.field = 3 > ? ?a.method() > > Turns into: > > ?__pyx_v_a = ((struct __pyx_obj_5test2_A *)Py_None); > ?__pyx_v_a->field = 3; > ?((struct __pyx_vtabstruct_5test2_A *) > __pyx_v_a->__pyx_vtab)->method(__pyx_v_a); > > > > >> I think we should really get back to the habit of making code safe first >> and fast afterwards. > > > Nobody has argued otherwise for some time (since the cdivision thread I > believe), this is all about Pyrex legacy. Guess part of the story is that > there's lots of performance-sensitive code in SAGE using cdef classes which > was written in Pyrex before Cython was around... I think there's a difference between making a new feature fast instead of safe, and introducing a (significant) performance regression to add safety to existing code. Also, the proposed change of "or None" is backwards incompatible, and the eventual solution (as far as I understand it) is to switch back to allowing None (for consistency everywhere else they occur) once we have cheap non checks in place. We can't get around the fact that cdef classes might be None, due to attributes (which must be initialized to something initially). Doing a None check on every buffer access in a loop falls into the significant performance regression, but ideally we could pull it out. - Robert From greg.ewing at canterbury.ac.nz Tue May 8 02:05:16 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 08 May 2012 12:05:16 +1200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA7C852.9020004@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> Message-ID: <4FA8633C.2040500@canterbury.ac.nz> Stefan Behnel wrote: > The main reason we didn't change this behaviour back then was that it would > clearly break user code and we thought we could do without that. That's > different from considering it "right" and "good". I changed the None-checking behaviour in Pyrex because I *wanted* to break user code. Or rather, I didn't think it would be a bad thing to make people revisit their code and think properly about whether they really wanted to allow None in each case. -- Greg From robertwb at gmail.com Tue May 8 02:19:58 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Mon, 7 May 2012 17:19:58 -0700 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA8633C.2040500@canterbury.ac.nz> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA8633C.2040500@canterbury.ac.nz> Message-ID: On Mon, May 7, 2012 at 5:05 PM, Greg Ewing wrote: > Stefan Behnel wrote: > >> The main reason we didn't change this behaviour back then was that it >> would >> clearly break user code and we thought we could do without that. That's >> different from considering it "right" and "good". > > I changed the None-checking behaviour in Pyrex because I *wanted* > to break user code. Or rather, I didn't think it would be a > bad thing to make people revisit their code and think properly > about whether they really wanted to allow None in each case. That's great if you have the time, but revisiting half a million lines of code (e.g. Sage) can be quite expensive... especially as a short-term patch for a better long-term solution (mostly optimized away None-checks on access). By bigger issue of why I don't think this is the right long-term solution is that (cp)def foo(ExnClass arg): ... should behave the same as (cp)def foo(arg): cdef ExnClass a = arg I think part of the difference is also how strongly the line is drawn between the compiled and un-compiled portions of the program. Cython blurs the line (more) between "called from Cython" and "called from Python," and only the latter doing None-checking is inconsistent too. - Robert From stefan_ml at behnel.de Tue May 8 06:41:04 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 May 2012 06:41:04 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA8633C.2040500@canterbury.ac.nz> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA8633C.2040500@canterbury.ac.nz> Message-ID: <4FA8A3E0.2010109@behnel.de> Greg Ewing, 08.05.2012 02:05: > Stefan Behnel wrote: > >> The main reason we didn't change this behaviour back then was that it would >> clearly break user code and we thought we could do without that. That's >> different from considering it "right" and "good". > > I changed the None-checking behaviour in Pyrex because I *wanted* > to break user code. Or rather, I didn't think it would be a > bad thing to make people revisit their code and think properly > about whether they really wanted to allow None in each case. The problem here is that it's not very likely that people specifically tested their code with None values, especially if they didn't carefully think of it already when writing it. So changing the default to make people think may not result in making them think before their code starts throwing exceptions somewhere in production. And having to revisit a large amount of code at that point may turn out to be rather expensive. Stefan From stefan_ml at behnel.de Tue May 8 08:03:49 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 May 2012 08:03:49 +0200 Subject: [Cython] How do you trigger a jenkins build? In-Reply-To: References: Message-ID: <4FA8B745.6080700@behnel.de> Vitja Makarov, 07.05.2012 17:08: > I've noticed that old one URL hook doesn't work for me now. > > I tried to check "Build when a change is pushed to GitHub" That should work. > and set "Jenkins Hook URL" to > https://sage.math.washington.edu:8091/hudson/github-webhook/ That isn't configured in Jenkins but in your own GitHub repo as a "post receive URL" (admin->service hooks). Stefan From stefan_ml at behnel.de Tue May 8 08:12:18 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 May 2012 08:12:18 +0200 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7F33A.9020903@astro.uio.no> <4FA7F477.30701@behnel.de> <4FA80109.9020201@behnel.de> Message-ID: <4FA8B942.5040306@behnel.de> mark florisson, 07.05.2012 19:13: > On 7 May 2012 18:06, Stefan Behnel wrote: >> mark florisson, 07.05.2012 18:18: >>> On 7 May 2012 17:16, mark florisson wrote: >>>> On 7 May 2012 17:12, Stefan Behnel wrote: >>>>> Dag Sverre Seljebotn, 07.05.2012 18:07: >>>>>> On 05/07/2012 06:04 PM, mark florisson wrote: >>>>>>> On 7 May 2012 12:10, Stefan Behnel wrote: >>>>>>>> BTW, is there a reason why we shouldn't allow a "not None" declaration for >>>>>>>> cdef functions? Obviously, the caller would have to do the check in that >>>>>>>> case. >>>>>>> >>>>>>> Why can't the callee just check it? If it's None, just raise an >>>>>>> exception like usual? >>>>>> >>>>>> It's just that there's a lot more potential for rather easy optimization if >>>>>> the caller does it. >>>>> >>>>> Exactly. The NoneCheckNode is easy to get rid of at any stage in the >>>>> pipeline, whereas a hard coded None check has a fixed cost at runtime. >>>> >>>> I see, yes. I expect a pointer comparison to be reasonably >>>> insignificant compared to function call overhead, but it would also >>>> reduce the code in the instruction cache. If you take the address of >>>> the function though, or if you declare it public in a pxd, you >>>> probably don't want to do that, as you still want to be safe when >>>> called from C. Could do the same trick as in the 'less annotations' >>>> CEP though, that would be nice. >>> >>> ... or you could document that 'not None' means the caller cannot pass >>> it in, but that would be weird as you could do it from Cython and get >>> an exception, but not from C :) That would be better specified in the >>> documentation of the function as its contract or whatever. >> >> "not None" on a cdef function means what all declarations on cdef functions >> mean: the caller is responsible for doing the appropriate type conversions >> and checks. >> >> If a function accepts an int32 and the caller puts a float32 on the stack, >> it's not the fault of the callee. The same applies to extension type >> arguments and None checks. > > Well, 'with gil' makes the callee do something. There's two sides to this one. A "with gil" function can be called from nogil code, so, in a way, the "with gil" declaration is only a short hand for a "nogil" declaration with a "with gil" block inside of the function. It's also a historical artefact of the (long) time before we actually had "with gil" blocks, but it's convenient in that it saves a level of indention that would otherwise uselessly cover the whole function. > I would personally > expect not None to be enforced at least conceptually in the function > itself. As usual: in Python functions, yes, in C functions, no. Vitek's Python wrapper split was a good step into a better design here that reflects this separation of concerns. > In any case, I also think it's really not an important issue, > as it's likely pretty uncommon to call it from C. If it does break, it > will be easy enough to figure out (unless you accidentally corrupt > your memory :) So either solution would be fine with me. Good. Stefan From stefan_ml at behnel.de Tue May 8 08:22:55 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 May 2012 08:22:55 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA80212.1030808@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <4FA7FFBF.4010905@behnel.de> <4FA80212.1030808@astro.uio.no> Message-ID: <4FA8BBBF.5080808@behnel.de> Dag Sverre Seljebotn, 07.05.2012 19:10: > On 05/07/2012 07:00 PM, Stefan Behnel wrote: >> mark florisson, 07.05.2012 18:28: >>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it as >>>>>>> np.ndarray, array.array etc. being some sort of "template types". That >>>>>>> is, >>>>>>> we disallow "object[int]" and require some special declarations in the >>>>>>> relevant pxd files. >>>>>> >>>>>> Hmm, yes, it's unfortunate that we have two different types of syntax >>>>>> now, >>>>>> one that declares the item type before the brackets and one that >>>>>> declares it afterwards. >>>>> >>>>> I actually think this merits some more discussion. Should we consider the >>>>> buffer interface syntax deprecated and focus on the memory view syntax? >>>> >>>> I think that's the very-long-term intention. Then again, it may be too >>>> early >>>> to really tell yet, we just need to see how the memory views play out in >>>> real life and whether they'll be able to replace np.ndarray[double] among >>>> real users. We don't want to shove things down users throats. >>>> >>>> But the use of the trailing-[] syntax needs some cleaning up. Me and Mark >>>> agreed we'd put this proposal forward when we got around to it: >>>> >>>> - Deprecate the "object[double]" form, where [dtype] can be stuck on any >>>> extension type >>>> >>>> - But, do NOT (for the next year at least) deprecate np.ndarray[double], >>>> array.array[double], etc. Basically, there should be a magic flag in >>>> extension type declarations saying "I can be a buffer". >>>> >>>> For one thing, that is sort of needed to open up things for templated cdef >>>> classes/fused types cdef classes, if that is ever implemented. >>> >>> Deprecating is definitely a good start. >> >> Then the first step on that road is to rework the documentation so that it >> pushes users into going for memory views instead of the plain buffer syntax. > > -1, premature. Ok, fine. Then we should at least put them next to each other in the NumPy docs and explain a) what the differences are and b) which one users should choose for use cases X, Y and Z. The docs should also make it clear that using "np.ndarray" is only useful for making code work with CPython < 2.6 (and maybe some other cases where NumPy's C-API is leveraged internally), but that this declaration has the drawback of making code less versatile, e.g. because it will *not* work with memoryviews and other kinds of buffers but only plain NumPy arrays. Currently, it basically tells people that statically typed NumPy arrays are the only way to get things working. If it's known to be likely that something will become less important or even deprecated at some point in the future, it's best to make users aware by adapting the documentation ASAP, so that less impacted code gets written in the meantime. Stefan From russ at perspexis.com Tue May 8 08:25:11 2012 From: russ at perspexis.com (Russell Warren) Date: Tue, 8 May 2012 02:25:11 -0400 Subject: [Cython] Bug report: enumerate does not accept the "start" argument Message-ID: Python's built-in function 'enumerate' has a lesser-known 2nd argument that allows the start value of the enumeration to be set. See the python docs here: http://docs.python.org/library/functions.html#enumerate Cython 0.16 doesn't like it, and only allows one argument. Here is a simple file to reproduce the failure: for i in enumerate("abc", 1): > print i And the resulting output complaint: Error compiling Cython file: > ------------------------------------------------------------ > ... > for i in enumerate("abc", 1): > ^ > ------------------------------------------------------------ > deploy/_working/_cython_test.pyx:1:18: enumerate() takes at most 1 argument I have requested a trac login to file bugs like this, but the request is pending (just sent). -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertwb at gmail.com Tue May 8 08:36:02 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Mon, 7 May 2012 23:36:02 -0700 Subject: [Cython] Fwd: Re: [cython-users] checking for "None" in nogil function In-Reply-To: <4FA8A3E0.2010109@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA8633C.2040500@canterbury.ac.nz> <4FA8A3E0.2010109@behnel.de> Message-ID: On Mon, May 7, 2012 at 9:41 PM, Stefan Behnel wrote: > Greg Ewing, 08.05.2012 02:05: >> Stefan Behnel wrote: >> >>> The main reason we didn't change this behaviour back then was that it would >>> clearly break user code and we thought we could do without that. That's >>> different from considering it "right" and "good". >> >> I changed the None-checking behaviour in Pyrex because I *wanted* >> to break user code. Or rather, I didn't think it would be a >> bad thing to make people revisit their code and think properly >> about whether they really wanted to allow None in each case. > > The problem here is that it's not very likely that people specifically > tested their code with None values, especially if they didn't carefully > think of it already when writing it. > > So changing the default to make people think may not result in making them > think before their code starts throwing exceptions somewhere in production. > And having to revisit a large amount of code at that point may turn out to > be rather expensive. There's also the problem of people (including me) who wrote a lot of code that *does* correctly handle the None case which, with this change, would suddenly start (erroneously) throwing exceptions without all being revisited. - Robert From vitja.makarov at gmail.com Tue May 8 08:43:33 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Tue, 8 May 2012 10:43:33 +0400 Subject: [Cython] How do you trigger a jenkins build? In-Reply-To: <4FA8B745.6080700@behnel.de> References: <4FA8B745.6080700@behnel.de> Message-ID: 2012/5/8 Stefan Behnel : > Vitja Makarov, 07.05.2012 17:08: >> I've noticed that old one URL hook doesn't work for me now. >> >> I tried to check "Build when a change is pushed to GitHub" > > That should work. > > >> and set "Jenkins Hook URL" ?to >> https://sage.math.washington.edu:8091/hudson/github-webhook/ > > That isn't configured in Jenkins but in your own GitHub repo as a "post > receive URL" (admin->service hooks). > Thanks! -- vitja. From stefan_ml at behnel.de Tue May 8 09:37:00 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 May 2012 09:37:00 +0200 Subject: [Cython] Bug report: enumerate does not accept the "start" argument In-Reply-To: References: Message-ID: <4FA8CD1C.6030803@behnel.de> Russell Warren, 08.05.2012 08:25: > Python's built-in function 'enumerate' has a lesser-known 2nd argument that > allows the start value of the enumeration to be set. See the python docs > here: > http://docs.python.org/library/functions.html#enumerate > > Cython 0.16 doesn't like it, and only allows one argument. > > Here is a simple file to reproduce the failure: > > for i in enumerate("abc", 1): >> print i > > > And the resulting output complaint: > > Error compiling Cython file: >> ------------------------------------------------------------ >> ... >> for i in enumerate("abc", 1): >> ^ >> ------------------------------------------------------------ >> deploy/_working/_cython_test.pyx:1:18: enumerate() takes at most 1 argument Thanks for the report, here is a fix: https://github.com/cython/cython/commit/2e3a306d0b624993d41a02f790725d8b2100e57d > I have requested a trac login to file bugs like this, but the request is > pending (just sent). Please file it anyway (when you get your account) so that we can document in the tracker that it's fixed. Stefan From d.s.seljebotn at astro.uio.no Tue May 8 09:57:44 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 May 2012 09:57:44 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> Message-ID: <4FA8D1F8.5020109@astro.uio.no> On 05/07/2012 11:21 PM, mark florisson wrote: > On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >> >> >> mark florisson wrote: >> >>> On 7 May 2012 17:00, Dag Sverre Seljebotn >>> wrote: >>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>> >>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>> >>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>> >>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>> as >>>>>>> np.ndarray, array.array etc. being some sort of "template types". >>> That >>>>>>> is, >>>>>>> we disallow "object[int]" and require some special declarations in >>> the >>>>>>> relevant pxd files. >>>>>> >>>>>> >>>>>> Hmm, yes, it's unfortunate that we have two different types of >>> syntax >>>>>> now, >>>>>> one that declares the item type before the brackets and one that >>> declares >>>>>> it afterwards. >>>>> >>>>> >>>>> I actually think this merits some more discussion. Should we >>> consider the >>>>> buffer interface syntax deprecated and focus on the memory view >>> syntax? >>>> >>>> >>>> I think that's the very-long-term intention. Then again, it may be >>> too early >>>> to really tell yet, we just need to see how the memory views play out >>> in >>>> real life and whether they'll be able to replace np.ndarray[double] >>> among >>>> real users. We don't want to shove things down users throats. >>>> >>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>> Mark >>>> agreed we'd put this proposal forward when we got around to it: >>>> >>>> - Deprecate the "object[double]" form, where [dtype] can be stuck on >>> any >>>> extension type >>>> >>>> - But, do NOT (for the next year at least) deprecate >>> np.ndarray[double], >>>> array.array[double], etc. Basically, there should be a magic flag in >>>> extension type declarations saying "I can be a buffer". >>>> >>>> For one thing, that is sort of needed to open up things for templated >>> cdef >>>> classes/fused types cdef classes, if that is ever implemented. >>> >>> Deprecating is definitely a good start. I think at least if you only >>> allow two types as buffers it will be at least reasonably clear when >>> one is dealing with fused types or buffers. >>> >>> Basically, I think memoryviews should live up to demands of the users, >>> which would mean there would be no reason to keep the buffer syntax. >> >> But they are different approaches -- use a different type/API, or just try to speed up parts of NumPy.. >> >>> One thing to do is make memoryviews coerce cheaply back to the >>> original objects if wanted (which is likely). Writting >>> np.asarray(mymemview) is kind of annoying. >>> >> >> >> It is going to be very confusing to have type(mymemview), repr(mymemview), and so on come out as NumPy arrays, but not have the full API of NumPy. Unless you auto-convert on getattr to... > > Yeah, the idea is as very simple, as you mention, just keep the object > around cached, and when you slice construct one lazily. > >> If you want to eradicate the distinction between the backing array and the memory view and make it transparent, I really suggest you kick back alive np.ndarray (it can exist in some 'unrealized' state with delayed construction after slicing, and so on). Implementation much the same either way, it is all about how it is presented to the user. > > You mean the buffer syntax? > >> Something like mymemview.asobject() could work though, and while not much shorter, it would have some polymorphism that np.asarray does not have (based probably on some custom PEP 3118 extension) > > I was thinking you could allow the user to register a callback, and > use that to coerce from a memoryview back to an object (given a > memoryview object). For numpy this would be np.asarray, and the > implementation is allowed to cache the result (which it will). > It may be too magicky though... but it will be convenient. The > memoryview will act as a subclass, meaning that any of its methods > will override methods of the converted object. My point was that this seems *way* to magicky. Beyond "confusing users" and so on that are sort of subjective, here's a fundamental problem for you: We're making it very difficult to type-infer memoryviews. Consider: cdef double[:] x = ... y = x print y.shape Now, because y is not typed, you're semantically throwing in a conversion on line 2, so that line 3 says that you want the attribute access to be invoked on "whatever object x coerced back to". And we have no idea what kind of object that is. If you don't transparently convert to object, it'd be safe to automatically infer y as a double[:]. On a related note, I've said before that I dislike the notion of cdef double[:] mview = obj I'd rather like cdef double[:] mview = double[:](obj) I support Robert in that "np.ndarray[double]" is the syntax to use when you want this kind of transparent "be an object when I need to and a memory view when I need to". Proposal: 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in the language. It means exactly what you would like double[:] to mean, i.e. a variable that is memoryview when you need to and an object otherwise. When you use this type, you bear the consequences of early-binding things that could in theory be overridden. 2) double[:] is for when you want to access data of *any* Python object in a generic way. Raw PEP 3118. In those situations, access to the underlying object is much less useful. 2a) Therefore we require that you do "mview.asobject()" manually; doing "mview.foo()" is a compile-time error 2b) To drive the point home among users, and aid type inference and overall language clarity, we REMOVE the auto-acquisition and require that you do cdef double[:] mview = double[:](obj) 2c) Perhaps: Do not even coerce to a Python memoryview and disallow "print mview"; instead require that you do "print mview.asmemoryview()" or "print memoryview(mview)" or somesuch. (A related proposal that's been up earlier has been that a variable can be annotated with many interfaces; e.g. cdef A|B|C obj ...and then when you do "obj.method", it is first looked up in C, then B, then A, then Python getattr. Not sure if we want to reopen that can of worms...) Dag From stefan_ml at behnel.de Tue May 8 10:18:49 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 May 2012 10:18:49 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA8D1F8.5020109@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> Message-ID: <4FA8D6E9.9090004@behnel.de> Dag Sverre Seljebotn, 08.05.2012 09:57: > On 05/07/2012 11:21 PM, mark florisson wrote: >> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>> mark florisson wrote: >>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>> as np.ndarray, array.array etc. being some sort of "template types". >>>>>>>> That is, >>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>> the relevant pxd files. >>>>>>> >>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>> syntax now, >>>>>>> one that declares the item type before the brackets and one that >>>>>>> declares it afterwards. >>>>>> Should we consider the >>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>> syntax? >>>>> >>>>> I think that's the very-long-term intention. Then again, it may be >>>>> too early >>>>> to really tell yet, we just need to see how the memory views play out >>>>> in >>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>> among real users. We don't want to shove things down users throats. >>>>> >>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>> >>>>> - Deprecate the "object[double]" form, where [dtype] can be stuck on >>>>> any extension type >>>>> >>>>> - But, do NOT (for the next year at least) deprecate >>>>> np.ndarray[double], >>>>> array.array[double], etc. Basically, there should be a magic flag in >>>>> extension type declarations saying "I can be a buffer". >>>>> >>>>> For one thing, that is sort of needed to open up things for templated >>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>> >>>> Deprecating is definitely a good start. I think at least if you only >>>> allow two types as buffers it will be at least reasonably clear when >>>> one is dealing with fused types or buffers. >>>> >>>> Basically, I think memoryviews should live up to demands of the users, >>>> which would mean there would be no reason to keep the buffer syntax. >>> >>> But they are different approaches -- use a different type/API, or just >>> try to speed up parts of NumPy.. >>> >>>> One thing to do is make memoryviews coerce cheaply back to the >>>> original objects if wanted (which is likely). Writting >>>> np.asarray(mymemview) is kind of annoying. >>> >>> It is going to be very confusing to have type(mymemview), >>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>> full API of NumPy. Unless you auto-convert on getattr to... >> >> Yeah, the idea is as very simple, as you mention, just keep the object >> around cached, and when you slice construct one lazily. >> >>> If you want to eradicate the distinction between the backing array and >>> the memory view and make it transparent, I really suggest you kick back >>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>> construction after slicing, and so on). Implementation much the same >>> either way, it is all about how it is presented to the user. >> >> You mean the buffer syntax? >> >>> Something like mymemview.asobject() could work though, and while not >>> much shorter, it would have some polymorphism that np.asarray does not >>> have (based probably on some custom PEP 3118 extension) >> >> I was thinking you could allow the user to register a callback, and >> use that to coerce from a memoryview back to an object (given a >> memoryview object). For numpy this would be np.asarray, and the >> implementation is allowed to cache the result (which it will). >> It may be too magicky though... but it will be convenient. The >> memoryview will act as a subclass, meaning that any of its methods >> will override methods of the converted object. > > My point was that this seems *way* to magicky. > > Beyond "confusing users" and so on that are sort of subjective, here's a > fundamental problem for you: We're making it very difficult to type-infer > memoryviews. Consider: > > cdef double[:] x = ... > y = x > print y.shape > > Now, because y is not typed, you're semantically throwing in a conversion > on line 2, so that line 3 says that you want the attribute access to be > invoked on "whatever object x coerced back to". And we have no idea what > kind of object that is. > > If you don't transparently convert to object, it'd be safe to automatically > infer y as a double[:]. Why can't y be inferred as the type of x due to the assignment? > On a related note, I've said before that I dislike the notion of > > cdef double[:] mview = obj > > I'd rather like > > cdef double[:] mview = double[:](obj) Why? We currently allow cdef char* s = some_py_bytes_string Auto-coercion is a serious part of the language, and I don't see the advantage of requiring the redundancy in the case above. It's clear enough to me what the typed assignment is intended to mean: get me a buffer view on the object, regardless of what it is. > I support Robert in that "np.ndarray[double]" is the syntax to use when you > want this kind of transparent "be an object when I need to and a memory > view when I need to". > > Proposal: > > 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in > the language. It means exactly what you would like double[:] to mean, i.e. > a variable that is memoryview when you need to and an object otherwise. > When you use this type, you bear the consequences of early-binding things > that could in theory be overridden. > > 2) double[:] is for when you want to access data of *any* Python object in > a generic way. Raw PEP 3118. In those situations, access to the underlying > object is much less useful. > > 2a) Therefore we require that you do "mview.asobject()" manually; doing > "mview.foo()" is a compile-time error Sounds good. I think that would clean up the current syntax overlap very nicely. > 2b) To drive the point home among users, and aid type inference and > overall language clarity, we REMOVE the auto-acquisition and require that > you do > > cdef double[:] mview = double[:](obj) I don't see the point, as noted above. Either "obj" is statically typed and the bare assignment becomes a no-op, or it's not typed and the assignment coerces by creating a view. As with all other typed assignments. > 2c) Perhaps: Do not even coerce to a Python memoryview and disallow > "print mview"; instead require that you do "print mview.asmemoryview()" or > "print memoryview(mview)" or somesuch. This seems to depend on 2b. > (A related proposal that's been up earlier has been that a variable can be > annotated with many interfaces; e.g. > > cdef A|B|C obj > > ...and then when you do "obj.method", it is first looked up in C, then B, > then A, then Python getattr. Not sure if we want to reopen that can of > worms...) Different topic - new thread? Stefan From d.s.seljebotn at astro.uio.no Tue May 8 10:31:32 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 May 2012 10:31:32 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA8D6E9.9090004@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> Message-ID: <4FA8D9E4.1040102@astro.uio.no> On 05/08/2012 10:18 AM, Stefan Behnel wrote: > Dag Sverre Seljebotn, 08.05.2012 09:57: >> On 05/07/2012 11:21 PM, mark florisson wrote: >>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>> mark florisson wrote: >>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>>> as np.ndarray, array.array etc. being some sort of "template types". >>>>>>>>> That is, >>>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>>> the relevant pxd files. >>>>>>>> >>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>> syntax now, >>>>>>>> one that declares the item type before the brackets and one that >>>>>>>> declares it afterwards. >>>>>>> Should we consider the >>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>> syntax? >>>>>> >>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>> too early >>>>>> to really tell yet, we just need to see how the memory views play out >>>>>> in >>>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>>> among real users. We don't want to shove things down users throats. >>>>>> >>>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>>> >>>>>> - Deprecate the "object[double]" form, where [dtype] can be stuck on >>>>>> any extension type >>>>>> >>>>>> - But, do NOT (for the next year at least) deprecate >>>>>> np.ndarray[double], >>>>>> array.array[double], etc. Basically, there should be a magic flag in >>>>>> extension type declarations saying "I can be a buffer". >>>>>> >>>>>> For one thing, that is sort of needed to open up things for templated >>>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>>> >>>>> Deprecating is definitely a good start. I think at least if you only >>>>> allow two types as buffers it will be at least reasonably clear when >>>>> one is dealing with fused types or buffers. >>>>> >>>>> Basically, I think memoryviews should live up to demands of the users, >>>>> which would mean there would be no reason to keep the buffer syntax. >>>> >>>> But they are different approaches -- use a different type/API, or just >>>> try to speed up parts of NumPy.. >>>> >>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>> original objects if wanted (which is likely). Writting >>>>> np.asarray(mymemview) is kind of annoying. >>>> >>>> It is going to be very confusing to have type(mymemview), >>>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>>> full API of NumPy. Unless you auto-convert on getattr to... >>> >>> Yeah, the idea is as very simple, as you mention, just keep the object >>> around cached, and when you slice construct one lazily. >>> >>>> If you want to eradicate the distinction between the backing array and >>>> the memory view and make it transparent, I really suggest you kick back >>>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>>> construction after slicing, and so on). Implementation much the same >>>> either way, it is all about how it is presented to the user. >>> >>> You mean the buffer syntax? >>> >>>> Something like mymemview.asobject() could work though, and while not >>>> much shorter, it would have some polymorphism that np.asarray does not >>>> have (based probably on some custom PEP 3118 extension) >>> >>> I was thinking you could allow the user to register a callback, and >>> use that to coerce from a memoryview back to an object (given a >>> memoryview object). For numpy this would be np.asarray, and the >>> implementation is allowed to cache the result (which it will). >>> It may be too magicky though... but it will be convenient. The >>> memoryview will act as a subclass, meaning that any of its methods >>> will override methods of the converted object. >> >> My point was that this seems *way* to magicky. >> >> Beyond "confusing users" and so on that are sort of subjective, here's a >> fundamental problem for you: We're making it very difficult to type-infer >> memoryviews. Consider: >> >> cdef double[:] x = ... >> y = x >> print y.shape >> >> Now, because y is not typed, you're semantically throwing in a conversion >> on line 2, so that line 3 says that you want the attribute access to be >> invoked on "whatever object x coerced back to". And we have no idea what >> kind of object that is. >> >> If you don't transparently convert to object, it'd be safe to automatically >> infer y as a double[:]. > > Why can't y be inferred as the type of x due to the assignment? > > >> On a related note, I've said before that I dislike the notion of >> >> cdef double[:] mview = obj >> >> I'd rather like >> >> cdef double[:] mview = double[:](obj) > > Why? We currently allow > > cdef char* s = some_py_bytes_string > > Auto-coercion is a serious part of the language, and I don't see the > advantage of requiring the redundancy in the case above. It's clear enough > to me what the typed assignment is intended to mean: get me a buffer view > on the object, regardless of what it is. Good point. I admit defeat. There's slight difference in that there's more of a 1:1 between a bytes and a char*, whereas there's a many:1 for buffers. But it doesn't seem to matter, since "char*" doesn't coerce back to object automatically. (Though that fact is an argument against letting memoryviews coerce to objects automatically) (Also I happen to not like this part of the language -- I think it's making us be further from Python than we would need to -- but that's not relevant in this thread at all, but rather in some pure Python mode thread.) > > >> I support Robert in that "np.ndarray[double]" is the syntax to use when you >> want this kind of transparent "be an object when I need to and a memory >> view when I need to". >> >> Proposal: >> >> 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >> the language. It means exactly what you would like double[:] to mean, i.e. >> a variable that is memoryview when you need to and an object otherwise. >> When you use this type, you bear the consequences of early-binding things >> that could in theory be overridden. >> >> 2) double[:] is for when you want to access data of *any* Python object in >> a generic way. Raw PEP 3118. In those situations, access to the underlying >> object is much less useful. >> >> 2a) Therefore we require that you do "mview.asobject()" manually; doing >> "mview.foo()" is a compile-time error > > Sounds good. I think that would clean up the current syntax overlap very > nicely. > > >> 2b) To drive the point home among users, and aid type inference and >> overall language clarity, we REMOVE the auto-acquisition and require that >> you do >> >> cdef double[:] mview = double[:](obj) > > I don't see the point, as noted above. Either "obj" is statically typed and > the bare assignment becomes a no-op, or it's not typed and the assignment > coerces by creating a view. As with all other typed assignments. >> 2c) Perhaps: Do not even coerce to a Python memoryview and disallow >> "print mview"; instead require that you do "print mview.asmemoryview()" or >> "print memoryview(mview)" or somesuch. > > This seems to depend on 2b. > > >> (A related proposal that's been up earlier has been that a variable can be >> annotated with many interfaces; e.g. >> >> cdef A|B|C obj >> >> ...and then when you do "obj.method", it is first looked up in C, then B, >> then A, then Python getattr. Not sure if we want to reopen that can of >> worms...) > > Different topic - new thread? It's very related, since np.ndarray[double] would essentially be "np.ndarray | double[:]". Dag From d.s.seljebotn at astro.uio.no Tue May 8 10:36:18 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 May 2012 10:36:18 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA8D6E9.9090004@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> Message-ID: <4FA8DB02.2020902@astro.uio.no> On 05/08/2012 10:18 AM, Stefan Behnel wrote: > Dag Sverre Seljebotn, 08.05.2012 09:57: >> On 05/07/2012 11:21 PM, mark florisson wrote: >>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>> mark florisson wrote: >>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>>> as np.ndarray, array.array etc. being some sort of "template types". >>>>>>>>> That is, >>>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>>> the relevant pxd files. >>>>>>>> >>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>> syntax now, >>>>>>>> one that declares the item type before the brackets and one that >>>>>>>> declares it afterwards. >>>>>>> Should we consider the >>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>> syntax? >>>>>> >>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>> too early >>>>>> to really tell yet, we just need to see how the memory views play out >>>>>> in >>>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>>> among real users. We don't want to shove things down users throats. >>>>>> >>>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>>> >>>>>> - Deprecate the "object[double]" form, where [dtype] can be stuck on >>>>>> any extension type >>>>>> >>>>>> - But, do NOT (for the next year at least) deprecate >>>>>> np.ndarray[double], >>>>>> array.array[double], etc. Basically, there should be a magic flag in >>>>>> extension type declarations saying "I can be a buffer". >>>>>> >>>>>> For one thing, that is sort of needed to open up things for templated >>>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>>> >>>>> Deprecating is definitely a good start. I think at least if you only >>>>> allow two types as buffers it will be at least reasonably clear when >>>>> one is dealing with fused types or buffers. >>>>> >>>>> Basically, I think memoryviews should live up to demands of the users, >>>>> which would mean there would be no reason to keep the buffer syntax. >>>> >>>> But they are different approaches -- use a different type/API, or just >>>> try to speed up parts of NumPy.. >>>> >>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>> original objects if wanted (which is likely). Writting >>>>> np.asarray(mymemview) is kind of annoying. >>>> >>>> It is going to be very confusing to have type(mymemview), >>>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>>> full API of NumPy. Unless you auto-convert on getattr to... >>> >>> Yeah, the idea is as very simple, as you mention, just keep the object >>> around cached, and when you slice construct one lazily. >>> >>>> If you want to eradicate the distinction between the backing array and >>>> the memory view and make it transparent, I really suggest you kick back >>>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>>> construction after slicing, and so on). Implementation much the same >>>> either way, it is all about how it is presented to the user. >>> >>> You mean the buffer syntax? >>> >>>> Something like mymemview.asobject() could work though, and while not >>>> much shorter, it would have some polymorphism that np.asarray does not >>>> have (based probably on some custom PEP 3118 extension) >>> >>> I was thinking you could allow the user to register a callback, and >>> use that to coerce from a memoryview back to an object (given a >>> memoryview object). For numpy this would be np.asarray, and the >>> implementation is allowed to cache the result (which it will). >>> It may be too magicky though... but it will be convenient. The >>> memoryview will act as a subclass, meaning that any of its methods >>> will override methods of the converted object. >> >> My point was that this seems *way* to magicky. >> >> Beyond "confusing users" and so on that are sort of subjective, here's a >> fundamental problem for you: We're making it very difficult to type-infer >> memoryviews. Consider: >> >> cdef double[:] x = ... >> y = x >> print y.shape >> >> Now, because y is not typed, you're semantically throwing in a conversion >> on line 2, so that line 3 says that you want the attribute access to be >> invoked on "whatever object x coerced back to". And we have no idea what >> kind of object that is. >> >> If you don't transparently convert to object, it'd be safe to automatically >> infer y as a double[:]. > > Why can't y be inferred as the type of x due to the assignment? > > >> On a related note, I've said before that I dislike the notion of >> >> cdef double[:] mview = obj >> >> I'd rather like >> >> cdef double[:] mview = double[:](obj) > > Why? We currently allow > > cdef char* s = some_py_bytes_string > > Auto-coercion is a serious part of the language, and I don't see the > advantage of requiring the redundancy in the case above. It's clear enough > to me what the typed assignment is intended to mean: get me a buffer view > on the object, regardless of what it is. > > >> I support Robert in that "np.ndarray[double]" is the syntax to use when you >> want this kind of transparent "be an object when I need to and a memory >> view when I need to". >> >> Proposal: >> >> 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >> the language. It means exactly what you would like double[:] to mean, i.e. >> a variable that is memoryview when you need to and an object otherwise. >> When you use this type, you bear the consequences of early-binding things >> that could in theory be overridden. >> >> 2) double[:] is for when you want to access data of *any* Python object in >> a generic way. Raw PEP 3118. In those situations, access to the underlying >> object is much less useful. >> >> 2a) Therefore we require that you do "mview.asobject()" manually; doing >> "mview.foo()" is a compile-time error > > Sounds good. I think that would clean up the current syntax overlap very > nicely. > > >> 2b) To drive the point home among users, and aid type inference and >> overall language clarity, we REMOVE the auto-acquisition and require that >> you do >> >> cdef double[:] mview = double[:](obj) > > I don't see the point, as noted above. Either "obj" is statically typed and > the bare assignment becomes a no-op, or it's not typed and the assignment > coerces by creating a view. As with all other typed assignments. > > >> 2c) Perhaps: Do not even coerce to a Python memoryview and disallow >> "print mview"; instead require that you do "print mview.asmemoryview()" or >> "print memoryview(mview)" or somesuch. > > This seems to depend on 2b. This I don't understand. The question of 2c) is the analogue to auto-coercion of "char*" to bytes; approving 2c) would put memoryviews in line with char*. Then again, we could in future auto-coerce char* to a ctypes pointer, and in that case, coercing a memoryview to an object representing that memoryview would be OK. Either way, you would never get back the same object that you coerced from! Dag From stefan_ml at behnel.de Tue May 8 10:49:56 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 May 2012 10:49:56 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA8DB02.2020902@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> Message-ID: <4FA8DE34.6050806@behnel.de> Dag Sverre Seljebotn, 08.05.2012 10:36: > On 05/08/2012 10:18 AM, Stefan Behnel wrote: >> Dag Sverre Seljebotn, 08.05.2012 09:57: >>> On 05/07/2012 11:21 PM, mark florisson wrote: >>>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>>> mark florisson wrote: >>>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>>>> as np.ndarray, array.array etc. being some sort of "template types". >>>>>>>>>> That is, >>>>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>>>> the relevant pxd files. >>>>>>>>> >>>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>>> syntax now, >>>>>>>>> one that declares the item type before the brackets and one that >>>>>>>>> declares it afterwards. >>>>>>>> Should we consider the >>>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>>> syntax? >>>>>>> >>>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>>> too early >>>>>>> to really tell yet, we just need to see how the memory views play out >>>>>>> in >>>>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>>>> among real users. We don't want to shove things down users throats. >>>>>>> >>>>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>>>> >>>>>>> - Deprecate the "object[double]" form, where [dtype] can be stuck on >>>>>>> any extension type >>>>>>> >>>>>>> - But, do NOT (for the next year at least) deprecate >>>>>>> np.ndarray[double], >>>>>>> array.array[double], etc. Basically, there should be a magic flag in >>>>>>> extension type declarations saying "I can be a buffer". >>>>>>> >>>>>>> For one thing, that is sort of needed to open up things for templated >>>>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>>>> >>>>>> Deprecating is definitely a good start. I think at least if you only >>>>>> allow two types as buffers it will be at least reasonably clear when >>>>>> one is dealing with fused types or buffers. >>>>>> >>>>>> Basically, I think memoryviews should live up to demands of the users, >>>>>> which would mean there would be no reason to keep the buffer syntax. >>>>> >>>>> But they are different approaches -- use a different type/API, or just >>>>> try to speed up parts of NumPy.. >>>>> >>>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>>> original objects if wanted (which is likely). Writting >>>>>> np.asarray(mymemview) is kind of annoying. >>>>> >>>>> It is going to be very confusing to have type(mymemview), >>>>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>>>> full API of NumPy. Unless you auto-convert on getattr to... >>>> >>>> Yeah, the idea is as very simple, as you mention, just keep the object >>>> around cached, and when you slice construct one lazily. >>>> >>>>> If you want to eradicate the distinction between the backing array and >>>>> the memory view and make it transparent, I really suggest you kick back >>>>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>>>> construction after slicing, and so on). Implementation much the same >>>>> either way, it is all about how it is presented to the user. >>>> >>>> You mean the buffer syntax? >>>> >>>>> Something like mymemview.asobject() could work though, and while not >>>>> much shorter, it would have some polymorphism that np.asarray does not >>>>> have (based probably on some custom PEP 3118 extension) >>>> >>>> I was thinking you could allow the user to register a callback, and >>>> use that to coerce from a memoryview back to an object (given a >>>> memoryview object). For numpy this would be np.asarray, and the >>>> implementation is allowed to cache the result (which it will). >>>> It may be too magicky though... but it will be convenient. The >>>> memoryview will act as a subclass, meaning that any of its methods >>>> will override methods of the converted object. >>> >>> My point was that this seems *way* to magicky. >>> >>> Beyond "confusing users" and so on that are sort of subjective, here's a >>> fundamental problem for you: We're making it very difficult to type-infer >>> memoryviews. Consider: >>> >>> cdef double[:] x = ... >>> y = x >>> print y.shape >>> >>> Now, because y is not typed, you're semantically throwing in a conversion >>> on line 2, so that line 3 says that you want the attribute access to be >>> invoked on "whatever object x coerced back to". And we have no idea what >>> kind of object that is. >>> >>> If you don't transparently convert to object, it'd be safe to automatically >>> infer y as a double[:]. >> >> Why can't y be inferred as the type of x due to the assignment? >> >> >>> On a related note, I've said before that I dislike the notion of >>> >>> cdef double[:] mview = obj >>> >>> I'd rather like >>> >>> cdef double[:] mview = double[:](obj) >> >> Why? We currently allow >> >> cdef char* s = some_py_bytes_string >> >> Auto-coercion is a serious part of the language, and I don't see the >> advantage of requiring the redundancy in the case above. It's clear enough >> to me what the typed assignment is intended to mean: get me a buffer view >> on the object, regardless of what it is. >> >> >>> I support Robert in that "np.ndarray[double]" is the syntax to use when you >>> want this kind of transparent "be an object when I need to and a memory >>> view when I need to". >>> >>> Proposal: >>> >>> 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>> the language. It means exactly what you would like double[:] to mean, i.e. >>> a variable that is memoryview when you need to and an object otherwise. >>> When you use this type, you bear the consequences of early-binding things >>> that could in theory be overridden. >>> >>> 2) double[:] is for when you want to access data of *any* Python >>> object in >>> a generic way. Raw PEP 3118. In those situations, access to the underlying >>> object is much less useful. >>> >>> 2a) Therefore we require that you do "mview.asobject()" manually; doing >>> "mview.foo()" is a compile-time error >> >> Sounds good. I think that would clean up the current syntax overlap very >> nicely. >> >> >>> 2b) To drive the point home among users, and aid type inference and >>> overall language clarity, we REMOVE the auto-acquisition and require that >>> you do >>> >>> cdef double[:] mview = double[:](obj) >> >> I don't see the point, as noted above. Either "obj" is statically typed and >> the bare assignment becomes a no-op, or it's not typed and the assignment >> coerces by creating a view. As with all other typed assignments. >> >> >>> 2c) Perhaps: Do not even coerce to a Python memoryview and disallow >>> "print mview"; instead require that you do "print mview.asmemoryview()" or >>> "print memoryview(mview)" or somesuch. >> >> This seems to depend on 2b. > > This I don't understand. The question of 2c) is the analogue to > auto-coercion of "char*" to bytes; approving 2c) would put memoryviews in > line with char*. > > Then again, we could in future auto-coerce char* to a ctypes pointer, and > in that case, coercing a memoryview to an object representing that > memoryview would be OK. > > Either way, you would never get back the same object that you coerced from! Ah, that's what you meant. I thought you were referring to getting a memoryview from an object. I agree that a buffer view shouldn't auto-coerce back to its owner (or to a Python object in general), that's the whole point of the syntax cleanup. In simple cases, buffer.obj would be the thing to talk to, except for memory views, where only the view knows the mapped memory layout but the underlying exporter has the methods to deal with the buffer. In that case, we may really want to leave it to the user to handle this. I don't think the compiler can do the right thing in all cases, and the user is really the only one who knows what kind of object should be used or even instantiated to wrap a buffer. Nothing we can do is shorter or more clearly readable than np.asarray() or whatever function a specific library has for this. So, what about just keeping buffer.obj visible and leaving everything else to users? Stefan From markflorisson88 at gmail.com Tue May 8 11:21:04 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 8 May 2012 10:21:04 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA8DE34.6050806@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8DE34.6050806@behnel.de> Message-ID: On 8 May 2012 09:49, Stefan Behnel wrote: > Dag Sverre Seljebotn, 08.05.2012 10:36: >> On 05/08/2012 10:18 AM, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>> On 05/07/2012 11:21 PM, mark florisson wrote: >>>>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>>>> mark florisson wrote: >>>>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>>>>> as np.ndarray, array.array etc. being some sort of "template types". >>>>>>>>>>> That is, >>>>>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>>>>> the relevant pxd files. >>>>>>>>>> >>>>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>>>> syntax now, >>>>>>>>>> one that declares the item type before the brackets and one that >>>>>>>>>> declares it afterwards. >>>>>>>>> Should we consider the >>>>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>>>> syntax? >>>>>>>> >>>>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>>>> too early >>>>>>>> to really tell yet, we just need to see how the memory views play out >>>>>>>> in >>>>>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>>>>> among real users. We don't want to shove things down users throats. >>>>>>>> >>>>>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>>>>> >>>>>>>> ? ?- Deprecate the "object[double]" form, where [dtype] can be stuck on >>>>>>>> ? ?any extension type >>>>>>>> >>>>>>>> ? ?- But, do NOT (for the next year at least) deprecate >>>>>>>> ? ?np.ndarray[double], >>>>>>>> ? ?array.array[double], etc. Basically, there should be a magic flag in >>>>>>>> ? ?extension type declarations saying "I can be a buffer". >>>>>>>> >>>>>>>> For one thing, that is sort of needed to open up things for templated >>>>>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>>>>> >>>>>>> Deprecating is definitely a good start. I think at least if you only >>>>>>> allow two types as buffers it will be at least reasonably clear when >>>>>>> one is dealing with fused types or buffers. >>>>>>> >>>>>>> Basically, I think memoryviews should live up to demands of the users, >>>>>>> which would mean there would be no reason to keep the buffer syntax. >>>>>> >>>>>> But they are different approaches -- use a different type/API, or just >>>>>> try to speed up parts of NumPy.. >>>>>> >>>>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>>>> original objects if wanted (which is likely). Writting >>>>>>> np.asarray(mymemview) is kind of annoying. >>>>>> >>>>>> It is going to be very confusing to have type(mymemview), >>>>>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>>>>> full API of NumPy. Unless you auto-convert on getattr to... >>>>> >>>>> Yeah, the idea is as very simple, as you mention, just keep the object >>>>> around cached, and when you slice construct one lazily. >>>>> >>>>>> If you want to eradicate the distinction between the backing array and >>>>>> the memory view and make it transparent, I really suggest you kick back >>>>>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>>>>> construction after slicing, and so on). Implementation much the same >>>>>> either way, it is all about how it is presented to the user. >>>>> >>>>> You mean the buffer syntax? >>>>> >>>>>> Something like mymemview.asobject() could work though, and while not >>>>>> much shorter, it would have some polymorphism that np.asarray does not >>>>>> have (based probably on some custom PEP 3118 extension) >>>>> >>>>> I was thinking you could allow the user to register a callback, and >>>>> use that to coerce from a memoryview back to an object (given a >>>>> memoryview object). For numpy this would be np.asarray, and the >>>>> implementation is allowed to cache the result (which it will). >>>>> It may be too magicky though... but it will be convenient. The >>>>> memoryview will act as a subclass, meaning that any of its methods >>>>> will override methods of the converted object. >>>> >>>> My point was that this seems *way* to magicky. >>>> >>>> Beyond "confusing users" and so on that are sort of subjective, here's a >>>> fundamental problem for you: We're making it very difficult to type-infer >>>> memoryviews. Consider: >>>> >>>> cdef double[:] x = ... >>>> y = x >>>> print y.shape >>>> >>>> Now, because y is not typed, you're semantically throwing in a conversion >>>> on line 2, so that line 3 says that you want the attribute access to be >>>> invoked on "whatever object x coerced back to". And we have no idea what >>>> kind of object that is. >>>> >>>> If you don't transparently convert to object, it'd be safe to automatically >>>> infer y as a double[:]. >>> >>> Why can't y be inferred as the type of x due to the assignment? >>> >>> >>>> On a related note, I've said before that I dislike the notion of >>>> >>>> cdef double[:] mview = obj >>>> >>>> I'd rather like >>>> >>>> cdef double[:] mview = double[:](obj) >>> >>> Why? We currently allow >>> >>> ? ? ?cdef char* s = some_py_bytes_string >>> >>> Auto-coercion is a serious part of the language, and I don't see the >>> advantage of requiring the redundancy in the case above. It's clear enough >>> to me what the typed assignment is intended to mean: get me a buffer view >>> on the object, regardless of what it is. >>> >>> >>>> I support Robert in that "np.ndarray[double]" is the syntax to use when you >>>> want this kind of transparent "be an object when I need to and a memory >>>> view when I need to". >>>> >>>> Proposal: >>>> >>>> ? 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>>> the language. It means exactly what you would like double[:] to mean, i.e. >>>> a variable that is memoryview when you need to and an object otherwise. >>>> When you use this type, you bear the consequences of early-binding things >>>> that could in theory be overridden. >>>> >>>> ? 2) double[:] is for when you want to access data of *any* Python >>>> object in >>>> a generic way. Raw PEP 3118. In those situations, access to the underlying >>>> object is much less useful. >>>> >>>> ? ?2a) Therefore we require that you do "mview.asobject()" manually; doing >>>> "mview.foo()" is a compile-time error >>> >>> Sounds good. I think that would clean up the current syntax overlap very >>> nicely. >>> >>> >>>> ? ?2b) To drive the point home among users, and aid type inference and >>>> overall language clarity, we REMOVE the auto-acquisition and require that >>>> you do >>>> >>>> ? ? ?cdef double[:] mview = double[:](obj) >>> >>> I don't see the point, as noted above. Either "obj" is statically typed and >>> the bare assignment becomes a no-op, or it's not typed and the assignment >>> coerces by creating a view. As with all other typed assignments. >>> >>> >>>> ? ?2c) Perhaps: Do not even coerce to a Python memoryview and disallow >>>> "print mview"; instead require that you do "print mview.asmemoryview()" or >>>> "print memoryview(mview)" or somesuch. >>> >>> This seems to depend on 2b. >> >> This I don't understand. The question of 2c) is the analogue to >> auto-coercion of "char*" to bytes; approving 2c) would put memoryviews in >> line with char*. >> >> Then again, we could in future auto-coerce char* to a ctypes pointer, and >> in that case, coercing a memoryview to an object representing that >> memoryview would be OK. >> >> Either way, you would never get back the same object that you coerced from! > > Ah, that's what you meant. I thought you were referring to getting a > memoryview from an object. > > I agree that a buffer view shouldn't auto-coerce back to its owner (or to a > Python object in general), that's the whole point of the syntax cleanup. > > In simple cases, buffer.obj would be the thing to talk to, except for > memory views, where only the view knows the mapped memory layout but the > underlying exporter has the methods to deal with the buffer. In that case, > we may really want to leave it to the user to handle this. I don't think > the compiler can do the right thing in all cases, and the user is really > the only one who knows what kind of object should be used or even > instantiated to wrap a buffer. Nothing we can do is shorter or more clearly > readable than np.asarray() or whatever function a specific library has for > this. > > So, what about just keeping buffer.obj visible and leaving everything else > to users? buffer.base gets you the original object. > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Tue May 8 11:22:24 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 8 May 2012 10:22:24 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA8DB02.2020902@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> Message-ID: On 8 May 2012 09:36, Dag Sverre Seljebotn wrote: > On 05/08/2012 10:18 AM, Stefan Behnel wrote: >> >> Dag Sverre Seljebotn, 08.05.2012 09:57: >>> >>> On 05/07/2012 11:21 PM, mark florisson wrote: >>>> >>>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>>> >>>>> mark florisson wrote: >>>>>> >>>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>>> >>>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>>> >>>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>>> >>>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>>> >>>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>>>> as np.ndarray, array.array etc. being some sort of "template >>>>>>>>>> types". >>>>>>>>>> That is, >>>>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>>>> the relevant pxd files. >>>>>>>>> >>>>>>>>> >>>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>>> syntax now, >>>>>>>>> one that declares the item type before the brackets and one that >>>>>>>>> declares it afterwards. >>>>>>>> >>>>>>>> Should we consider the >>>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>>> syntax? >>>>>>> >>>>>>> >>>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>>> too early >>>>>>> to really tell yet, we just need to see how the memory views play out >>>>>>> in >>>>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>>>> among real users. We don't want to shove things down users throats. >>>>>>> >>>>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>>>> >>>>>>> ? - Deprecate the "object[double]" form, where [dtype] can be stuck >>>>>>> on >>>>>>> ? any extension type >>>>>>> >>>>>>> ? - But, do NOT (for the next year at least) deprecate >>>>>>> ? np.ndarray[double], >>>>>>> ? array.array[double], etc. Basically, there should be a magic flag >>>>>>> in >>>>>>> ? extension type declarations saying "I can be a buffer". >>>>>>> >>>>>>> For one thing, that is sort of needed to open up things for templated >>>>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>>>> >>>>>> >>>>>> Deprecating is definitely a good start. I think at least if you only >>>>>> allow two types as buffers it will be at least reasonably clear when >>>>>> one is dealing with fused types or buffers. >>>>>> >>>>>> Basically, I think memoryviews should live up to demands of the users, >>>>>> which would mean there would be no reason to keep the buffer syntax. >>>>> >>>>> >>>>> But they are different approaches -- use a different type/API, or just >>>>> try to speed up parts of NumPy.. >>>>> >>>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>>> original objects if wanted (which is likely). Writting >>>>>> np.asarray(mymemview) is kind of annoying. >>>>> >>>>> >>>>> It is going to be very confusing to have type(mymemview), >>>>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>>>> full API of NumPy. Unless you auto-convert on getattr to... >>>> >>>> >>>> Yeah, the idea is as very simple, as you mention, just keep the object >>>> around cached, and when you slice construct one lazily. >>>> >>>>> If you want to eradicate the distinction between the backing array and >>>>> the memory view and make it transparent, I really suggest you kick back >>>>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>>>> construction after slicing, and so on). Implementation much the same >>>>> either way, it is all about how it is presented to the user. >>>> >>>> >>>> You mean the buffer syntax? >>>> >>>>> Something like mymemview.asobject() could work though, and while not >>>>> much shorter, it would have some polymorphism that np.asarray does not >>>>> have (based probably on some custom PEP 3118 extension) >>>> >>>> >>>> I was thinking you could allow the user to register a callback, and >>>> use that to coerce from a memoryview back to an object (given a >>>> memoryview object). For numpy this would be np.asarray, and the >>>> implementation is allowed to cache the result (which it will). >>>> It may be too magicky though... but it will be convenient. The >>>> memoryview will act as a subclass, meaning that any of its methods >>>> will override methods of the converted object. >>> >>> >>> My point was that this seems *way* to magicky. >>> >>> Beyond "confusing users" and so on that are sort of subjective, here's a >>> fundamental problem for you: We're making it very difficult to type-infer >>> memoryviews. Consider: >>> >>> cdef double[:] x = ... >>> y = x >>> print y.shape >>> >>> Now, because y is not typed, you're semantically throwing in a conversion >>> on line 2, so that line 3 says that you want the attribute access to be >>> invoked on "whatever object x coerced back to". And we have no idea what >>> kind of object that is. >>> >>> If you don't transparently convert to object, it'd be safe to >>> automatically >>> infer y as a double[:]. >> >> >> Why can't y be inferred as the type of x due to the assignment? >> >> >>> On a related note, I've said before that I dislike the notion of >>> >>> cdef double[:] mview = obj >>> >>> I'd rather like >>> >>> cdef double[:] mview = double[:](obj) >> >> >> Why? We currently allow >> >> ? ? cdef char* s = some_py_bytes_string >> >> Auto-coercion is a serious part of the language, and I don't see the >> advantage of requiring the redundancy in the case above. It's clear enough >> to me what the typed assignment is intended to mean: get me a buffer view >> on the object, regardless of what it is. >> >> >>> I support Robert in that "np.ndarray[double]" is the syntax to use when >>> you >>> want this kind of transparent "be an object when I need to and a memory >>> view when I need to". >>> >>> Proposal: >>> >>> ?1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>> the language. It means exactly what you would like double[:] to mean, >>> i.e. >>> a variable that is memoryview when you need to and an object otherwise. >>> When you use this type, you bear the consequences of early-binding things >>> that could in theory be overridden. >>> >>> ?2) double[:] is for when you want to access data of *any* Python object >>> in >>> a generic way. Raw PEP 3118. In those situations, access to the >>> underlying >>> object is much less useful. >>> >>> ? 2a) Therefore we require that you do "mview.asobject()" manually; doing >>> "mview.foo()" is a compile-time error >> >> >> Sounds good. I think that would clean up the current syntax overlap very >> nicely. >> >> >>> ? 2b) To drive the point home among users, and aid type inference and >>> overall language clarity, we REMOVE the auto-acquisition and require that >>> you do >>> >>> ? ? cdef double[:] mview = double[:](obj) >> >> >> I don't see the point, as noted above. Either "obj" is statically typed >> and >> the bare assignment becomes a no-op, or it's not typed and the assignment >> coerces by creating a view. As with all other typed assignments. >> >> >>> ? 2c) Perhaps: Do not even coerce to a Python memoryview and disallow >>> "print mview"; instead require that you do "print mview.asmemoryview()" >>> or >>> "print memoryview(mview)" or somesuch. >> >> >> This seems to depend on 2b. > > > This I don't understand. The question of 2c) is the analogue to > auto-coercion of "char*" to bytes; approving 2c) would put memoryviews in > line with char*. > > Then again, we could in future auto-coerce char* to a ctypes pointer, and in > that case, coercing a memoryview to an object representing that memoryview > would be OK. Character pointers coerce to strings. Hell, even structs coerce to and from python dicts, so disallowing the same for memoryviews would just be inconsistent and inconvenient. > Either way, you would never get back the same object that you coerced from! > > Dag > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Tue May 8 11:24:56 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 8 May 2012 10:24:56 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> Message-ID: On 8 May 2012 10:22, mark florisson wrote: > On 8 May 2012 09:36, Dag Sverre Seljebotn wrote: >> On 05/08/2012 10:18 AM, Stefan Behnel wrote: >>> >>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>> >>>> On 05/07/2012 11:21 PM, mark florisson wrote: >>>>> >>>>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>>>> >>>>>> mark florisson wrote: >>>>>>> >>>>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>>>> >>>>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>>>> >>>>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>>>> >>>>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>>>> >>>>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>>>>> as np.ndarray, array.array etc. being some sort of "template >>>>>>>>>>> types". >>>>>>>>>>> That is, >>>>>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>>>>> the relevant pxd files. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>>>> syntax now, >>>>>>>>>> one that declares the item type before the brackets and one that >>>>>>>>>> declares it afterwards. >>>>>>>>> >>>>>>>>> Should we consider the >>>>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>>>> syntax? >>>>>>>> >>>>>>>> >>>>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>>>> too early >>>>>>>> to really tell yet, we just need to see how the memory views play out >>>>>>>> in >>>>>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>>>>> among real users. We don't want to shove things down users throats. >>>>>>>> >>>>>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>>>>> >>>>>>>> ? - Deprecate the "object[double]" form, where [dtype] can be stuck >>>>>>>> on >>>>>>>> ? any extension type >>>>>>>> >>>>>>>> ? - But, do NOT (for the next year at least) deprecate >>>>>>>> ? np.ndarray[double], >>>>>>>> ? array.array[double], etc. Basically, there should be a magic flag >>>>>>>> in >>>>>>>> ? extension type declarations saying "I can be a buffer". >>>>>>>> >>>>>>>> For one thing, that is sort of needed to open up things for templated >>>>>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>>>>> >>>>>>> >>>>>>> Deprecating is definitely a good start. I think at least if you only >>>>>>> allow two types as buffers it will be at least reasonably clear when >>>>>>> one is dealing with fused types or buffers. >>>>>>> >>>>>>> Basically, I think memoryviews should live up to demands of the users, >>>>>>> which would mean there would be no reason to keep the buffer syntax. >>>>>> >>>>>> >>>>>> But they are different approaches -- use a different type/API, or just >>>>>> try to speed up parts of NumPy.. >>>>>> >>>>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>>>> original objects if wanted (which is likely). Writting >>>>>>> np.asarray(mymemview) is kind of annoying. >>>>>> >>>>>> >>>>>> It is going to be very confusing to have type(mymemview), >>>>>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>>>>> full API of NumPy. Unless you auto-convert on getattr to... >>>>> >>>>> >>>>> Yeah, the idea is as very simple, as you mention, just keep the object >>>>> around cached, and when you slice construct one lazily. >>>>> >>>>>> If you want to eradicate the distinction between the backing array and >>>>>> the memory view and make it transparent, I really suggest you kick back >>>>>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>>>>> construction after slicing, and so on). Implementation much the same >>>>>> either way, it is all about how it is presented to the user. >>>>> >>>>> >>>>> You mean the buffer syntax? >>>>> >>>>>> Something like mymemview.asobject() could work though, and while not >>>>>> much shorter, it would have some polymorphism that np.asarray does not >>>>>> have (based probably on some custom PEP 3118 extension) >>>>> >>>>> >>>>> I was thinking you could allow the user to register a callback, and >>>>> use that to coerce from a memoryview back to an object (given a >>>>> memoryview object). For numpy this would be np.asarray, and the >>>>> implementation is allowed to cache the result (which it will). >>>>> It may be too magicky though... but it will be convenient. The >>>>> memoryview will act as a subclass, meaning that any of its methods >>>>> will override methods of the converted object. >>>> >>>> >>>> My point was that this seems *way* to magicky. >>>> >>>> Beyond "confusing users" and so on that are sort of subjective, here's a >>>> fundamental problem for you: We're making it very difficult to type-infer >>>> memoryviews. Consider: >>>> >>>> cdef double[:] x = ... >>>> y = x >>>> print y.shape >>>> >>>> Now, because y is not typed, you're semantically throwing in a conversion >>>> on line 2, so that line 3 says that you want the attribute access to be >>>> invoked on "whatever object x coerced back to". And we have no idea what >>>> kind of object that is. >>>> >>>> If you don't transparently convert to object, it'd be safe to >>>> automatically >>>> infer y as a double[:]. >>> >>> >>> Why can't y be inferred as the type of x due to the assignment? >>> >>> >>>> On a related note, I've said before that I dislike the notion of >>>> >>>> cdef double[:] mview = obj >>>> >>>> I'd rather like >>>> >>>> cdef double[:] mview = double[:](obj) >>> >>> >>> Why? We currently allow >>> >>> ? ? cdef char* s = some_py_bytes_string >>> >>> Auto-coercion is a serious part of the language, and I don't see the >>> advantage of requiring the redundancy in the case above. It's clear enough >>> to me what the typed assignment is intended to mean: get me a buffer view >>> on the object, regardless of what it is. >>> >>> >>>> I support Robert in that "np.ndarray[double]" is the syntax to use when >>>> you >>>> want this kind of transparent "be an object when I need to and a memory >>>> view when I need to". >>>> >>>> Proposal: >>>> >>>> ?1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>>> the language. It means exactly what you would like double[:] to mean, >>>> i.e. >>>> a variable that is memoryview when you need to and an object otherwise. >>>> When you use this type, you bear the consequences of early-binding things >>>> that could in theory be overridden. >>>> >>>> ?2) double[:] is for when you want to access data of *any* Python object >>>> in >>>> a generic way. Raw PEP 3118. In those situations, access to the >>>> underlying >>>> object is much less useful. >>>> >>>> ? 2a) Therefore we require that you do "mview.asobject()" manually; doing >>>> "mview.foo()" is a compile-time error >>> >>> >>> Sounds good. I think that would clean up the current syntax overlap very >>> nicely. >>> >>> >>>> ? 2b) To drive the point home among users, and aid type inference and >>>> overall language clarity, we REMOVE the auto-acquisition and require that >>>> you do >>>> >>>> ? ? cdef double[:] mview = double[:](obj) >>> >>> >>> I don't see the point, as noted above. Either "obj" is statically typed >>> and >>> the bare assignment becomes a no-op, or it's not typed and the assignment >>> coerces by creating a view. As with all other typed assignments. >>> >>> >>>> ? 2c) Perhaps: Do not even coerce to a Python memoryview and disallow >>>> "print mview"; instead require that you do "print mview.asmemoryview()" >>>> or >>>> "print memoryview(mview)" or somesuch. >>> >>> >>> This seems to depend on 2b. >> >> >> This I don't understand. The question of 2c) is the analogue to >> auto-coercion of "char*" to bytes; approving 2c) would put memoryviews in >> line with char*. >> >> Then again, we could in future auto-coerce char* to a ctypes pointer, and in >> that case, coercing a memoryview to an object representing that memoryview >> would be OK. > > Character pointers coerce to strings. Hell, even structs coerce to and > from python dicts, so disallowing the same for memoryviews would just > be inconsistent and inconvenient. Also,?if you don't allow coercion from python, then it means they also cannot be used as 'def' function arguments and be called from python. >> Either way, you would never get back the same object that you coerced from! >> >> Dag >> >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Tue May 8 11:26:00 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 8 May 2012 10:26:00 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA8DE34.6050806@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8DE34.6050806@behnel.de> Message-ID: On 8 May 2012 09:49, Stefan Behnel wrote: > Dag Sverre Seljebotn, 08.05.2012 10:36: >> On 05/08/2012 10:18 AM, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>> On 05/07/2012 11:21 PM, mark florisson wrote: >>>>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>>>> mark florisson wrote: >>>>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>>>>> as np.ndarray, array.array etc. being some sort of "template types". >>>>>>>>>>> That is, >>>>>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>>>>> the relevant pxd files. >>>>>>>>>> >>>>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>>>> syntax now, >>>>>>>>>> one that declares the item type before the brackets and one that >>>>>>>>>> declares it afterwards. >>>>>>>>> Should we consider the >>>>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>>>> syntax? >>>>>>>> >>>>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>>>> too early >>>>>>>> to really tell yet, we just need to see how the memory views play out >>>>>>>> in >>>>>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>>>>> among real users. We don't want to shove things down users throats. >>>>>>>> >>>>>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>>>>> >>>>>>>> ? ?- Deprecate the "object[double]" form, where [dtype] can be stuck on >>>>>>>> ? ?any extension type >>>>>>>> >>>>>>>> ? ?- But, do NOT (for the next year at least) deprecate >>>>>>>> ? ?np.ndarray[double], >>>>>>>> ? ?array.array[double], etc. Basically, there should be a magic flag in >>>>>>>> ? ?extension type declarations saying "I can be a buffer". >>>>>>>> >>>>>>>> For one thing, that is sort of needed to open up things for templated >>>>>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>>>>> >>>>>>> Deprecating is definitely a good start. I think at least if you only >>>>>>> allow two types as buffers it will be at least reasonably clear when >>>>>>> one is dealing with fused types or buffers. >>>>>>> >>>>>>> Basically, I think memoryviews should live up to demands of the users, >>>>>>> which would mean there would be no reason to keep the buffer syntax. >>>>>> >>>>>> But they are different approaches -- use a different type/API, or just >>>>>> try to speed up parts of NumPy.. >>>>>> >>>>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>>>> original objects if wanted (which is likely). Writting >>>>>>> np.asarray(mymemview) is kind of annoying. >>>>>> >>>>>> It is going to be very confusing to have type(mymemview), >>>>>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>>>>> full API of NumPy. Unless you auto-convert on getattr to... >>>>> >>>>> Yeah, the idea is as very simple, as you mention, just keep the object >>>>> around cached, and when you slice construct one lazily. >>>>> >>>>>> If you want to eradicate the distinction between the backing array and >>>>>> the memory view and make it transparent, I really suggest you kick back >>>>>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>>>>> construction after slicing, and so on). Implementation much the same >>>>>> either way, it is all about how it is presented to the user. >>>>> >>>>> You mean the buffer syntax? >>>>> >>>>>> Something like mymemview.asobject() could work though, and while not >>>>>> much shorter, it would have some polymorphism that np.asarray does not >>>>>> have (based probably on some custom PEP 3118 extension) >>>>> >>>>> I was thinking you could allow the user to register a callback, and >>>>> use that to coerce from a memoryview back to an object (given a >>>>> memoryview object). For numpy this would be np.asarray, and the >>>>> implementation is allowed to cache the result (which it will). >>>>> It may be too magicky though... but it will be convenient. The >>>>> memoryview will act as a subclass, meaning that any of its methods >>>>> will override methods of the converted object. >>>> >>>> My point was that this seems *way* to magicky. >>>> >>>> Beyond "confusing users" and so on that are sort of subjective, here's a >>>> fundamental problem for you: We're making it very difficult to type-infer >>>> memoryviews. Consider: >>>> >>>> cdef double[:] x = ... >>>> y = x >>>> print y.shape >>>> >>>> Now, because y is not typed, you're semantically throwing in a conversion >>>> on line 2, so that line 3 says that you want the attribute access to be >>>> invoked on "whatever object x coerced back to". And we have no idea what >>>> kind of object that is. >>>> >>>> If you don't transparently convert to object, it'd be safe to automatically >>>> infer y as a double[:]. >>> >>> Why can't y be inferred as the type of x due to the assignment? >>> >>> >>>> On a related note, I've said before that I dislike the notion of >>>> >>>> cdef double[:] mview = obj >>>> >>>> I'd rather like >>>> >>>> cdef double[:] mview = double[:](obj) >>> >>> Why? We currently allow >>> >>> ? ? ?cdef char* s = some_py_bytes_string >>> >>> Auto-coercion is a serious part of the language, and I don't see the >>> advantage of requiring the redundancy in the case above. It's clear enough >>> to me what the typed assignment is intended to mean: get me a buffer view >>> on the object, regardless of what it is. >>> >>> >>>> I support Robert in that "np.ndarray[double]" is the syntax to use when you >>>> want this kind of transparent "be an object when I need to and a memory >>>> view when I need to". >>>> >>>> Proposal: >>>> >>>> ? 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>>> the language. It means exactly what you would like double[:] to mean, i.e. >>>> a variable that is memoryview when you need to and an object otherwise. >>>> When you use this type, you bear the consequences of early-binding things >>>> that could in theory be overridden. >>>> >>>> ? 2) double[:] is for when you want to access data of *any* Python >>>> object in >>>> a generic way. Raw PEP 3118. In those situations, access to the underlying >>>> object is much less useful. >>>> >>>> ? ?2a) Therefore we require that you do "mview.asobject()" manually; doing >>>> "mview.foo()" is a compile-time error >>> >>> Sounds good. I think that would clean up the current syntax overlap very >>> nicely. >>> >>> >>>> ? ?2b) To drive the point home among users, and aid type inference and >>>> overall language clarity, we REMOVE the auto-acquisition and require that >>>> you do >>>> >>>> ? ? ?cdef double[:] mview = double[:](obj) >>> >>> I don't see the point, as noted above. Either "obj" is statically typed and >>> the bare assignment becomes a no-op, or it's not typed and the assignment >>> coerces by creating a view. As with all other typed assignments. >>> >>> >>>> ? ?2c) Perhaps: Do not even coerce to a Python memoryview and disallow >>>> "print mview"; instead require that you do "print mview.asmemoryview()" or >>>> "print memoryview(mview)" or somesuch. >>> >>> This seems to depend on 2b. >> >> This I don't understand. The question of 2c) is the analogue to >> auto-coercion of "char*" to bytes; approving 2c) would put memoryviews in >> line with char*. >> >> Then again, we could in future auto-coerce char* to a ctypes pointer, and >> in that case, coercing a memoryview to an object representing that >> memoryview would be OK. >> >> Either way, you would never get back the same object that you coerced from! > > Ah, that's what you meant. I thought you were referring to getting a > memoryview from an object. > > I agree that a buffer view shouldn't auto-coerce back to its owner (or to a > Python object in general), that's the whole point of the syntax cleanup. > > In simple cases, buffer.obj would be the thing to talk to, except for > memory views, where only the view knows the mapped memory layout but the > underlying exporter has the methods to deal with the buffer. In that case, > we may really want to leave it to the user to handle this. I don't think > the compiler can do the right thing in all cases, and the user is really > the only one who knows what kind of object should be used or even > instantiated to wrap a buffer. Nothing we can do is shorter or more clearly > readable than np.asarray() or whatever function a specific library has for > this. > > So, what about just keeping buffer.obj visible and leaving everything else > to users? What about allowing a user callback to trigger when accessing buffer.obj, of which results may be cached? buffer.base will then still remain the 'original base object'. > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From d.s.seljebotn at astro.uio.no Tue May 8 11:30:02 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 May 2012 11:30:02 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> Message-ID: <4FA8E79A.4040402@astro.uio.no> On 05/08/2012 11:22 AM, mark florisson wrote: > On 8 May 2012 09:36, Dag Sverre Seljebotn wrote: >> On 05/08/2012 10:18 AM, Stefan Behnel wrote: >>> >>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>> >>>> On 05/07/2012 11:21 PM, mark florisson wrote: >>>>> >>>>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>>>> >>>>>> mark florisson wrote: >>>>>>> >>>>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>>>> >>>>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>>>> >>>>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>>>> >>>>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>>>> >>>>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked about just >>>>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather treat it >>>>>>>>>>> as np.ndarray, array.array etc. being some sort of "template >>>>>>>>>>> types". >>>>>>>>>>> That is, >>>>>>>>>>> we disallow "object[int]" and require some special declarations in >>>>>>>>>>> the relevant pxd files. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>>>> syntax now, >>>>>>>>>> one that declares the item type before the brackets and one that >>>>>>>>>> declares it afterwards. >>>>>>>>> >>>>>>>>> Should we consider the >>>>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>>>> syntax? >>>>>>>> >>>>>>>> >>>>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>>>> too early >>>>>>>> to really tell yet, we just need to see how the memory views play out >>>>>>>> in >>>>>>>> real life and whether they'll be able to replace np.ndarray[double] >>>>>>>> among real users. We don't want to shove things down users throats. >>>>>>>> >>>>>>>> But the use of the trailing-[] syntax needs some cleaning up. Me and >>>>>>>> Mark agreed we'd put this proposal forward when we got around to it: >>>>>>>> >>>>>>>> - Deprecate the "object[double]" form, where [dtype] can be stuck >>>>>>>> on >>>>>>>> any extension type >>>>>>>> >>>>>>>> - But, do NOT (for the next year at least) deprecate >>>>>>>> np.ndarray[double], >>>>>>>> array.array[double], etc. Basically, there should be a magic flag >>>>>>>> in >>>>>>>> extension type declarations saying "I can be a buffer". >>>>>>>> >>>>>>>> For one thing, that is sort of needed to open up things for templated >>>>>>>> cdef classes/fused types cdef classes, if that is ever implemented. >>>>>>> >>>>>>> >>>>>>> Deprecating is definitely a good start. I think at least if you only >>>>>>> allow two types as buffers it will be at least reasonably clear when >>>>>>> one is dealing with fused types or buffers. >>>>>>> >>>>>>> Basically, I think memoryviews should live up to demands of the users, >>>>>>> which would mean there would be no reason to keep the buffer syntax. >>>>>> >>>>>> >>>>>> But they are different approaches -- use a different type/API, or just >>>>>> try to speed up parts of NumPy.. >>>>>> >>>>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>>>> original objects if wanted (which is likely). Writting >>>>>>> np.asarray(mymemview) is kind of annoying. >>>>>> >>>>>> >>>>>> It is going to be very confusing to have type(mymemview), >>>>>> repr(mymemview), and so on come out as NumPy arrays, but not have the >>>>>> full API of NumPy. Unless you auto-convert on getattr to... >>>>> >>>>> >>>>> Yeah, the idea is as very simple, as you mention, just keep the object >>>>> around cached, and when you slice construct one lazily. >>>>> >>>>>> If you want to eradicate the distinction between the backing array and >>>>>> the memory view and make it transparent, I really suggest you kick back >>>>>> alive np.ndarray (it can exist in some 'unrealized' state with delayed >>>>>> construction after slicing, and so on). Implementation much the same >>>>>> either way, it is all about how it is presented to the user. >>>>> >>>>> >>>>> You mean the buffer syntax? >>>>> >>>>>> Something like mymemview.asobject() could work though, and while not >>>>>> much shorter, it would have some polymorphism that np.asarray does not >>>>>> have (based probably on some custom PEP 3118 extension) >>>>> >>>>> >>>>> I was thinking you could allow the user to register a callback, and >>>>> use that to coerce from a memoryview back to an object (given a >>>>> memoryview object). For numpy this would be np.asarray, and the >>>>> implementation is allowed to cache the result (which it will). >>>>> It may be too magicky though... but it will be convenient. The >>>>> memoryview will act as a subclass, meaning that any of its methods >>>>> will override methods of the converted object. >>>> >>>> >>>> My point was that this seems *way* to magicky. >>>> >>>> Beyond "confusing users" and so on that are sort of subjective, here's a >>>> fundamental problem for you: We're making it very difficult to type-infer >>>> memoryviews. Consider: >>>> >>>> cdef double[:] x = ... >>>> y = x >>>> print y.shape >>>> >>>> Now, because y is not typed, you're semantically throwing in a conversion >>>> on line 2, so that line 3 says that you want the attribute access to be >>>> invoked on "whatever object x coerced back to". And we have no idea what >>>> kind of object that is. >>>> >>>> If you don't transparently convert to object, it'd be safe to >>>> automatically >>>> infer y as a double[:]. >>> >>> >>> Why can't y be inferred as the type of x due to the assignment? >>> >>> >>>> On a related note, I've said before that I dislike the notion of >>>> >>>> cdef double[:] mview = obj >>>> >>>> I'd rather like >>>> >>>> cdef double[:] mview = double[:](obj) >>> >>> >>> Why? We currently allow >>> >>> cdef char* s = some_py_bytes_string >>> >>> Auto-coercion is a serious part of the language, and I don't see the >>> advantage of requiring the redundancy in the case above. It's clear enough >>> to me what the typed assignment is intended to mean: get me a buffer view >>> on the object, regardless of what it is. >>> >>> >>>> I support Robert in that "np.ndarray[double]" is the syntax to use when >>>> you >>>> want this kind of transparent "be an object when I need to and a memory >>>> view when I need to". >>>> >>>> Proposal: >>>> >>>> 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>>> the language. It means exactly what you would like double[:] to mean, >>>> i.e. >>>> a variable that is memoryview when you need to and an object otherwise. >>>> When you use this type, you bear the consequences of early-binding things >>>> that could in theory be overridden. >>>> >>>> 2) double[:] is for when you want to access data of *any* Python object >>>> in >>>> a generic way. Raw PEP 3118. In those situations, access to the >>>> underlying >>>> object is much less useful. >>>> >>>> 2a) Therefore we require that you do "mview.asobject()" manually; doing >>>> "mview.foo()" is a compile-time error >>> >>> >>> Sounds good. I think that would clean up the current syntax overlap very >>> nicely. >>> >>> >>>> 2b) To drive the point home among users, and aid type inference and >>>> overall language clarity, we REMOVE the auto-acquisition and require that >>>> you do >>>> >>>> cdef double[:] mview = double[:](obj) >>> >>> >>> I don't see the point, as noted above. Either "obj" is statically typed >>> and >>> the bare assignment becomes a no-op, or it's not typed and the assignment >>> coerces by creating a view. As with all other typed assignments. >>> >>> >>>> 2c) Perhaps: Do not even coerce to a Python memoryview and disallow >>>> "print mview"; instead require that you do "print mview.asmemoryview()" >>>> or >>>> "print memoryview(mview)" or somesuch. >>> >>> >>> This seems to depend on 2b. >> >> >> This I don't understand. The question of 2c) is the analogue to >> auto-coercion of "char*" to bytes; approving 2c) would put memoryviews in >> line with char*. >> >> Then again, we could in future auto-coerce char* to a ctypes pointer, and in >> that case, coercing a memoryview to an object representing that memoryview >> would be OK. > > Character pointers coerce to strings. Hell, even structs coerce to and > from python dicts, so disallowing the same for memoryviews would just > be inconsistent and inconvenient. OK, but even structs don't coerce back to some arbitrary type, it's always a dict. I don't necesarrily oppose coercing memoryviews to some Python memoryview object (not necesarrily the builtin). I agree that some mview.asobject() triggering a callback defined by some CEP 1xxx ("cross-language CEP") would be really useful; and that could form the basis of a new, improved np.ndarray[double] that allows fast slicing etc. (where that is used automatically whenever needed). Dag From d.s.seljebotn at astro.uio.no Tue May 8 11:47:26 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 May 2012 11:47:26 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA8E79A.4040402@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8E79A.4040402@astro.uio.no> Message-ID: <4FA8EBAE.3010106@astro.uio.no> On 05/08/2012 11:30 AM, Dag Sverre Seljebotn wrote: > On 05/08/2012 11:22 AM, mark florisson wrote: >> On 8 May 2012 09:36, Dag Sverre Seljebotn >> wrote: >>> On 05/08/2012 10:18 AM, Stefan Behnel wrote: >>>> >>>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>>> >>>>> On 05/07/2012 11:21 PM, mark florisson wrote: >>>>>> >>>>>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>>>>> >>>>>>> mark florisson wrote: >>>>>>>> >>>>>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>>>>> >>>>>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>>>>> >>>>>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>>>>> >>>>>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>>>>> >>>>>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked >>>>>>>>>>>> about just >>>>>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather >>>>>>>>>>>> treat it >>>>>>>>>>>> as np.ndarray, array.array etc. being some sort of "template >>>>>>>>>>>> types". >>>>>>>>>>>> That is, >>>>>>>>>>>> we disallow "object[int]" and require some special >>>>>>>>>>>> declarations in >>>>>>>>>>>> the relevant pxd files. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>>>>> syntax now, >>>>>>>>>>> one that declares the item type before the brackets and one that >>>>>>>>>>> declares it afterwards. >>>>>>>>>> >>>>>>>>>> Should we consider the >>>>>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>>>>> syntax? >>>>>>>>> >>>>>>>>> >>>>>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>>>>> too early >>>>>>>>> to really tell yet, we just need to see how the memory views >>>>>>>>> play out >>>>>>>>> in >>>>>>>>> real life and whether they'll be able to replace >>>>>>>>> np.ndarray[double] >>>>>>>>> among real users. We don't want to shove things down users >>>>>>>>> throats. >>>>>>>>> >>>>>>>>> But the use of the trailing-[] syntax needs some cleaning up. >>>>>>>>> Me and >>>>>>>>> Mark agreed we'd put this proposal forward when we got around >>>>>>>>> to it: >>>>>>>>> >>>>>>>>> - Deprecate the "object[double]" form, where [dtype] can be stuck >>>>>>>>> on >>>>>>>>> any extension type >>>>>>>>> >>>>>>>>> - But, do NOT (for the next year at least) deprecate >>>>>>>>> np.ndarray[double], >>>>>>>>> array.array[double], etc. Basically, there should be a magic flag >>>>>>>>> in >>>>>>>>> extension type declarations saying "I can be a buffer". >>>>>>>>> >>>>>>>>> For one thing, that is sort of needed to open up things for >>>>>>>>> templated >>>>>>>>> cdef classes/fused types cdef classes, if that is ever >>>>>>>>> implemented. >>>>>>>> >>>>>>>> >>>>>>>> Deprecating is definitely a good start. I think at least if you >>>>>>>> only >>>>>>>> allow two types as buffers it will be at least reasonably clear >>>>>>>> when >>>>>>>> one is dealing with fused types or buffers. >>>>>>>> >>>>>>>> Basically, I think memoryviews should live up to demands of the >>>>>>>> users, >>>>>>>> which would mean there would be no reason to keep the buffer >>>>>>>> syntax. >>>>>>> >>>>>>> >>>>>>> But they are different approaches -- use a different type/API, or >>>>>>> just >>>>>>> try to speed up parts of NumPy.. >>>>>>> >>>>>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>>>>> original objects if wanted (which is likely). Writting >>>>>>>> np.asarray(mymemview) is kind of annoying. >>>>>>> >>>>>>> >>>>>>> It is going to be very confusing to have type(mymemview), >>>>>>> repr(mymemview), and so on come out as NumPy arrays, but not have >>>>>>> the >>>>>>> full API of NumPy. Unless you auto-convert on getattr to... >>>>>> >>>>>> >>>>>> Yeah, the idea is as very simple, as you mention, just keep the >>>>>> object >>>>>> around cached, and when you slice construct one lazily. >>>>>> >>>>>>> If you want to eradicate the distinction between the backing >>>>>>> array and >>>>>>> the memory view and make it transparent, I really suggest you >>>>>>> kick back >>>>>>> alive np.ndarray (it can exist in some 'unrealized' state with >>>>>>> delayed >>>>>>> construction after slicing, and so on). Implementation much the same >>>>>>> either way, it is all about how it is presented to the user. >>>>>> >>>>>> >>>>>> You mean the buffer syntax? >>>>>> >>>>>>> Something like mymemview.asobject() could work though, and while not >>>>>>> much shorter, it would have some polymorphism that np.asarray >>>>>>> does not >>>>>>> have (based probably on some custom PEP 3118 extension) >>>>>> >>>>>> >>>>>> I was thinking you could allow the user to register a callback, and >>>>>> use that to coerce from a memoryview back to an object (given a >>>>>> memoryview object). For numpy this would be np.asarray, and the >>>>>> implementation is allowed to cache the result (which it will). >>>>>> It may be too magicky though... but it will be convenient. The >>>>>> memoryview will act as a subclass, meaning that any of its methods >>>>>> will override methods of the converted object. >>>>> >>>>> >>>>> My point was that this seems *way* to magicky. >>>>> >>>>> Beyond "confusing users" and so on that are sort of subjective, >>>>> here's a >>>>> fundamental problem for you: We're making it very difficult to >>>>> type-infer >>>>> memoryviews. Consider: >>>>> >>>>> cdef double[:] x = ... >>>>> y = x >>>>> print y.shape >>>>> >>>>> Now, because y is not typed, you're semantically throwing in a >>>>> conversion >>>>> on line 2, so that line 3 says that you want the attribute access >>>>> to be >>>>> invoked on "whatever object x coerced back to". And we have no idea >>>>> what >>>>> kind of object that is. >>>>> >>>>> If you don't transparently convert to object, it'd be safe to >>>>> automatically >>>>> infer y as a double[:]. >>>> >>>> >>>> Why can't y be inferred as the type of x due to the assignment? >>>> >>>> >>>>> On a related note, I've said before that I dislike the notion of >>>>> >>>>> cdef double[:] mview = obj >>>>> >>>>> I'd rather like >>>>> >>>>> cdef double[:] mview = double[:](obj) >>>> >>>> >>>> Why? We currently allow >>>> >>>> cdef char* s = some_py_bytes_string >>>> >>>> Auto-coercion is a serious part of the language, and I don't see the >>>> advantage of requiring the redundancy in the case above. It's clear >>>> enough >>>> to me what the typed assignment is intended to mean: get me a buffer >>>> view >>>> on the object, regardless of what it is. >>>> >>>> >>>>> I support Robert in that "np.ndarray[double]" is the syntax to use >>>>> when >>>>> you >>>>> want this kind of transparent "be an object when I need to and a >>>>> memory >>>>> view when I need to". >>>>> >>>>> Proposal: >>>>> >>>>> 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping >>>>> that in >>>>> the language. It means exactly what you would like double[:] to mean, >>>>> i.e. >>>>> a variable that is memoryview when you need to and an object >>>>> otherwise. >>>>> When you use this type, you bear the consequences of early-binding >>>>> things >>>>> that could in theory be overridden. >>>>> >>>>> 2) double[:] is for when you want to access data of *any* Python >>>>> object >>>>> in >>>>> a generic way. Raw PEP 3118. In those situations, access to the >>>>> underlying >>>>> object is much less useful. >>>>> >>>>> 2a) Therefore we require that you do "mview.asobject()" manually; >>>>> doing >>>>> "mview.foo()" is a compile-time error >>>> >>>> >>>> Sounds good. I think that would clean up the current syntax overlap >>>> very >>>> nicely. >>>> >>>> >>>>> 2b) To drive the point home among users, and aid type inference and >>>>> overall language clarity, we REMOVE the auto-acquisition and >>>>> require that >>>>> you do >>>>> >>>>> cdef double[:] mview = double[:](obj) >>>> >>>> >>>> I don't see the point, as noted above. Either "obj" is statically typed >>>> and >>>> the bare assignment becomes a no-op, or it's not typed and the >>>> assignment >>>> coerces by creating a view. As with all other typed assignments. >>>> >>>> >>>>> 2c) Perhaps: Do not even coerce to a Python memoryview and disallow >>>>> "print mview"; instead require that you do "print >>>>> mview.asmemoryview()" >>>>> or >>>>> "print memoryview(mview)" or somesuch. >>>> >>>> >>>> This seems to depend on 2b. >>> >>> >>> This I don't understand. The question of 2c) is the analogue to >>> auto-coercion of "char*" to bytes; approving 2c) would put >>> memoryviews in >>> line with char*. >>> >>> Then again, we could in future auto-coerce char* to a ctypes pointer, >>> and in >>> that case, coercing a memoryview to an object representing that >>> memoryview >>> would be OK. >> >> Character pointers coerce to strings. Hell, even structs coerce to and >> from python dicts, so disallowing the same for memoryviews would just >> be inconsistent and inconvenient. > > OK, but even structs don't coerce back to some arbitrary type, it's > always a dict. I don't necesarrily oppose coercing memoryviews to some > Python memoryview object (not necesarrily the builtin). > > I agree that some mview.asobject() triggering a callback defined by some > CEP 1xxx ("cross-language CEP") would be really useful; and that could > form the basis of a new, improved np.ndarray[double] that allows fast > slicing etc. (where that is used automatically whenever needed). After some thinking I believe I can see more clearly where Mark is coming from. To sum up, it's either A) Keep both np.ndarray[double] and double[:] around, with clearly defined and separate roles. np.ndarray[double] implementation is revamped to allow fast slicing etc., based on the double[:] implementation. B) Deprecate np.ndarray[double] sooner rather than later, but make double[:] have functionality that is *really* close to what np.ndarray[double] currently does. In most cases one should be able to basically replace np.ndarray[double] with double[:] and the code should continue to work just like before; difference is that if you pass in anything else than a NumPy array, it will likely fail with a runtime AttributeError at some point rather than fail a PyType_Check. Between those two I believe it's a matter of design taste, not so much rational argument, and I don't know where I stand yet. And I'm going to stop thinking about it until I see what Robert says... Dag From stefan_ml at behnel.de Tue May 8 11:48:51 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 May 2012 11:48:51 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> Message-ID: <4FA8EC03.4000201@behnel.de> mark florisson, 08.05.2012 11:24: >>>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>>> 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>>>> the language. It means exactly what you would like double[:] to mean, >>>>> i.e. >>>>> a variable that is memoryview when you need to and an object otherwise. >>>>> When you use this type, you bear the consequences of early-binding things >>>>> that could in theory be overridden. >>>>> >>>>> 2) double[:] is for when you want to access data of *any* Python object >>>>> in a generic way. Raw PEP 3118. In those situations, access to the >>>>> underlying object is much less useful. >>>>> >>>>> 2a) Therefore we require that you do "mview.asobject()" manually; doing >>>>> "mview.foo()" is a compile-time error >> [...] >> Character pointers coerce to strings. Hell, even structs coerce to and >> from python dicts, so disallowing the same for memoryviews would just >> be inconsistent and inconvenient. Two separate things to discuss here: the original exporter and a Python level wrapper. As long as wrapping the memoryview in a new object is can easily be done by users, I don't see a reason to provide compiler support for getting at the exporter. After all, a user may have a memory view that is backed by a NumPy array but wants to reinterpret it as a PIL image. Just because the underlying object has a specific object type doesn't mean that's the one to use for a given use case. If a user requires a specific object *instead* of a bare memory view, we have the object type buffer syntax for that. It's also not necessarily more efficient to access the underlying object than to create a new one if the underlying exporter has to learn about the mapped layout first. Regarding the coercion to Python, I do not see a problem with providing a general Python view object for memory views that arbitrary Cython memory views can coerce to. In fact, I consider that a useful feature. The builtin memoryview type in Python (at least the one in CPython 3.3) should be quite capable of providing this, although I don't mind what exactly this becomes. > Also, if you don't allow coercion from python, then it means they also > cannot be used as 'def' function arguments and be called from python. Coercion *from* Python is not being questioned. We have syntax for that, and a Python memory view wrapper can easily be unboxed (even transitively) through the buffer interface when entering back into Cython. Stefan From markflorisson88 at gmail.com Tue May 8 12:35:13 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 8 May 2012 11:35:13 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA8EBAE.3010106@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8E79A.4040402@astro.uio.no> <4FA8EBAE.3010106@astro.uio.no> Message-ID: On 8 May 2012 10:47, Dag Sverre Seljebotn wrote: > On 05/08/2012 11:30 AM, Dag Sverre Seljebotn wrote: >> >> On 05/08/2012 11:22 AM, mark florisson wrote: >>> >>> On 8 May 2012 09:36, Dag Sverre Seljebotn >>> wrote: >>>> >>>> On 05/08/2012 10:18 AM, Stefan Behnel wrote: >>>>> >>>>> >>>>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>>>> >>>>>> >>>>>> On 05/07/2012 11:21 PM, mark florisson wrote: >>>>>>> >>>>>>> >>>>>>> On 7 May 2012 19:40, Dag Sverre Seljebotn wrote: >>>>>>>> >>>>>>>> >>>>>>>> mark florisson wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7 May 2012 17:00, Dag Sverre Seljebotn wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 05/07/2012 04:16 PM, Stefan Behnel wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Stefan Behnel, 07.05.2012 15:04: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Dag Sverre Seljebotn, 07.05.2012 13:48: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> BTW, with the coming of memoryviews, me and Mark talked >>>>>>>>>>>>> about just >>>>>>>>>>>>> deprecating the "mytype[...]" meaning buffers, and rather >>>>>>>>>>>>> treat it >>>>>>>>>>>>> as np.ndarray, array.array etc. being some sort of "template >>>>>>>>>>>>> types". >>>>>>>>>>>>> That is, >>>>>>>>>>>>> we disallow "object[int]" and require some special >>>>>>>>>>>>> declarations in >>>>>>>>>>>>> the relevant pxd files. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hmm, yes, it's unfortunate that we have two different types of >>>>>>>>>>>> syntax now, >>>>>>>>>>>> one that declares the item type before the brackets and one that >>>>>>>>>>>> declares it afterwards. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Should we consider the >>>>>>>>>>> buffer interface syntax deprecated and focus on the memory view >>>>>>>>>>> syntax? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I think that's the very-long-term intention. Then again, it may be >>>>>>>>>> too early >>>>>>>>>> to really tell yet, we just need to see how the memory views >>>>>>>>>> play out >>>>>>>>>> in >>>>>>>>>> real life and whether they'll be able to replace >>>>>>>>>> np.ndarray[double] >>>>>>>>>> among real users. We don't want to shove things down users >>>>>>>>>> throats. >>>>>>>>>> >>>>>>>>>> But the use of the trailing-[] syntax needs some cleaning up. >>>>>>>>>> Me and >>>>>>>>>> Mark agreed we'd put this proposal forward when we got around >>>>>>>>>> to it: >>>>>>>>>> >>>>>>>>>> - Deprecate the "object[double]" form, where [dtype] can be stuck >>>>>>>>>> on >>>>>>>>>> any extension type >>>>>>>>>> >>>>>>>>>> - But, do NOT (for the next year at least) deprecate >>>>>>>>>> np.ndarray[double], >>>>>>>>>> array.array[double], etc. Basically, there should be a magic flag >>>>>>>>>> in >>>>>>>>>> extension type declarations saying "I can be a buffer". >>>>>>>>>> >>>>>>>>>> For one thing, that is sort of needed to open up things for >>>>>>>>>> templated >>>>>>>>>> cdef classes/fused types cdef classes, if that is ever >>>>>>>>>> implemented. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Deprecating is definitely a good start. I think at least if you >>>>>>>>> only >>>>>>>>> allow two types as buffers it will be at least reasonably clear >>>>>>>>> when >>>>>>>>> one is dealing with fused types or buffers. >>>>>>>>> >>>>>>>>> Basically, I think memoryviews should live up to demands of the >>>>>>>>> users, >>>>>>>>> which would mean there would be no reason to keep the buffer >>>>>>>>> syntax. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> But they are different approaches -- use a different type/API, or >>>>>>>> just >>>>>>>> try to speed up parts of NumPy.. >>>>>>>> >>>>>>>>> One thing to do is make memoryviews coerce cheaply back to the >>>>>>>>> original objects if wanted (which is likely). Writting >>>>>>>>> np.asarray(mymemview) is kind of annoying. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> It is going to be very confusing to have type(mymemview), >>>>>>>> repr(mymemview), and so on come out as NumPy arrays, but not have >>>>>>>> the >>>>>>>> full API of NumPy. Unless you auto-convert on getattr to... >>>>>>> >>>>>>> >>>>>>> >>>>>>> Yeah, the idea is as very simple, as you mention, just keep the >>>>>>> object >>>>>>> around cached, and when you slice construct one lazily. >>>>>>> >>>>>>>> If you want to eradicate the distinction between the backing >>>>>>>> array and >>>>>>>> the memory view and make it transparent, I really suggest you >>>>>>>> kick back >>>>>>>> alive np.ndarray (it can exist in some 'unrealized' state with >>>>>>>> delayed >>>>>>>> construction after slicing, and so on). Implementation much the same >>>>>>>> either way, it is all about how it is presented to the user. >>>>>>> >>>>>>> >>>>>>> >>>>>>> You mean the buffer syntax? >>>>>>> >>>>>>>> Something like mymemview.asobject() could work though, and while not >>>>>>>> much shorter, it would have some polymorphism that np.asarray >>>>>>>> does not >>>>>>>> have (based probably on some custom PEP 3118 extension) >>>>>>> >>>>>>> >>>>>>> >>>>>>> I was thinking you could allow the user to register a callback, and >>>>>>> use that to coerce from a memoryview back to an object (given a >>>>>>> memoryview object). For numpy this would be np.asarray, and the >>>>>>> implementation is allowed to cache the result (which it will). >>>>>>> It may be too magicky though... but it will be convenient. The >>>>>>> memoryview will act as a subclass, meaning that any of its methods >>>>>>> will override methods of the converted object. >>>>>> >>>>>> >>>>>> >>>>>> My point was that this seems *way* to magicky. >>>>>> >>>>>> Beyond "confusing users" and so on that are sort of subjective, >>>>>> here's a >>>>>> fundamental problem for you: We're making it very difficult to >>>>>> type-infer >>>>>> memoryviews. Consider: >>>>>> >>>>>> cdef double[:] x = ... >>>>>> y = x >>>>>> print y.shape >>>>>> >>>>>> Now, because y is not typed, you're semantically throwing in a >>>>>> conversion >>>>>> on line 2, so that line 3 says that you want the attribute access >>>>>> to be >>>>>> invoked on "whatever object x coerced back to". And we have no idea >>>>>> what >>>>>> kind of object that is. >>>>>> >>>>>> If you don't transparently convert to object, it'd be safe to >>>>>> automatically >>>>>> infer y as a double[:]. >>>>> >>>>> >>>>> >>>>> Why can't y be inferred as the type of x due to the assignment? >>>>> >>>>> >>>>>> On a related note, I've said before that I dislike the notion of >>>>>> >>>>>> cdef double[:] mview = obj >>>>>> >>>>>> I'd rather like >>>>>> >>>>>> cdef double[:] mview = double[:](obj) >>>>> >>>>> >>>>> >>>>> Why? We currently allow >>>>> >>>>> cdef char* s = some_py_bytes_string >>>>> >>>>> Auto-coercion is a serious part of the language, and I don't see the >>>>> advantage of requiring the redundancy in the case above. It's clear >>>>> enough >>>>> to me what the typed assignment is intended to mean: get me a buffer >>>>> view >>>>> on the object, regardless of what it is. >>>>> >>>>> >>>>>> I support Robert in that "np.ndarray[double]" is the syntax to use >>>>>> when >>>>>> you >>>>>> want this kind of transparent "be an object when I need to and a >>>>>> memory >>>>>> view when I need to". >>>>>> >>>>>> Proposal: >>>>>> >>>>>> 1) We NEVER deprecate "np.ndarray[double]", we commit to keeping >>>>>> that in >>>>>> the language. It means exactly what you would like double[:] to mean, >>>>>> i.e. >>>>>> a variable that is memoryview when you need to and an object >>>>>> otherwise. >>>>>> When you use this type, you bear the consequences of early-binding >>>>>> things >>>>>> that could in theory be overridden. >>>>>> >>>>>> 2) double[:] is for when you want to access data of *any* Python >>>>>> object >>>>>> in >>>>>> a generic way. Raw PEP 3118. In those situations, access to the >>>>>> underlying >>>>>> object is much less useful. >>>>>> >>>>>> 2a) Therefore we require that you do "mview.asobject()" manually; >>>>>> doing >>>>>> "mview.foo()" is a compile-time error >>>>> >>>>> >>>>> >>>>> Sounds good. I think that would clean up the current syntax overlap >>>>> very >>>>> nicely. >>>>> >>>>> >>>>>> 2b) To drive the point home among users, and aid type inference and >>>>>> overall language clarity, we REMOVE the auto-acquisition and >>>>>> require that >>>>>> you do >>>>>> >>>>>> cdef double[:] mview = double[:](obj) >>>>> >>>>> >>>>> >>>>> I don't see the point, as noted above. Either "obj" is statically typed >>>>> and >>>>> the bare assignment becomes a no-op, or it's not typed and the >>>>> assignment >>>>> coerces by creating a view. As with all other typed assignments. >>>>> >>>>> >>>>>> 2c) Perhaps: Do not even coerce to a Python memoryview and disallow >>>>>> "print mview"; instead require that you do "print >>>>>> mview.asmemoryview()" >>>>>> or >>>>>> "print memoryview(mview)" or somesuch. >>>>> >>>>> >>>>> >>>>> This seems to depend on 2b. >>>> >>>> >>>> >>>> This I don't understand. The question of 2c) is the analogue to >>>> auto-coercion of "char*" to bytes; approving 2c) would put >>>> memoryviews in >>>> line with char*. >>>> >>>> Then again, we could in future auto-coerce char* to a ctypes pointer, >>>> and in >>>> that case, coercing a memoryview to an object representing that >>>> memoryview >>>> would be OK. >>> >>> >>> Character pointers coerce to strings. Hell, even structs coerce to and >>> from python dicts, so disallowing the same for memoryviews would just >>> be inconsistent and inconvenient. >> >> >> OK, but even structs don't coerce back to some arbitrary type, it's >> always a dict. I don't necesarrily oppose coercing memoryviews to some >> Python memoryview object (not necesarrily the builtin). >> >> I agree that some mview.asobject() triggering a callback defined by some >> CEP 1xxx ("cross-language CEP") would be really useful; and that could >> form the basis of a new, improved np.ndarray[double] that allows fast >> slicing etc. (where that is used automatically whenever needed). > > > After some thinking I believe I can see more clearly where Mark is coming > from. To sum up, it's either > > A) Keep both np.ndarray[double] and double[:] around, with clearly defined > and separate roles. np.ndarray[double] implementation is revamped to allow > fast slicing etc., based on the double[:] implementation. > > B) Deprecate np.ndarray[double] sooner rather than later, but make double[:] > have functionality that is *really* close to what np.ndarray[double] > currently does. In most cases one should be able to basically replace > np.ndarray[double] with double[:] and the code should continue to work just > like before; difference is that if you pass in anything else than a NumPy > array, it will likely fail with a runtime AttributeError at some point > rather than fail a PyType_Check. That's a good summary. I have a big preference for B here, but I agree that treating a typed memoryview as both a user object (possibly converted through callback) and a typed memoryview "subclass" is quite magicky. I wouldn't particularly mind something concise like 'm.obj'. The AttributeError would be the case as usual, when a python object doesn't have the right interface. > Between those two I believe it's a matter of design taste, not so much > rational argument, and I don't know where I stand yet. And I'm going to stop > thinking about it until I see what Robert says... > > > Dag > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Tue May 8 12:35:30 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 8 May 2012 11:35:30 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA8EC03.4000201@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8EC03.4000201@behnel.de> Message-ID: On 8 May 2012 10:48, Stefan Behnel wrote: > mark florisson, 08.05.2012 11:24: >>>>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>>>> ?1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>>>>> the language. It means exactly what you would like double[:] to mean, >>>>>> i.e. >>>>>> a variable that is memoryview when you need to and an object otherwise. >>>>>> When you use this type, you bear the consequences of early-binding things >>>>>> that could in theory be overridden. >>>>>> >>>>>> ?2) double[:] is for when you want to access data of *any* Python object >>>>>> in a generic way. Raw PEP 3118. In those situations, access to the >>>>>> underlying object is much less useful. >>>>>> >>>>>> ? 2a) Therefore we require that you do "mview.asobject()" manually; doing >>>>>> "mview.foo()" is a compile-time error >>> [...] >>> Character pointers coerce to strings. Hell, even structs coerce to and >>> from python dicts, so disallowing the same for memoryviews would just >>> be inconsistent and inconvenient. > > Two separate things to discuss here: the original exporter and a Python > level wrapper. > > As long as wrapping the memoryview in a new object is can easily be done by > users, I don't see a reason to provide compiler support for getting at the > exporter. Well, the support is already there :) It's basically to be consistent with numpy's attributes. > After all, a user may have a memory view that is backed by a > NumPy array but wants to reinterpret it as a PIL image. Just because the > underlying object has a specific object type doesn't mean that's the one to > use for a given use case. If a user requires a specific object *instead* of > a bare memory view, we have the object type buffer syntax for that. Which is better deprecated to allow only one way to do things, and to make fused extension types less confusing. > It's also not necessarily more efficient to access the underlying object > than to create a new one if the underlying exporter has to learn about the > mapped layout first. > > Regarding the coercion to Python, I do not see a problem with providing a > general Python view object for memory views that arbitrary Cython memory > views can coerce to. In fact, I consider that a useful feature. The builtin > memoryview type in Python (at least the one in CPython 3.3) should be quite > capable of providing this, although I don't mind what exactly this becomes. > There are two ways to argue this entire problem, one is from a theoretical standpoint, and one from a pragmatic. Theoretically your points are sound, but in practice 99% of the uses will be numpy arrays, and in 99% of those uses people will want one back. If one does not allow easy, compiler-supported, conversion, then any numpy operation will go from typed memoryview slice -> memoryview object -> buffer interface -> some computation in numpy -> buffer interface -> typed memoryview. The compiler can help here by maintaining cached views aided by a user callback. In the case you're not slicing, you can just return the original object. I'm not sure how to register those callbacks though, as making them global may interfere between projects. Maybe it should be a module level thing? >> Also, if you don't allow coercion from python, then it means they also >> cannot be used as 'def' function arguments and be called from python. > > Coercion *from* Python is not being questioned. We have syntax for that, > and a Python memory view wrapper can easily be unboxed (even transitively) > through the buffer interface when entering back into Cython. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From vitja.makarov at gmail.com Tue May 8 13:27:28 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Tue, 8 May 2012 15:27:28 +0400 Subject: [Cython] callable() optimization Message-ID: I've noticed regression related to callable() optimization. https://github.com/cython/cython/commit/a40112b0461eae5ab22fbdd07ae798d4a72ff523 class C: pass print callable(C()) It prints True optimized version checks ((obj)->ob_type->tp_call != NULL) condition that is True for both class and instance. >>> help(callable) callable(...) callable(object) -> bool Return whether the object is callable (i.e., some kind of function). Note that classes are callable, as are instances with a __call__() method. -- vitja. From stefan_ml at behnel.de Tue May 8 14:24:25 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 May 2012 14:24:25 +0200 Subject: [Cython] CF based type inference Message-ID: <4FA91079.5090503@behnel.de> Hi, Vitja has rebased the type inference on the control flow, so I wonder if this will enable us to properly infer this: def partial_validity(): """ >>> partial_validity() ('Python object', 'double', 'str object') """ a = 1.0 b = a + 2 # definitely double a = 'test' c = a + 'toast' # definitely str return typeof(a), typeof(b), typeof(c) I think, what is mainly needed for this is that a NameNode with an undeclared type should not report its own entry as dependency but that of its own cf_assignments. Would this work? (Haven't got the time to try it out right now, so I'm dumping it here.) Stefan From stefan_ml at behnel.de Tue May 8 15:20:11 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 08 May 2012 15:20:11 +0200 Subject: [Cython] [cython-users] Confusing beheavior of memoryview (assigning value to slice fails without any indication) In-Reply-To: References: <11786512.1729.1336392567694.JavaMail.geo-discussion-forums@vbvx4> Message-ID: <4FA91D8B.4080804@behnel.de> mark florisson, 08.05.2012 11:11: > On 7 May 2012 13:09, Maxim wrote: >> Consider the following code: >> >>> # a.pyx: >>> cdef class Base(object): >>> cdef public double[:,:] arr >>> # b.py: >>> from a import Base >>> import numpy as np >>> class MyClass(Base): >>> def __init__(self): >>> self.arr = np.zeros((10, 10), dtype=np.float64) >>> self.arr[1, :] = 10 # this line will execute correctly, but won't >>> have any effect >>> print self.arr[1,5] >>> >> >> Is it possible to somehow warn the user here that assigning value to >> memoryview slice is not supported? Finding this out after some debugging was >> a little annoying. > > Thanks for the report, that's a silly bug. It works with typed > memoryviews, but with objects it passes in the wrong ndim, and the > second dimension is 0, which means it does nothing. It is fixed in the > cython master branch. > > BTW, using 'self.arr' in the python subclass means you don't get your > numpy array back, but rather a cython memoryview that is far less > capable. I think it would be good to cherry pick this kind of fix directly over into the release branch so that we can start building up our pile of fixes for 0.16.1 there. Stefan From vitja.makarov at gmail.com Tue May 8 15:47:29 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Tue, 8 May 2012 17:47:29 +0400 Subject: [Cython] CF based type inference In-Reply-To: <4FA91079.5090503@behnel.de> References: <4FA91079.5090503@behnel.de> Message-ID: 2012/5/8 Stefan Behnel : > Hi, > > Vitja has rebased the type inference on the control flow, so I wonder if > this will enable us to properly infer this: > > ?def partial_validity(): > ? ?""" > ? ?>>> partial_validity() > ? ?('Python object', 'double', 'str object') > ? ?""" > ? ?a = 1.0 > ? ?b = a + 2 ? # definitely double > ? ?a = 'test' > ? ?c = a + 'toast' ?# definitely str > ? ?return typeof(a), typeof(b), typeof(c) > > I think, what is mainly needed for this is that a NameNode with an > undeclared type should not report its own entry as dependency but that of > its own cf_assignments. Would this work? > > (Haven't got the time to try it out right now, so I'm dumping it here.) > Yeah, that might work. The other way to go is to split entries: def partial_validity(): """ >>> partial_validity() ('str object', 'double', 'str object') """ a_1 = 1.0 b = a_1 + 2 # definitely double a_2 = 'test' c = a_2 + 'toast' # definitely str return typeof(a_2), typeof(b), typeof(c) And this should work better because it allows to infer a_1 as a double and a_2 as a string. -- vitja. From d.s.seljebotn at astro.uio.no Tue May 8 18:52:24 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 08 May 2012 18:52:24 +0200 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> Message-ID: <5ff2d356-9b49-4954-82df-cd972a403f8c@email.android.com> Vitja Makarov wrote: >2012/5/8 Stefan Behnel : >> Hi, >> >> Vitja has rebased the type inference on the control flow, so I wonder >if >> this will enable us to properly infer this: >> >> ?def partial_validity(): >> ? ?""" >> ? ?>>> partial_validity() >> ? ?('Python object', 'double', 'str object') >> ? ?""" >> ? ?a = 1.0 >> ? ?b = a + 2 ? # definitely double >> ? ?a = 'test' >> ? ?c = a + 'toast' ?# definitely str >> ? ?return typeof(a), typeof(b), typeof(c) >> >> I think, what is mainly needed for this is that a NameNode with an >> undeclared type should not report its own entry as dependency but >that of >> its own cf_assignments. Would this work? >> >> (Haven't got the time to try it out right now, so I'm dumping it >here.) >> > >Yeah, that might work. The other way to go is to split entries: > > def partial_validity(): > """ > >>> partial_validity() > ('str object', 'double', 'str object') > """ > a_1 = 1.0 > b = a_1 + 2 # definitely double > a_2 = 'test' > c = a_2 + 'toast' # definitely str > return typeof(a_2), typeof(b), typeof(c) > >And this should work better because it allows to infer a_1 as a double >and a_2 as a string. +1 (as also Mark has hinted several times). I also happen to like that typeof returns str rather than object... I don't think type inferred code has to restrict itself to what you could dousing *only* declarations. To go out on a hyperbole: Reinventing compiler theory to make things fit better with our current tree and the Pyrex legacy isn't sustainable forever, at some point we should do things the standard way and refactor some code if necesarry. Dag > > >-- >vitja. >_______________________________________________ >cython-devel mailing list >cython-devel at python.org >http://mail.python.org/mailman/listinfo/cython-devel -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. From markflorisson88 at gmail.com Tue May 8 20:36:50 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 8 May 2012 19:36:50 +0100 Subject: [Cython] 0.16.1 Message-ID: Ok, so for the bugfix release 0.16.1 I propose that everyone cherry picks over its own fixes into the release branch (at least Stefan, since your fixes pertain to your newly merged branches and sometimes to the master branch itself). This branch should not be merged back into master, and any additional fixes should go into master and be picked over to release. Some things that should still be fixed: - nonechecks for memoryviews - memoryview documentation - more? We can then shortly-ish after release 0.17 with actual features (and new bugs, lets call those features too), depending on how many bugs are still found in 0.16.1. From markflorisson88 at gmail.com Tue May 8 20:52:56 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 8 May 2012 19:52:56 +0100 Subject: [Cython] 0.16.1 In-Reply-To: References: Message-ID: On 8 May 2012 19:36, mark florisson wrote: > Ok, so for the bugfix release 0.16.1 I propose that everyone cherry > picks over its own fixes into the release branch (at least Stefan, > since your fixes pertain to your newly merged branches and sometimes > to the master branch itself). This branch should not be merged back > into master, and any additional fixes should go into master and be > picked over to release. > > Some things that should still be fixed: > ? ?- nonechecks for memoryviews > ? ?- memoryview documentation > ? ?- more? > > We can then shortly-ish after release 0.17 with actual features (and > new bugs, lets call those features too), depending on how many bugs > are still found in 0.16.1. TBH, if we're actually close to a major release, the usefulness of a bugfix release is imho not that great. From vitja.makarov at gmail.com Tue May 8 21:04:02 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Tue, 8 May 2012 23:04:02 +0400 Subject: [Cython] 0.16.1 In-Reply-To: References: Message-ID: 2012/5/8 mark florisson : > On 8 May 2012 19:36, mark florisson wrote: >> Ok, so for the bugfix release 0.16.1 I propose that everyone cherry >> picks over its own fixes into the release branch (at least Stefan, >> since your fixes pertain to your newly merged branches and sometimes >> to the master branch itself). This branch should not be merged back >> into master, and any additional fixes should go into master and be >> picked over to release. >> >> Some things that should still be fixed: >> ? ?- nonechecks for memoryviews >> ? ?- memoryview documentation >> ? ?- more? >> >> We can then shortly-ish after release 0.17 with actual features (and >> new bugs, lets call those features too), depending on how many bugs >> are still found in 0.16.1. > > TBH, if we're actually close to a major release, the usefulness of a > bugfix release is imho not that great. There are some fixes to generators implementation that depend on "yield from" that can't be easily cherry-picked. So I think you're right about 0.17 release. But new features may introduce new bugs and we'll have to release 0.17.1 soon. -- vitja. From robertwb at gmail.com Wed May 9 00:12:29 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Tue, 8 May 2012 15:12:29 -0700 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> Message-ID: On Tue, May 8, 2012 at 6:47 AM, Vitja Makarov wrote: > 2012/5/8 Stefan Behnel : >> Hi, >> >> Vitja has rebased the type inference on the control flow, so I wonder if >> this will enable us to properly infer this: >> >> ?def partial_validity(): >> ? ?""" >> ? ?>>> partial_validity() >> ? ?('Python object', 'double', 'str object') >> ? ?""" >> ? ?a = 1.0 >> ? ?b = a + 2 ? # definitely double >> ? ?a = 'test' >> ? ?c = a + 'toast' ?# definitely str >> ? ?return typeof(a), typeof(b), typeof(c) >> >> I think, what is mainly needed for this is that a NameNode with an >> undeclared type should not report its own entry as dependency but that of >> its own cf_assignments. Would this work? >> >> (Haven't got the time to try it out right now, so I'm dumping it here.) >> > > Yeah, that might work. The other way to go is to split entries: > > ?def partial_validity(): > ? """ > ? >>> partial_validity() > ? ('str object', 'double', 'str object') > ? """ > ? a_1 = 1.0 > ? b = a_1 + 2 ? # definitely double > ? a_2 = 'test' > ? c = a_2 + 'toast' ?# definitely str > ? return typeof(a_2), typeof(b), typeof(c) > > And this should work better because it allows to infer a_1 as a double > and a_2 as a string. This already works, right? I agree it's nicer in general to split things up, but not being able to optimize a loop variable because it was used earlier or later in a different context is a disadvantage of the current system. - Robert From robertwb at gmail.com Wed May 9 00:16:44 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Tue, 8 May 2012 15:16:44 -0700 Subject: [Cython] 0.16.1 In-Reply-To: References: Message-ID: On Tue, May 8, 2012 at 12:04 PM, Vitja Makarov wrote: > 2012/5/8 mark florisson : >> On 8 May 2012 19:36, mark florisson wrote: >>> Ok, so for the bugfix release 0.16.1 I propose that everyone cherry >>> picks over its own fixes into the release branch (at least Stefan, >>> since your fixes pertain to your newly merged branches and sometimes >>> to the master branch itself). This branch should not be merged back >>> into master, and any additional fixes should go into master and be >>> picked over to release. >>> >>> Some things that should still be fixed: >>> ? ?- nonechecks for memoryviews >>> ? ?- memoryview documentation >>> ? ?- more? >>> >>> We can then shortly-ish after release 0.17 with actual features (and >>> new bugs, lets call those features too), depending on how many bugs >>> are still found in 0.16.1. >> >> TBH, if we're actually close to a major release, the usefulness of a >> bugfix release is imho not that great. > > There are some fixes to generators implementation that depend on > "yield from" that can't be easily cherry-picked. > So I think you're right about 0.17 release. But new features may > introduce new bugs and we'll have to release 0.17.1 soon. If we're looking at doing 0.17 soon, lets just do that. In the future, we could have a bugfix branch that all bugfixes get checked into, regularly merged into master, which we could release more often as x.y.z releases. - Robert From stefan_ml at behnel.de Wed May 9 08:22:07 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 08:22:07 +0200 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> Message-ID: <4FAA0D0F.9090508@behnel.de> Robert Bradshaw, 09.05.2012 00:12: > On Tue, May 8, 2012 at 6:47 AM, Vitja Makarov wrote: >> 2012/5/8 Stefan Behnel: >>> Vitja has rebased the type inference on the control flow, so I wonder if >>> this will enable us to properly infer this: >>> >>> def partial_validity(): >>> """ >>> >>> partial_validity() >>> ('Python object', 'double', 'str object') >>> """ >>> a = 1.0 >>> b = a + 2 # definitely double >>> a = 'test' >>> c = a + 'toast' # definitely str >>> return typeof(a), typeof(b), typeof(c) >>> >>> I think, what is mainly needed for this is that a NameNode with an >>> undeclared type should not report its own entry as dependency but that of >>> its own cf_assignments. Would this work? >>> >>> (Haven't got the time to try it out right now, so I'm dumping it here.) >>> >> >> Yeah, that might work. The other way to go is to split entries: >> >> def partial_validity(): >> """ >> >>> partial_validity() >> ('str object', 'double', 'str object') >> """ >> a_1 = 1.0 >> b = a_1 + 2 # definitely double >> a_2 = 'test' >> c = a_2 + 'toast' # definitely str >> return typeof(a_2), typeof(b), typeof(c) >> >> And this should work better because it allows to infer a_1 as a double >> and a_2 as a string. > > This already works, right? It would work if it was implemented. *wink* > I agree it's nicer in general to split > things up, but not being able to optimize a loop variable because it > was used earlier or later in a different context is a disadvantage of > the current system. Absolutely. I was considering entry splitting more of a "soon, maybe not now" type of thing because it isn't entire clear to me what needs to be done. It may not even be all that hard to implement, but I think it's more than just a local change in the scope implementation because the current lookup_here() doesn't know what node is asking. Stefan From stefan_ml at behnel.de Wed May 9 08:41:03 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 08:41:03 +0200 Subject: [Cython] 0.16.1 In-Reply-To: References: Message-ID: <4FAA117F.1080402@behnel.de> Robert Bradshaw, 09.05.2012 00:16: > On Tue, May 8, 2012 at 12:04 PM, Vitja Makarov wrote: >> 2012/5/8 mark florisson: >>> On 8 May 2012 19:36, mark florisson wrote: >>>> Ok, so for the bugfix release 0.16.1 I propose that everyone cherry >>>> picks over its own fixes into the release branch (at least Stefan, >>>> since your fixes pertain to your newly merged branches and sometimes >>>> to the master branch itself). This branch should not be merged back >>>> into master, and any additional fixes should go into master and be >>>> picked over to release. >>>> >>>> Some things that should still be fixed: >>>> - nonechecks for memoryviews >>>> - memoryview documentation >>>> - more? >>>> >>>> We can then shortly-ish after release 0.17 with actual features (and >>>> new bugs, lets call those features too), depending on how many bugs >>>> are still found in 0.16.1. >>> >>> TBH, if we're actually close to a major release, the usefulness of a >>> bugfix release is imho not that great. >> >> There are some fixes to generators implementation that depend on >> "yield from" that can't be easily cherry-picked. >> So I think you're right about 0.17 release. But new features may >> introduce new bugs and we'll have to release 0.17.1 soon. > > If we're looking at doing 0.17 soon, lets just do that. I think it's close enough to be released. I'll try to get around to list the changes in the release notes (and maybe even add a note about alpha quality PyPy support to the docs), but I wouldn't mind if someone else was quicker, at least for a start. ;) > In the future, > we could have a bugfix branch that all bugfixes get checked into, > regularly merged into master, which we could release more often as > x.y.z releases. +11. We have the release branch for that, it just hasn't been used much since the last release. I also don't mind releasing a 0.16.1 shortly before (or even after) a 0.17. Distributors (e.g. Debian) often try to stick to a given release series during their support time frame (usually more than a year), so unless we release fixes, they'll end up cherry picking or porting their own fixes, each on their own. Applying at least the obvious fixes to the release branch and then merging it into the master from there would make it easier for them. Stefan From stefan_ml at behnel.de Wed May 9 09:19:36 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 09:19:36 +0200 Subject: [Cython] callable() optimization In-Reply-To: References: Message-ID: <4FAA1A88.8020801@behnel.de> Vitja Makarov, 08.05.2012 13:27: > I've noticed regression related to callable() optimization. > > https://github.com/cython/cython/commit/a40112b0461eae5ab22fbdd07ae798d4a72ff523 > > class C: > pass > print callable(C()) > > It prints True optimized version checks ((obj)->ob_type->tp_call != > NULL) condition that is True for both class and instance. > > >>> help(callable) > callable(...) > callable(object) -> bool > > Return whether the object is callable (i.e., some kind of function). > Note that classes are callable, as are instances with a __call__() method. Ah, right - old style classes are special cased in Py2. I'll make this a Py3-only optimisation then. Stefan From vitja.makarov at gmail.com Wed May 9 09:43:58 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 11:43:58 +0400 Subject: [Cython] callable() optimization In-Reply-To: <4FAA1A88.8020801@behnel.de> References: <4FAA1A88.8020801@behnel.de> Message-ID: 2012/5/9 Stefan Behnel : > Vitja Makarov, 08.05.2012 13:27: >> I've noticed regression related to callable() optimization. >> >> https://github.com/cython/cython/commit/a40112b0461eae5ab22fbdd07ae798d4a72ff523 >> >> class C: >> ? ? pass >> print callable(C()) >> >> It prints True optimized version checks ((obj)->ob_type->tp_call != >> NULL) condition that is True for both class and instance. >> >> >>> help(callable) >> callable(...) >> ? ? callable(object) -> bool >> >> ? ? Return whether the object is callable (i.e., some kind of function). >> ? ? Note that classes are callable, as are instances with a __call__() method. > > Ah, right - old style classes are special cased in Py2. > > I'll make this a Py3-only optimisation then. > I don't see difference between py2 and py3 here: Python 3.2.3 (default, May 3 2012, 15:51:42) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> class Foo: pass ... >>> callable(Foo()) False >>> There is PyCallable_Check() CPython function: int PyCallable_Check(PyObject *x) { if (x == NULL) return 0; if (PyInstance_Check(x)) { PyObject *call = PyObject_GetAttrString(x, "__call__"); if (call == NULL) { PyErr_Clear(); return 0; } /* Could test recursively but don't, for fear of endless recursion if some joker sets self.__call__ = self */ Py_DECREF(call); return 1; } else { return x->ob_type->tp_call != NULL; } } -- vitja. From stefan_ml at behnel.de Wed May 9 09:49:04 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 09:49:04 +0200 Subject: [Cython] 0.16.1 In-Reply-To: <4FAA117F.1080402@behnel.de> References: <4FAA117F.1080402@behnel.de> Message-ID: <4FAA2170.4050808@behnel.de> Stefan Behnel, 09.05.2012 08:41: > Robert Bradshaw, 09.05.2012 00:16: >> If we're looking at doing 0.17 soon, lets just do that. > I think it's close enough to be released. ... although one thing I just noticed is that the "numpy_memoryview" test is still disabled because it lead to crashes in recent Py3.2 releases (and thus most likely also in the latest Py3k). Not sure if it still crashes, but should be checked before going for a release. Stefan From vitja.makarov at gmail.com Wed May 9 10:02:20 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 12:02:20 +0400 Subject: [Cython] CF based type inference In-Reply-To: <4FAA0D0F.9090508@behnel.de> References: <4FA91079.5090503@behnel.de> <4FAA0D0F.9090508@behnel.de> Message-ID: 2012/5/9 Stefan Behnel : > Robert Bradshaw, 09.05.2012 00:12: >> On Tue, May 8, 2012 at 6:47 AM, Vitja Makarov wrote: >>> 2012/5/8 Stefan Behnel: >>>> Vitja has rebased the type inference on the control flow, so I wonder if >>>> this will enable us to properly infer this: >>>> >>>> ?def partial_validity(): >>>> ? ?""" >>>> ? ?>>> partial_validity() >>>> ? ?('Python object', 'double', 'str object') >>>> ? ?""" >>>> ? ?a = 1.0 >>>> ? ?b = a + 2 ? # definitely double >>>> ? ?a = 'test' >>>> ? ?c = a + 'toast' ?# definitely str >>>> ? ?return typeof(a), typeof(b), typeof(c) >>>> >>>> I think, what is mainly needed for this is that a NameNode with an >>>> undeclared type should not report its own entry as dependency but that of >>>> its own cf_assignments. Would this work? >>>> >>>> (Haven't got the time to try it out right now, so I'm dumping it here.) >>>> >>> >>> Yeah, that might work. The other way to go is to split entries: >>> >>> ?def partial_validity(): >>> ? """ >>> ? >>> partial_validity() >>> ? ('str object', 'double', 'str object') >>> ? """ >>> ? a_1 = 1.0 >>> ? b = a_1 + 2 ? # definitely double >>> ? a_2 = 'test' >>> ? c = a_2 + 'toast' ?# definitely str >>> ? return typeof(a_2), typeof(b), typeof(c) >>> >>> And this should work better because it allows to infer a_1 as a double >>> and a_2 as a string. >> >> This already works, right? > > It would work if it was implemented. *wink* > > >> I agree it's nicer in general to split >> things up, but not being able to optimize a loop variable because it >> was used earlier or later in a different context is a disadvantage of >> the current system. > > Absolutely. I was considering entry splitting more of a "soon, maybe not > now" type of thing because it isn't entire clear to me what needs to be > done. It may not even be all that hard to implement, but I think it's more > than just a local change in the scope implementation because the current > lookup_here() doesn't know what node is asking. > That could be done the following way: - Before running type inference find independent assignment groups and split entries - Run type infrerence - Join entries of the same type or of PyObject base type - Then change names to private ones "{old_name}.{index}" -- vitja. From stefan_ml at behnel.de Wed May 9 10:02:51 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 10:02:51 +0200 Subject: [Cython] callable() optimization In-Reply-To: References: <4FAA1A88.8020801@behnel.de> Message-ID: <4FAA24AB.5010101@behnel.de> Vitja Makarov, 09.05.2012 09:43: > 2012/5/9 Stefan Behnel : >> Vitja Makarov, 08.05.2012 13:27: >>> I've noticed regression related to callable() optimization. >>> >>> https://github.com/cython/cython/commit/a40112b0461eae5ab22fbdd07ae798d4a72ff523 >>> >>> class C: >>> pass >>> print callable(C()) >>> >>> It prints True optimized version checks ((obj)->ob_type->tp_call != >>> NULL) condition that is True for both class and instance. >>> >>>>>> help(callable) >>> callable(...) >>> callable(object) -> bool >>> >>> Return whether the object is callable (i.e., some kind of function). >>> Note that classes are callable, as are instances with a __call__() method. >> >> Ah, right - old style classes are special cased in Py2. >> >> I'll make this a Py3-only optimisation then. >> > > I don't see difference between py2 and py3 here: > > Python 3.2.3 (default, May 3 2012, 15:51:42) > [GCC 4.6.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> class Foo: pass > ... > >>> callable(Foo()) > False > >>> > > There is PyCallable_Check() CPython function: > > int > PyCallable_Check(PyObject *x) > { > if (x == NULL) > return 0; > if (PyInstance_Check(x)) { > PyObject *call = PyObject_GetAttrString(x, "__call__"); > if (call == NULL) { > PyErr_Clear(); > return 0; > } > /* Could test recursively but don't, for fear of endless > recursion if some joker sets self.__call__ = self */ > Py_DECREF(call); > return 1; > } > else { > return x->ob_type->tp_call != NULL; > } > } That's the Py2 version. In Py3, it looks as follows, because old-style "instances" no longer exist: """ int PyCallable_Check(PyObject *x) { if (x == NULL) return 0; return x->ob_type->tp_call != NULL; } """ That's what I had initially based my optimisation on. Stefan From vitja.makarov at gmail.com Wed May 9 10:21:54 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 12:21:54 +0400 Subject: [Cython] callable() optimization In-Reply-To: <4FAA24AB.5010101@behnel.de> References: <4FAA1A88.8020801@behnel.de> <4FAA24AB.5010101@behnel.de> Message-ID: 2012/5/9 Stefan Behnel : > Vitja Makarov, 09.05.2012 09:43: >> 2012/5/9 Stefan Behnel : >>> Vitja Makarov, 08.05.2012 13:27: >>>> I've noticed regression related to callable() optimization. >>>> >>>> https://github.com/cython/cython/commit/a40112b0461eae5ab22fbdd07ae798d4a72ff523 >>>> >>>> class C: >>>> ? ? pass >>>> print callable(C()) >>>> >>>> It prints True optimized version checks ((obj)->ob_type->tp_call != >>>> NULL) condition that is True for both class and instance. >>>> >>>>>>> help(callable) >>>> callable(...) >>>> ? ? callable(object) -> bool >>>> >>>> ? ? Return whether the object is callable (i.e., some kind of function). >>>> ? ? Note that classes are callable, as are instances with a __call__() method. >>> >>> Ah, right - old style classes are special cased in Py2. >>> >>> I'll make this a Py3-only optimisation then. >>> >> >> I don't see difference between py2 and py3 here: >> >> Python 3.2.3 (default, May ?3 2012, 15:51:42) >> [GCC 4.6.3] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >> >>> class Foo: pass >> ... >> >>> callable(Foo()) >> False >> >>> >> >> There is PyCallable_Check() CPython function: >> >> int >> PyCallable_Check(PyObject *x) >> { >> ? ? if (x == NULL) >> ? ? ? ? return 0; >> ? ? if (PyInstance_Check(x)) { >> ? ? ? ? PyObject *call = PyObject_GetAttrString(x, "__call__"); >> ? ? ? ? if (call == NULL) { >> ? ? ? ? ? ? PyErr_Clear(); >> ? ? ? ? ? ? return 0; >> ? ? ? ? } >> ? ? ? ? /* Could test recursively but don't, for fear of endless >> ? ? ? ? ? ?recursion if some joker sets self.__call__ = self */ >> ? ? ? ? Py_DECREF(call); >> ? ? ? ? return 1; >> ? ? } >> ? ? else { >> ? ? ? ? return x->ob_type->tp_call != NULL; >> ? ? } >> } > > That's the Py2 version. In Py3, it looks as follows, because old-style > "instances" no longer exist: > > """ > int > PyCallable_Check(PyObject *x) > { > ? ? ? ?if (x == NULL) > ? ? ? ? ? ? ? ?return 0; > ? ? ? ?return x->ob_type->tp_call != NULL; > } > """ > > That's what I had initially based my optimisation on. > Ok, so why don't you want to use PyCallable_Check() in all cases? -- vitja. From markflorisson88 at gmail.com Wed May 9 10:28:31 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 09:28:31 +0100 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA0D0F.9090508@behnel.de> Message-ID: On 9 May 2012 09:02, Vitja Makarov wrote: > 2012/5/9 Stefan Behnel : >> Robert Bradshaw, 09.05.2012 00:12: >>> On Tue, May 8, 2012 at 6:47 AM, Vitja Makarov wrote: >>>> 2012/5/8 Stefan Behnel: >>>>> Vitja has rebased the type inference on the control flow, so I wonder if >>>>> this will enable us to properly infer this: >>>>> >>>>> ?def partial_validity(): >>>>> ? ?""" >>>>> ? ?>>> partial_validity() >>>>> ? ?('Python object', 'double', 'str object') >>>>> ? ?""" >>>>> ? ?a = 1.0 >>>>> ? ?b = a + 2 ? # definitely double >>>>> ? ?a = 'test' >>>>> ? ?c = a + 'toast' ?# definitely str >>>>> ? ?return typeof(a), typeof(b), typeof(c) >>>>> >>>>> I think, what is mainly needed for this is that a NameNode with an >>>>> undeclared type should not report its own entry as dependency but that of >>>>> its own cf_assignments. Would this work? >>>>> >>>>> (Haven't got the time to try it out right now, so I'm dumping it here.) >>>>> >>>> >>>> Yeah, that might work. The other way to go is to split entries: >>>> >>>> ?def partial_validity(): >>>> ? """ >>>> ? >>> partial_validity() >>>> ? ('str object', 'double', 'str object') >>>> ? """ >>>> ? a_1 = 1.0 >>>> ? b = a_1 + 2 ? # definitely double >>>> ? a_2 = 'test' >>>> ? c = a_2 + 'toast' ?# definitely str >>>> ? return typeof(a_2), typeof(b), typeof(c) >>>> >>>> And this should work better because it allows to infer a_1 as a double >>>> and a_2 as a string. >>> >>> This already works, right? >> >> It would work if it was implemented. *wink* >> >> >>> I agree it's nicer in general to split >>> things up, but not being able to optimize a loop variable because it >>> was used earlier or later in a different context is a disadvantage of >>> the current system. >> >> Absolutely. I was considering entry splitting more of a "soon, maybe not >> now" type of thing because it isn't entire clear to me what needs to be >> done. It may not even be all that hard to implement, but I think it's more >> than just a local change in the scope implementation because the current >> lookup_here() doesn't know what node is asking. >> > > That could be done the following way: > ?- Before running type inference find independent assignment groups > and split entries > ?- Run type infrerence > ?- Join entries of the same type or of PyObject base type > ?- Then change names to private ones "{old_name}.{index}" Sounds like a good approach. Do you think it would be useful if a variable can be type inferred at some point, but at no other point in the function, to specialize for both the first type you find and object? i.e. i = 0 while something: use i i = something_not_inferred() and specialize on 'i' being an int? Bonus points maybe :) If these entries are different depending on control flow, it's basically a form of ssa, which is cool. Then optimizations like none-checking, boundschecking, wraparound etc can, for each new variable insert a single check (for bounds checking it depends on the entire expression, but...). The only thing I'm not entirely sure about is this when the user eliminates your check through try/finally or try/except, e.g. try: buf[i] except IndexError: print "no worries" buf[i] Here you basically want a new (virtual) reference of "i". Maybe that could just be handled in the optimization transform though, where it invalidates the previous check (especially since there is no assignment here). > -- > vitja. > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Wed May 9 10:32:00 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 09:32:00 +0100 Subject: [Cython] 0.16.1 In-Reply-To: <4FAA2170.4050808@behnel.de> References: <4FAA117F.1080402@behnel.de> <4FAA2170.4050808@behnel.de> Message-ID: On 9 May 2012 08:49, Stefan Behnel wrote: > Stefan Behnel, 09.05.2012 08:41: >> Robert Bradshaw, 09.05.2012 00:16: >>> If we're looking at doing 0.17 soon, lets just do that. >> I think it's close enough to be released. > > ... although one thing I just noticed is that the "numpy_memoryview" test > is still disabled because it lead to crashes in recent Py3.2 releases (and > thus most likely also in the latest Py3k). Not sure if it still crashes, > but should be checked before going for a release. Hm, all the tests or just one? Was that the problem with gc_refs != 0? That should be fixed now. > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Wed May 9 10:33:27 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 09:33:27 +0100 Subject: [Cython] 0.16.1 In-Reply-To: <4FAA117F.1080402@behnel.de> References: <4FAA117F.1080402@behnel.de> Message-ID: On 9 May 2012 07:41, Stefan Behnel wrote: > Robert Bradshaw, 09.05.2012 00:16: >> On Tue, May 8, 2012 at 12:04 PM, Vitja Makarov wrote: >>> 2012/5/8 mark florisson: >>>> On 8 May 2012 19:36, mark florisson wrote: >>>>> Ok, so for the bugfix release 0.16.1 I propose that everyone cherry >>>>> picks over its own fixes into the release branch (at least Stefan, >>>>> since your fixes pertain to your newly merged branches and sometimes >>>>> to the master branch itself). This branch should not be merged back >>>>> into master, and any additional fixes should go into master and be >>>>> picked over to release. >>>>> >>>>> Some things that should still be fixed: >>>>> ? ?- nonechecks for memoryviews >>>>> ? ?- memoryview documentation >>>>> ? ?- more? >>>>> >>>>> We can then shortly-ish after release 0.17 with actual features (and >>>>> new bugs, lets call those features too), depending on how many bugs >>>>> are still found in 0.16.1. >>>> >>>> TBH, if we're actually close to a major release, the usefulness of a >>>> bugfix release is imho not that great. >>> >>> There are some fixes to generators implementation that depend on >>> "yield from" that can't be easily cherry-picked. >>> So I think you're right about 0.17 release. But new features may >>> introduce new bugs and we'll have to release 0.17.1 soon. >> >> If we're looking at doing 0.17 soon, lets just do that. > > I think it's close enough to be released. I'll try to get around to list > the changes in the release notes (and maybe even add a note about alpha > quality PyPy support to the docs), but I wouldn't mind if someone else was > quicker, at least for a start. ;) > > >> In the future, >> we could have a bugfix branch that all bugfixes get checked into, >> regularly merged into master, which we could release more often as >> x.y.z releases. > > +11. We have the release branch for that, it just hasn't been used much > since the last release. Yeah, I like it too. It's much easier than cherry-picking stuff over in a large history, where fixes may depend (partially) on features. > I also don't mind releasing a 0.16.1 shortly before (or even after) a 0.17. > Distributors (e.g. Debian) often try to stick to a given release series > during their support time frame (usually more than a year), so unless we > release fixes, they'll end up cherry picking or porting their own fixes, > each on their own. Applying at least the obvious fixes to the release > branch and then merging it into the master from there would make it easier > for them. Debian stable? :) Good point though, I think we can manage that. > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Wed May 9 10:35:59 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 10:35:59 +0200 Subject: [Cython] callable() optimization In-Reply-To: References: <4FAA1A88.8020801@behnel.de> <4FAA24AB.5010101@behnel.de> Message-ID: <4FAA2C6F.5040501@behnel.de> Vitja Makarov, 09.05.2012 10:21: > 2012/5/9 Stefan Behnel : >> Vitja Makarov, 09.05.2012 09:43: >>> 2012/5/9 Stefan Behnel : >>>> Vitja Makarov, 08.05.2012 13:27: >>>>> I've noticed regression related to callable() optimization. >>>>> >>>>> https://github.com/cython/cython/commit/a40112b0461eae5ab22fbdd07ae798d4a72ff523 >>>>> >>>>> class C: >>>>> pass >>>>> print callable(C()) >>>>> >>>>> It prints True optimized version checks ((obj)->ob_type->tp_call != >>>>> NULL) condition that is True for both class and instance. >>>>> >>>>>>>> help(callable) >>>>> callable(...) >>>>> callable(object) -> bool >>>>> >>>>> Return whether the object is callable (i.e., some kind of function). >>>>> Note that classes are callable, as are instances with a __call__() method. >>>> >>>> Ah, right - old style classes are special cased in Py2. >>>> >>>> I'll make this a Py3-only optimisation then. >>>> >>> >>> I don't see difference between py2 and py3 here: >>> >>> Python 3.2.3 (default, May 3 2012, 15:51:42) >>> [GCC 4.6.3] on linux2 >>> Type "help", "copyright", "credits" or "license" for more information. >>>>>> class Foo: pass >>> ... >>>>>> callable(Foo()) >>> False >>>>>> >>> >>> There is PyCallable_Check() CPython function: >>> >>> int >>> PyCallable_Check(PyObject *x) >>> { >>> if (x == NULL) >>> return 0; >>> if (PyInstance_Check(x)) { >>> PyObject *call = PyObject_GetAttrString(x, "__call__"); >>> if (call == NULL) { >>> PyErr_Clear(); >>> return 0; >>> } >>> /* Could test recursively but don't, for fear of endless >>> recursion if some joker sets self.__call__ = self */ >>> Py_DECREF(call); >>> return 1; >>> } >>> else { >>> return x->ob_type->tp_call != NULL; >>> } >>> } >> >> That's the Py2 version. In Py3, it looks as follows, because old-style >> "instances" no longer exist: >> >> """ >> int >> PyCallable_Check(PyObject *x) >> { >> if (x == NULL) >> return 0; >> return x->ob_type->tp_call != NULL; >> } >> """ >> >> That's what I had initially based my optimisation on. > > Ok, so why don't you want to use PyCallable_Check() in all cases? Well, maybe this isn't performance critical enough to merit inlining. Do you think it matters? Stefan From markflorisson88 at gmail.com Wed May 9 10:37:58 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 09:37:58 +0100 Subject: [Cython] callable() optimization In-Reply-To: <4FAA24AB.5010101@behnel.de> References: <4FAA1A88.8020801@behnel.de> <4FAA24AB.5010101@behnel.de> Message-ID: On 9 May 2012 09:02, Stefan Behnel wrote: > Vitja Makarov, 09.05.2012 09:43: >> 2012/5/9 Stefan Behnel : >>> Vitja Makarov, 08.05.2012 13:27: >>>> I've noticed regression related to callable() optimization. >>>> >>>> https://github.com/cython/cython/commit/a40112b0461eae5ab22fbdd07ae798d4a72ff523 >>>> >>>> class C: >>>> ? ? pass >>>> print callable(C()) >>>> >>>> It prints True optimized version checks ((obj)->ob_type->tp_call != >>>> NULL) condition that is True for both class and instance. >>>> >>>>>>> help(callable) >>>> callable(...) >>>> ? ? callable(object) -> bool >>>> >>>> ? ? Return whether the object is callable (i.e., some kind of function). >>>> ? ? Note that classes are callable, as are instances with a __call__() method. >>> >>> Ah, right - old style classes are special cased in Py2. >>> >>> I'll make this a Py3-only optimisation then. >>> >> >> I don't see difference between py2 and py3 here: >> >> Python 3.2.3 (default, May ?3 2012, 15:51:42) >> [GCC 4.6.3] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >> >>> class Foo: pass >> ... >> >>> callable(Foo()) >> False >> >>> >> >> There is PyCallable_Check() CPython function: >> >> int >> PyCallable_Check(PyObject *x) >> { >> ? ? if (x == NULL) >> ? ? ? ? return 0; >> ? ? if (PyInstance_Check(x)) { >> ? ? ? ? PyObject *call = PyObject_GetAttrString(x, "__call__"); >> ? ? ? ? if (call == NULL) { >> ? ? ? ? ? ? PyErr_Clear(); >> ? ? ? ? ? ? return 0; >> ? ? ? ? } >> ? ? ? ? /* Could test recursively but don't, for fear of endless >> ? ? ? ? ? ?recursion if some joker sets self.__call__ = self */ >> ? ? ? ? Py_DECREF(call); >> ? ? ? ? return 1; >> ? ? } >> ? ? else { >> ? ? ? ? return x->ob_type->tp_call != NULL; >> ? ? } >> } > > That's the Py2 version. In Py3, it looks as follows, because old-style > "instances" no longer exist: > > """ > int > PyCallable_Check(PyObject *x) > { > ? ? ? ?if (x == NULL) > ? ? ? ? ? ? ? ?return 0; > ? ? ? ?return x->ob_type->tp_call != NULL; > } > """ > > That's what I had initially based my optimisation on. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel Huh, so __call__ in a user defined new style class could never end up in ob_type.tp_call right? So how can it avoid that dict lookup? From markflorisson88 at gmail.com Wed May 9 10:44:49 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 09:44:49 +0100 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA0D0F.9090508@behnel.de> Message-ID: On 9 May 2012 09:28, mark florisson wrote: > On 9 May 2012 09:02, Vitja Makarov wrote: >> 2012/5/9 Stefan Behnel : >>> Robert Bradshaw, 09.05.2012 00:12: >>>> On Tue, May 8, 2012 at 6:47 AM, Vitja Makarov wrote: >>>>> 2012/5/8 Stefan Behnel: >>>>>> Vitja has rebased the type inference on the control flow, so I wonder if >>>>>> this will enable us to properly infer this: >>>>>> >>>>>> ?def partial_validity(): >>>>>> ? ?""" >>>>>> ? ?>>> partial_validity() >>>>>> ? ?('Python object', 'double', 'str object') >>>>>> ? ?""" >>>>>> ? ?a = 1.0 >>>>>> ? ?b = a + 2 ? # definitely double >>>>>> ? ?a = 'test' >>>>>> ? ?c = a + 'toast' ?# definitely str >>>>>> ? ?return typeof(a), typeof(b), typeof(c) >>>>>> >>>>>> I think, what is mainly needed for this is that a NameNode with an >>>>>> undeclared type should not report its own entry as dependency but that of >>>>>> its own cf_assignments. Would this work? >>>>>> >>>>>> (Haven't got the time to try it out right now, so I'm dumping it here.) >>>>>> >>>>> >>>>> Yeah, that might work. The other way to go is to split entries: >>>>> >>>>> ?def partial_validity(): >>>>> ? """ >>>>> ? >>> partial_validity() >>>>> ? ('str object', 'double', 'str object') >>>>> ? """ >>>>> ? a_1 = 1.0 >>>>> ? b = a_1 + 2 ? # definitely double >>>>> ? a_2 = 'test' >>>>> ? c = a_2 + 'toast' ?# definitely str >>>>> ? return typeof(a_2), typeof(b), typeof(c) >>>>> >>>>> And this should work better because it allows to infer a_1 as a double >>>>> and a_2 as a string. >>>> >>>> This already works, right? >>> >>> It would work if it was implemented. *wink* >>> >>> >>>> I agree it's nicer in general to split >>>> things up, but not being able to optimize a loop variable because it >>>> was used earlier or later in a different context is a disadvantage of >>>> the current system. >>> >>> Absolutely. I was considering entry splitting more of a "soon, maybe not >>> now" type of thing because it isn't entire clear to me what needs to be >>> done. It may not even be all that hard to implement, but I think it's more >>> than just a local change in the scope implementation because the current >>> lookup_here() doesn't know what node is asking. >>> >> >> That could be done the following way: >> ?- Before running type inference find independent assignment groups >> and split entries >> ?- Run type infrerence >> ?- Join entries of the same type or of PyObject base type >> ?- Then change names to private ones "{old_name}.{index}" > > Sounds like a good approach. Do you think it would be useful if a > variable can be type inferred at some point, but at no other point in > the function, to specialize for both the first type you find and > object? i.e. > > i = 0 > while something: > ? ?use i > ? ?i = something_not_inferred() > > and specialize on 'i' being an int? Bonus points maybe :) > > If these entries are different depending on control flow, it's > basically a form of ssa, which is cool. You could reuse entry cnames if you re-encounter the same type though, but it would be nice if they were different, uniquely referencable objects if they originate from different assignment or merge points. > Then optimizations like > none-checking, boundschecking, wraparound etc can, for each new > variable insert a single check (for bounds checking it depends on the > entire expression, but...). The only thing I'm not entirely sure about > is this when the user eliminates your check through try/finally or > try/except, e.g. > > try: > ? ?buf[i] > except IndexError: > ? ?print "no worries" > > buf[i] > > Here you basically want a new (virtual) reference of "i". Maybe that > could just be handled in the optimization transform though, where it > invalidates the previous check (especially since there is no > assignment here). > >> -- >> vitja. >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Wed May 9 10:47:07 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 10:47:07 +0200 Subject: [Cython] callable() optimization In-Reply-To: References: <4FAA1A88.8020801@behnel.de> <4FAA24AB.5010101@behnel.de> Message-ID: <4FAA2F0B.1010406@behnel.de> mark florisson, 09.05.2012 10:37: > On 9 May 2012 09:02, Stefan Behnel wrote: >> Vitja Makarov, 09.05.2012 09:43: >>> 2012/5/9 Stefan Behnel : >>>> Vitja Makarov, 08.05.2012 13:27: >>>>> I've noticed regression related to callable() optimization. >>>>> >>>>> https://github.com/cython/cython/commit/a40112b0461eae5ab22fbdd07ae798d4a72ff523 >>>>> >>>>> class C: >>>>> pass >>>>> print callable(C()) >>>>> >>>>> It prints True optimized version checks ((obj)->ob_type->tp_call != >>>>> NULL) condition that is True for both class and instance. >>>>> >>>>>>>> help(callable) >>>>> callable(...) >>>>> callable(object) -> bool >>>>> >>>>> Return whether the object is callable (i.e., some kind of function). >>>>> Note that classes are callable, as are instances with a __call__() method. >>>> >>>> Ah, right - old style classes are special cased in Py2. >>>> >>>> I'll make this a Py3-only optimisation then. >>>> >>> >>> I don't see difference between py2 and py3 here: >>> >>> Python 3.2.3 (default, May 3 2012, 15:51:42) >>> [GCC 4.6.3] on linux2 >>> Type "help", "copyright", "credits" or "license" for more information. >>>>>> class Foo: pass >>> ... >>>>>> callable(Foo()) >>> False >>>>>> >>> >>> There is PyCallable_Check() CPython function: >>> >>> int >>> PyCallable_Check(PyObject *x) >>> { >>> if (x == NULL) >>> return 0; >>> if (PyInstance_Check(x)) { >>> PyObject *call = PyObject_GetAttrString(x, "__call__"); >>> if (call == NULL) { >>> PyErr_Clear(); >>> return 0; >>> } >>> /* Could test recursively but don't, for fear of endless >>> recursion if some joker sets self.__call__ = self */ >>> Py_DECREF(call); >>> return 1; >>> } >>> else { >>> return x->ob_type->tp_call != NULL; >>> } >>> } >> >> That's the Py2 version. In Py3, it looks as follows, because old-style >> "instances" no longer exist: >> >> """ >> int >> PyCallable_Check(PyObject *x) >> { >> if (x == NULL) >> return 0; >> return x->ob_type->tp_call != NULL; >> } >> """ >> >> That's what I had initially based my optimisation on. > > Huh, so __call__ in a user defined new style class could never end up > in ob_type.tp_call right? Yes it does. CPython special cases these method names. Stefan From vitja.makarov at gmail.com Wed May 9 10:50:24 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 12:50:24 +0400 Subject: [Cython] callable() optimization In-Reply-To: <4FAA2C6F.5040501@behnel.de> References: <4FAA1A88.8020801@behnel.de> <4FAA24AB.5010101@behnel.de> <4FAA2C6F.5040501@behnel.de> Message-ID: 2012/5/9 Stefan Behnel : > Vitja Makarov, 09.05.2012 10:21: >> 2012/5/9 Stefan Behnel : >>> Vitja Makarov, 09.05.2012 09:43: >>>> 2012/5/9 Stefan Behnel : >>>>> Vitja Makarov, 08.05.2012 13:27: >>>>>> I've noticed regression related to callable() optimization. >>>>>> >>>>>> https://github.com/cython/cython/commit/a40112b0461eae5ab22fbdd07ae798d4a72ff523 >>>>>> >>>>>> class C: >>>>>> ? ? pass >>>>>> print callable(C()) >>>>>> >>>>>> It prints True optimized version checks ((obj)->ob_type->tp_call != >>>>>> NULL) condition that is True for both class and instance. >>>>>> >>>>>>>>> help(callable) >>>>>> callable(...) >>>>>> ? ? callable(object) -> bool >>>>>> >>>>>> ? ? Return whether the object is callable (i.e., some kind of function). >>>>>> ? ? Note that classes are callable, as are instances with a __call__() method. >>>>> >>>>> Ah, right - old style classes are special cased in Py2. >>>>> >>>>> I'll make this a Py3-only optimisation then. >>>>> >>>> >>>> I don't see difference between py2 and py3 here: >>>> >>>> Python 3.2.3 (default, May ?3 2012, 15:51:42) >>>> [GCC 4.6.3] on linux2 >>>> Type "help", "copyright", "credits" or "license" for more information. >>>>>>> class Foo: pass >>>> ... >>>>>>> callable(Foo()) >>>> False >>>>>>> >>>> >>>> There is PyCallable_Check() CPython function: >>>> >>>> int >>>> PyCallable_Check(PyObject *x) >>>> { >>>> ? ? if (x == NULL) >>>> ? ? ? ? return 0; >>>> ? ? if (PyInstance_Check(x)) { >>>> ? ? ? ? PyObject *call = PyObject_GetAttrString(x, "__call__"); >>>> ? ? ? ? if (call == NULL) { >>>> ? ? ? ? ? ? PyErr_Clear(); >>>> ? ? ? ? ? ? return 0; >>>> ? ? ? ? } >>>> ? ? ? ? /* Could test recursively but don't, for fear of endless >>>> ? ? ? ? ? ?recursion if some joker sets self.__call__ = self */ >>>> ? ? ? ? Py_DECREF(call); >>>> ? ? ? ? return 1; >>>> ? ? } >>>> ? ? else { >>>> ? ? ? ? return x->ob_type->tp_call != NULL; >>>> ? ? } >>>> } >>> >>> That's the Py2 version. In Py3, it looks as follows, because old-style >>> "instances" no longer exist: >>> >>> """ >>> int >>> PyCallable_Check(PyObject *x) >>> { >>> ? ? ? ?if (x == NULL) >>> ? ? ? ? ? ? ? ?return 0; >>> ? ? ? ?return x->ob_type->tp_call != NULL; >>> } >>> """ >>> >>> That's what I had initially based my optimisation on. >> >> Ok, so why don't you want to use PyCallable_Check() in all cases? > > Well, maybe this isn't performance critical enough to merit inlining. Do > you think it matters? > Py3k case is quite simple expression so I think it may be inlined. On the other hand it's not often used. -- vitja. From stefan_ml at behnel.de Wed May 9 10:51:27 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 10:51:27 +0200 Subject: [Cython] 0.16.1 In-Reply-To: <4FAA117F.1080402@behnel.de> References: <4FAA117F.1080402@behnel.de> Message-ID: <4FAA300F.80507@behnel.de> Stefan Behnel, 09.05.2012 08:41: > Robert Bradshaw, 09.05.2012 00:16: >> If we're looking at doing 0.17 soon, lets just do that. > > I think it's close enough to be released. I'll try to get around to list > the changes in the release notes (and maybe even add a note about alpha > quality PyPy support to the docs), but I wouldn't mind if someone else was > quicker, at least for a start. ;) Well, here's a start: http://wiki.cython.org/ReleaseNotes-0.17 Please add to it if you see anything missing. Stefan From stefan_ml at behnel.de Wed May 9 10:57:28 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 10:57:28 +0200 Subject: [Cython] 0.16.1 In-Reply-To: <4FAA300F.80507@behnel.de> References: <4FAA117F.1080402@behnel.de> <4FAA300F.80507@behnel.de> Message-ID: <4FAA3178.70805@behnel.de> Stefan Behnel, 09.05.2012 10:51: > Stefan Behnel, 09.05.2012 08:41: >> Robert Bradshaw, 09.05.2012 00:16: >>> If we're looking at doing 0.17 soon, lets just do that. >> >> I think it's close enough to be released. I'll try to get around to list >> the changes in the release notes (and maybe even add a note about alpha >> quality PyPy support to the docs), but I wouldn't mind if someone else was >> quicker, at least for a start. ;) > > Well, here's a start: > > http://wiki.cython.org/ReleaseNotes-0.17 Oh, and I think this makes it pretty clear that this is a 0.17 and not a 0.16.1. Stefan From stefan_ml at behnel.de Wed May 9 11:13:48 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 11:13:48 +0200 Subject: [Cython] 0.17 In-Reply-To: <4FA6C17B.1080303@behnel.de> References: <4FA6BA3C.3030001@astro.uio.no> <4FA6C17B.1080303@behnel.de> Message-ID: <4FAA354C.1040401@behnel.de> Stefan Behnel, 06.05.2012 20:22: > Dag Sverre Seljebotn, 06.05.2012 19:51: >> On 05/06/2012 04:28 PM, mark florisson wrote: >>> I think we already have quite a bit of functionality (nearly) ready, >>> after merging some pending pull requests maybe it will be a good time >>> for a 0.17 release? I think it would be good to also document to what >>> extent pypy support works, what works and what doesn't. Stefan, since >>> you added a large majority of the features, would you want to be the >>> release manager? >>> >>> In summary, the following pull requests should likely go in >>> - array.array support (unless further discussion prevents that) >>> - fused types runtime buffer dispatch >>> - newaxis >>> - more? >> >> >> Sounds more like a 0.16.1? (Did we have any rules for that -- except the >> obvious one that breaking backwards compatibility in noticeable ways has to >> increment the major?) > > Those are only the pending pull requests, the current feature set in the > master branch is way larger than that. I'll start writing up the release > notes soon. Reviving this thread because it's the proper one to discuss 0.17 (instead of the "0.16.1" thread). So, here are the release notes so far: http://wiki.cython.org/ReleaseNotes-0.17 There are a couple of bugs targeted for 0.17 that have not been closed (or worked on?) yet. Please look through them as well to see if they a) have been fixed, b) will be fixed soon or c) should be postponed. Stefan From stefan_ml at behnel.de Wed May 9 11:34:18 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 11:34:18 +0200 Subject: [Cython] CF based type inference In-Reply-To: <4FA91079.5090503@behnel.de> References: <4FA91079.5090503@behnel.de> Message-ID: <4FAA3A1A.7070008@behnel.de> Stefan Behnel, 08.05.2012 14:24: > Vitja has rebased the type inference on the control flow On a related note, is this fixable now? def test(): x = 1 # inferred as int del x # error: Deletion of non-Python, non-C++ object http://trac.cython.org/cython_trac/ticket/768 It might be enough to infer "object" for names that are being del-ed for now, and to fix "del" The Right Way when we split entries. Stefan From stefan_ml at behnel.de Wed May 9 11:43:26 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 11:43:26 +0200 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: <4FAA3C3E.6020100@behnel.de> mark florisson, 06.05.2012 16:28: > I think we already have quite a bit of functionality (nearly) ready, > after merging some pending pull requests maybe it will be a good time > for a 0.17 release? I think it would be good to also document to what > extent pypy support works, what works and what doesn't. Stefan, since > you added a large majority of the features, would you want to be the > release manager? > > In summary, the following pull requests should likely go in > - array.array support (unless further discussion prevents that) > - fused types runtime buffer dispatch > - newaxis > - more? Looks like it was not a good idea to disable the numpy_memoryview tests: """ numpy_memoryview.cpp: In function ?PyObject* __pyx_pf_16numpy_memoryview_32test_coerce_to_numpy(PyObject*)?: numpy_memoryview.cpp:15069: error: cannot convert ?td_h_short*? to ?int*? in assignment numpy_memoryview.cpp:15118: error: cannot convert ?td_h_double*? to ?float*? in assignment """ https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests/BACKEND=cpp,PYVERSION=py32-ext/374/consoleFull Stefan From vitja.makarov at gmail.com Wed May 9 14:21:37 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 16:21:37 +0400 Subject: [Cython] CF based type inference In-Reply-To: <4FAA3A1A.7070008@behnel.de> References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> Message-ID: 2012/5/9 Stefan Behnel : > Stefan Behnel, 08.05.2012 14:24: >> Vitja has rebased the type inference on the control flow > > On a related note, is this fixable now? > > ?def test(): > ? ? ?x = 1 ? ?# inferred as int > ? ? ?del x ? ?# error: Deletion of non-Python, non-C++ object > > http://trac.cython.org/cython_trac/ticket/768 > > It might be enough to infer "object" for names that are being del-ed for > now, and to fix "del" The Right Way when we split entries. > Do you mean that `x` should be inferred as "python object" in your example? Yes, we may add workaround for del case. Del is represented now by NameDeletion with the same rhs and lhs. We can add method infer_type() to NameAssignment and use it instead of Node.infer_type() -- vitja. From vitja.makarov at gmail.com Wed May 9 14:39:38 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 16:39:38 +0400 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> Message-ID: 2012/5/9 Vitja Makarov : > 2012/5/9 Stefan Behnel : >> Stefan Behnel, 08.05.2012 14:24: >>> Vitja has rebased the type inference on the control flow >> >> On a related note, is this fixable now? >> >> ?def test(): >> ? ? ?x = 1 ? ?# inferred as int >> ? ? ?del x ? ?# error: Deletion of non-Python, non-C++ object >> >> http://trac.cython.org/cython_trac/ticket/768 >> >> It might be enough to infer "object" for names that are being del-ed for >> now, and to fix "del" The Right Way when we split entries. >> > > Do you mean that `x` should be inferred as "python object" in your example? > > Yes, we may add workaround for del case. > Del is represented now by NameDeletion with the same rhs and lhs. > > We can add method infer_type() to NameAssignment and use it instead of > Node.infer_type() > > Here I've tried to fix it, now deletion always infers as python_object https://github.com/vitek/cython/commit/225c9c60bed6406db46e87da31596e053056f8b7 That may break C++ object deletion -- vitja. From markflorisson88 at gmail.com Wed May 9 14:43:17 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 13:43:17 +0100 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> Message-ID: On 9 May 2012 13:39, Vitja Makarov wrote: > 2012/5/9 Vitja Makarov : >> 2012/5/9 Stefan Behnel : >>> Stefan Behnel, 08.05.2012 14:24: >>>> Vitja has rebased the type inference on the control flow >>> >>> On a related note, is this fixable now? >>> >>> ?def test(): >>> ? ? ?x = 1 ? ?# inferred as int >>> ? ? ?del x ? ?# error: Deletion of non-Python, non-C++ object >>> >>> http://trac.cython.org/cython_trac/ticket/768 >>> >>> It might be enough to infer "object" for names that are being del-ed for >>> now, and to fix "del" The Right Way when we split entries. >>> >> >> Do you mean that `x` should be inferred as "python object" in your example? >> >> Yes, we may add workaround for del case. >> Del is represented now by NameDeletion with the same rhs and lhs. >> >> We can add method infer_type() to NameAssignment and use it instead of >> Node.infer_type() >> >> > > Here I've tried to fix it, now deletion always infers as python_object > > https://github.com/vitek/cython/commit/225c9c60bed6406db46e87da31596e053056f8b7 > > > That may break C++ object deletion > > -- > vitja. > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel Memoryviews can be deleted as well. From vitja.makarov at gmail.com Wed May 9 14:47:59 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 16:47:59 +0400 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> Message-ID: 2012/5/9 mark florisson : > On 9 May 2012 13:39, Vitja Makarov wrote: >> 2012/5/9 Vitja Makarov : >>> 2012/5/9 Stefan Behnel : >>>> Stefan Behnel, 08.05.2012 14:24: >>>>> Vitja has rebased the type inference on the control flow >>>> >>>> On a related note, is this fixable now? >>>> >>>> ?def test(): >>>> ? ? ?x = 1 ? ?# inferred as int >>>> ? ? ?del x ? ?# error: Deletion of non-Python, non-C++ object >>>> >>>> http://trac.cython.org/cython_trac/ticket/768 >>>> >>>> It might be enough to infer "object" for names that are being del-ed for >>>> now, and to fix "del" The Right Way when we split entries. >>>> >>> >>> Do you mean that `x` should be inferred as "python object" in your example? >>> >>> Yes, we may add workaround for del case. >>> Del is represented now by NameDeletion with the same rhs and lhs. >>> >>> We can add method infer_type() to NameAssignment and use it instead of >>> Node.infer_type() >>> >>> >> >> Here I've tried to fix it, now deletion always infers as python_object >> >> https://github.com/vitek/cython/commit/225c9c60bed6406db46e87da31596e053056f8b7 >> >> >> That may break C++ object deletion >> > > Memoryviews can be deleted as well. That code is run for entries with unspecified_type only -- vitja. From vitja.makarov at gmail.com Wed May 9 14:58:03 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 16:58:03 +0400 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> Message-ID: 2012/5/9 Vitja Makarov : > 2012/5/9 mark florisson : >> On 9 May 2012 13:39, Vitja Makarov wrote: >>> 2012/5/9 Vitja Makarov : >>>> 2012/5/9 Stefan Behnel : >>>>> Stefan Behnel, 08.05.2012 14:24: >>>>>> Vitja has rebased the type inference on the control flow >>>>> >>>>> On a related note, is this fixable now? >>>>> >>>>> ?def test(): >>>>> ? ? ?x = 1 ? ?# inferred as int >>>>> ? ? ?del x ? ?# error: Deletion of non-Python, non-C++ object >>>>> >>>>> http://trac.cython.org/cython_trac/ticket/768 >>>>> >>>>> It might be enough to infer "object" for names that are being del-ed for >>>>> now, and to fix "del" The Right Way when we split entries. >>>>> >>>> >>>> Do you mean that `x` should be inferred as "python object" in your example? >>>> >>>> Yes, we may add workaround for del case. >>>> Del is represented now by NameDeletion with the same rhs and lhs. >>>> >>>> We can add method infer_type() to NameAssignment and use it instead of >>>> Node.infer_type() >>>> >>>> >>> >>> Here I've tried to fix it, now deletion always infers as python_object >>> >>> https://github.com/vitek/cython/commit/225c9c60bed6406db46e87da31596e053056f8b7 >>> >>> >>> That may break C++ object deletion >>> >> >> Memoryviews can be deleted as well. > > > That code is run for entries with unspecified_type only > > Yeah, this code doesn't work now: cdef extern from "foo.h": cdef cppclass Foo: Foo() def foo(): foo = new Foo() print typeof(foo) del foo And I'm not sure how to fix it. -- vitja. From vitja.makarov at gmail.com Wed May 9 15:16:25 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 17:16:25 +0400 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> Message-ID: 2012/5/9 Vitja Makarov : > 2012/5/9 Vitja Makarov : >> 2012/5/9 mark florisson : >>> On 9 May 2012 13:39, Vitja Makarov wrote: >>>> 2012/5/9 Vitja Makarov : >>>>> 2012/5/9 Stefan Behnel : >>>>>> Stefan Behnel, 08.05.2012 14:24: >>>>>>> Vitja has rebased the type inference on the control flow >>>>>> >>>>>> On a related note, is this fixable now? >>>>>> >>>>>> ?def test(): >>>>>> ? ? ?x = 1 ? ?# inferred as int >>>>>> ? ? ?del x ? ?# error: Deletion of non-Python, non-C++ object >>>>>> >>>>>> http://trac.cython.org/cython_trac/ticket/768 >>>>>> >>>>>> It might be enough to infer "object" for names that are being del-ed for >>>>>> now, and to fix "del" The Right Way when we split entries. >>>>>> >>>>> >>>>> Do you mean that `x` should be inferred as "python object" in your example? >>>>> >>>>> Yes, we may add workaround for del case. >>>>> Del is represented now by NameDeletion with the same rhs and lhs. >>>>> >>>>> We can add method infer_type() to NameAssignment and use it instead of >>>>> Node.infer_type() >>>>> >>>>> >>>> >>>> Here I've tried to fix it, now deletion always infers as python_object >>>> >>>> https://github.com/vitek/cython/commit/225c9c60bed6406db46e87da31596e053056f8b7 >>>> >>>> >>>> That may break C++ object deletion >>>> >>> >>> Memoryviews can be deleted as well. >> >> >> That code is run for entries with unspecified_type only >> >> > > Yeah, this code doesn't work now: > > cdef extern from "foo.h": > ? ?cdef cppclass Foo: > ? ? ? ?Foo() > > def foo(): > ? ?foo = new Foo() > ? ?print typeof(foo) > ? ?del foo > > And I'm not sure how to fix it. I've fixed cppclasses: https://github.com/vitek/cython/commit/f5acf44be0f647bdcbb5a23c8bfbceff48f4414e About memoryviews: from cython cimport typeof def foo(float[::1] a): b = a #del b print typeof(b) print typeof(a) In this example `b` is inferred as 'Python object' and not `float[::1]`, is that correct? -- vitja. From markflorisson88 at gmail.com Wed May 9 15:18:44 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 14:18:44 +0100 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> Message-ID: On 9 May 2012 14:16, Vitja Makarov wrote: > 2012/5/9 Vitja Makarov : >> 2012/5/9 Vitja Makarov : >>> 2012/5/9 mark florisson : >>>> On 9 May 2012 13:39, Vitja Makarov wrote: >>>>> 2012/5/9 Vitja Makarov : >>>>>> 2012/5/9 Stefan Behnel : >>>>>>> Stefan Behnel, 08.05.2012 14:24: >>>>>>>> Vitja has rebased the type inference on the control flow >>>>>>> >>>>>>> On a related note, is this fixable now? >>>>>>> >>>>>>> ?def test(): >>>>>>> ? ? ?x = 1 ? ?# inferred as int >>>>>>> ? ? ?del x ? ?# error: Deletion of non-Python, non-C++ object >>>>>>> >>>>>>> http://trac.cython.org/cython_trac/ticket/768 >>>>>>> >>>>>>> It might be enough to infer "object" for names that are being del-ed for >>>>>>> now, and to fix "del" The Right Way when we split entries. >>>>>>> >>>>>> >>>>>> Do you mean that `x` should be inferred as "python object" in your example? >>>>>> >>>>>> Yes, we may add workaround for del case. >>>>>> Del is represented now by NameDeletion with the same rhs and lhs. >>>>>> >>>>>> We can add method infer_type() to NameAssignment and use it instead of >>>>>> Node.infer_type() >>>>>> >>>>>> >>>>> >>>>> Here I've tried to fix it, now deletion always infers as python_object >>>>> >>>>> https://github.com/vitek/cython/commit/225c9c60bed6406db46e87da31596e053056f8b7 >>>>> >>>>> >>>>> That may break C++ object deletion >>>>> >>>> >>>> Memoryviews can be deleted as well. >>> >>> >>> That code is run for entries with unspecified_type only >>> >>> >> >> Yeah, this code doesn't work now: >> >> cdef extern from "foo.h": >> ? ?cdef cppclass Foo: >> ? ? ? ?Foo() >> >> def foo(): >> ? ?foo = new Foo() >> ? ?print typeof(foo) >> ? ?del foo >> >> And I'm not sure how to fix it. > > I've fixed cppclasses: > > https://github.com/vitek/cython/commit/f5acf44be0f647bdcbb5a23c8bfbceff48f4414e > > About memoryviews: > > from cython cimport typeof > > def foo(float[::1] a): > ? ?b = a > ? ?#del b > ? ?print typeof(b) > ? ?print typeof(a) > > > In this example `b` is inferred as 'Python object' and not > `float[::1]`, is that correct? > > -- > vitja. > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel That's the current behaviour, but it would be better if it inferred a memoryview slice instead. From stefan_ml at behnel.de Wed May 9 15:27:25 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 15:27:25 +0200 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> Message-ID: <4FAA70BD.9010102@behnel.de> Vitja Makarov, 09.05.2012 15:16: > 2012/5/9 Vitja Makarov: >>>> On 9 May 2012 13:39, Vitja Makarov wrote: >>>>> 2012/5/9 Vitja Makarov: >>>>>> 2012/5/9 Stefan Behnel: >>>>>>> def test(): >>>>>>> x = 1 # inferred as int >>>>>>> del x # error: Deletion of non-Python, non-C++ object >>>>>>> >>>>>>> http://trac.cython.org/cython_trac/ticket/768 >>>>>>> >>>>>>> It might be enough to infer "object" for names that are being del-ed for >>>>>>> now, and to fix "del" The Right Way when we split entries. >>>>>> >>>>>> Do you mean that `x` should be inferred as "python object" in your example? >>>>>> >>>>>> Yes, we may add workaround for del case. >>>>>> Del is represented now by NameDeletion with the same rhs and lhs. >>>>>> >>>>>> We can add method infer_type() to NameAssignment and use it instead of >>>>>> Node.infer_type() Yes, looks ok. >>>>> Here I've tried to fix it, now deletion always infers as python_object >>>>> >>>>> https://github.com/vitek/cython/commit/225c9c60bed6406db46e87da31596e053056f8b7 >>>>> >>>>> That may break C++ object deletion >> >> Yeah, this code doesn't work now: >> >> cdef extern from "foo.h": >> cdef cppclass Foo: >> Foo() >> >> def foo(): >> foo = new Foo() >> print typeof(foo) >> del foo >> >> And I'm not sure how to fix it. > > I've fixed cppclasses: > > https://github.com/vitek/cython/commit/f5acf44be0f647bdcbb5a23c8bfbceff48f4414e Sure, that makes sense. If the type cannot be del-ed, we'll get an error elsewhere - not a concern of type inference. > About memoryviews: > > from cython cimport typeof > > def foo(float[::1] a): > b = a > #del b > print typeof(b) > print typeof(a) > > In this example `b` is inferred as 'Python object' and not > `float[::1]`, is that correct? I think it currently is, but it may no longer be in the future. See the running ML thread about the future of the buffer syntax and the memoryview syntax. If we're up to changing this, it would be good to give it a suitable behaviour right for the next release, so that users don't start relying on the above. Stefan From stefan_ml at behnel.de Wed May 9 15:33:55 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 15:33:55 +0200 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> Message-ID: <4FAA7243.9000107@behnel.de> mark florisson, 09.05.2012 15:18: > On 9 May 2012 14:16, Vitja Makarov wrote: >> from cython cimport typeof >> >> def foo(float[::1] a): >> b = a >> #del b >> print typeof(b) >> print typeof(a) >> >> >> In this example `b` is inferred as 'Python object' and not >> `float[::1]`, is that correct? >> > That's the current behaviour, but it would be better if it inferred a > memoryview slice instead. +1 Stefan From stefan_ml at behnel.de Wed May 9 17:13:10 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 17:13:10 +0200 Subject: [Cython] CF based type inference In-Reply-To: <5ff2d356-9b49-4954-82df-cd972a403f8c@email.android.com> References: <4FA91079.5090503@behnel.de> <5ff2d356-9b49-4954-82df-cd972a403f8c@email.android.com> Message-ID: <4FAA8986.70409@behnel.de> Dag Sverre Seljebotn, 08.05.2012 18:52: > Vitja Makarov wrote: >> def partial_validity(): >> """ >> >>> partial_validity() >> ('str object', 'double', 'str object') >> """ >> a_1 = 1.0 >> b = a_1 + 2 # definitely double >> a_2 = 'test' >> c = a_2 + 'toast' # definitely str >> return typeof(a_2), typeof(b), typeof(c) >> >> And this should work better because it allows to infer a_1 as a double >> and a_2 as a string. > > +1 (as also Mark has hinted several times). I also happen to like that > typeof returns str rather than object... I don't think type inferred code > has to restrict itself to what you could dousing *only* declarations. > > To go out on a hyperbole: Reinventing compiler theory to make things > fit better with our current tree and the Pyrex legacy isn't sustainable > forever, at some point we should do things the standard way and > refactor some code if necesarry. That's how these things work, though. It's basically register allocation and variable renaming mapped to a code translator (rather than a compiler that emits assembly or byte code). Stefan From markflorisson88 at gmail.com Wed May 9 17:20:03 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 16:20:03 +0100 Subject: [Cython] CF based type inference In-Reply-To: <4FAA8986.70409@behnel.de> References: <4FA91079.5090503@behnel.de> <5ff2d356-9b49-4954-82df-cd972a403f8c@email.android.com> <4FAA8986.70409@behnel.de> Message-ID: On 9 May 2012 16:13, Stefan Behnel wrote: > Dag Sverre Seljebotn, 08.05.2012 18:52: >> Vitja Makarov wrote: >>> def partial_validity(): >>> ? """ >>> ? >>> partial_validity() >>> ? ('str object', 'double', 'str object') >>> ? """ >>> ? a_1 = 1.0 >>> ? b = a_1 + 2 ? # definitely double >>> ? a_2 = 'test' >>> ? c = a_2 + 'toast' ?# definitely str >>> ? return typeof(a_2), typeof(b), typeof(c) >>> >>> And this should work better because it allows to infer a_1 as a double >>> and a_2 as a string. >> >> +1 (as also Mark has hinted several times). I also happen to like that >> typeof returns str rather than object... I don't think type inferred code >> has to restrict itself to what you could dousing *only* declarations. >> >> To go out on a hyperbole: Reinventing compiler theory to make things >> fit better with our current tree and the Pyrex legacy isn't sustainable >> forever, at some point we should do things the standard way and >> refactor some code if necesarry. > > That's how these things work, though. It's basically register allocation > and variable renaming mapped to a code translator (rather than a compiler > that emits assembly or byte code). > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel That's not what he was hinting at though. Many of these things we're doing are standard in compiler theory, and inventing our own ad-hoc ways and sloppy algorithms for things like control flow, type inference, variable renaming, bounds check optimizations, none checking optimizations, etc, isn't going to cut it. As we have already seen, standard ways to do control flow have worked out very great due to Vitja's work. From d.s.seljebotn at astro.uio.no Wed May 9 17:58:07 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Wed, 09 May 2012 17:58:07 +0200 Subject: [Cython] CF based type inference In-Reply-To: <4FAA8986.70409@behnel.de> References: <4FA91079.5090503@behnel.de> <5ff2d356-9b49-4954-82df-cd972a403f8c@email.android.com> <4FAA8986.70409@behnel.de> Message-ID: <4FAA940F.5070607@astro.uio.no> On 05/09/2012 05:13 PM, Stefan Behnel wrote: > Dag Sverre Seljebotn, 08.05.2012 18:52: >> Vitja Makarov wrote: >>> def partial_validity(): >>> """ >>> >>> partial_validity() >>> ('str object', 'double', 'str object') >>> """ >>> a_1 = 1.0 >>> b = a_1 + 2 # definitely double >>> a_2 = 'test' >>> c = a_2 + 'toast' # definitely str >>> return typeof(a_2), typeof(b), typeof(c) >>> >>> And this should work better because it allows to infer a_1 as a double >>> and a_2 as a string. >> >> +1 (as also Mark has hinted several times). I also happen to like that >> typeof returns str rather than object... I don't think type inferred code >> has to restrict itself to what you could dousing *only* declarations. >> >> To go out on a hyperbole: Reinventing compiler theory to make things >> fit better with our current tree and the Pyrex legacy isn't sustainable >> forever, at some point we should do things the standard way and >> refactor some code if necesarry. > > That's how these things work, though. It's basically register allocation > and variable renaming mapped to a code translator (rather than a compiler > that emits assembly or byte code). Yes, to be crystal clear, I was actually hinting at your original proposal here, and applauding Vitja's counter-proposal as a more standard way of doing things. But I regretted posting at all afterwards, I do so little coding on Cython these days that I shouldn't interfer at this level. I'll try to leave such rants to Mark in the future :-) Dag From vitja.makarov at gmail.com Wed May 9 18:31:51 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 20:31:51 +0400 Subject: [Cython] Bug in print statement Message-ID: Del statement inference enabled pyregr.test_descr testcase and it SIGSEGVs. Here is minimal example: import unittest import sys class Foo(unittest.TestCase): def test_file_fault(self): # Testing sys.stdout is changed in getattr... test_stdout = sys.stdout class StdoutGuard: def __getattr__(self, attr): test_stdout.write('%d\n' % sys.getrefcount(self)) sys.stdout = test_stdout #sys.__stdout__ test_stdout.write('%d\n' % sys.getrefcount(self)) test_stdout.write('getattr: %r\n' % attr) test_stdout.flush() raise RuntimeError("Premature access to sys.stdout.%s" % attr) sys.stdout = StdoutGuard() try: print "Oops!" except RuntimeError: pass finally: sys.stdout = test_stdout def test_getattr_hooks(self): pass from test import test_support test_support.run_unittest(Foo) It works in python and sigsegvs in cython. It seems to me that the problem is StdoutGuard() is still used when its reference counter is zero since Python interpreter does Py_XINCREF() for file object and __Pyx_Print() doesn't. -- vitja. From stefan_ml at behnel.de Wed May 9 18:44:11 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 18:44:11 +0200 Subject: [Cython] Bug in print statement In-Reply-To: References: Message-ID: <4FAA9EDB.4040902@behnel.de> Vitja Makarov, 09.05.2012 18:31: > Del statement inference enabled pyregr.test_descr testcase and it SIGSEGVs. > Here is minimal example: > > import unittest > import sys > > class Foo(unittest.TestCase): > def test_file_fault(self): > # Testing sys.stdout is changed in getattr... > test_stdout = sys.stdout > class StdoutGuard: > def __getattr__(self, attr): > test_stdout.write('%d\n' % sys.getrefcount(self)) > sys.stdout = test_stdout #sys.__stdout__ > test_stdout.write('%d\n' % sys.getrefcount(self)) > test_stdout.write('getattr: %r\n' % attr) > test_stdout.flush() > raise RuntimeError("Premature access to sys.stdout.%s" % attr) > sys.stdout = StdoutGuard() > try: > print "Oops!" > except RuntimeError: > pass > finally: > sys.stdout = test_stdout > > def test_getattr_hooks(self): > pass > > from test import test_support > test_support.run_unittest(Foo) > > It works in python and sigsegvs in cython. > It seems to me that the problem is StdoutGuard() is still used when > its reference counter is zero since Python interpreter does > Py_XINCREF() for file object and __Pyx_Print() doesn't. Makes sense to change that, IMHO. An additional INCREF during something as involved as a print() will not hurt anyone. IIRC, I had the same problem with PyPy - guess I should have fixed it back then instead of taking the lazy escape towards using the print() function. Stefan From vitja.makarov at gmail.com Wed May 9 18:57:14 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 20:57:14 +0400 Subject: [Cython] Bug in print statement In-Reply-To: <4FAA9EDB.4040902@behnel.de> References: <4FAA9EDB.4040902@behnel.de> Message-ID: 2012/5/9 Stefan Behnel : > Vitja Makarov, 09.05.2012 18:31: >> Del statement inference enabled pyregr.test_descr testcase and it SIGSEGVs. >> Here is minimal example: >> >> import unittest >> import sys >> >> class Foo(unittest.TestCase): >> ? ? def test_file_fault(self): >> ? ? ? ? # Testing sys.stdout is changed in getattr... >> ? ? ? ? test_stdout = sys.stdout >> ? ? ? ? class StdoutGuard: >> ? ? ? ? ? ? def __getattr__(self, attr): >> ? ? ? ? ? ? ? ? test_stdout.write('%d\n' % sys.getrefcount(self)) >> ? ? ? ? ? ? ? ? sys.stdout = ?test_stdout #sys.__stdout__ >> ? ? ? ? ? ? ? ? test_stdout.write('%d\n' % sys.getrefcount(self)) >> ? ? ? ? ? ? ? ? test_stdout.write('getattr: %r\n' % attr) >> ? ? ? ? ? ? ? ? test_stdout.flush() >> ? ? ? ? ? ? ? ? raise RuntimeError("Premature access to sys.stdout.%s" % attr) >> ? ? ? ? sys.stdout = StdoutGuard() >> ? ? ? ? try: >> ? ? ? ? ? ? print "Oops!" >> ? ? ? ? except RuntimeError: >> ? ? ? ? ? ? pass >> ? ? ? ? finally: >> ? ? ? ? ? ? sys.stdout = test_stdout >> >> ? ? def test_getattr_hooks(self): >> ? ? ? ? pass >> >> from test import test_support >> test_support.run_unittest(Foo) >> >> It works in python and sigsegvs in cython. >> It seems to me that the problem is StdoutGuard() is still used when >> its reference counter is zero since Python interpreter does >> Py_XINCREF() for file object and __Pyx_Print() doesn't. > > Makes sense to change that, IMHO. An additional INCREF during something as > involved as a print() will not hurt anyone. > > IIRC, I had the same problem with PyPy - guess I should have fixed it back > then instead of taking the lazy escape towards using the print() function. > I've moved printing function to Utility/ and fixed refcount bug, if jenkins is ok I'm gonna push this commit to master https://github.com/vitek/cython/commit/83eceb31b4ed9afc0fd6d24c9eda5e52d9420535 -- vitja. From robertwb at gmail.com Wed May 9 20:15:07 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 9 May 2012 11:15:07 -0700 Subject: [Cython] CF based type inference In-Reply-To: <4FAA7243.9000107@behnel.de> References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> <4FAA7243.9000107@behnel.de> Message-ID: On Wed, May 9, 2012 at 6:33 AM, Stefan Behnel wrote: > mark florisson, 09.05.2012 15:18: >> On 9 May 2012 14:16, Vitja Makarov wrote: >>> from cython cimport typeof >>> >>> def foo(float[::1] a): >>> ? ?b = a >>> ? ?#del b >>> ? ?print typeof(b) >>> ? ?print typeof(a) >>> >>> >>> In this example `b` is inferred as 'Python object' and not >>> `float[::1]`, is that correct? >>> >> That's the current behaviour, but it would be better if it inferred a >> memoryview slice instead. > > +1 +1. This looks like it would break inference of extension classes as well. https://github.com/vitek/cython/commit/f5acf44be0f647bdcbb5a23c8bfbceff48f4414e#L0R336 could be changed to check if it's already a py_object_type (or memory view) as a quick fix, but it's not as pure as adding the constraints "can be del'ed" to the type inference engine. - Robert From vitja.makarov at gmail.com Wed May 9 20:21:48 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Wed, 9 May 2012 22:21:48 +0400 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> <4FAA7243.9000107@behnel.de> Message-ID: 2012/5/9 Robert Bradshaw : > On Wed, May 9, 2012 at 6:33 AM, Stefan Behnel wrote: >> mark florisson, 09.05.2012 15:18: >>> On 9 May 2012 14:16, Vitja Makarov wrote: >>>> from cython cimport typeof >>>> >>>> def foo(float[::1] a): >>>> ? ?b = a >>>> ? ?#del b >>>> ? ?print typeof(b) >>>> ? ?print typeof(a) >>>> >>>> >>>> In this example `b` is inferred as 'Python object' and not >>>> `float[::1]`, is that correct? >>>> >>> That's the current behaviour, but it would be better if it inferred a >>> memoryview slice instead. >> >> +1 > > +1. This looks like it would break inference of extension classes as well. > > https://github.com/vitek/cython/commit/f5acf44be0f647bdcbb5a23c8bfbceff48f4414e#L0R336 > > could be changed to check if it's already a py_object_type (or memory > view) as a quick fix, but it's not as pure as adding the constraints > "can be del'ed" to the type inference engine. > Yeah, right. It must be something like this: if not inferred_type.is_pyobject and inferred_type.can_coerce_to_pyobject(scope): -- vitja. From robertwb at gmail.com Wed May 9 20:35:00 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 9 May 2012 11:35:00 -0700 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FA8EC03.4000201@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8EC03.4000201@behnel.de> Message-ID: On Tue, May 8, 2012 at 2:48 AM, Stefan Behnel wrote: > mark florisson, 08.05.2012 11:24: >>>>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>>>> ?1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>>>>> the language. It means exactly what you would like double[:] to mean, >>>>>> i.e. >>>>>> a variable that is memoryview when you need to and an object otherwise. >>>>>> When you use this type, you bear the consequences of early-binding things >>>>>> that could in theory be overridden. >>>>>> >>>>>> ?2) double[:] is for when you want to access data of *any* Python object >>>>>> in a generic way. Raw PEP 3118. In those situations, access to the >>>>>> underlying object is much less useful. >>>>>> >>>>>> ? 2a) Therefore we require that you do "mview.asobject()" manually; doing >>>>>> "mview.foo()" is a compile-time error >>> [...] >>> Character pointers coerce to strings. Hell, even structs coerce to and >>> from python dicts, so disallowing the same for memoryviews would just >>> be inconsistent and inconvenient. > > Two separate things to discuss here: the original exporter and a Python > level wrapper. > > As long as wrapping the memoryview in a new object is can easily be done by > users, I don't see a reason to provide compiler support for getting at the > exporter. After all, a user may have a memory view that is backed by a > NumPy array but wants to reinterpret it as a PIL image. Just because the > underlying object has a specific object type doesn't mean that's the one to > use for a given use case. If a user requires a specific object *instead* of > a bare memory view, we have the object type buffer syntax for that. On the other hand, if the object type buffer syntax to be deprecated and replaced by bare memory views, then a user-specified exporter is I think quite important so that, e.g. when slicing NumPy arrays one gets NumPy arrays back. Is slicing the only way in which to get new memoryviews from old? If this is the case, perhaps we could use a Python __getitem__ call with the appropriate slice to create a new underlying object from the original underlying object (only when needed of course). This is assuming that the underlying object supports it. > It's also not necessarily more efficient to access the underlying object > than to create a new one if the underlying exporter has to learn about the > mapped layout first. > > Regarding the coercion to Python, I do not see a problem with providing a > general Python view object for memory views that arbitrary Cython memory > views can coerce to. In fact, I consider that a useful feature. The builtin > memoryview type in Python (at least the one in CPython 3.3) should be quite > capable of providing this, although I don't mind what exactly this becomes. I'd rather not make things global, but for memory views that were created without an underlying object, having a good default (I'd rather not have a global registry) makes a lot of sense. - Robert From markflorisson88 at gmail.com Wed May 9 20:45:48 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 19:45:48 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8EC03.4000201@behnel.de> Message-ID: On 9 May 2012 19:35, Robert Bradshaw wrote: > On Tue, May 8, 2012 at 2:48 AM, Stefan Behnel wrote: >> mark florisson, 08.05.2012 11:24: >>>>>> Dag Sverre Seljebotn, 08.05.2012 09:57: >>>>>>> ?1) We NEVER deprecate "np.ndarray[double]", we commit to keeping that in >>>>>>> the language. It means exactly what you would like double[:] to mean, >>>>>>> i.e. >>>>>>> a variable that is memoryview when you need to and an object otherwise. >>>>>>> When you use this type, you bear the consequences of early-binding things >>>>>>> that could in theory be overridden. >>>>>>> >>>>>>> ?2) double[:] is for when you want to access data of *any* Python object >>>>>>> in a generic way. Raw PEP 3118. In those situations, access to the >>>>>>> underlying object is much less useful. >>>>>>> >>>>>>> ? 2a) Therefore we require that you do "mview.asobject()" manually; doing >>>>>>> "mview.foo()" is a compile-time error >>>> [...] >>>> Character pointers coerce to strings. Hell, even structs coerce to and >>>> from python dicts, so disallowing the same for memoryviews would just >>>> be inconsistent and inconvenient. >> >> Two separate things to discuss here: the original exporter and a Python >> level wrapper. >> >> As long as wrapping the memoryview in a new object is can easily be done by >> users, I don't see a reason to provide compiler support for getting at the >> exporter. After all, a user may have a memory view that is backed by a >> NumPy array but wants to reinterpret it as a PIL image. Just because the >> underlying object has a specific object type doesn't mean that's the one to >> use for a given use case. If a user requires a specific object *instead* of >> a bare memory view, we have the object type buffer syntax for that. > > On the other hand, if the object type buffer syntax to be deprecated > and replaced by bare memory views, then a user-specified exporter is I > think quite important so that, e.g. when slicing NumPy arrays one gets > NumPy arrays back. > > Is slicing the only way in which to get new memoryviews from old? If > this is the case, perhaps we could use a Python __getitem__ call with > the appropriate slice to create a new underlying object from the > original underlying object (only when needed of course). This is > assuming that the underlying object supports it. You can also use newaxis indexing or transpose the view, but those are the only ways to change the view I think. I like the idea quite a bit, as the callback has no sane way of getting registered. For newaxes we could pass in None in the right places to __getitem__, as for transpose, the 'T' attribute works for numpy, I don't know about other exposers. >> It's also not necessarily more efficient to access the underlying object >> than to create a new one if the underlying exporter has to learn about the >> mapped layout first. >> >> Regarding the coercion to Python, I do not see a problem with providing a >> general Python view object for memory views that arbitrary Cython memory >> views can coerce to. In fact, I consider that a useful feature. The builtin >> memoryview type in Python (at least the one in CPython 3.3) should be quite >> capable of providing this, although I don't mind what exactly this becomes. > > I'd rather not make things global, but for memory views that were > created without an underlying object, having a good default (I'd > rather not have a global registry) makes a lot of sense. > > - Robert > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Wed May 9 20:55:04 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 09 May 2012 20:55:04 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8EC03.4000201@behnel.de> Message-ID: <4FAABD88.9030103@behnel.de> mark florisson, 09.05.2012 20:45: > You can also use newaxis indexing or transpose the view What is "newaxis indexing"? Stefan From robertwb at gmail.com Wed May 9 20:56:45 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 9 May 2012 11:56:45 -0700 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8E79A.4040402@astro.uio.no> <4FA8EBAE.3010106@astro.uio.no> Message-ID: On Tue, May 8, 2012 at 3:35 AM, mark florisson wrote: > On 8 May 2012 10:47, Dag Sverre Seljebotn wrote: >> >> After some thinking I believe I can see more clearly where Mark is coming >> from. To sum up, it's either >> >> A) Keep both np.ndarray[double] and double[:] around, with clearly defined >> and separate roles. np.ndarray[double] implementation is revamped to allow >> fast slicing etc., based on the double[:] implementation. >> >> B) Deprecate np.ndarray[double] sooner rather than later, but make double[:] >> have functionality that is *really* close to what np.ndarray[double] >> currently does. In most cases one should be able to basically replace >> np.ndarray[double] with double[:] and the code should continue to work just >> like before; difference is that if you pass in anything else than a NumPy >> array, it will likely fail with a runtime AttributeError at some point >> rather than fail a PyType_Check. > > That's a good summary. I have a big preference for B here, but I agree > that treating a typed memoryview as both a user object (possibly > converted through callback) and a typed memoryview "subclass" is quite > magicky. With the talk of overlay modules and go-style interface, being able to specify the type of an object as well as its bufferness could become more interesting than it even is now. The notion of supporting multiple interfaces, e.g. cdef np.ndarray & double[:] my_array could obviate the need for np.ndarray[double]. Until we support something like this, or decide to reject it, I think we need to keep the old-style syntax around. (np.ndarray[double] could even become this intersection type to gain all the new features before we decide on a appropriate syntax). > I wouldn't particularly mind something concise like 'm.obj'. > The AttributeError would be the case as usual, when a python object > doesn't have the right interface. Having to insert the .obj in there does make it more painful to convert existing Python code. - Robert From markflorisson88 at gmail.com Wed May 9 21:03:56 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 20:03:56 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FAABD88.9030103@behnel.de> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8EC03.4000201@behnel.de> <4FAABD88.9030103@behnel.de> Message-ID: On 9 May 2012 19:55, Stefan Behnel wrote: > mark florisson, 09.05.2012 20:45: >> You can also use newaxis indexing or transpose the view > > What is "newaxis indexing"? > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel It's when you introduce a new one-sized dimension. E.g. if you have a 1D array with shape (10,), and index it like myarray[None, :], you get a 2D array with shape (1, 10). There is a pending pull request for that (which should make it into 0.17). From markflorisson88 at gmail.com Wed May 9 21:08:07 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 20:08:07 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8E79A.4040402@astro.uio.no> <4FA8EBAE.3010106@astro.uio.no> Message-ID: On 9 May 2012 19:56, Robert Bradshaw wrote: > On Tue, May 8, 2012 at 3:35 AM, mark florisson > wrote: >> On 8 May 2012 10:47, Dag Sverre Seljebotn wrote: >>> >>> After some thinking I believe I can see more clearly where Mark is coming >>> from. To sum up, it's either >>> >>> A) Keep both np.ndarray[double] and double[:] around, with clearly defined >>> and separate roles. np.ndarray[double] implementation is revamped to allow >>> fast slicing etc., based on the double[:] implementation. >>> >>> B) Deprecate np.ndarray[double] sooner rather than later, but make double[:] >>> have functionality that is *really* close to what np.ndarray[double] >>> currently does. In most cases one should be able to basically replace >>> np.ndarray[double] with double[:] and the code should continue to work just >>> like before; difference is that if you pass in anything else than a NumPy >>> array, it will likely fail with a runtime AttributeError at some point >>> rather than fail a PyType_Check. >> >> That's a good summary. I have a big preference for B here, but I agree >> that treating a typed memoryview as both a user object (possibly >> converted through callback) and a typed memoryview "subclass" is quite >> magicky. > > With the talk of overlay modules and go-style interface, being able to > specify the type of an object as well as its bufferness could become > more interesting than it even is now. The notion of supporting > multiple interfaces, e.g. > > cdef np.ndarray & double[:] my_array > > could obviate the need for np.ndarray[double]. Until we support > something like this, or decide to reject it, I think we need to keep > the old-style syntax around. (np.ndarray[double] could even become > this intersection type to gain all the new features before we decide > on a appropriate syntax). It's kind of interesting but also kind of a pain to declare everywhere like that. Buffer syntax should by no means deprecated in the near future, but at some point it will be better to have one way to do things, whether slightly magicky or more convoluted or not. Also, as Dag mentioned, if we want fused extension types it makes more sense to remove buffer syntax to disambiguate this and avoid context-dependent special casing (e.g. np.ndarray and array.array). >> I wouldn't particularly mind something concise like 'm.obj'. >> The AttributeError would be the case as usual, when a python object >> doesn't have the right interface. > > Having to insert the .obj in there does make it more painful to > convert existing Python code. Yes, hence my slight bias towards magicky. But I do fully agree with all opposing arguments that say "too much magic". I just prefer to be pragmatic here :) > - Robert > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Wed May 9 21:09:01 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 9 May 2012 20:09:01 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8E79A.4040402@astro.uio.no> <4FA8EBAE.3010106@astro.uio.no> Message-ID: On 9 May 2012 20:08, mark florisson wrote: > On 9 May 2012 19:56, Robert Bradshaw wrote: >> On Tue, May 8, 2012 at 3:35 AM, mark florisson >> wrote: >>> On 8 May 2012 10:47, Dag Sverre Seljebotn wrote: >>>> >>>> After some thinking I believe I can see more clearly where Mark is coming >>>> from. To sum up, it's either >>>> >>>> A) Keep both np.ndarray[double] and double[:] around, with clearly defined >>>> and separate roles. np.ndarray[double] implementation is revamped to allow >>>> fast slicing etc., based on the double[:] implementation. >>>> >>>> B) Deprecate np.ndarray[double] sooner rather than later, but make double[:] >>>> have functionality that is *really* close to what np.ndarray[double] >>>> currently does. In most cases one should be able to basically replace >>>> np.ndarray[double] with double[:] and the code should continue to work just >>>> like before; difference is that if you pass in anything else than a NumPy >>>> array, it will likely fail with a runtime AttributeError at some point >>>> rather than fail a PyType_Check. >>> >>> That's a good summary. I have a big preference for B here, but I agree >>> that treating a typed memoryview as both a user object (possibly >>> converted through callback) and a typed memoryview "subclass" is quite >>> magicky. >> >> With the talk of overlay modules and go-style interface, being able to >> specify the type of an object as well as its bufferness could become >> more interesting than it even is now. The notion of supporting >> multiple interfaces, e.g. >> >> cdef np.ndarray & double[:] my_array >> >> could obviate the need for np.ndarray[double]. Until we support >> something like this, or decide to reject it, I think we need to keep >> the old-style syntax around. (np.ndarray[double] could even become >> this intersection type to gain all the new features before we decide >> on a appropriate syntax). > > It's kind of interesting but also kind of a pain to declare everywhere > like that. Although I suppose a typedef could help. But then it's harder to see the dtype without lookup up the typedef declaration. Oh well :) > Buffer syntax should by no means deprecated in the near > future, but at some point it will be better to have one way to do > things, whether slightly magicky or more convoluted or not. Also, as > Dag mentioned, if we want fused extension types it makes more sense to > remove buffer syntax to disambiguate this and avoid context-dependent > special casing (e.g. np.ndarray and array.array). > >>> I wouldn't particularly mind something concise like 'm.obj'. >>> The AttributeError would be the case as usual, when a python object >>> doesn't have the right interface. >> >> Having to insert the .obj in there does make it more painful to >> convert existing Python code. > > Yes, hence my slight bias towards magicky. But I do fully agree with > all opposing arguments that say "too much magic". I just prefer to be > pragmatic here :) > >> - Robert >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel From robertwb at gmail.com Wed May 9 21:44:44 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 9 May 2012 12:44:44 -0700 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8E79A.4040402@astro.uio.no> <4FA8EBAE.3010106@astro.uio.no> Message-ID: On Wed, May 9, 2012 at 12:09 PM, mark florisson wrote: > On 9 May 2012 20:08, mark florisson wrote: >> On 9 May 2012 19:56, Robert Bradshaw wrote: >>> On Tue, May 8, 2012 at 3:35 AM, mark florisson >>> wrote: >>>> On 8 May 2012 10:47, Dag Sverre Seljebotn wrote: >>>>> >>>>> After some thinking I believe I can see more clearly where Mark is coming >>>>> from. To sum up, it's either >>>>> >>>>> A) Keep both np.ndarray[double] and double[:] around, with clearly defined >>>>> and separate roles. np.ndarray[double] implementation is revamped to allow >>>>> fast slicing etc., based on the double[:] implementation. >>>>> >>>>> B) Deprecate np.ndarray[double] sooner rather than later, but make double[:] >>>>> have functionality that is *really* close to what np.ndarray[double] >>>>> currently does. In most cases one should be able to basically replace >>>>> np.ndarray[double] with double[:] and the code should continue to work just >>>>> like before; difference is that if you pass in anything else than a NumPy >>>>> array, it will likely fail with a runtime AttributeError at some point >>>>> rather than fail a PyType_Check. >>>> >>>> That's a good summary. I have a big preference for B here, but I agree >>>> that treating a typed memoryview as both a user object (possibly >>>> converted through callback) and a typed memoryview "subclass" is quite >>>> magicky. >>> >>> With the talk of overlay modules and go-style interface, being able to >>> specify the type of an object as well as its bufferness could become >>> more interesting than it even is now. The notion of supporting >>> multiple interfaces, e.g. >>> >>> cdef np.ndarray & double[:] my_array >>> >>> could obviate the need for np.ndarray[double]. Until we support >>> something like this, or decide to reject it, I think we need to keep >>> the old-style syntax around. (np.ndarray[double] could even become >>> this intersection type to gain all the new features before we decide >>> on a appropriate syntax). >> >> It's kind of interesting but also kind of a pain to declare everywhere >> like that. > > Although I suppose a typedef could help. But then it's harder to see > the dtype without lookup up the typedef declaration. Oh well :) One would only use this syntax when one wanted to use features from both. >> Yes, hence my slight bias towards magicky. But I do fully agree with >> all opposing arguments that say "too much magic". I just prefer to be >> pragmatic here :) Same here. I think part of the magic feel is due to the ambiguity; a concrete and simple declaration of when it acts as an object and when it doesn't could help here. Auto-coercion is well engrained into the Cython language (and one of the big selling points) so I think that's OK. - Robert From stefan_ml at behnel.de Thu May 10 08:45:45 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 10 May 2012 08:45:45 +0200 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> Message-ID: <4FAB6419.3090303@behnel.de> Vitja Makarov, 08.05.2012 15:47: > 2012/5/8 Stefan Behnel: >> Vitja has rebased the type inference on the control flow, so I wonder if >> this will enable us to properly infer this: >> >> def partial_validity(): >> """ >> >>> partial_validity() >> ('Python object', 'double', 'str object') >> """ >> a = 1.0 >> b = a + 2 # definitely double >> a = 'test' >> c = a + 'toast' # definitely str >> return typeof(a), typeof(b), typeof(c) >> >> I think, what is mainly needed for this is that a NameNode with an >> undeclared type should not report its own entry as dependency but that of >> its own cf_assignments. Would this work? >> >> (Haven't got the time to try it out right now, so I'm dumping it here.) > > Yeah, that might work. The other way to go is to split entries: > > def partial_validity(): > """ > >>> partial_validity() > ('str object', 'double', 'str object') > """ > a_1 = 1.0 > b = a_1 + 2 # definitely double > a_2 = 'test' > c = a_2 + 'toast' # definitely str > return typeof(a_2), typeof(b), typeof(c) > > And this should work better because it allows to infer a_1 as a double > and a_2 as a string. How would type checks fit into this? Stupid example: def test(x): if isinstance(x, MyExtType): x.call_c_method() # type known, no None check needed else: x.call_py_method() # type unknown, may be None Would it work to consider a type checking branch an assignment to a new (and differently typed) entry? Stefan From vitja.makarov at gmail.com Thu May 10 09:27:58 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Thu, 10 May 2012 11:27:58 +0400 Subject: [Cython] CF based type inference In-Reply-To: <4FAB6419.3090303@behnel.de> References: <4FA91079.5090503@behnel.de> <4FAB6419.3090303@behnel.de> Message-ID: 2012/5/10 Stefan Behnel : > Vitja Makarov, 08.05.2012 15:47: >> 2012/5/8 Stefan Behnel: >>> Vitja has rebased the type inference on the control flow, so I wonder if >>> this will enable us to properly infer this: >>> >>> ?def partial_validity(): >>> ? ?""" >>> ? ?>>> partial_validity() >>> ? ?('Python object', 'double', 'str object') >>> ? ?""" >>> ? ?a = 1.0 >>> ? ?b = a + 2 ? # definitely double >>> ? ?a = 'test' >>> ? ?c = a + 'toast' ?# definitely str >>> ? ?return typeof(a), typeof(b), typeof(c) >>> >>> I think, what is mainly needed for this is that a NameNode with an >>> undeclared type should not report its own entry as dependency but that of >>> its own cf_assignments. Would this work? >>> >>> (Haven't got the time to try it out right now, so I'm dumping it here.) >> >> Yeah, that might work. The other way to go is to split entries: >> >> ?def partial_validity(): >> ? ?""" >> ? ?>>> partial_validity() >> ? ?('str object', 'double', 'str object') >> ? ?""" >> ? ?a_1 = 1.0 >> ? ?b = a_1 + 2 ? # definitely double >> ? ?a_2 = 'test' >> ? ?c = a_2 + 'toast' ?# definitely str >> ? ?return typeof(a_2), typeof(b), typeof(c) >> >> And this should work better because it allows to infer a_1 as a double >> and a_2 as a string. > > How would type checks fit into this? Stupid example: > > ? def test(x): > ? ? ? if isinstance(x, MyExtType): > ? ? ? ? ? x.call_c_method() ? ?# type known, no None check needed > ? ? ? else: > ? ? ? ? ? x.call_py_method() ? # type unknown, may be None > > Would it work to consider a type checking branch an assignment to a new > (and differently typed) entry? > No, at least without special handler for this case. Anyway that's not that hard to implement isinstance() condition may mark x as being assigned to MyExtType, e.g.: if isinstance(x, MyExtType): x = x # Fake assignment x.call_c_method() -- vitja. From d.s.seljebotn at astro.uio.no Thu May 10 09:37:51 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 10 May 2012 09:37:51 +0200 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8E79A.4040402@astro.uio.no> <4FA8EBAE.3010106@astro.uio.no> Message-ID: <4FAB704F.5010801@astro.uio.no> On 05/09/2012 09:08 PM, mark florisson wrote: > On 9 May 2012 19:56, Robert Bradshaw wrote: >> On Tue, May 8, 2012 at 3:35 AM, mark florisson >> wrote: >>> On 8 May 2012 10:47, Dag Sverre Seljebotn wrote: >>>> >>>> After some thinking I believe I can see more clearly where Mark is coming >>>> from. To sum up, it's either >>>> >>>> A) Keep both np.ndarray[double] and double[:] around, with clearly defined >>>> and separate roles. np.ndarray[double] implementation is revamped to allow >>>> fast slicing etc., based on the double[:] implementation. >>>> >>>> B) Deprecate np.ndarray[double] sooner rather than later, but make double[:] >>>> have functionality that is *really* close to what np.ndarray[double] >>>> currently does. In most cases one should be able to basically replace >>>> np.ndarray[double] with double[:] and the code should continue to work just >>>> like before; difference is that if you pass in anything else than a NumPy >>>> array, it will likely fail with a runtime AttributeError at some point >>>> rather than fail a PyType_Check. >>> >>> That's a good summary. I have a big preference for B here, but I agree >>> that treating a typed memoryview as both a user object (possibly >>> converted through callback) and a typed memoryview "subclass" is quite >>> magicky. >> >> With the talk of overlay modules and go-style interface, being able to >> specify the type of an object as well as its bufferness could become >> more interesting than it even is now. The notion of supporting >> multiple interfaces, e.g. >> >> cdef np.ndarray& double[:] my_array >> >> could obviate the need for np.ndarray[double]. Until we support >> something like this, or decide to reject it, I think we need to keep >> the old-style syntax around. (np.ndarray[double] could even become >> this intersection type to gain all the new features before we decide >> on a appropriate syntax). > > It's kind of interesting but also kind of a pain to declare everywhere > like that. Buffer syntax should by no means deprecated in the near > future, but at some point it will be better to have one way to do > things, whether slightly magicky or more convoluted or not. Also, as > Dag mentioned, if we want fused extension types it makes more sense to > remove buffer syntax to disambiguate this and avoid context-dependent > special casing (e.g. np.ndarray and array.array). I don't think it hurts to have two ways of doing things if they are sufficiently well-motivated, sufficiently well-defined, and sufficiently different from one another. The original reason I wanted double[:] was to stop tying ourselves to NumPy and don't promise to be compatible, because of the polymorphic aspect of NumPy. I think in the future, the Python behaviour of, say, +, in np.ndarray is going to be different from what we have today. You'll have the + fetching data over the network in some cases, or treating NA in special ways (I think there might be over a thousand about NA on the NumPy now?). In short, lots of stuff can be going on that we can't emulate in Cython. OTOH, perhaps that doesn't matter -- we just raise an exception for the NumPy arrays that we can't deal with, and move on... >>> I wouldn't particularly mind something concise like 'm.obj'. >>> The AttributeError would be the case as usual, when a python object >>> doesn't have the right interface. >> >> Having to insert the .obj in there does make it more painful to >> convert existing Python code. > > Yes, hence my slight bias towards magicky. But I do fully agree with > all opposing arguments that say "too much magic". I just prefer to be > pragmatic here :) It's a very big decision. I think two or three alternatives are starting to crystallise; but to choose between them I think it calls for a CEP with code examples, and a request for comment on both cython-users and numpy-discussion. Until that happens, avoiding any magic seems like a conservative forward-compatible default. Dag From markflorisson88 at gmail.com Thu May 10 10:34:31 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Thu, 10 May 2012 09:34:31 +0100 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAB6419.3090303@behnel.de> Message-ID: On 10 May 2012 08:27, Vitja Makarov wrote: > 2012/5/10 Stefan Behnel : >> Vitja Makarov, 08.05.2012 15:47: >>> 2012/5/8 Stefan Behnel: >>>> Vitja has rebased the type inference on the control flow, so I wonder if >>>> this will enable us to properly infer this: >>>> >>>> ?def partial_validity(): >>>> ? ?""" >>>> ? ?>>> partial_validity() >>>> ? ?('Python object', 'double', 'str object') >>>> ? ?""" >>>> ? ?a = 1.0 >>>> ? ?b = a + 2 ? # definitely double >>>> ? ?a = 'test' >>>> ? ?c = a + 'toast' ?# definitely str >>>> ? ?return typeof(a), typeof(b), typeof(c) >>>> >>>> I think, what is mainly needed for this is that a NameNode with an >>>> undeclared type should not report its own entry as dependency but that of >>>> its own cf_assignments. Would this work? >>>> >>>> (Haven't got the time to try it out right now, so I'm dumping it here.) >>> >>> Yeah, that might work. The other way to go is to split entries: >>> >>> ?def partial_validity(): >>> ? ?""" >>> ? ?>>> partial_validity() >>> ? ?('str object', 'double', 'str object') >>> ? ?""" >>> ? ?a_1 = 1.0 >>> ? ?b = a_1 + 2 ? # definitely double >>> ? ?a_2 = 'test' >>> ? ?c = a_2 + 'toast' ?# definitely str >>> ? ?return typeof(a_2), typeof(b), typeof(c) >>> >>> And this should work better because it allows to infer a_1 as a double >>> and a_2 as a string. >> >> How would type checks fit into this? Stupid example: >> >> ? def test(x): >> ? ? ? if isinstance(x, MyExtType): >> ? ? ? ? ? x.call_c_method() ? ?# type known, no None check needed >> ? ? ? else: >> ? ? ? ? ? x.call_py_method() ? # type unknown, may be None >> >> Would it work to consider a type checking branch an assignment to a new >> (and differently typed) entry? >> > > No, at least without special handler for this case. > Anyway that's not that hard to implement isinstance() condition may > mark x as being assigned to MyExtType, e.g.: > > if isinstance(x, MyExtType): > ? ?x = x ?# Fake assignment > ? ?x.call_c_method() > That would be nice. It might also be useful to do branch pruning before that stage, which may avoid a merge after the branch leading to a different (unknown, i.e. object) type. That could be useful in the face of fused types, where people write generic code triggering only a certain branch depending on the specialization. Bit of a special case maybe :) > > > > -- > vitja. > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Thu May 10 10:44:29 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Thu, 10 May 2012 09:44:29 +0100 Subject: [Cython] buffer syntax vs. memory view syntax In-Reply-To: <4FAB704F.5010801@astro.uio.no> References: <4FA7A618.4000503@astro.uio.no> <4FA7A6B2.5000801@astro.uio.no> <4FA7ADC0.40501@behnel.de> <4FA7B682.5050300@astro.uio.no> <4FA7C852.9020004@behnel.de> <4FA7D940.5030607@behnel.de> <4FA7F194.5080008@astro.uio.no> <95c0afc3-08f4-47d1-8649-7b80f931be54@email.android.com> <4FA8D1F8.5020109@astro.uio.no> <4FA8D6E9.9090004@behnel.de> <4FA8DB02.2020902@astro.uio.no> <4FA8E79A.4040402@astro.uio.no> <4FA8EBAE.3010106@astro.uio.no> <4FAB704F.5010801@astro.uio.no> Message-ID: On 10 May 2012 08:37, Dag Sverre Seljebotn wrote: > On 05/09/2012 09:08 PM, mark florisson wrote: >> >> On 9 May 2012 19:56, Robert Bradshaw ?wrote: >>> >>> On Tue, May 8, 2012 at 3:35 AM, mark florisson >>> ?wrote: >>>> >>>> On 8 May 2012 10:47, Dag Sverre Seljebotn >>>> ?wrote: >>>>> >>>>> >>>>> After some thinking I believe I can see more clearly where Mark is >>>>> coming >>>>> from. To sum up, it's either >>>>> >>>>> A) Keep both np.ndarray[double] and double[:] around, with clearly >>>>> defined >>>>> and separate roles. np.ndarray[double] implementation is revamped to >>>>> allow >>>>> fast slicing etc., based on the double[:] implementation. >>>>> >>>>> B) Deprecate np.ndarray[double] sooner rather than later, but make >>>>> double[:] >>>>> have functionality that is *really* close to what np.ndarray[double] >>>>> currently does. In most cases one should be able to basically replace >>>>> np.ndarray[double] with double[:] and the code should continue to work >>>>> just >>>>> like before; difference is that if you pass in anything else than a >>>>> NumPy >>>>> array, it will likely fail with a runtime AttributeError at some point >>>>> rather than fail a PyType_Check. >>>> >>>> >>>> That's a good summary. I have a big preference for B here, but I agree >>>> that treating a typed memoryview as both a user object (possibly >>>> converted through callback) and a typed memoryview "subclass" is quite >>>> magicky. >>> >>> >>> With the talk of overlay modules and go-style interface, being able to >>> specify the type of an object as well as its bufferness could become >>> more interesting than it even is now. The notion of supporting >>> multiple interfaces, e.g. >>> >>> cdef np.ndarray& ?double[:] my_array >>> >>> >>> could obviate the need for np.ndarray[double]. Until we support >>> something like this, or decide to reject it, I think we need to keep >>> the old-style syntax around. (np.ndarray[double] could even become >>> this intersection type to gain all the new features before we decide >>> on a appropriate syntax). >> >> >> It's kind of interesting but also kind of a pain to declare everywhere >> like that. Buffer syntax should by no means deprecated in the near >> future, but at some point it will be better to have one way to do >> things, whether slightly magicky or more convoluted or not. Also, as >> Dag mentioned, if we want fused extension types it makes more sense to >> remove buffer syntax to disambiguate this and avoid context-dependent >> special casing (e.g. np.ndarray and array.array). > > > I don't think it hurts to have two ways of doing things if they are > sufficiently well-motivated, sufficiently well-defined, and sufficiently > different from one another. > > The original reason I wanted double[:] was to stop tying ourselves to NumPy > and don't promise to be compatible, because of the polymorphic aspect of > NumPy. I think in the future, the Python behaviour of, say, +, in np.ndarray > is going to be different from what we have today. You'll have the + fetching > data over the network in some cases, or treating NA in special ways (I think > there might be over a thousand about NA on the NumPy now?). In short, lots > of stuff can be going on that we can't emulate in Cython. > > OTOH, perhaps that doesn't matter -- we just raise an exception for the > NumPy arrays that we can't deal with, and move on... > Basically, the only thing that both np.ndarray and memoryviews guarantee is that they operate through the buffer interface, and that they obtain this view at certain points (assignment). Hence, if you decide to resize your array, or swap your axes or whatever, then your object view may no longer be consistent with your buffer. When or if your buffer view changes isn't even defined, but kind of dictated by the implementation. Hence, if memoryviews overload +, then that + will always be triggered on a typed view. I do suppose that if people rely on type inference getting the type right, things start to get messy. As for NA, maybe they will extend the buffer interface at some point, but on the other hand Python people may feel that it will be too specific of a use case (wild guess). Unti then, keep your separate masks around :) Anyway, a valid point. It's hard to see where this is going and how future proof it is. >>>> I wouldn't particularly mind something concise like 'm.obj'. >>>> The AttributeError would be the case as usual, when a python object >>>> doesn't have the right interface. >>> >>> >>> Having to insert the .obj in there does make it more painful to >>> convert existing Python code. >> >> >> Yes, hence my slight bias towards magicky. But I do fully agree with >> all opposing arguments that say "too much magic". I just prefer to be >> pragmatic here :) > > > It's a very big decision. I think two or three alternatives are starting to > crystallise; but to choose between them I think it calls for a CEP with code > examples, and a request for comment on both cython-users and > numpy-discussion. > > Until that happens, avoiding any magic seems like a conservative > forward-compatible default. > > Dag > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From vitja.makarov at gmail.com Thu May 10 11:15:12 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Thu, 10 May 2012 13:15:12 +0400 Subject: [Cython] Bug in print statement In-Reply-To: References: <4FAA9EDB.4040902@behnel.de> Message-ID: 2012/5/9 Vitja Makarov : > 2012/5/9 Stefan Behnel : >> Vitja Makarov, 09.05.2012 18:31: >>> Del statement inference enabled pyregr.test_descr testcase and it SIGSEGVs. >>> Here is minimal example: >>> >>> import unittest >>> import sys >>> >>> class Foo(unittest.TestCase): >>> ? ? def test_file_fault(self): >>> ? ? ? ? # Testing sys.stdout is changed in getattr... >>> ? ? ? ? test_stdout = sys.stdout >>> ? ? ? ? class StdoutGuard: >>> ? ? ? ? ? ? def __getattr__(self, attr): >>> ? ? ? ? ? ? ? ? test_stdout.write('%d\n' % sys.getrefcount(self)) >>> ? ? ? ? ? ? ? ? sys.stdout = ?test_stdout #sys.__stdout__ >>> ? ? ? ? ? ? ? ? test_stdout.write('%d\n' % sys.getrefcount(self)) >>> ? ? ? ? ? ? ? ? test_stdout.write('getattr: %r\n' % attr) >>> ? ? ? ? ? ? ? ? test_stdout.flush() >>> ? ? ? ? ? ? ? ? raise RuntimeError("Premature access to sys.stdout.%s" % attr) >>> ? ? ? ? sys.stdout = StdoutGuard() >>> ? ? ? ? try: >>> ? ? ? ? ? ? print "Oops!" >>> ? ? ? ? except RuntimeError: >>> ? ? ? ? ? ? pass >>> ? ? ? ? finally: >>> ? ? ? ? ? ? sys.stdout = test_stdout >>> >>> ? ? def test_getattr_hooks(self): >>> ? ? ? ? pass >>> >>> from test import test_support >>> test_support.run_unittest(Foo) >>> >>> It works in python and sigsegvs in cython. >>> It seems to me that the problem is StdoutGuard() is still used when >>> its reference counter is zero since Python interpreter does >>> Py_XINCREF() for file object and __Pyx_Print() doesn't. >> >> Makes sense to change that, IMHO. An additional INCREF during something as >> involved as a print() will not hurt anyone. >> >> IIRC, I had the same problem with PyPy - guess I should have fixed it back >> then instead of taking the lazy escape towards using the print() function. >> > > I've moved printing function to Utility/ and fixed refcount bug, if > jenkins is ok I'm gonna push this commit to master > > https://github.com/vitek/cython/commit/83eceb31b4ed9afc0fd6d24c9eda5e52d9420535 > I've pushed fixes to master. -- vitja. From stefan_ml at behnel.de Thu May 10 11:25:14 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 10 May 2012 11:25:14 +0200 Subject: [Cython] pure mode quirk Message-ID: <4FAB897A.4020606@behnel.de> Hi, when declaring a C function in pure mode, you eventually end up with this: @cython.cfunc @cython.returns(cython.bint) @cython.locals(a=cython.int, b=cython.int, c=cython.int) def c_compare(a,b): c = 5 return a == b + c That is very verbose, making it hard to find the name of the actual function. It's also not very intuitive that @cython.locals() is the way to declare arguments. I would find it more readable to support this: @cython.cfunc(cython.bint, a=cython.int, b=cython.int) @cython.locals(c=cython.int) def c_compare(a,b): c = 5 return a == b But the problem here is that it conflicts with @cython.cfunc def c_compare(a,b): c = 5 return a == b when executed from Shadow.py. How should the fake decorator know that it is being called with a type as first argument and not with the function it decorates? Legacy, legacy ... An alternative would be this: @cython.cfunc(a=cython.int, b=cython.int, _returns=cython.bint) @cython.locals(c=cython.int) def c_compare(a,b): c = 5 return a == b But that's not clearer than an explicit decorator for the return value. I'm somewhat concerned about the redundancy this introduces with @locals(), which could still be used to declare argument types (even conflicting ones). However, getting rid of the need for a separate @returns() seems worthwhile by itself, so this might provide a compromise: @cython.cfunc(returns=cython.bint) @cython.locals(a=cython.int, b=cython.int, c=cython.int) def c_compare(a,b): c = 5 return a == b + c This would work in Shadow.py because it's easy to distinguish between a positional argument (the decorated function) and a keyword argument ("returns"). It might lead to bugs in user code, though, if they forget to pass the return type as a keyword argument. Maybe just a minor concern, because the decorator doesn't read well without the keyword. What do you think? Is this worth doing something about at all? Stefan From vitja.makarov at gmail.com Thu May 10 11:34:08 2012 From: vitja.makarov at gmail.com (Vitja Makarov) Date: Thu, 10 May 2012 13:34:08 +0400 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA3A1A.7070008@behnel.de> <4FAA7243.9000107@behnel.de> Message-ID: 2012/5/9 Robert Bradshaw : > On Wed, May 9, 2012 at 6:33 AM, Stefan Behnel wrote: >> mark florisson, 09.05.2012 15:18: >>> On 9 May 2012 14:16, Vitja Makarov wrote: >>>> from cython cimport typeof >>>> >>>> def foo(float[::1] a): >>>> ? ?b = a >>>> ? ?#del b >>>> ? ?print typeof(b) >>>> ? ?print typeof(a) >>>> >>>> >>>> In this example `b` is inferred as 'Python object' and not >>>> `float[::1]`, is that correct? >>>> >>> That's the current behaviour, but it would be better if it inferred a >>> memoryview slice instead. >> >> +1 > > +1. This looks like it would break inference of extension classes as well. > > https://github.com/vitek/cython/commit/f5acf44be0f647bdcbb5a23c8bfbceff48f4414e#L0R336 > > could be changed to check if it's already a py_object_type (or memory > view) as a quick fix, but it's not as pure as adding the constraints > "can be del'ed" to the type inference engine. > Here is fixed version: https://github.com/vitek/cython/commit/0f122b6dfb6d0c7932b08cc35cdcc90c3c30257b#L0R334 -- vitja. From markflorisson88 at gmail.com Thu May 10 11:41:16 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Thu, 10 May 2012 10:41:16 +0100 Subject: [Cython] pure mode quirk In-Reply-To: <4FAB897A.4020606@behnel.de> References: <4FAB897A.4020606@behnel.de> Message-ID: On 10 May 2012 10:25, Stefan Behnel wrote: > Hi, > > when declaring a C function in pure mode, you eventually end up with this: > > ? ?@cython.cfunc > ? ?@cython.returns(cython.bint) > ? ?@cython.locals(a=cython.int, b=cython.int, c=cython.int) > ? ?def c_compare(a,b): > ? ? ? ?c = 5 > ? ? ? ?return a == b + c > > That is very verbose, making it hard to find the name of the actual > function. It's also not very intuitive that @cython.locals() is the way to > declare arguments. > > I would find it more readable to support this: > > ? ?@cython.cfunc(cython.bint, a=cython.int, b=cython.int) > ? ?@cython.locals(c=cython.int) > ? ?def c_compare(a,b): > ? ? ? ?c = 5 > ? ? ? ?return a == b > > But the problem here is that it conflicts with > > ? ?@cython.cfunc > ? ?def c_compare(a,b): > ? ? ? ?c = 5 > ? ? ? ?return a == b > > when executed from Shadow.py. How should the fake decorator know that it is > being called with a type as first argument and not with the function it > decorates? Legacy, legacy ... I personally don't care much for pure mode, but it could just do an instance check for a function. You only accept real def functions anyway. > An alternative would be this: > > ? ?@cython.cfunc(a=cython.int, b=cython.int, _returns=cython.bint) > ? ?@cython.locals(c=cython.int) > ? ?def c_compare(a,b): > ? ? ? ?c = 5 > ? ? ? ?return a == b > > But that's not clearer than an explicit decorator for the return value. > > I'm somewhat concerned about the redundancy this introduces with @locals(), > which could still be used to declare argument types (even conflicting > ones). However, getting rid of the need for a separate @returns() seems > worthwhile by itself, so this might provide a compromise: > > ? ?@cython.cfunc(returns=cython.bint) > ? ?@cython.locals(a=cython.int, b=cython.int, c=cython.int) > ? ?def c_compare(a,b): > ? ? ? ?c = 5 > ? ? ? ?return a == b + c > > This would work in Shadow.py because it's easy to distinguish between a > positional argument (the decorated function) and a keyword argument > ("returns"). It might lead to bugs in user code, though, if they forget to > pass the return type as a keyword argument. Maybe just a minor concern, > because the decorator doesn't read well without the keyword. > > What do you think? Is this worth doing something about at all? > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Thu May 10 13:39:26 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 10 May 2012 13:39:26 +0200 Subject: [Cython] pure mode quirk In-Reply-To: References: <4FAB897A.4020606@behnel.de> Message-ID: <4FABA8EE.7090008@behnel.de> mark florisson, 10.05.2012 11:41: > On 10 May 2012 10:25, Stefan Behnel wrote: >> when declaring a C function in pure mode, you eventually end up with this: >> >> @cython.cfunc >> @cython.returns(cython.bint) >> @cython.locals(a=cython.int, b=cython.int, c=cython.int) >> def c_compare(a,b): >> c = 5 >> return a == b + c >> >> That is very verbose, making it hard to find the name of the actual >> function. It's also not very intuitive that @cython.locals() is the way to >> declare arguments. >> >> I would find it more readable to support this: >> >> @cython.cfunc(cython.bint, a=cython.int, b=cython.int) >> @cython.locals(c=cython.int) >> def c_compare(a,b): >> c = 5 >> return a == b >> >> But the problem here is that it conflicts with >> >> @cython.cfunc >> def c_compare(a,b): >> c = 5 >> return a == b >> >> when executed from Shadow.py. How should the fake decorator know that it is >> being called with a type as first argument and not with the function it >> decorates? Legacy, legacy ... > > I personally don't care much for pure mode, but it could just do an > instance check for a function. You only accept real def functions > anyway. Hmm, maybe, yes. IIRC, non-Cython decorators are otherwise forbidden on cdef functions (but also on cpdef functions?), so the case that another decorator replaces the function with something else in between isn't very likely to occur. In any case, the fix would be to change the decorator order to move the Cython decorators right at the function declaration. Not sure if that'd be completely obvious to everyone, but, as I said, not very likely to be a problem ... Stefan From robertwb at gmail.com Thu May 10 19:34:06 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Thu, 10 May 2012 10:34:06 -0700 Subject: [Cython] pure mode quirk In-Reply-To: <4FAB897A.4020606@behnel.de> References: <4FAB897A.4020606@behnel.de> Message-ID: On Thu, May 10, 2012 at 2:25 AM, Stefan Behnel wrote: > Hi, > > when declaring a C function in pure mode, you eventually end up with this: > > ? ?@cython.cfunc > ? ?@cython.returns(cython.bint) > ? ?@cython.locals(a=cython.int, b=cython.int, c=cython.int) > ? ?def c_compare(a,b): > ? ? ? ?c = 5 > ? ? ? ?return a == b + c > > That is very verbose, making it hard to find the name of the actual > function. It's also not very intuitive that @cython.locals() is the way to > declare arguments. > > I would find it more readable to support this: > > ? ?@cython.cfunc(cython.bint, a=cython.int, b=cython.int) > ? ?@cython.locals(c=cython.int) > ? ?def c_compare(a,b): > ? ? ? ?c = 5 > ? ? ? ?return a == b > > But the problem here is that it conflicts with > > ? ?@cython.cfunc > ? ?def c_compare(a,b): > ? ? ? ?c = 5 > ? ? ? ?return a == b > > when executed from Shadow.py. How should the fake decorator know that it is > being called with a type as first argument and not with the function it > decorates? Legacy, legacy ... > > An alternative would be this: > > ? ?@cython.cfunc(a=cython.int, b=cython.int, _returns=cython.bint) > ? ?@cython.locals(c=cython.int) > ? ?def c_compare(a,b): > ? ? ? ?c = 5 > ? ? ? ?return a == b > > But that's not clearer than an explicit decorator for the return value. > > I'm somewhat concerned about the redundancy this introduces with @locals(), > which could still be used to declare argument types (even conflicting > ones). However, getting rid of the need for a separate @returns() seems > worthwhile by itself, so this might provide a compromise: > > ? ?@cython.cfunc(returns=cython.bint) > ? ?@cython.locals(a=cython.int, b=cython.int, c=cython.int) > ? ?def c_compare(a,b): > ? ? ? ?c = 5 > ? ? ? ?return a == b + c > > This would work in Shadow.py because it's easy to distinguish between a > positional argument (the decorated function) and a keyword argument > ("returns"). It might lead to bugs in user code, though, if they forget to > pass the return type as a keyword argument. Maybe just a minor concern, > because the decorator doesn't read well without the keyword. > > What do you think? Is this worth doing something about at all? I didn't implement it this way originally because of the whole called/not ambiguity, but I didn't think about taking a keyword and using that to distinguish. (Testing the type of the input seemed to hackish...) I'm +1 for this, as well as accepting argument types in the cfunc decorator as well. There is a bit of overlap as "returns" has a special meaning (vs. an argument named returns) but I think that's OK, and cython.locals should still work. - Robert From markflorisson88 at gmail.com Thu May 10 21:13:33 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Thu, 10 May 2012 20:13:33 +0100 Subject: [Cython] 0.16.1 In-Reply-To: <4FAA2170.4050808@behnel.de> References: <4FAA117F.1080402@behnel.de> <4FAA2170.4050808@behnel.de> Message-ID: On 9 May 2012 08:49, Stefan Behnel wrote: > Stefan Behnel, 09.05.2012 08:41: >> Robert Bradshaw, 09.05.2012 00:16: >>> If we're looking at doing 0.17 soon, lets just do that. >> I think it's close enough to be released. > > ... although one thing I just noticed is that the "numpy_memoryview" test > is still disabled because it lead to crashes in recent Py3.2 releases (and > thus most likely also in the latest Py3k). Not sure if it still crashes, > but should be checked before going for a release. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel Hurgh. Disabling tests in bugs.txt is terrible, there should have been a comment in numpy_memoryview saying DISABLED and the testcase function should have been a noop. Your commit e3838e42c4b6f67f180d06b8cd75566f3380ab95 broke how typedef types are compared, which makes the test get a temporary of the wrong type. Let me try reverting that commit, what was it needed for? From stefan_ml at behnel.de Thu May 10 22:21:06 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 10 May 2012 22:21:06 +0200 Subject: [Cython] 0.16.1 In-Reply-To: References: <4FAA117F.1080402@behnel.de> <4FAA2170.4050808@behnel.de> Message-ID: <4FAC2332.2040001@behnel.de> mark florisson, 10.05.2012 21:13: > On 9 May 2012 08:49, Stefan Behnel wrote: >> Stefan Behnel, 09.05.2012 08:41: >>> Robert Bradshaw, 09.05.2012 00:16: >>>> If we're looking at doing 0.17 soon, lets just do that. >>> I think it's close enough to be released. >> >> ... although one thing I just noticed is that the "numpy_memoryview" test >> is still disabled because it lead to crashes in recent Py3.2 releases (and >> thus most likely also in the latest Py3k). Not sure if it still crashes, >> but should be checked before going for a release. > > Hurgh. Disabling tests in bugs.txt is terrible, there should have been > a comment in numpy_memoryview saying DISABLED and the testcase > function should have been a noop. ... or have a release mode in the test runner that barks at disabled tests. > Your commit > e3838e42c4b6f67f180d06b8cd75566f3380ab95 broke how typedef types are > compared, which makes the test get a temporary of the wrong type. Let > me try reverting that commit, what was it needed for? It was meant to fix the comparison of different char* ctypedefs. However, seeing it in retrospect now, it would definitely break user code to compare ctypedefs by their underlying base type because it's common for users to be lax about ctypedefs, e.g. for integer types. I think a better (and substantially safer) way to do it would be to use the hash value of the underlying declared type, but to make the equals comparison based on the typedef-ed name. That, plus a special case treatment for char* compatible types. Thanks for figuring out the problem. Stefan From markflorisson88 at gmail.com Thu May 10 22:24:54 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Thu, 10 May 2012 21:24:54 +0100 Subject: [Cython] 0.16.1 In-Reply-To: <4FAC2332.2040001@behnel.de> References: <4FAA117F.1080402@behnel.de> <4FAA2170.4050808@behnel.de> <4FAC2332.2040001@behnel.de> Message-ID: On 10 May 2012 21:21, Stefan Behnel wrote: > mark florisson, 10.05.2012 21:13: >> On 9 May 2012 08:49, Stefan Behnel wrote: >>> Stefan Behnel, 09.05.2012 08:41: >>>> Robert Bradshaw, 09.05.2012 00:16: >>>>> If we're looking at doing 0.17 soon, lets just do that. >>>> I think it's close enough to be released. >>> >>> ... although one thing I just noticed is that the "numpy_memoryview" test >>> is still disabled because it lead to crashes in recent Py3.2 releases (and >>> thus most likely also in the latest Py3k). Not sure if it still crashes, >>> but should be checked before going for a release. >> >> Hurgh. Disabling tests in bugs.txt is terrible, there should have been >> a comment in numpy_memoryview saying DISABLED and the testcase >> function should have been a noop. > > ... or have a release mode in the test runner that barks at disabled tests. > > >> Your commit >> e3838e42c4b6f67f180d06b8cd75566f3380ab95 broke how typedef types are >> compared, which makes the test get a temporary of the wrong type. Let >> me try reverting that commit, what was it needed for? > > It was meant to fix the comparison of different char* ctypedefs. However, > seeing it in retrospect now, it would definitely break user code to compare > ctypedefs by their underlying base type because it's common for users to be > lax about ctypedefs, e.g. for integer types. > > I think a better (and substantially safer) way to do it would be to use the > hash value of the underlying declared type, but to make the equals > comparison based on the typedef-ed name. That, plus a special case > treatment for char* compatible types. Yeah, I was thinking the same thing. I pushed a reverted commit, if you want you can try out that scheme and see if it works. > Thanks for figuring out the problem. No problem. We learned that disabled tests aren't very good for continuous integration now :) > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Fri May 11 08:38:10 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 11 May 2012 08:38:10 +0200 Subject: [Cython] weird declarations in fused types C code Message-ID: <4FACB3D2.3090606@behnel.de> Hi, while trying to replace the "import sys; if sys.version_info >= (3,0)" in the fused types dispatch code by the more straight forward "if PY_MAJOR_VERSION >= 3" (before I came to think that this particular case only guards useless code that does the wrong thing), I noticed that the code generates a declaration of PyErr_Clear() into the outside environment. When used in cdef classes, this leads to an external method being declared in the class, essentially like this: cdef class MyClass: cdef extern from *: void PyErr_Clear() Surprisingly enough, this actually works. Cython assigns the real C-API function pointer to it during type initialisation and even calls the function directly (instead of going through the vtab) when used. A rather curious feature that I would never had thought of. Anyway, this side effect is obviously a bug in the fused types dispatch, but I don't have a good idea on how to fix it. I'm sure Mark put some thought into this while trying hard to make it work and just didn't notice the impact on type namespaces. I've put up a pull request to remove the Py3 specialisation code, but this is worth some more consideration. Stefan From markflorisson88 at gmail.com Fri May 11 12:44:55 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Fri, 11 May 2012 11:44:55 +0100 Subject: [Cython] weird declarations in fused types C code In-Reply-To: <4FACB3D2.3090606@behnel.de> References: <4FACB3D2.3090606@behnel.de> Message-ID: On 11 May 2012 07:38, Stefan Behnel wrote: > Hi, > > while trying to replace the "import sys; if sys.version_info >= (3,0)" in > the fused types dispatch code by the more straight forward "if > PY_MAJOR_VERSION >= 3" (before I came to think that this particular case > only guards useless code that does the wrong thing), Yes, you made that plenty clear, sorry for thinking in terms of python code. For the record,?it does do the right thing. > I noticed that the > code generates a declaration of PyErr_Clear() into the outside environment. > When used in cdef classes, this leads to an external method being declared > in the class, essentially like this: > > ? ?cdef class MyClass: > ? ? ? ?cdef extern from *: > ? ? ? ? ? ?void PyErr_Clear() > > Surprisingly enough, this actually works. Cython assigns the real C-API > function pointer to it during type initialisation and even calls the > function directly (instead of going through the vtab) when used. A rather > curious feature that I would never had thought of. Yes, normally the parser catches that. > Anyway, this side effect is obviously a bug in the fused types dispatch, > but I don't have a good idea on how to fix it. I'm sure Mark put some > thought into this while trying hard to make it work and just didn't notice > the impact on type namespaces. I am aware of this behaviour, the thing is that the dispatcher function needs to be analyzed in the right context in order to generate an appropriate function or method in case of a cdef class (which are different from methods in normal classes even when synthesized). I thought about splitting the declarations from the actual function, and analyzing that in the module scope. Perhaps with some name mangling this can avoid names being accidentally available in user code. I don't recall if I have tried that already, but I'll give it another try. > I've put up a pull request to remove the Py3 specialisation code, but this > is worth some more consideration. Looks good to me, I'll merge it. > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Fri May 11 12:54:50 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Fri, 11 May 2012 11:54:50 +0100 Subject: [Cython] weird declarations in fused types C code In-Reply-To: References: <4FACB3D2.3090606@behnel.de> Message-ID: On 11 May 2012 11:44, mark florisson wrote: > On 11 May 2012 07:38, Stefan Behnel wrote: >> Hi, >> >> while trying to replace the "import sys; if sys.version_info >= (3,0)" in >> the fused types dispatch code by the more straight forward "if >> PY_MAJOR_VERSION >= 3" (before I came to think that this particular case >> only guards useless code that does the wrong thing), > > Yes, you made that plenty clear, sorry for thinking in terms of python > code. For the record,?it does do the right thing. > >> I noticed that the >> code generates a declaration of PyErr_Clear() into the outside environment. >> When used in cdef classes, this leads to an external method being declared >> in the class, essentially like this: >> >> ? ?cdef class MyClass: >> ? ? ? ?cdef extern from *: >> ? ? ? ? ? ?void PyErr_Clear() >> >> Surprisingly enough, this actually works. Cython assigns the real C-API >> function pointer to it during type initialisation and even calls the >> function directly (instead of going through the vtab) when used. A rather >> curious feature that I would never had thought of. > > Yes, normally the parser catches that. > >> Anyway, this side effect is obviously a bug in the fused types dispatch, >> but I don't have a good idea on how to fix it. I'm sure Mark put some >> thought into this while trying hard to make it work and just didn't notice >> the impact on type namespaces. > > I am aware of this behaviour, the thing is that the dispatcher > function needs to be analyzed in the right context in order to > generate an appropriate function or method in case of a cdef class > (which are different from methods in normal classes even when > synthesized). I thought about splitting the declarations from the > actual function, and analyzing that in the module scope. Perhaps with > some name mangling this can avoid names being accidentally available > in user code. I don't recall if I have tried that already, but I'll > give it another try. Ah, I see I already split them, all that is needed is to put it in the global scope now :) >> I've put up a pull request to remove the Py3 specialisation code, but this >> is worth some more consideration. > > Looks good to me, I'll merge it. > >> Stefan >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Fri May 11 13:00:16 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Fri, 11 May 2012 12:00:16 +0100 Subject: [Cython] weird declarations in fused types C code In-Reply-To: References: <4FACB3D2.3090606@behnel.de> Message-ID: On 11 May 2012 11:54, mark florisson wrote: > On 11 May 2012 11:44, mark florisson wrote: >> On 11 May 2012 07:38, Stefan Behnel wrote: >>> Hi, >>> >>> while trying to replace the "import sys; if sys.version_info >= (3,0)" in >>> the fused types dispatch code by the more straight forward "if >>> PY_MAJOR_VERSION >= 3" (before I came to think that this particular case >>> only guards useless code that does the wrong thing), >> >> Yes, you made that plenty clear, sorry for thinking in terms of python >> code. For the record,?it does do the right thing. >> >>> I noticed that the >>> code generates a declaration of PyErr_Clear() into the outside environment. >>> When used in cdef classes, this leads to an external method being declared >>> in the class, essentially like this: >>> >>> ? ?cdef class MyClass: >>> ? ? ? ?cdef extern from *: >>> ? ? ? ? ? ?void PyErr_Clear() >>> >>> Surprisingly enough, this actually works. Cython assigns the real C-API >>> function pointer to it during type initialisation and even calls the >>> function directly (instead of going through the vtab) when used. A rather >>> curious feature that I would never had thought of. >> >> Yes, normally the parser catches that. >> >>> Anyway, this side effect is obviously a bug in the fused types dispatch, >>> but I don't have a good idea on how to fix it. I'm sure Mark put some >>> thought into this while trying hard to make it work and just didn't notice >>> the impact on type namespaces. >> >> I am aware of this behaviour, the thing is that the dispatcher >> function needs to be analyzed in the right context in order to >> generate an appropriate function or method in case of a cdef class >> (which are different from methods in normal classes even when >> synthesized). I thought about splitting the declarations from the >> actual function, and analyzing that in the module scope. Perhaps with >> some name mangling this can avoid names being accidentally available >> in user code. I don't recall if I have tried that already, but I'll >> give it another try. > > Ah, I see I already split them, all that is needed is to put it in the > global scope now :) https://github.com/markflorisson88/cython/commit/3500fcd01ce6e68e76fcbabfe009eb273d7972fb >>> I've put up a pull request to remove the Py3 specialisation code, but this >>> is worth some more consideration. >> >> Looks good to me, I'll merge it. >> >>> Stefan >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Fri May 11 13:21:28 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 11 May 2012 13:21:28 +0200 Subject: [Cython] weird declarations in fused types C code In-Reply-To: References: <4FACB3D2.3090606@behnel.de> Message-ID: <4FACF638.6050207@behnel.de> mark florisson, 11.05.2012 13:00: > On 11 May 2012 11:54, mark florisson wrote: >> On 11 May 2012 11:44, mark florisson wrote: >>> On 11 May 2012 07:38, Stefan Behnel wrote: >>>> while trying to replace the "import sys; if sys.version_info >= (3,0)" in >>>> the fused types dispatch code by the more straight forward "if >>>> PY_MAJOR_VERSION >= 3" (before I came to think that this particular case >>>> only guards useless code that does the wrong thing), >>> >>> Yes, you made that plenty clear, sorry for thinking in terms of python >>> code. For the record, it does do the right thing. >>> >>>> I noticed that the >>>> code generates a declaration of PyErr_Clear() into the outside environment. >>>> When used in cdef classes, this leads to an external method being declared >>>> in the class, essentially like this: >>>> >>>> cdef class MyClass: >>>> cdef extern from *: >>>> void PyErr_Clear() >>>> >>>> Surprisingly enough, this actually works. Cython assigns the real C-API >>>> function pointer to it during type initialisation and even calls the >>>> function directly (instead of going through the vtab) when used. A rather >>>> curious feature that I would never had thought of. >>> >>> Yes, normally the parser catches that. >>> >>>> Anyway, this side effect is obviously a bug in the fused types dispatch, >>>> but I don't have a good idea on how to fix it. I'm sure Mark put some >>>> thought into this while trying hard to make it work and just didn't notice >>>> the impact on type namespaces. >>> >>> I am aware of this behaviour, the thing is that the dispatcher >>> function needs to be analyzed in the right context in order to >>> generate an appropriate function or method in case of a cdef class >>> (which are different from methods in normal classes even when >>> synthesized). I thought about splitting the declarations from the >>> actual function, and analyzing that in the module scope. Perhaps with >>> some name mangling this can avoid names being accidentally available >>> in user code. I don't recall if I have tried that already, but I'll >>> give it another try. >> >> Ah, I see I already split them, all that is needed is to put it in the >> global scope now :) > > https://github.com/markflorisson88/cython/commit/3500fcd01ce6e68e76fcbabfe009eb273d7972fb Ok, sure, works for me. Stefan From d.s.seljebotn at astro.uio.no Fri May 11 15:25:28 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Fri, 11 May 2012 15:25:28 +0200 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions Message-ID: <4FAD1348.5010608@astro.uio.no> This comes from a refactor of the work on CEP 1000: A PEP proposal, with a hack for use in current Python versions and in the case of PEP rejection, that allows 3rd party libraries to agree on extensions to PyTypeObject. http://wiki.cython.org/enhancements/cep1001 If this makes it as a PEP, I don't think we need to think about having CEP 1000 accepted as a PEP. Comments? Dag From robertwb at gmail.com Fri May 11 17:48:52 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Fri, 11 May 2012 08:48:52 -0700 Subject: [Cython] CF based type inference In-Reply-To: <4FAA0D0F.9090508@behnel.de> References: <4FA91079.5090503@behnel.de> <4FAA0D0F.9090508@behnel.de> Message-ID: On Tue, May 8, 2012 at 11:22 PM, Stefan Behnel wrote: > Robert Bradshaw, 09.05.2012 00:12: >> On Tue, May 8, 2012 at 6:47 AM, Vitja Makarov wrote: >>> 2012/5/8 Stefan Behnel: >>>> Vitja has rebased the type inference on the control flow, so I wonder if >>>> this will enable us to properly infer this: >>>> >>>> ?def partial_validity(): >>>> ? ?""" >>>> ? ?>>> partial_validity() >>>> ? ?('Python object', 'double', 'str object') >>>> ? ?""" >>>> ? ?a = 1.0 >>>> ? ?b = a + 2 ? # definitely double >>>> ? ?a = 'test' >>>> ? ?c = a + 'toast' ?# definitely str >>>> ? ?return typeof(a), typeof(b), typeof(c) >>>> >>>> I think, what is mainly needed for this is that a NameNode with an >>>> undeclared type should not report its own entry as dependency but that of >>>> its own cf_assignments. Would this work? >>>> >>>> (Haven't got the time to try it out right now, so I'm dumping it here.) >>>> >>> >>> Yeah, that might work. The other way to go is to split entries: >>> >>> ?def partial_validity(): >>> ? """ >>> ? >>> partial_validity() >>> ? ('str object', 'double', 'str object') >>> ? """ >>> ? a_1 = 1.0 >>> ? b = a_1 + 2 ? # definitely double >>> ? a_2 = 'test' >>> ? c = a_2 + 'toast' ?# definitely str >>> ? return typeof(a_2), typeof(b), typeof(c) >>> >>> And this should work better because it allows to infer a_1 as a double >>> and a_2 as a string. >> >> This already works, right? > > It would work if it was implemented. *wink* Well, we don't infer str, but that's a separate issue from control flow. - Robert From stefan_ml at behnel.de Fri May 11 17:56:04 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 11 May 2012 17:56:04 +0200 Subject: [Cython] CF based type inference In-Reply-To: References: <4FA91079.5090503@behnel.de> <4FAA0D0F.9090508@behnel.de> Message-ID: <4FAD3694.1020105@behnel.de> Robert Bradshaw, 11.05.2012 17:48: > On Tue, May 8, 2012 at 11:22 PM, Stefan Behnel wrote: >> Robert Bradshaw, 09.05.2012 00:12: >>> On Tue, May 8, 2012 at 6:47 AM, Vitja Makarov wrote: >>>> 2012/5/8 Stefan Behnel: >>>>> Vitja has rebased the type inference on the control flow, so I wonder if >>>>> this will enable us to properly infer this: >>>>> >>>>> def partial_validity(): >>>>> """ >>>>> >>> partial_validity() >>>>> ('Python object', 'double', 'str object') >>>>> """ >>>>> a = 1.0 >>>>> b = a + 2 # definitely double >>>>> a = 'test' >>>>> c = a + 'toast' # definitely str >>>>> return typeof(a), typeof(b), typeof(c) >>>>> >>>>> I think, what is mainly needed for this is that a NameNode with an >>>>> undeclared type should not report its own entry as dependency but that of >>>>> its own cf_assignments. Would this work? >>>>> >>>>> (Haven't got the time to try it out right now, so I'm dumping it here.) >>>>> >>>> >>>> Yeah, that might work. The other way to go is to split entries: >>>> >>>> def partial_validity(): >>>> """ >>>> >>> partial_validity() >>>> ('str object', 'double', 'str object') >>>> """ >>>> a_1 = 1.0 >>>> b = a_1 + 2 # definitely double >>>> a_2 = 'test' >>>> c = a_2 + 'toast' # definitely str >>>> return typeof(a_2), typeof(b), typeof(c) >>>> >>>> And this should work better because it allows to infer a_1 as a double >>>> and a_2 as a string. >>> >>> This already works, right? >> >> It would work if it was implemented. *wink* > > Well, we don't infer str Yes we do, there are even some optimisations for str. It's well defined for both Py2 and Py3, just not the same on both, so the final code to use for them is C compile time dependent. I meant to say that entry splitting isn't implemented. Stefan From robertwb at gmail.com Fri May 11 17:58:34 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Fri, 11 May 2012 08:58:34 -0700 Subject: [Cython] CF based type inference In-Reply-To: <4FAD3694.1020105@behnel.de> References: <4FA91079.5090503@behnel.de> <4FAA0D0F.9090508@behnel.de> <4FAD3694.1020105@behnel.de> Message-ID: On Fri, May 11, 2012 at 8:56 AM, Stefan Behnel wrote: > Robert Bradshaw, 11.05.2012 17:48: >> On Tue, May 8, 2012 at 11:22 PM, Stefan Behnel wrote: >>> Robert Bradshaw, 09.05.2012 00:12: >>>> On Tue, May 8, 2012 at 6:47 AM, Vitja Makarov wrote: >>>>> 2012/5/8 Stefan Behnel: >>>>>> Vitja has rebased the type inference on the control flow, so I wonder if >>>>>> this will enable us to properly infer this: >>>>>> >>>>>> ?def partial_validity(): >>>>>> ? ?""" >>>>>> ? ?>>> partial_validity() >>>>>> ? ?('Python object', 'double', 'str object') >>>>>> ? ?""" >>>>>> ? ?a = 1.0 >>>>>> ? ?b = a + 2 ? # definitely double >>>>>> ? ?a = 'test' >>>>>> ? ?c = a + 'toast' ?# definitely str >>>>>> ? ?return typeof(a), typeof(b), typeof(c) >>>>>> >>>>>> I think, what is mainly needed for this is that a NameNode with an >>>>>> undeclared type should not report its own entry as dependency but that of >>>>>> its own cf_assignments. Would this work? >>>>>> >>>>>> (Haven't got the time to try it out right now, so I'm dumping it here.) >>>>>> >>>>> >>>>> Yeah, that might work. The other way to go is to split entries: >>>>> >>>>> ?def partial_validity(): >>>>> ? """ >>>>> ? >>> partial_validity() >>>>> ? ('str object', 'double', 'str object') >>>>> ? """ >>>>> ? a_1 = 1.0 >>>>> ? b = a_1 + 2 ? # definitely double >>>>> ? a_2 = 'test' >>>>> ? c = a_2 + 'toast' ?# definitely str >>>>> ? return typeof(a_2), typeof(b), typeof(c) >>>>> >>>>> And this should work better because it allows to infer a_1 as a double >>>>> and a_2 as a string. >>>> >>>> This already works, right? >>> >>> It would work if it was implemented. *wink* >> >> Well, we don't infer str > > Yes we do, there are even some optimisations for str. It's well defined for > both Py2 and Py3, just not the same on both, so the final code to use for > them is C compile time dependent. > > I meant to say that entry splitting isn't implemented. Yeah, that isn't implemented yet. - Robert From stefan_ml at behnel.de Sat May 12 09:44:41 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 May 2012 09:44:41 +0200 Subject: [Cython] Cython project homepage update Message-ID: <4FAE14E9.3080902@behnel.de> Hi, I added a "power point" list section to the top of the Cython homepage that aims to push readers into the main selling points of Cython. Please take a look and tell me if I missed anything. http://cython.org/ While checking the links, I just noticed that the NumPy wiki section on extending and integrating NumPy points to Cython like this: """ Pyrex: Pyrex lets you write code that mixes Python and C data types any way you want, and compiles it into a C extension for Python. See also Cython. """ http://www.scipy.org/Topical_Software#head-7153b42ac4ea517c7d99ec4f4453555b2302a1f8 I don't have an account there, but it would be worth changing the order of the names. I don't think there are many people who use NumPy together with Pyrex these days. Stefan From njs at pobox.com Sat May 12 20:44:16 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 12 May 2012 19:44:16 +0100 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: <4FAD1348.5010608@astro.uio.no> References: <4FAD1348.5010608@astro.uio.no> Message-ID: On Fri, May 11, 2012 at 2:25 PM, Dag Sverre Seljebotn wrote: > This comes from a refactor of the work on CEP 1000: A PEP proposal, with a > hack for use in current Python versions and in the case of PEP rejection, > that allows 3rd party libraries to agree on extensions to PyTypeObject. > > http://wiki.cython.org/enhancements/cep1001 > > If this makes it as a PEP, I don't think we need to think about having CEP > 1000 accepted as a PEP. > > Comments? There should probably be some discussion of memory management for the tpe_data pointers. (I assume it's "guaranteed to be valid for as long as the associated PyTypeObject, and the PyTypeObject is responsible for making sure any necessary cleanup happens if it gets deallocated", but a note to this effect would be good.) What happens if I want to inherit from PyTypeObject (a "metaclass") and also implement this interface? It is possible? What if I want to inherit from an existing subclass of PyTypeObject and add on this interface? I don't know enough gnarly details about how new style classes are implemented to tell. Would it make sense to make this memory-layout-equivalent to a PyTypeObject subclass with extra fields? -- Nathaniel From d.s.seljebotn at astro.uio.no Sun May 13 21:35:35 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Sun, 13 May 2012 21:35:35 +0200 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: References: <4FAD1348.5010608@astro.uio.no> Message-ID: <4FB00D07.7080105@astro.uio.no> On 05/12/2012 08:44 PM, Nathaniel Smith wrote: > On Fri, May 11, 2012 at 2:25 PM, Dag Sverre Seljebotn > wrote: >> This comes from a refactor of the work on CEP 1000: A PEP proposal, with a >> hack for use in current Python versions and in the case of PEP rejection, >> that allows 3rd party libraries to agree on extensions to PyTypeObject. >> >> http://wiki.cython.org/enhancements/cep1001 >> >> If this makes it as a PEP, I don't think we need to think about having CEP >> 1000 accepted as a PEP. >> >> Comments? > > There should probably be some discussion of memory management for the > tpe_data pointers. (I assume it's "guaranteed to be valid for as long > as the associated PyTypeObject, and the PyTypeObject is responsible > for making sure any necessary cleanup happens if it gets deallocated", > but a note to this effect would be good.) > > What happens if I want to inherit from PyTypeObject (a "metaclass") > and also implement this interface? It is possible? What if I want to > inherit from an existing subclass of PyTypeObject and add on this > interface? I don't know enough gnarly details about how new style > classes are implemented to tell. Would it make sense to make this > memory-layout-equivalent to a PyTypeObject subclass with extra fields? Hmm. You know what -- this whole thing could probably be a metaclass. Except I think a PyObject_TypeCheck on the type would be a bit more expensive than just checking a flag. I think I like having a flag better... The point of supporting objects with a metaclass is a good one. I don't know enough details either. I wonder if ob_size field could save us; basically access extra information at offset (ob_size != 0) ? ob_size : sizeof(PyTypeObject); I think that also flags that the type object is allocated on heap? But at least it allows a way out of you want to use a metaclass (allocate it on the heap; or perhaps give it a very high refcount). But I didn't check this in detail yet. Dag From d.s.seljebotn at astro.uio.no Sun May 13 21:37:22 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Sun, 13 May 2012 21:37:22 +0200 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: <4FB00D07.7080105@astro.uio.no> References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> Message-ID: <4FB00D72.7040509@astro.uio.no> On 05/13/2012 09:35 PM, Dag Sverre Seljebotn wrote: > On 05/12/2012 08:44 PM, Nathaniel Smith wrote: >> On Fri, May 11, 2012 at 2:25 PM, Dag Sverre Seljebotn >> wrote: >>> This comes from a refactor of the work on CEP 1000: A PEP proposal, >>> with a >>> hack for use in current Python versions and in the case of PEP >>> rejection, >>> that allows 3rd party libraries to agree on extensions to PyTypeObject. >>> >>> http://wiki.cython.org/enhancements/cep1001 >>> >>> If this makes it as a PEP, I don't think we need to think about >>> having CEP >>> 1000 accepted as a PEP. >>> >>> Comments? >> >> There should probably be some discussion of memory management for the >> tpe_data pointers. (I assume it's "guaranteed to be valid for as long >> as the associated PyTypeObject, and the PyTypeObject is responsible >> for making sure any necessary cleanup happens if it gets deallocated", >> but a note to this effect would be good.) >> >> What happens if I want to inherit from PyTypeObject (a "metaclass") >> and also implement this interface? It is possible? What if I want to >> inherit from an existing subclass of PyTypeObject and add on this >> interface? I don't know enough gnarly details about how new style >> classes are implemented to tell. Would it make sense to make this >> memory-layout-equivalent to a PyTypeObject subclass with extra fields? > > Hmm. You know what -- this whole thing could probably be a metaclass. > Except I think a PyObject_TypeCheck on the type would be a bit more > expensive than just checking a flag. I think I like having a flag better... > > The point of supporting objects with a metaclass is a good one. I don't > know enough details either. I wonder if ob_size field could save us; > basically access extra information at offset > > (ob_size != 0) ? ob_size : sizeof(PyTypeObject); Ehrm, presumably the size information must include the size of the extra payload, so this computation must be a bit different... Anyway, thanks for the heads up, this seems to need a bit more work. Input from somebody more familiar with this corner of the CPython API very welcome. Dag > > I think that also flags that the type object is allocated on heap? But > at least it allows a way out of you want to use a metaclass (allocate it > on the heap; or perhaps give it a very high refcount). > > But I didn't check this in detail yet. > > Dag From stefan_ml at behnel.de Mon May 14 13:34:41 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 14 May 2012 13:34:41 +0200 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: <4FB00D72.7040509@astro.uio.no> References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> <4FB00D72.7040509@astro.uio.no> Message-ID: <4FB0EDD1.2010206@behnel.de> Dag Sverre Seljebotn, 13.05.2012 21:37: > Anyway, thanks for the heads up, this seems to need a bit more work. Input > from somebody more familiar with this corner of the CPython API very welcome. Wouldn't you consider python-dev an appropriate place to discuss this? Stefan From njs at pobox.com Mon May 14 14:29:06 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 14 May 2012 13:29:06 +0100 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: <4FB00D07.7080105@astro.uio.no> References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> Message-ID: On Sun, May 13, 2012 at 8:35 PM, Dag Sverre Seljebotn wrote: > On 05/12/2012 08:44 PM, Nathaniel Smith wrote: >> >> On Fri, May 11, 2012 at 2:25 PM, Dag Sverre Seljebotn >> ?wrote: >>> >>> This comes from a refactor of the work on CEP 1000: A PEP proposal, with >>> a >>> hack for use in current Python versions and in the case of PEP rejection, >>> that allows 3rd party libraries to agree on extensions to PyTypeObject. >>> >>> http://wiki.cython.org/enhancements/cep1001 >>> >>> If this makes it as a PEP, I don't think we need to think about having >>> CEP >>> 1000 accepted as a PEP. >>> >>> Comments? >> >> >> There should probably be some discussion of memory management for the >> tpe_data pointers. (I assume it's "guaranteed to be valid for as long >> as the associated PyTypeObject, and the PyTypeObject is responsible >> for making sure any necessary cleanup happens if it gets deallocated", >> but a note to this effect would be good.) >> >> What happens if I want to inherit from PyTypeObject (a "metaclass") >> and also implement this interface? It is possible? What if I want to >> inherit from an existing subclass of PyTypeObject and add on this >> interface? I don't know enough gnarly details about how new style >> classes are implemented to tell. Would it make sense to make this >> memory-layout-equivalent to a PyTypeObject subclass with extra fields? > > > Hmm. You know what -- this whole thing could probably be a metaclass. Well, yes, conceptually, that's exactly what it is -- the question is how and whether it relates to the Python metaclass machinery, since you are speed freaks :-). > Except > I think a PyObject_TypeCheck on the type would be a bit more expensive than > just checking a flag. I think I like having a flag better... A number of existing flags are actually used exactly to make type-checking faster for some key types (PY_TPFLAGS_INT_SUBCLASS, etc.). I guess doing it the same way would put the flag in obj->tp_class->tp_class->tp_flags, though, instead of obj->tp_class->tp_flags. - N From d.s.seljebotn at astro.uio.no Mon May 14 16:23:01 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 14 May 2012 16:23:01 +0200 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: <4FB0EDD1.2010206@behnel.de> References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> <4FB00D72.7040509@astro.uio.no> <4FB0EDD1.2010206@behnel.de> Message-ID: <4FB11545.8070803@astro.uio.no> On 05/14/2012 01:34 PM, Stefan Behnel wrote: > Dag Sverre Seljebotn, 13.05.2012 21:37: >> Anyway, thanks for the heads up, this seems to need a bit more work. Input >> from somebody more familiar with this corner of the CPython API very welcome. > > Wouldn't you consider python-dev an appropriate place to discuss this? Propose something for a PEP that's primarily useful to Cython without even understanding the full implications myself first? I'd rather try to not annoy people; I figured the time I have the CPython patches ready and tested is the time I ping python-dev... Dag From njs at pobox.com Mon May 14 19:05:55 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 14 May 2012 18:05:55 +0100 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: <4FB11545.8070803@astro.uio.no> References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> <4FB00D72.7040509@astro.uio.no> <4FB0EDD1.2010206@behnel.de> <4FB11545.8070803@astro.uio.no> Message-ID: On Mon, May 14, 2012 at 3:23 PM, Dag Sverre Seljebotn wrote: > On 05/14/2012 01:34 PM, Stefan Behnel wrote: >> >> Dag Sverre Seljebotn, 13.05.2012 21:37: >>> >>> Anyway, thanks for the heads up, this seems to need a bit more work. >>> Input >>> from somebody more familiar with this corner of the CPython API very >>> welcome. >> >> >> Wouldn't you consider python-dev an appropriate place to discuss this? > > > Propose something for a PEP that's primarily useful to Cython without even > understanding the full implications myself first? > > I'd rather try to not annoy people; I figured the time I have the CPython > patches ready and tested is the time I ping python-dev... If you want to eventually propose a PEP, you really really really should be talking to them before. Otherwise you'll get everything worked out just the way you want and they'll be like "what is this? re-do it all totally differently". And they might be wrong, but then you have to reconstruct for them the whole debate and reasoning process and implicit assumptions that you're making and not realizing you need to articulate, so easier to just get all the interested people at the table to begin with. And they might be right, in which case you just wasted however much time digging yourself into a hole and reverse-engineering bits of CPython. Don't propose it as a PEP, just say "hey, we have this problem and these constraints, and we're thinking we could solve them by something like this; but of course that has these limitations, so I dunno. What do you think?" And expect to spend some time figuring out what your requirements actually are (even if you think you know already, see above about implicit assumptions). -- Nathaniel From stefan_ml at behnel.de Mon May 14 19:21:39 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 14 May 2012 19:21:39 +0200 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> <4FB00D72.7040509@astro.uio.no> <4FB0EDD1.2010206@behnel.de> <4FB11545.8070803@astro.uio.no> Message-ID: <4FB13F23.30007@behnel.de> Nathaniel Smith, 14.05.2012 19:05: > On Mon, May 14, 2012 at 3:23 PM, Dag Sverre Seljebotn wrote: >> On 05/14/2012 01:34 PM, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 13.05.2012 21:37: >>>> Anyway, thanks for the heads up, this seems to need a bit more work. >>>> Input >>>> from somebody more familiar with this corner of the CPython API very >>>> welcome. >>> >>> Wouldn't you consider python-dev an appropriate place to discuss this? >> >> Propose something for a PEP that's primarily useful to Cython without even >> understanding the full implications myself first? >> >> I'd rather try to not annoy people; I figured the time I have the CPython >> patches ready and tested is the time I ping python-dev... > > If you want to eventually propose a PEP, you really really really > should be talking to them before. Otherwise you'll get everything > worked out just the way you want and they'll be like "what is this? > re-do it all totally differently". And they might be wrong, but then > you have to reconstruct for them the whole debate and reasoning > process and implicit assumptions that you're making and not realizing > you need to articulate, so easier to just get all the interested > people at the table to begin with. And they might be right, in which > case you just wasted however much time digging yourself into a hole > and reverse-engineering bits of CPython. > > Don't propose it as a PEP, just say "hey, we have this problem and > these constraints, and we're thinking we could solve them by something > like this; but of course that has these limitations, so I dunno. What > do you think?" And expect to spend some time figuring out what your > requirements actually are (even if you think you know already, see > above about implicit assumptions). +1 Stefan From robertwb at gmail.com Mon May 14 20:01:36 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Mon, 14 May 2012 11:01:36 -0700 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> <4FB00D72.7040509@astro.uio.no> <4FB0EDD1.2010206@behnel.de> <4FB11545.8070803@astro.uio.no> Message-ID: On Mon, May 14, 2012 at 10:05 AM, Nathaniel Smith wrote: > On Mon, May 14, 2012 at 3:23 PM, Dag Sverre Seljebotn > wrote: >> On 05/14/2012 01:34 PM, Stefan Behnel wrote: >>> >>> Dag Sverre Seljebotn, 13.05.2012 21:37: >>>> >>>> Anyway, thanks for the heads up, this seems to need a bit more work. >>>> Input >>>> from somebody more familiar with this corner of the CPython API very >>>> welcome. >>> >>> >>> Wouldn't you consider python-dev an appropriate place to discuss this? >> >> >> Propose something for a PEP that's primarily useful to Cython without even >> understanding the full implications myself first? >> >> I'd rather try to not annoy people; I figured the time I have the CPython >> patches ready and tested is the time I ping python-dev... > > If you want to eventually propose a PEP, you really really really > should be talking to them before. Otherwise you'll get everything > worked out just the way you want and they'll be like "what is this? > re-do it all totally differently". And they might be wrong, but then > you have to reconstruct for them the whole debate and reasoning > process and implicit assumptions that you're making and not realizing > you need to articulate, so easier to just get all the interested > people at the table to begin with. And they might be right, in which > case you just wasted however much time digging yourself into a hole > and reverse-engineering bits of CPython. > > Don't propose it as a PEP, just say "hey, we have this problem and > these constraints, and we're thinking we could solve them by something > like this; but of course that has these limitations, so I dunno. What > do you think?" And expect to spend some time figuring out what your > requirements actually are (even if you think you know already, see > above about implicit assumptions). I personally think it's a great idea to bounce ideas around here first before going to python-dev, especially as a PEP wouldn't get in until 3.3 or 3.4 at best, and we want to do something with 2.4+ in the near term. That doesn't preclude presenting the problem and proposed solution on python-dev as well, but the purpose of this thread seems to be to think about it some, including how we're going to support things in the short term, not nail down an exact PEP for Python to accept at face value. I think we're at a point we can ping python-dev now though. To be more futureproof, we'd want an offset to PyExtendedTypeObject rather than assuming it exists at the end of PyTypeObject, but I don't see a good place to store this information, so assuming it's right there based on a bit in the flag seems a reasonable way forward. - Robert From d.s.seljebotn at astro.uio.no Wed May 16 13:20:52 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Wed, 16 May 2012 13:20:52 +0200 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> <4FB00D72.7040509@astro.uio.no> <4FB0EDD1.2010206@behnel.de> <4FB11545.8070803@astro.uio.no> Message-ID: <4FB38D94.9000200@astro.uio.no> On 05/14/2012 08:01 PM, Robert Bradshaw wrote: > On Mon, May 14, 2012 at 10:05 AM, Nathaniel Smith wrote: >> On Mon, May 14, 2012 at 3:23 PM, Dag Sverre Seljebotn >> wrote: >>> On 05/14/2012 01:34 PM, Stefan Behnel wrote: >>>> >>>> Dag Sverre Seljebotn, 13.05.2012 21:37: >>>>> >>>>> Anyway, thanks for the heads up, this seems to need a bit more work. >>>>> Input >>>>> from somebody more familiar with this corner of the CPython API very >>>>> welcome. >>>> >>>> >>>> Wouldn't you consider python-dev an appropriate place to discuss this? >>> >>> >>> Propose something for a PEP that's primarily useful to Cython without even >>> understanding the full implications myself first? >>> >>> I'd rather try to not annoy people; I figured the time I have the CPython >>> patches ready and tested is the time I ping python-dev... >> >> If you want to eventually propose a PEP, you really really really >> should be talking to them before. Otherwise you'll get everything >> worked out just the way you want and they'll be like "what is this? >> re-do it all totally differently". And they might be wrong, but then >> you have to reconstruct for them the whole debate and reasoning >> process and implicit assumptions that you're making and not realizing >> you need to articulate, so easier to just get all the interested >> people at the table to begin with. And they might be right, in which >> case you just wasted however much time digging yourself into a hole >> and reverse-engineering bits of CPython. >> >> Don't propose it as a PEP, just say "hey, we have this problem and >> these constraints, and we're thinking we could solve them by something >> like this; but of course that has these limitations, so I dunno. What >> do you think?" And expect to spend some time figuring out what your >> requirements actually are (even if you think you know already, see >> above about implicit assumptions). > > I personally think it's a great idea to bounce ideas around here first > before going to python-dev, especially as a PEP wouldn't get in until > 3.3 or 3.4 at best, and we want to do something with 2.4+ in the near > term. That doesn't preclude presenting the problem and proposed > solution on python-dev as well, but the purpose of this thread seems > to be to think about it some, including how we're going to support > things in the short term, not nail down an exact PEP for Python to > accept at face value. I think we're at a point we can ping python-dev > now though. > > To be more futureproof, we'd want an offset to PyExtendedTypeObject > rather than assuming it exists at the end of PyTypeObject, but I don't > see a good place to store this information, so assuming it's right > there based on a bit in the flag seems a reasonable way forward. So I posted on python-dev. There's a lot of "You don't need to do this"; but here's an idea I got that's inspired by that discussion: We could use tp_getattr (and call it directly), but pass in an interned char* which Python code can never get hold of, and then that could return a void* (casted through a PyObject*, but it would not be). Another alternative is to somehow handshake on a metaclass implementation; and different Cython modules/NumPy/SciPy etc. would inherit from it. But apart from that handshaking, and having to use a metaclass and make everything more complicated for C implementors of the spec, it gives you a more expensive check than just checking a flag. I like a flag bit much better. I still hope that somebody more understanding comes along, argues our case, and gets bit 22 reserved for our purpose :-) Dag From markflorisson88 at gmail.com Wed May 16 15:03:56 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 16 May 2012 14:03:56 +0100 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: <4FB38D94.9000200@astro.uio.no> References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> <4FB00D72.7040509@astro.uio.no> <4FB0EDD1.2010206@behnel.de> <4FB11545.8070803@astro.uio.no> <4FB38D94.9000200@astro.uio.no> Message-ID: On 16 May 2012 12:20, Dag Sverre Seljebotn wrote: > On 05/14/2012 08:01 PM, Robert Bradshaw wrote: >> >> On Mon, May 14, 2012 at 10:05 AM, Nathaniel Smith ?wrote: >>> >>> On Mon, May 14, 2012 at 3:23 PM, Dag Sverre Seljebotn >>> ?wrote: >>>> >>>> On 05/14/2012 01:34 PM, Stefan Behnel wrote: >>>>> >>>>> >>>>> Dag Sverre Seljebotn, 13.05.2012 21:37: >>>>>> >>>>>> >>>>>> Anyway, thanks for the heads up, this seems to need a bit more work. >>>>>> Input >>>>>> from somebody more familiar with this corner of the CPython API very >>>>>> welcome. >>>>> >>>>> >>>>> >>>>> Wouldn't you consider python-dev an appropriate place to discuss this? >>>> >>>> >>>> >>>> Propose something for a PEP that's primarily useful to Cython without >>>> even >>>> understanding the full implications myself first? >>>> >>>> I'd rather try to not annoy people; I figured the time I have the >>>> CPython >>>> patches ready and tested is the time I ping python-dev... >>> >>> >>> If you want to eventually propose a PEP, you really really really >>> should be talking to them before. Otherwise you'll get everything >>> worked out just the way you want and they'll be like "what is this? >>> re-do it all totally differently". And they might be wrong, but then >>> you have to reconstruct for them the whole debate and reasoning >>> process and implicit assumptions that you're making and not realizing >>> you need to articulate, so easier to just get all the interested >>> people at the table to begin with. And they might be right, in which >>> case you just wasted however much time digging yourself into a hole >>> and reverse-engineering bits of CPython. >>> >>> Don't propose it as a PEP, just say "hey, we have this problem and >>> these constraints, and we're thinking we could solve them by something >>> like this; but of course that has these limitations, so I dunno. What >>> do you think?" And expect to spend some time figuring out what your >>> requirements actually are (even if you think you know already, see >>> above about implicit assumptions). >> >> >> I personally think it's a great idea to bounce ideas around here first >> before going to python-dev, especially as a PEP wouldn't get in until >> 3.3 or 3.4 at best, and we want to do something with 2.4+ in the near >> term. That doesn't preclude presenting the problem and proposed >> solution on python-dev as well, but the purpose of this thread seems >> to be to think about it some, including how we're going to support >> things in the short term, not nail down an exact PEP for Python to >> accept at face value. I think we're at a point we can ping python-dev >> now though. >> >> To be more futureproof, we'd want an offset to PyExtendedTypeObject >> rather than assuming it exists at the end of PyTypeObject, but I don't >> see a good place to store this information, so assuming it's right >> there based on a bit in the flag seems a reasonable way forward. > > > So I posted on python-dev. > > There's a lot of "You don't need to do this"; but here's an idea I got > that's inspired by that discussion: We could use tp_getattr (and call it > directly), but pass in an interned char* which Python code can never get > hold of, and then that could return a void* (casted through a PyObject*, but > it would not be). Would we want to support monkey patching these interfaces? If so, this mechanism would be a bit harder than reallocing a pointer, although I guess a closure chain of tp_getattr functions would work :). But I think we want GIL-less access anyway right, which means neither approach would work unsynchronized. > Another alternative is to somehow handshake on a metaclass implementation; > and different Cython modules/NumPy/SciPy etc. would inherit from it. But > apart from that handshaking, and having to use a metaclass and make > everything more complicated for C implementors of the spec, it gives you a > more expensive check than just checking a flag. I agree that the flag is much easier, if you have a metaclass the question is again in which module to store it to get a cross-module working typecheck. On the other hand, if the header file provides an easy way to import the metaclass (monkeypatched on some module or living in its own module), and to allocate type instances given a (statically allocated) type, that would be more future-proof and elegant. I don't think it would be much slower, it's doing 'o->ob_type->tp_flags & MYFLAG' vs 'o->ob_type->ob_type == MYMETA'. I think for the bit flag the interface won't span subclasses, whereas the metaclass approach would allow subclassing but not subclassing of the metaclass itself (unless you instantiate the metaclass through itself, and check the interface against the metaclass, which only means the metaclass of the metaclass isn't subclassable :) (this would also mean a more expensive check)). I think if we implement CEP 1000, we will at that time have a generic way to optimize and hoist none checking/bounds checking etc, which will also allow us to optimize signature matching, which would mean the matching and unpacking isn't as performance critical. JIT compilers could take a similar approach. > I like a flag bit much better. I still hope that somebody more understanding > comes along, argues our case, and gets bit 22 reserved for our purpose :-) > > Dag > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Wed May 16 15:25:17 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 16 May 2012 14:25:17 +0100 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> <4FB00D72.7040509@astro.uio.no> <4FB0EDD1.2010206@behnel.de> <4FB11545.8070803@astro.uio.no> <4FB38D94.9000200@astro.uio.no> Message-ID: On 16 May 2012 14:03, mark florisson wrote: > On 16 May 2012 12:20, Dag Sverre Seljebotn wrote: >> On 05/14/2012 08:01 PM, Robert Bradshaw wrote: >>> >>> On Mon, May 14, 2012 at 10:05 AM, Nathaniel Smith ?wrote: >>>> >>>> On Mon, May 14, 2012 at 3:23 PM, Dag Sverre Seljebotn >>>> ?wrote: >>>>> >>>>> On 05/14/2012 01:34 PM, Stefan Behnel wrote: >>>>>> >>>>>> >>>>>> Dag Sverre Seljebotn, 13.05.2012 21:37: >>>>>>> >>>>>>> >>>>>>> Anyway, thanks for the heads up, this seems to need a bit more work. >>>>>>> Input >>>>>>> from somebody more familiar with this corner of the CPython API very >>>>>>> welcome. >>>>>> >>>>>> >>>>>> >>>>>> Wouldn't you consider python-dev an appropriate place to discuss this? >>>>> >>>>> >>>>> >>>>> Propose something for a PEP that's primarily useful to Cython without >>>>> even >>>>> understanding the full implications myself first? >>>>> >>>>> I'd rather try to not annoy people; I figured the time I have the >>>>> CPython >>>>> patches ready and tested is the time I ping python-dev... >>>> >>>> >>>> If you want to eventually propose a PEP, you really really really >>>> should be talking to them before. Otherwise you'll get everything >>>> worked out just the way you want and they'll be like "what is this? >>>> re-do it all totally differently". And they might be wrong, but then >>>> you have to reconstruct for them the whole debate and reasoning >>>> process and implicit assumptions that you're making and not realizing >>>> you need to articulate, so easier to just get all the interested >>>> people at the table to begin with. And they might be right, in which >>>> case you just wasted however much time digging yourself into a hole >>>> and reverse-engineering bits of CPython. >>>> >>>> Don't propose it as a PEP, just say "hey, we have this problem and >>>> these constraints, and we're thinking we could solve them by something >>>> like this; but of course that has these limitations, so I dunno. What >>>> do you think?" And expect to spend some time figuring out what your >>>> requirements actually are (even if you think you know already, see >>>> above about implicit assumptions). >>> >>> >>> I personally think it's a great idea to bounce ideas around here first >>> before going to python-dev, especially as a PEP wouldn't get in until >>> 3.3 or 3.4 at best, and we want to do something with 2.4+ in the near >>> term. That doesn't preclude presenting the problem and proposed >>> solution on python-dev as well, but the purpose of this thread seems >>> to be to think about it some, including how we're going to support >>> things in the short term, not nail down an exact PEP for Python to >>> accept at face value. I think we're at a point we can ping python-dev >>> now though. >>> >>> To be more futureproof, we'd want an offset to PyExtendedTypeObject >>> rather than assuming it exists at the end of PyTypeObject, but I don't >>> see a good place to store this information, so assuming it's right >>> there based on a bit in the flag seems a reasonable way forward. >> >> >> So I posted on python-dev. >> >> There's a lot of "You don't need to do this"; but here's an idea I got >> that's inspired by that discussion: We could use tp_getattr (and call it >> directly), but pass in an interned char* which Python code can never get >> hold of, and then that could return a void* (casted through a PyObject*, but >> it would not be). > > Would we want to support monkey patching these interfaces? If so, this > mechanism would be a bit harder than reallocing a pointer, although I > guess a closure chain of tp_getattr functions would work :). But I > think we want GIL-less access anyway right, which means neither > approach would work unsynchronized. > >> Another alternative is to somehow handshake on a metaclass implementation; >> and different Cython modules/NumPy/SciPy etc. would inherit from it. But >> apart from that handshaking, and having to use a metaclass and make >> everything more complicated for C implementors of the spec, it gives you a >> more expensive check than just checking a flag. > > I agree that the flag is much easier, if you have a metaclass the > question is again in which module to store it to get a cross-module > working typecheck. On the other hand, if the header file provides an > easy way to import the metaclass (monkeypatched on some module or > living in its own module), and to allocate type instances given a > (statically allocated) type, that would be more future-proof and > elegant. I don't think it would be much slower, it's doing > 'o->ob_type->tp_flags & MYFLAG' vs 'o->ob_type->ob_type == MYMETA'. Sorry, I think you mentioned something in related in the CEP, I couldn't find it from the enhancement list (I should have followed your originally posted link :). So that's a good point, there are several issues: - subclasses of the type exposing the interface (I don't think that can be handled through tp_flags) - in the case of a metaclass approach, subclassing the metaclass itself means type instances will no longer match the MYMETA pointer - exposing interfaces on metaclasses (I think this one is listed in the CEP) I suppose the last case could either be disallowed, or one could provide a custom tp_alloc and store (a pointer to) extra information ahead of the object, taking care to precede any GC information. This is pretty hacky though. > I think for the bit flag the interface won't span subclasses, whereas > the metaclass approach would allow subclassing but not subclassing of > the metaclass itself (unless you instantiate the metaclass through > itself, and check the interface against the metaclass, which only > means the metaclass of the metaclass isn't subclassable :) (this would > also mean a more expensive check)). > > I think if we implement CEP 1000, we will at that time have a generic > way to optimize and hoist none checking/bounds checking etc, which > will also allow us to optimize signature matching, which would mean > the matching and unpacking isn't as performance critical. JIT > compilers could take a similar approach. > >> I like a flag bit much better. I still hope that somebody more understanding >> comes along, argues our case, and gets bit 22 reserved for our purpose :-) >> >> Dag >> >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Wed May 16 15:26:38 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 16 May 2012 14:26:38 +0100 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> <4FB00D72.7040509@astro.uio.no> <4FB0EDD1.2010206@behnel.de> <4FB11545.8070803@astro.uio.no> <4FB38D94.9000200@astro.uio.no> Message-ID: On 16 May 2012 14:25, mark florisson wrote: > On 16 May 2012 14:03, mark florisson wrote: >> On 16 May 2012 12:20, Dag Sverre Seljebotn wrote: >>> On 05/14/2012 08:01 PM, Robert Bradshaw wrote: >>>> >>>> On Mon, May 14, 2012 at 10:05 AM, Nathaniel Smith ?wrote: >>>>> >>>>> On Mon, May 14, 2012 at 3:23 PM, Dag Sverre Seljebotn >>>>> ?wrote: >>>>>> >>>>>> On 05/14/2012 01:34 PM, Stefan Behnel wrote: >>>>>>> >>>>>>> >>>>>>> Dag Sverre Seljebotn, 13.05.2012 21:37: >>>>>>>> >>>>>>>> >>>>>>>> Anyway, thanks for the heads up, this seems to need a bit more work. >>>>>>>> Input >>>>>>>> from somebody more familiar with this corner of the CPython API very >>>>>>>> welcome. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Wouldn't you consider python-dev an appropriate place to discuss this? >>>>>> >>>>>> >>>>>> >>>>>> Propose something for a PEP that's primarily useful to Cython without >>>>>> even >>>>>> understanding the full implications myself first? >>>>>> >>>>>> I'd rather try to not annoy people; I figured the time I have the >>>>>> CPython >>>>>> patches ready and tested is the time I ping python-dev... >>>>> >>>>> >>>>> If you want to eventually propose a PEP, you really really really >>>>> should be talking to them before. Otherwise you'll get everything >>>>> worked out just the way you want and they'll be like "what is this? >>>>> re-do it all totally differently". And they might be wrong, but then >>>>> you have to reconstruct for them the whole debate and reasoning >>>>> process and implicit assumptions that you're making and not realizing >>>>> you need to articulate, so easier to just get all the interested >>>>> people at the table to begin with. And they might be right, in which >>>>> case you just wasted however much time digging yourself into a hole >>>>> and reverse-engineering bits of CPython. >>>>> >>>>> Don't propose it as a PEP, just say "hey, we have this problem and >>>>> these constraints, and we're thinking we could solve them by something >>>>> like this; but of course that has these limitations, so I dunno. What >>>>> do you think?" And expect to spend some time figuring out what your >>>>> requirements actually are (even if you think you know already, see >>>>> above about implicit assumptions). >>>> >>>> >>>> I personally think it's a great idea to bounce ideas around here first >>>> before going to python-dev, especially as a PEP wouldn't get in until >>>> 3.3 or 3.4 at best, and we want to do something with 2.4+ in the near >>>> term. That doesn't preclude presenting the problem and proposed >>>> solution on python-dev as well, but the purpose of this thread seems >>>> to be to think about it some, including how we're going to support >>>> things in the short term, not nail down an exact PEP for Python to >>>> accept at face value. I think we're at a point we can ping python-dev >>>> now though. >>>> >>>> To be more futureproof, we'd want an offset to PyExtendedTypeObject >>>> rather than assuming it exists at the end of PyTypeObject, but I don't >>>> see a good place to store this information, so assuming it's right >>>> there based on a bit in the flag seems a reasonable way forward. >>> >>> >>> So I posted on python-dev. >>> >>> There's a lot of "You don't need to do this"; but here's an idea I got >>> that's inspired by that discussion: We could use tp_getattr (and call it >>> directly), but pass in an interned char* which Python code can never get >>> hold of, and then that could return a void* (casted through a PyObject*, but >>> it would not be). >> >> Would we want to support monkey patching these interfaces? If so, this >> mechanism would be a bit harder than reallocing a pointer, although I >> guess a closure chain of tp_getattr functions would work :). But I >> think we want GIL-less access anyway right, which means neither >> approach would work unsynchronized. >> >>> Another alternative is to somehow handshake on a metaclass implementation; >>> and different Cython modules/NumPy/SciPy etc. would inherit from it. But >>> apart from that handshaking, and having to use a metaclass and make >>> everything more complicated for C implementors of the spec, it gives you a >>> more expensive check than just checking a flag. >> >> I agree that the flag is much easier, if you have a metaclass the >> question is again in which module to store it to get a cross-module >> working typecheck. On the other hand, if the header file provides an >> easy way to import the metaclass (monkeypatched on some module or >> living in its own module), and to allocate type instances given a >> (statically allocated) type, that would be more future-proof and >> elegant. I don't think it would be much slower, it's doing >> 'o->ob_type->tp_flags & MYFLAG' vs 'o->ob_type->ob_type == MYMETA'. > > Sorry, I think you mentioned something in related in the CEP, I > couldn't find it from the enhancement list (I should have followed > your originally posted link :). So that's a good point, there are > several issues: > > ? ?- subclasses of the type exposing the interface (I don't think > that can be handled through tp_flags) > ? ?- in the case of a metaclass approach, subclassing the metaclass > itself means type instances will no longer match the MYMETA pointer (Which can obviously be handled through a full typecheck, but that is more expensive) > ? ?- exposing interfaces on metaclasses (I think this one is listed in the CEP) > > I suppose the last case could either be disallowed, or one could > provide a custom tp_alloc and store (a pointer to) extra information > ahead of the object, taking care to precede any GC information. This > is pretty hacky though. > >> I think for the bit flag the interface won't span subclasses, whereas >> the metaclass approach would allow subclassing but not subclassing of >> the metaclass itself (unless you instantiate the metaclass through >> itself, and check the interface against the metaclass, which only >> means the metaclass of the metaclass isn't subclassable :) (this would >> also mean a more expensive check)). >> >> I think if we implement CEP 1000, we will at that time have a generic >> way to optimize and hoist none checking/bounds checking etc, which >> will also allow us to optimize signature matching, which would mean >> the matching and unpacking isn't as performance critical. JIT >> compilers could take a similar approach. >> >>> I like a flag bit much better. I still hope that somebody more understanding >>> comes along, argues our case, and gets bit 22 reserved for our purpose :-) >>> >>> Dag >>> >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Wed May 16 15:36:39 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 16 May 2012 14:36:39 +0100 Subject: [Cython] CEP 1001 - Custom PyTypeObject extensions In-Reply-To: References: <4FAD1348.5010608@astro.uio.no> <4FB00D07.7080105@astro.uio.no> <4FB00D72.7040509@astro.uio.no> <4FB0EDD1.2010206@behnel.de> <4FB11545.8070803@astro.uio.no> <4FB38D94.9000200@astro.uio.no> Message-ID: On 16 May 2012 14:25, mark florisson wrote: > On 16 May 2012 14:03, mark florisson wrote: >> On 16 May 2012 12:20, Dag Sverre Seljebotn wrote: >>> On 05/14/2012 08:01 PM, Robert Bradshaw wrote: >>>> >>>> On Mon, May 14, 2012 at 10:05 AM, Nathaniel Smith ?wrote: >>>>> >>>>> On Mon, May 14, 2012 at 3:23 PM, Dag Sverre Seljebotn >>>>> ?wrote: >>>>>> >>>>>> On 05/14/2012 01:34 PM, Stefan Behnel wrote: >>>>>>> >>>>>>> >>>>>>> Dag Sverre Seljebotn, 13.05.2012 21:37: >>>>>>>> >>>>>>>> >>>>>>>> Anyway, thanks for the heads up, this seems to need a bit more work. >>>>>>>> Input >>>>>>>> from somebody more familiar with this corner of the CPython API very >>>>>>>> welcome. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Wouldn't you consider python-dev an appropriate place to discuss this? >>>>>> >>>>>> >>>>>> >>>>>> Propose something for a PEP that's primarily useful to Cython without >>>>>> even >>>>>> understanding the full implications myself first? >>>>>> >>>>>> I'd rather try to not annoy people; I figured the time I have the >>>>>> CPython >>>>>> patches ready and tested is the time I ping python-dev... >>>>> >>>>> >>>>> If you want to eventually propose a PEP, you really really really >>>>> should be talking to them before. Otherwise you'll get everything >>>>> worked out just the way you want and they'll be like "what is this? >>>>> re-do it all totally differently". And they might be wrong, but then >>>>> you have to reconstruct for them the whole debate and reasoning >>>>> process and implicit assumptions that you're making and not realizing >>>>> you need to articulate, so easier to just get all the interested >>>>> people at the table to begin with. And they might be right, in which >>>>> case you just wasted however much time digging yourself into a hole >>>>> and reverse-engineering bits of CPython. >>>>> >>>>> Don't propose it as a PEP, just say "hey, we have this problem and >>>>> these constraints, and we're thinking we could solve them by something >>>>> like this; but of course that has these limitations, so I dunno. What >>>>> do you think?" And expect to spend some time figuring out what your >>>>> requirements actually are (even if you think you know already, see >>>>> above about implicit assumptions). >>>> >>>> >>>> I personally think it's a great idea to bounce ideas around here first >>>> before going to python-dev, especially as a PEP wouldn't get in until >>>> 3.3 or 3.4 at best, and we want to do something with 2.4+ in the near >>>> term. That doesn't preclude presenting the problem and proposed >>>> solution on python-dev as well, but the purpose of this thread seems >>>> to be to think about it some, including how we're going to support >>>> things in the short term, not nail down an exact PEP for Python to >>>> accept at face value. I think we're at a point we can ping python-dev >>>> now though. >>>> >>>> To be more futureproof, we'd want an offset to PyExtendedTypeObject >>>> rather than assuming it exists at the end of PyTypeObject, but I don't >>>> see a good place to store this information, so assuming it's right >>>> there based on a bit in the flag seems a reasonable way forward. >>> >>> >>> So I posted on python-dev. >>> >>> There's a lot of "You don't need to do this"; but here's an idea I got >>> that's inspired by that discussion: We could use tp_getattr (and call it >>> directly), but pass in an interned char* which Python code can never get >>> hold of, and then that could return a void* (casted through a PyObject*, but >>> it would not be). >> >> Would we want to support monkey patching these interfaces? If so, this >> mechanism would be a bit harder than reallocing a pointer, although I >> guess a closure chain of tp_getattr functions would work :). But I >> think we want GIL-less access anyway right, which means neither >> approach would work unsynchronized. >> >>> Another alternative is to somehow handshake on a metaclass implementation; >>> and different Cython modules/NumPy/SciPy etc. would inherit from it. But >>> apart from that handshaking, and having to use a metaclass and make >>> everything more complicated for C implementors of the spec, it gives you a >>> more expensive check than just checking a flag. >> >> I agree that the flag is much easier, if you have a metaclass the >> question is again in which module to store it to get a cross-module >> working typecheck. On the other hand, if the header file provides an >> easy way to import the metaclass (monkeypatched on some module or >> living in its own module), and to allocate type instances given a >> (statically allocated) type, that would be more future-proof and >> elegant. I don't think it would be much slower, it's doing >> 'o->ob_type->tp_flags & MYFLAG' vs 'o->ob_type->ob_type == MYMETA'. > > Sorry, I think you mentioned something in related in the CEP, I > couldn't find it from the enhancement list (I should have followed > your originally posted link :). So that's a good point, there are > several issues: > > ? ?- subclasses of the type exposing the interface (I don't think > that can be handled through tp_flags) > ? ?- in the case of a metaclass approach, subclassing the metaclass > itself means type instances will no longer match the MYMETA pointer > ? ?- exposing interfaces on metaclasses (I think this one is listed in the CEP) > > I suppose the last case could either be disallowed, or one could > provide a custom tp_alloc and store (a pointer to) extra information > ahead of the object, taking care to precede any GC information. This > is pretty hacky though. Hm, actually if you copy the type by value anyway, you can just modify the tp_basicsize of the copied value... Can you elaborate on the metaclass issue in the CEP? >> I think for the bit flag the interface won't span subclasses, whereas >> the metaclass approach would allow subclassing but not subclassing of >> the metaclass itself (unless you instantiate the metaclass through >> itself, and check the interface against the metaclass, which only >> means the metaclass of the metaclass isn't subclassable :) (this would >> also mean a more expensive check)). >> >> I think if we implement CEP 1000, we will at that time have a generic >> way to optimize and hoist none checking/bounds checking etc, which >> will also allow us to optimize signature matching, which would mean >> the matching and unpacking isn't as performance critical. JIT >> compilers could take a similar approach. >> >>> I like a flag bit much better. I still hope that somebody more understanding >>> comes along, argues our case, and gets bit 22 reserved for our purpose :-) >>> >>> Dag >>> >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Wed May 16 18:34:39 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 16 May 2012 18:34:39 +0200 Subject: [Cython] [cython] Python array support (#113) In-Reply-To: References: Message-ID: <4FB3D71F.3080602@behnel.de> Andreas van Cranenburgh, 16.05.2012 18:15: > Any news on this? Let me know if there's anything I can do to help inclusion of this patch. Could someone please take over here? https://github.com/cython/cython/pull/113 I haven't merged this yet and won't have the time to do it soonish. What I'd like to see happen is to get the current header file replaced by utility code "somehow". Not sure how that "somehow" is going to work. Basically, if this can be solved, I'd love to have it in for 0.17. Otherwise, well, not ... Stefan From stefan_ml at behnel.de Wed May 16 21:15:58 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 16 May 2012 21:15:58 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FB3F2F8.8060207@v.loewis.de> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> Message-ID: <4FB3FCEE.5020405@behnel.de> "Martin v. L?wis", 16.05.2012 20:33: >> Does this use case make sense to everyone? >> >> The reason why we are discussing this on python-dev is that we are looking >> for a general way to expose these C level signatures within the Python >> ecosystem. And Dag's idea was to expose them as part of the type object, >> basically as an addition to the current Python level tp_call() slot. > > The use case makes sense, yet there is also a long-standing solution > already to expose APIs and function pointers: the capsule objects. > > If you want to avoid dictionary lookups on the server side, implement > tp_getattro, comparing addresses of interned strings. I think Martin has a point there. Why not just use a custom attribute on callables that hold a PyCapsule? Whenever we see inside of a Cython implemented function that an object variable that was retrieved from the outside, either as a function argument or as the result of a function call, is being called, we try to unpack a C function pointer from it on all assignments to the variable. If that works, we can scan for a suitable signature (either right away or lazily on first access) and cache that. On each subsequent call through that variable, the cached C function will be used. That means we'd replace Python variables that are being called by multiple local variables, one that holds the object and one for each C function with a different signature that it is being called with. We set the C function variables to NULL when the Python function variable is being assigned to. When the C function variable is NULL on call, we scan for a matching signature and assign it to the variable. When no matching signature can be found, we set it to (void*)-1. Additionally, we allow explicit user casts of Python objects to C function types, which would then try to unpack the C function, raising a TypeError on mismatch. Assignments to callable variables can be expected to occur much less frequently than calls to them, so this will give us a good trade-off in most cases. I don't see why this kind of caching would be any slower inside of loops than what we were discussing so far. Stefan From markflorisson88 at gmail.com Wed May 16 21:49:18 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 16 May 2012 20:49:18 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FB3FCEE.5020405@behnel.de> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> Message-ID: On 16 May 2012 20:15, Stefan Behnel wrote: > "Martin v. L?wis", 16.05.2012 20:33: >>> Does this use case make sense to everyone? >>> >>> The reason why we are discussing this on python-dev is that we are looking >>> for a general way to expose these C level signatures within the Python >>> ecosystem. And Dag's idea was to expose them as part of the type object, >>> basically as an addition to the current Python level tp_call() slot. >> >> The use case makes sense, yet there is also a long-standing solution >> already to expose APIs and function pointers: the capsule objects. >> >> If you want to avoid dictionary lookups on the server side, implement >> tp_getattro, comparing addresses of interned strings. > > I think Martin has a point there. Why not just use a custom attribute on > callables that hold a PyCapsule? Whenever we see inside of a Cython > implemented function that an object variable that was retrieved from the > outside, either as a function argument or as the result of a function call, > is being called, we try to unpack a C function pointer from it on all > assignments to the variable. If that works, we can scan for a suitable > signature (either right away or lazily on first access) and cache that. On > each subsequent call through that variable, the cached C function will be used. > > That means we'd replace Python variables that are being called by multiple > local variables, one that holds the object and one for each C function with > a different signature that it is being called with. We set the C function > variables to NULL when the Python function variable is being assigned to. > When the C function variable is NULL on call, we scan for a matching > signature and assign it to the variable. ?When no matching signature can be > found, we set it to (void*)-1. > > Additionally, we allow explicit user casts of Python objects to C function > types, which would then try to unpack the C function, raising a TypeError > on mismatch. > > Assignments to callable variables can be expected to occur much less > frequently than calls to them, so this will give us a good trade-off in > most cases. I don't see why this kind of caching would be any slower inside > of loops than what we were discussing so far. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel This works really well for local variables, but for globals, def methods or callbacks as attributes, this won't work so well, as they may be rebound at any time outside of the module scope. I think in general Cython code could be easily sped up for most cases by provided a really fast dispatch mechanism here. From markflorisson88 at gmail.com Wed May 16 21:54:51 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 16 May 2012 20:54:51 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> Message-ID: On 16 May 2012 20:49, mark florisson wrote: > On 16 May 2012 20:15, Stefan Behnel wrote: >> "Martin v. L?wis", 16.05.2012 20:33: >>>> Does this use case make sense to everyone? >>>> >>>> The reason why we are discussing this on python-dev is that we are looking >>>> for a general way to expose these C level signatures within the Python >>>> ecosystem. And Dag's idea was to expose them as part of the type object, >>>> basically as an addition to the current Python level tp_call() slot. >>> >>> The use case makes sense, yet there is also a long-standing solution >>> already to expose APIs and function pointers: the capsule objects. >>> >>> If you want to avoid dictionary lookups on the server side, implement >>> tp_getattro, comparing addresses of interned strings. >> >> I think Martin has a point there. Why not just use a custom attribute on >> callables that hold a PyCapsule? Whenever we see inside of a Cython >> implemented function that an object variable that was retrieved from the >> outside, either as a function argument or as the result of a function call, >> is being called, we try to unpack a C function pointer from it on all >> assignments to the variable. If that works, we can scan for a suitable >> signature (either right away or lazily on first access) and cache that. On >> each subsequent call through that variable, the cached C function will be used. >> >> That means we'd replace Python variables that are being called by multiple >> local variables, one that holds the object and one for each C function with >> a different signature that it is being called with. We set the C function >> variables to NULL when the Python function variable is being assigned to. >> When the C function variable is NULL on call, we scan for a matching >> signature and assign it to the variable. ?When no matching signature can be >> found, we set it to (void*)-1. >> >> Additionally, we allow explicit user casts of Python objects to C function >> types, which would then try to unpack the C function, raising a TypeError >> on mismatch. >> >> Assignments to callable variables can be expected to occur much less >> frequently than calls to them, so this will give us a good trade-off in >> most cases. I don't see why this kind of caching would be any slower inside >> of loops than what we were discussing so far. >> >> Stefan >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > This works really well for local variables, but for globals, def > methods or callbacks as attributes, this won't work so well, as they > may be rebound at any time outside of the module scope. I think in > general Cython code could be easily sped up for most cases by provided > a really fast dispatch mechanism here. ... unless we implement the __nomonkey__ (forgot the original name) or final declaration (also allowed in pxd files to declare module attributes final), where you can declare module attributes or class attributes final. I don't recall the outcome of the discussion, but I suppose the advantage of the __nomonkey__ is that is works from Python code as well and you don't have to bother with boring pxds, whereas the advantage of final is that is can work for class attributes. From d.s.seljebotn at astro.uio.no Wed May 16 22:16:26 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Wed, 16 May 2012 22:16:26 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> Message-ID: <169bd5091ff40cbd180cb9d807a9606f@ulrik.uio.no> On Wed, 16 May 2012 20:49:18 +0100, mark florisson wrote: > On 16 May 2012 20:15, Stefan Behnel wrote: >> "Martin v. L?wis", 16.05.2012 20:33: >>>> Does this use case make sense to everyone? >>>> >>>> The reason why we are discussing this on python-dev is that we are >>>> looking >>>> for a general way to expose these C level signatures within the >>>> Python >>>> ecosystem. And Dag's idea was to expose them as part of the type >>>> object, >>>> basically as an addition to the current Python level tp_call() >>>> slot. >>> >>> The use case makes sense, yet there is also a long-standing >>> solution >>> already to expose APIs and function pointers: the capsule objects. >>> >>> If you want to avoid dictionary lookups on the server side, >>> implement >>> tp_getattro, comparing addresses of interned strings. >> >> I think Martin has a point there. Why not just use a custom >> attribute on >> callables that hold a PyCapsule? Whenever we see inside of a Cython >> implemented function that an object variable that was retrieved from >> the >> outside, either as a function argument or as the result of a >> function call, >> is being called, we try to unpack a C function pointer from it on >> all >> assignments to the variable. If that works, we can scan for a >> suitable >> signature (either right away or lazily on first access) and cache >> that. On >> each subsequent call through that variable, the cached C function >> will be used. >> >> That means we'd replace Python variables that are being called by >> multiple >> local variables, one that holds the object and one for each C >> function with >> a different signature that it is being called with. We set the C >> function >> variables to NULL when the Python function variable is being >> assigned to. >> When the C function variable is NULL on call, we scan for a matching >> signature and assign it to the variable. ?When no matching signature >> can be >> found, we set it to (void*)-1. >> >> Additionally, we allow explicit user casts of Python objects to C >> function >> types, which would then try to unpack the C function, raising a >> TypeError >> on mismatch. >> >> Assignments to callable variables can be expected to occur much less >> frequently than calls to them, so this will give us a good trade-off >> in >> most cases. I don't see why this kind of caching would be any slower >> inside >> of loops than what we were discussing so far. >> >> Stefan >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > This works really well for local variables, but for globals, def > methods or callbacks as attributes, this won't work so well, as they > may be rebound at any time outside of the module scope. I think in +1. The python-dev discussion is pretty focused on the world of a manually written C extension. But code generation is an entirely different matter. Python puts in place pretty efficient boundaries against full-program static analysis, so there's really not much we can do. Here's some of my actual code I have for wrapping a C++ library: cdef class CallbackEventReceiver(BasicEventReceiver): cdef object callback def __init__(self, callback): self.callback = callback cdef dispatch_event(self, ...): self.callback(...) The idea is that you can subclass BasicEventReceiver in Cython for speed, but if you want to use a Python callable then this converter is used. This code is very performance critical. And, the *loop* in question sits deep inside a C++ library. Good luck pre-acquiring the function pointer of self.callback in any useful way. Even if it is not exported by the class, that could be overridden by a subclass. I stress the fact that this is real world code by yours truly (unfortunately not open source, it wraps a closed source library). Yes, you can tell users to be mindful of this and make as much as possible local variables, introduce final modifiers and __nomonkey__ and whatnot, but that's a large price to pay to avoid hacking tp_flags. Dag From robertwb at gmail.com Wed May 16 22:25:44 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 16 May 2012 13:25:44 -0700 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FB3FCEE.5020405@behnel.de> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> Message-ID: On Wed, May 16, 2012 at 12:15 PM, Stefan Behnel wrote: > "Martin v. L?wis", 16.05.2012 20:33: >>> Does this use case make sense to everyone? >>> >>> The reason why we are discussing this on python-dev is that we are looking >>> for a general way to expose these C level signatures within the Python >>> ecosystem. And Dag's idea was to expose them as part of the type object, >>> basically as an addition to the current Python level tp_call() slot. >> >> The use case makes sense, yet there is also a long-standing solution >> already to expose APIs and function pointers: the capsule objects. >> >> If you want to avoid dictionary lookups on the server side, implement >> tp_getattro, comparing addresses of interned strings. > > I think Martin has a point there. Why not just use a custom attribute on > callables that hold a PyCapsule? Whenever we see inside of a Cython > implemented function that an object variable that was retrieved from the > outside, either as a function argument or as the result of a function call, > is being called, we try to unpack a C function pointer from it on all > assignments to the variable. If that works, we can scan for a suitable > signature (either right away or lazily on first access) and cache that. On > each subsequent call through that variable, the cached C function will be used. > > That means we'd replace Python variables that are being called by multiple > local variables, one that holds the object and one for each C function with > a different signature that it is being called with. We set the C function > variables to NULL when the Python function variable is being assigned to. > When the C function variable is NULL on call, we scan for a matching > signature and assign it to the variable. ?When no matching signature can be > found, we set it to (void*)-1. > > Additionally, we allow explicit user casts of Python objects to C function > types, which would then try to unpack the C function, raising a TypeError > on mismatch. > > Assignments to callable variables can be expected to occur much less > frequently than calls to them, so this will give us a good trade-off in > most cases. I don't see why this kind of caching would be any slower inside > of loops than what we were discussing so far. I like the idea, but that only helps if you're doing multiple calls (e.g. in a loop). Definitely worth implementing in my mind, but orthogonal to making the lookup itself as fast as possible. - Robert From markflorisson88 at gmail.com Wed May 16 22:45:42 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 16 May 2012 21:45:42 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <169bd5091ff40cbd180cb9d807a9606f@ulrik.uio.no> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <169bd5091ff40cbd180cb9d807a9606f@ulrik.uio.no> Message-ID: On 16 May 2012 21:16, Dag Sverre Seljebotn wrote: > On Wed, 16 May 2012 20:49:18 +0100, mark florisson > wrote: >> >> On 16 May 2012 20:15, Stefan Behnel wrote: >>> >>> "Martin v. L?wis", 16.05.2012 20:33: >>>>> >>>>> Does this use case make sense to everyone? >>>>> >>>>> The reason why we are discussing this on python-dev is that we are >>>>> looking >>>>> for a general way to expose these C level signatures within the Python >>>>> ecosystem. And Dag's idea was to expose them as part of the type >>>>> object, >>>>> basically as an addition to the current Python level tp_call() slot. >>>> >>>> >>>> The use case makes sense, yet there is also a long-standing solution >>>> already to expose APIs and function pointers: the capsule objects. >>>> >>>> If you want to avoid dictionary lookups on the server side, implement >>>> tp_getattro, comparing addresses of interned strings. >>> >>> >>> I think Martin has a point there. Why not just use a custom attribute on >>> callables that hold a PyCapsule? Whenever we see inside of a Cython >>> implemented function that an object variable that was retrieved from the >>> outside, either as a function argument or as the result of a function >>> call, >>> is being called, we try to unpack a C function pointer from it on all >>> assignments to the variable. If that works, we can scan for a suitable >>> signature (either right away or lazily on first access) and cache that. >>> On >>> each subsequent call through that variable, the cached C function will be >>> used. >>> >>> That means we'd replace Python variables that are being called by >>> multiple >>> local variables, one that holds the object and one for each C function >>> with >>> a different signature that it is being called with. We set the C function >>> variables to NULL when the Python function variable is being assigned to. >>> When the C function variable is NULL on call, we scan for a matching >>> signature and assign it to the variable. ?When no matching signature can >>> be >>> found, we set it to (void*)-1. >>> >>> Additionally, we allow explicit user casts of Python objects to C >>> function >>> types, which would then try to unpack the C function, raising a TypeError >>> on mismatch. >>> >>> Assignments to callable variables can be expected to occur much less >>> frequently than calls to them, so this will give us a good trade-off in >>> most cases. I don't see why this kind of caching would be any slower >>> inside >>> of loops than what we were discussing so far. >>> >>> Stefan >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >> >> >> This works really well for local variables, but for globals, def >> methods or callbacks as attributes, this won't work so well, as they >> may be rebound at any time outside of the module scope. I think in > > > +1. The python-dev discussion is pretty focused on the world of a manually > written C extension. But code generation is an entirely different matter. > Python puts in place pretty efficient boundaries against full-program static > analysis, so there's really not much we can do. > > Here's some of my actual code I have for wrapping a C++ library: > > cdef class CallbackEventReceiver(BasicEventReceiver): > ? ?cdef object callback > > ? ?def __init__(self, callback): > ? ? ? ?self.callback = callback > > ? ?cdef dispatch_event(self, ...): > ? ? ? ?self.callback(...) > > The idea is that you can subclass BasicEventReceiver in Cython for speed, > but if you want to use a Python callable then this converter is used. > > This code is very performance critical. And, the *loop* in question sits > deep inside a C++ library. > > Good luck pre-acquiring the function pointer of self.callback in any useful > way. Even if it is not exported by the class, that could be overridden by a > subclass. I stress the fact that this is real world code by yours truly > (unfortunately not open source, it wraps a closed source library). > > Yes, you can tell users to be mindful of this and make as much as possible > local variables, introduce final modifiers and __nomonkey__ and whatnot, but > that's a large price to pay to avoid hacking tp_flags. > > Dag > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel Definitely. I personally prefer the metaclass approach, but it's an irrelevant detail. If we go the tp_flags route, would we copy all the interface information from the superclass into the subclass directly? I think in any case we need a wrapper around PyType_Ready to inherit the tp_flags bit (which would be automatic with a metaclass). From markflorisson88 at gmail.com Wed May 16 23:06:59 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 16 May 2012 22:06:59 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <169bd5091ff40cbd180cb9d807a9606f@ulrik.uio.no> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <169bd5091ff40cbd180cb9d807a9606f@ulrik.uio.no> Message-ID: On 16 May 2012 21:16, Dag Sverre Seljebotn wrote: > On Wed, 16 May 2012 20:49:18 +0100, mark florisson > wrote: >> >> On 16 May 2012 20:15, Stefan Behnel wrote: >>> >>> "Martin v. L?wis", 16.05.2012 20:33: >>>>> >>>>> Does this use case make sense to everyone? >>>>> >>>>> The reason why we are discussing this on python-dev is that we are >>>>> looking >>>>> for a general way to expose these C level signatures within the Python >>>>> ecosystem. And Dag's idea was to expose them as part of the type >>>>> object, >>>>> basically as an addition to the current Python level tp_call() slot. >>>> >>>> >>>> The use case makes sense, yet there is also a long-standing solution >>>> already to expose APIs and function pointers: the capsule objects. >>>> >>>> If you want to avoid dictionary lookups on the server side, implement >>>> tp_getattro, comparing addresses of interned strings. >>> >>> >>> I think Martin has a point there. Why not just use a custom attribute on >>> callables that hold a PyCapsule? Whenever we see inside of a Cython >>> implemented function that an object variable that was retrieved from the >>> outside, either as a function argument or as the result of a function >>> call, >>> is being called, we try to unpack a C function pointer from it on all >>> assignments to the variable. If that works, we can scan for a suitable >>> signature (either right away or lazily on first access) and cache that. >>> On >>> each subsequent call through that variable, the cached C function will be >>> used. >>> >>> That means we'd replace Python variables that are being called by >>> multiple >>> local variables, one that holds the object and one for each C function >>> with >>> a different signature that it is being called with. We set the C function >>> variables to NULL when the Python function variable is being assigned to. >>> When the C function variable is NULL on call, we scan for a matching >>> signature and assign it to the variable. ?When no matching signature can >>> be >>> found, we set it to (void*)-1. >>> >>> Additionally, we allow explicit user casts of Python objects to C >>> function >>> types, which would then try to unpack the C function, raising a TypeError >>> on mismatch. >>> >>> Assignments to callable variables can be expected to occur much less >>> frequently than calls to them, so this will give us a good trade-off in >>> most cases. I don't see why this kind of caching would be any slower >>> inside >>> of loops than what we were discussing so far. >>> >>> Stefan >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >> >> >> This works really well for local variables, but for globals, def >> methods or callbacks as attributes, this won't work so well, as they >> may be rebound at any time outside of the module scope. I think in > > > +1. The python-dev discussion is pretty focused on the world of a manually > written C extension. But code generation is an entirely different matter. > Python puts in place pretty efficient boundaries against full-program static > analysis, so there's really not much we can do. > > Here's some of my actual code I have for wrapping a C++ library: > > cdef class CallbackEventReceiver(BasicEventReceiver): > ? ?cdef object callback > > ? ?def __init__(self, callback): > ? ? ? ?self.callback = callback > > ? ?cdef dispatch_event(self, ...): > ? ? ? ?self.callback(...) > > The idea is that you can subclass BasicEventReceiver in Cython for speed, > but if you want to use a Python callable then this converter is used. > > This code is very performance critical. And, the *loop* in question sits > deep inside a C++ library. > > Good luck pre-acquiring the function pointer of self.callback in any useful > way. Even if it is not exported by the class, that could be overridden by a > subclass. I stress the fact that this is real world code by yours truly > (unfortunately not open source, it wraps a closed source library). > > Yes, you can tell users to be mindful of this and make as much as possible > local variables, introduce final modifiers and __nomonkey__ and whatnot, but > that's a large price to pay to avoid hacking tp_flags. > > Dag > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel I suppose for this case it might be faster to check if the world is sane (if the callback or function is still the object you expect it to be) on top of looking at whether the function pointer is unpacked. You don't really want to store that extra information in objects, but for global variables it might be worth the while (unless you're doing import * :)). So we definitely always need a fast dispatcher, but we may do slightly better in some cases if we care to implement it. I bet no one will care about shaving off those last 2 nano seconds though :) From stefan_ml at behnel.de Thu May 17 08:09:08 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 17 May 2012 08:09:08 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> Message-ID: <4FB49604.8070000@behnel.de> mark florisson, 16.05.2012 21:49: > On 16 May 2012 20:15, Stefan Behnel wrote: >> "Martin v. L?wis", 16.05.2012 20:33: >>>> Does this use case make sense to everyone? >>>> >>>> The reason why we are discussing this on python-dev is that we are looking >>>> for a general way to expose these C level signatures within the Python >>>> ecosystem. And Dag's idea was to expose them as part of the type object, >>>> basically as an addition to the current Python level tp_call() slot. >>> >>> The use case makes sense, yet there is also a long-standing solution >>> already to expose APIs and function pointers: the capsule objects. >>> >>> If you want to avoid dictionary lookups on the server side, implement >>> tp_getattro, comparing addresses of interned strings. >> >> I think Martin has a point there. Why not just use a custom attribute on >> callables that hold a PyCapsule? Whenever we see inside of a Cython >> implemented function that an object variable that was retrieved from the >> outside, either as a function argument or as the result of a function call, >> is being called, we try to unpack a C function pointer from it on all >> assignments to the variable. If that works, we can scan for a suitable >> signature (either right away or lazily on first access) and cache that. On >> each subsequent call through that variable, the cached C function will be used. >> >> That means we'd replace Python variables that are being called by multiple >> local variables, one that holds the object and one for each C function with >> a different signature that it is being called with. We set the C function >> variables to NULL when the Python function variable is being assigned to. >> When the C function variable is NULL on call, we scan for a matching >> signature and assign it to the variable. When no matching signature can be >> found, we set it to (void*)-1. >> >> Additionally, we allow explicit user casts of Python objects to C function >> types, which would then try to unpack the C function, raising a TypeError >> on mismatch. >> >> Assignments to callable variables can be expected to occur much less >> frequently than calls to them, so this will give us a good trade-off in >> most cases. I don't see why this kind of caching would be any slower inside >> of loops than what we were discussing so far. > > This works really well for local variables, but for globals, def > methods or callbacks as attributes, this won't work so well, as they > may be rebound at any time outside of the module scope. Only half true for globals, which can be declared "cdef object", e.g. for imported names. That would allow Cython to see all possible reassignments in a module, which would then apply the above scheme. I don't think def methods are a use case for this because you'd either cpdef them or even cdef them if you want speed. If you want them to be overridable, you'll have to live with the speed penalty that that implies. For object attributes, you have to pay the penalty of a lookup anyway, no way around that. We can't even cache anything here (e.g. with a borrowed reference) because the attribute may be rebound to another object that happens to live at the same address as the previous one. However, if you want speed, you'd do it as in CPython and assign the object to a local variable to pay the lookup of only once. Problem solved. > I think in > general Cython code could be easily sped up for most cases by provided > a really fast dispatch mechanism here. I feel inclined to doubt that by now. Stefan From d.s.seljebotn at astro.uio.no Thu May 17 09:12:17 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 17 May 2012 09:12:17 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FB49604.8070000@behnel.de> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <4FB49604.8070000@behnel.de> Message-ID: Stefan Behnel wrote: >mark florisson, 16.05.2012 21:49: >> On 16 May 2012 20:15, Stefan Behnel wrote: >>> "Martin v. L?wis", 16.05.2012 20:33: >>>>> Does this use case make sense to everyone? >>>>> >>>>> The reason why we are discussing this on python-dev is that we are >looking >>>>> for a general way to expose these C level signatures within the >Python >>>>> ecosystem. And Dag's idea was to expose them as part of the type >object, >>>>> basically as an addition to the current Python level tp_call() >slot. >>>> >>>> The use case makes sense, yet there is also a long-standing >solution >>>> already to expose APIs and function pointers: the capsule objects. >>>> >>>> If you want to avoid dictionary lookups on the server side, >implement >>>> tp_getattro, comparing addresses of interned strings. >>> >>> I think Martin has a point there. Why not just use a custom >attribute on >>> callables that hold a PyCapsule? Whenever we see inside of a Cython >>> implemented function that an object variable that was retrieved from >the >>> outside, either as a function argument or as the result of a >function call, >>> is being called, we try to unpack a C function pointer from it on >all >>> assignments to the variable. If that works, we can scan for a >suitable >>> signature (either right away or lazily on first access) and cache >that. On >>> each subsequent call through that variable, the cached C function >will be used. >>> >>> That means we'd replace Python variables that are being called by >multiple >>> local variables, one that holds the object and one for each C >function with >>> a different signature that it is being called with. We set the C >function >>> variables to NULL when the Python function variable is being >assigned to. >>> When the C function variable is NULL on call, we scan for a matching >>> signature and assign it to the variable. When no matching signature >can be >>> found, we set it to (void*)-1. >>> >>> Additionally, we allow explicit user casts of Python objects to C >function >>> types, which would then try to unpack the C function, raising a >TypeError >>> on mismatch. >>> >>> Assignments to callable variables can be expected to occur much less >>> frequently than calls to them, so this will give us a good trade-off >in >>> most cases. I don't see why this kind of caching would be any slower >inside >>> of loops than what we were discussing so far. >> >> This works really well for local variables, but for globals, def >> methods or callbacks as attributes, this won't work so well, as they >> may be rebound at any time outside of the module scope. > >Only half true for globals, which can be declared "cdef object", e.g. >for >imported names. That would allow Cython to see all possible >reassignments >in a module, which would then apply the above scheme. > >I don't think def methods are a use case for this because you'd either >cpdef them or even cdef them if you want speed. If you want them to be >overridable, you'll have to live with the speed penalty that that >implies. > >For object attributes, you have to pay the penalty of a lookup anyway, >no >way around that. We can't even cache anything here (e.g. with a >borrowed >reference) because the attribute may be rebound to another object that >happens to live at the same address as the previous one. However, if >you >want speed, you'd do it as in CPython and assign the object to a local >variable to pay the lookup of only once. Problem solved. 'Problem solved' by pushing the work over to the user? By that line of argument, why not just kill of Cython and require users to write C? Hyperbole aside; do you really believe it is worth dropping a relatively easy optimization just to make the C level code more to the taste of some python-dev posters? Dag > > >> I think in >> general Cython code could be easily sped up for most cases by >provided >> a really fast dispatch mechanism here. > >I feel inclined to doubt that by now. > >Stefan >_______________________________________________ >cython-devel mailing list >cython-devel at python.org >http://mail.python.org/mailman/listinfo/cython-devel -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. From stefan_ml at behnel.de Thu May 17 09:36:52 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 17 May 2012 09:36:52 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <4FB49604.8070000@behnel.de> Message-ID: <4FB4AA94.7060505@behnel.de> Dag Sverre Seljebotn, 17.05.2012 09:12: > Stefan Behnel wrote: >> mark florisson, 16.05.2012 21:49: >>> On 16 May 2012 20:15, Stefan Behnel wrote: >>>> Why not just use a custom attribute on callables that hold a >>>> PyCapsule? Whenever we see inside of a Cython implemented function >>>> that an object variable that was retrieved from the outside, >>>> either as a function argument or as the result of a function call, >>>> is being called, we try to unpack a C function pointer from it on >>>> all assignments to the variable. If that works, we can scan for a >>>> suitable signature (either right away or lazily on first access) >>>> and cache that. On each subsequent call through that variable, >>>> the cached C function will be used. >>>> >>>> That means we'd replace Python variables that are being called by >>>> multiple local variables, one that holds the object and one for each C >>>> function with a different signature that it is being called with. We >>>> set the C function variables to NULL when the Python function variable >>>> is being assigned to. >>>> When the C function variable is NULL on call, we scan for a matching >>>> signature and assign it to the variable. When no matching signature >>>> can be found, we set it to (void*)-1. >>>> >>>> Additionally, we allow explicit user casts of Python objects to C >>>> function types, which would then try to unpack the C function, raising >>>> a TypeError on mismatch. >>>> >>>> Assignments to callable variables can be expected to occur much less >>>> frequently than calls to them, so this will give us a good trade-off >>>> in most cases. I don't see why this kind of caching would be any slower >>>> inside of loops than what we were discussing so far. >>> >>> This works really well for local variables, but for globals, def >>> methods or callbacks as attributes, this won't work so well, as they >>> may be rebound at any time outside of the module scope. >> >> Only half true for globals, which can be declared "cdef object", e.g. >> for imported names. That would allow Cython to see all possible >> reassignments in a module, which would then apply the above scheme. >> >> I don't think def methods are a use case for this because you'd either >> cpdef them or even cdef them if you want speed. If you want them to be >> overridable, you'll have to live with the speed penalty that that >> implies. >> >> For object attributes, you have to pay the penalty of a lookup anyway, >> no way around that. We can't even cache anything here (e.g. with a >> borrowed reference) because the attribute may be rebound to another >> object that happens to live at the same address as the previous one. >> However, if you want speed, you'd do it as in CPython and assign the >> object to a local variable to pay the lookup of only once. Problem >> solved. > > 'Problem solved' by pushing the work over to the user? By that line > of argument, why not just kill of Cython and require users to write C? What part of the work does the above proposal push to the user? To make it explicit that an object attribute or a global variable is not expected to change during whatever a loop does? Well, yes. If the user knows that, a global cdef or an assignment to a local variable is the easiest, safest, fastest and most obvious way to tell Cython that it should take advantage of it. Why invent yet another declaration for this? > Hyperbole aside; do you really believe it is worth dropping a relatively > easy optimization just to make the C level code more to the taste of > some python-dev posters? I find the above much easier for all sides. It's easier to implement for us and others, it doesn't have any impact on CPython and I also find it easier to understand for users. Besides, I was only responding to Mark's remarks (pun not intended) about the few cases where this may not immediately yield the expected advantage. They are easy to fix, that's all I was saying. In most cases, this simple scheme will do the right thing without any user interaction, and it does not require any changes or future constraints on CPython. So, why not just implement this for now and *then* re-evaluate if we really need more, and if we can really do better? Stefan From markflorisson88 at gmail.com Thu May 17 11:15:15 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Thu, 17 May 2012 10:15:15 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FB49604.8070000@behnel.de> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <4FB49604.8070000@behnel.de> Message-ID: On 17 May 2012 07:09, Stefan Behnel wrote: > mark florisson, 16.05.2012 21:49: >> On 16 May 2012 20:15, Stefan Behnel wrote: >>> "Martin v. L?wis", 16.05.2012 20:33: >>>>> Does this use case make sense to everyone? >>>>> >>>>> The reason why we are discussing this on python-dev is that we are looking >>>>> for a general way to expose these C level signatures within the Python >>>>> ecosystem. And Dag's idea was to expose them as part of the type object, >>>>> basically as an addition to the current Python level tp_call() slot. >>>> >>>> The use case makes sense, yet there is also a long-standing solution >>>> already to expose APIs and function pointers: the capsule objects. >>>> >>>> If you want to avoid dictionary lookups on the server side, implement >>>> tp_getattro, comparing addresses of interned strings. >>> >>> I think Martin has a point there. Why not just use a custom attribute on >>> callables that hold a PyCapsule? Whenever we see inside of a Cython >>> implemented function that an object variable that was retrieved from the >>> outside, either as a function argument or as the result of a function call, >>> is being called, we try to unpack a C function pointer from it on all >>> assignments to the variable. If that works, we can scan for a suitable >>> signature (either right away or lazily on first access) and cache that. On >>> each subsequent call through that variable, the cached C function will be used. >>> >>> That means we'd replace Python variables that are being called by multiple >>> local variables, one that holds the object and one for each C function with >>> a different signature that it is being called with. We set the C function >>> variables to NULL when the Python function variable is being assigned to. >>> When the C function variable is NULL on call, we scan for a matching >>> signature and assign it to the variable. ?When no matching signature can be >>> found, we set it to (void*)-1. >>> >>> Additionally, we allow explicit user casts of Python objects to C function >>> types, which would then try to unpack the C function, raising a TypeError >>> on mismatch. >>> >>> Assignments to callable variables can be expected to occur much less >>> frequently than calls to them, so this will give us a good trade-off in >>> most cases. I don't see why this kind of caching would be any slower inside >>> of loops than what we were discussing so far. >> >> This works really well for local variables, but for globals, def >> methods or callbacks as attributes, this won't work so well, as they >> may be rebound at any time outside of the module scope. > > Only half true for globals, which can be declared "cdef object", e.g. for > imported names. That would allow Cython to see all possible reassignments > in a module, which would then apply the above scheme. I suppose by default they could be properties of a module subclass. That would also allow faster lookup of globals visible from Python in Cython space in the same module (but probably slower from outside). > I don't think def methods are a use case for this because you'd either > cpdef them or even cdef them if you want speed. If you want them to be > overridable, you'll have to live with the speed penalty that that implies. Which means you can no longer pass stuff around as a callback, but you need to define an interface in Cython and have people pass around objects on which you call methods. That is often less Pythonic and furthermore restricts people to use Cython for all their code. What you want is something that is fast when given a Cython callable, but which still works when I write my stuff in Python. Having to inherit from some cdef class and override its cpdef method just to pass a callback to other code from Python is a chore and unnecessary burden. We need to stop sacrificing our design decisions for speed. Speed should be obtained through clever compiler or interpreter design, not by telling its users to rewrite their code in a specific way that fits the current incapabilities of the compiler. > For object attributes, you have to pay the penalty of a lookup anyway, no > way around that. Not in a cdef class. But even in a cdef class any subclass method can rebind your attribute at any time. We currently have the same problem with memoryviews, that have to check whether they are initialized for every access. > We can't even cache anything here (e.g. with a borrowed > reference) because the attribute may be rebound to another object that > happens to live at the same address as the previous one. However, if you > want speed, you'd do it as in CPython and assign the object to a local > variable to pay the lookup of only once. Problem solved. > > >> I think in >> general Cython code could be easily sped up for most cases by provided >> a really fast dispatch mechanism here. > > I feel inclined to doubt that by now. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Thu May 17 11:26:41 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Thu, 17 May 2012 10:26:41 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FB4AA94.7060505@behnel.de> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <4FB49604.8070000@behnel.de> <4FB4AA94.7060505@behnel.de> Message-ID: On 17 May 2012 08:36, Stefan Behnel wrote: > Dag Sverre Seljebotn, 17.05.2012 09:12: >> Stefan Behnel wrote: >>> mark florisson, 16.05.2012 21:49: >>>> On 16 May 2012 20:15, Stefan Behnel wrote: >>>>> Why not just use a custom attribute on callables that hold a >>>>> PyCapsule? Whenever we see inside of a Cython implemented function >>>>> that an object variable that was retrieved from the outside, >>>>> either as a function argument or as the result of a function call, >>>>> is being called, we try to unpack a C function pointer from it on >>>>> all assignments to the variable. If that works, we can scan for a >>>>> suitable signature (either right away or lazily on first access) >>>>> and cache that. On each subsequent call through that variable, >>>>> the cached C function will be used. >>>>> >>>>> That means we'd replace Python variables that are being called by >>>>> multiple local variables, one that holds the object and one for each C >>>>> function with a different signature that it is being called with. We >>>>> set the C function variables to NULL when the Python function variable >>>>> is being assigned to. >>>>> When the C function variable is NULL on call, we scan for a matching >>>>> signature and assign it to the variable. ?When no matching signature >>>>> can be found, we set it to (void*)-1. >>>>> >>>>> Additionally, we allow explicit user casts of Python objects to C >>>>> function types, which would then try to unpack the C function, raising >>>>> a TypeError on mismatch. >>>>> >>>>> Assignments to callable variables can be expected to occur much less >>>>> frequently than calls to them, so this will give us a good trade-off >>>>> in most cases. I don't see why this kind of caching would be any slower >>>>> inside of loops than what we were discussing so far. >>>> >>>> This works really well for local variables, but for globals, def >>>> methods or callbacks as attributes, this won't work so well, as they >>>> may be rebound at any time outside of the module scope. >>> >>> Only half true for globals, which can be declared "cdef object", e.g. >>> for imported names. That would allow Cython to see all possible >>> reassignments in a module, which would then apply the above scheme. >>> >>> I don't think def methods are a use case for this because you'd either >>> cpdef them or even cdef them if you want speed. If you want them to be >>> overridable, you'll have to live with the speed penalty that that >>> implies. >>> >>> For object attributes, you have to pay the penalty of a lookup anyway, >>> no way around that. We can't even cache anything here (e.g. with a >>> borrowed reference) because the attribute may be rebound to another >>> object that happens to live at the same address as the previous one. >>> However, if you want speed, you'd do it as in CPython and assign the >>> object to a local variable to pay the lookup of only once. Problem >>> solved. >> >> 'Problem solved' by pushing the work over to the user? By that line >> of argument, why not just kill of Cython and require users to write C? > > What part of the work does the above proposal push to the user? To make it > explicit that an object attribute or a global variable is not expected to > change during whatever a loop does? Well, yes. If the user knows that, a > global cdef or an assignment to a local variable is the easiest, safest, > fastest and most obvious way to tell Cython that it should take advantage > of it. Why invent yet another declaration for this? > > >> Hyperbole aside; do you really believe it is worth dropping a relatively >> easy optimization just to make the C level code more to the taste of >> some python-dev posters? > > I find the above much easier for all sides. It's easier to implement for us > and others, it doesn't have any impact on CPython and I also find it easier > to understand for users. > > Besides, I was only responding to Mark's remarks (pun not intended) about > the few cases where this may not immediately yield the expected advantage. > They are easy to fix, that's all I was saying. In most cases, this simple > scheme will do the right thing without any user interaction, and it does > not require any changes or future constraints on CPython. > > So, why not just implement this for now and *then* re-evaluate if we really > need more, and if we can really do better? > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel Hm, I think we should implement fast dispatch first, and if an additional optimization with hoisted function pointer unpacking leads to non-negligible performance gains, we can just implement both. I don't think python-dev cares much about C-level interfaces, and Martin is right that we can just do the same thing through metaclasses, which would be portable across versions and just as fast (probably :). From stefan_ml at behnel.de Thu May 17 12:03:34 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 17 May 2012 12:03:34 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <4FB49604.8070000@behnel.de> Message-ID: <4FB4CCF6.6020102@behnel.de> mark florisson, 17.05.2012 11:15: > On 17 May 2012 07:09, Stefan Behnel wrote: >> mark florisson, 16.05.2012 21:49: >>> On 16 May 2012 20:15, Stefan Behnel wrote: >>>> "Martin v. L?wis", 16.05.2012 20:33: >>>>>> Does this use case make sense to everyone? >>>>>> >>>>>> The reason why we are discussing this on python-dev is that we are looking >>>>>> for a general way to expose these C level signatures within the Python >>>>>> ecosystem. And Dag's idea was to expose them as part of the type object, >>>>>> basically as an addition to the current Python level tp_call() slot. >>>>> >>>>> The use case makes sense, yet there is also a long-standing solution >>>>> already to expose APIs and function pointers: the capsule objects. >>>>> >>>>> If you want to avoid dictionary lookups on the server side, implement >>>>> tp_getattro, comparing addresses of interned strings. >>>> >>>> I think Martin has a point there. Why not just use a custom attribute on >>>> callables that hold a PyCapsule? Whenever we see inside of a Cython >>>> implemented function that an object variable that was retrieved from the >>>> outside, either as a function argument or as the result of a function call, >>>> is being called, we try to unpack a C function pointer from it on all >>>> assignments to the variable. If that works, we can scan for a suitable >>>> signature (either right away or lazily on first access) and cache that. On >>>> each subsequent call through that variable, the cached C function will be used. >>>> >>>> That means we'd replace Python variables that are being called by multiple >>>> local variables, one that holds the object and one for each C function with >>>> a different signature that it is being called with. We set the C function >>>> variables to NULL when the Python function variable is being assigned to. >>>> When the C function variable is NULL on call, we scan for a matching >>>> signature and assign it to the variable. When no matching signature can be >>>> found, we set it to (void*)-1. >>>> >>>> Additionally, we allow explicit user casts of Python objects to C function >>>> types, which would then try to unpack the C function, raising a TypeError >>>> on mismatch. >>>> >>>> Assignments to callable variables can be expected to occur much less >>>> frequently than calls to them, so this will give us a good trade-off in >>>> most cases. I don't see why this kind of caching would be any slower inside >>>> of loops than what we were discussing so far. >>> >>> This works really well for local variables, but for globals, def >>> methods or callbacks as attributes, this won't work so well, as they >>> may be rebound at any time outside of the module scope. >> >> Only half true for globals, which can be declared "cdef object", e.g. for >> imported names. That would allow Cython to see all possible reassignments >> in a module, which would then apply the above scheme. > > I suppose by default they could be properties of a module subclass. > That would also allow faster lookup of globals visible from Python in > Cython space in the same module (but probably slower from outside). Yes, that's another way to do it and yet another nice feature (which we've already been throwing into the discussions for years and years...) >> I don't think def methods are a use case for this because you'd either >> cpdef them or even cdef them if you want speed. If you want them to be >> overridable, you'll have to live with the speed penalty that that implies. > > Which means you can no longer pass stuff around as a callback Callbacks are not a problem because they behave like any other object that's being held in a variable (i.e. they are the normal case, not the exception). I was referring to globally defined def functions which can be reassigned in the module. That's a problem, but it's mostly the same as with any global name. >> For object attributes, you have to pay the penalty of a lookup anyway, no >> way around that. > > Not in a cdef class. Sure, also in cdef classes. Nothing keeps me from reassigning to the attribute of a cdef class. The difference is only that the attribute lookup is faster for them because it passes through a pointer indirection instead of a dict lookup. Apart from that, it's entirely the same thing. No, actually, it's even worse because you can't just hook into the dict (as Vitja did recently) and check if it has changed. You actually need to read the attribute again and then look up its C functions again, because even if the object pointer is the same, it doesn't mean that the object is the same, unless you keep an owned reference to it (which you can't without keeping the object alive). So there is even less of a chance for efficient caching. Stefan From stefan_ml at behnel.de Thu May 17 12:14:02 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 17 May 2012 12:14:02 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <4FB49604.8070000@behnel.de> <4FB4AA94.7060505@behnel.de> Message-ID: <4FB4CF6A.5040102@behnel.de> mark florisson, 17.05.2012 11:26: > On 17 May 2012 08:36, Stefan Behnel wrote: >> Dag Sverre Seljebotn, 17.05.2012 09:12: >>> Stefan Behnel wrote: >>>> mark florisson, 16.05.2012 21:49: >>>>> On 16 May 2012 20:15, Stefan Behnel wrote: >>>>>> Why not just use a custom attribute on callables that hold a >>>>>> PyCapsule? Whenever we see inside of a Cython implemented function >>>>>> that an object variable that was retrieved from the outside, >>>>>> either as a function argument or as the result of a function call, >>>>>> is being called, we try to unpack a C function pointer from it on >>>>>> all assignments to the variable. If that works, we can scan for a >>>>>> suitable signature (either right away or lazily on first access) >>>>>> and cache that. On each subsequent call through that variable, >>>>>> the cached C function will be used. >>>>>> >>>>>> That means we'd replace Python variables that are being called by >>>>>> multiple local variables, one that holds the object and one for each C >>>>>> function with a different signature that it is being called with. We >>>>>> set the C function variables to NULL when the Python function variable >>>>>> is being assigned to. >>>>>> When the C function variable is NULL on call, we scan for a matching >>>>>> signature and assign it to the variable. When no matching signature >>>>>> can be found, we set it to (void*)-1. >>>>>> >>>>>> Additionally, we allow explicit user casts of Python objects to C >>>>>> function types, which would then try to unpack the C function, raising >>>>>> a TypeError on mismatch. >>>>>> >>>>>> Assignments to callable variables can be expected to occur much less >>>>>> frequently than calls to them, so this will give us a good trade-off >>>>>> in most cases. I don't see why this kind of caching would be any slower >>>>>> inside of loops than what we were discussing so far. >>>>> >>>>> This works really well for local variables, but for globals, def >>>>> methods or callbacks as attributes, this won't work so well, as they >>>>> may be rebound at any time outside of the module scope. >>>> >>>> Only half true for globals, which can be declared "cdef object", e.g. >>>> for imported names. That would allow Cython to see all possible >>>> reassignments in a module, which would then apply the above scheme. >>>> >>>> I don't think def methods are a use case for this because you'd either >>>> cpdef them or even cdef them if you want speed. If you want them to be >>>> overridable, you'll have to live with the speed penalty that that >>>> implies. >>>> >>>> For object attributes, you have to pay the penalty of a lookup anyway, >>>> no way around that. We can't even cache anything here (e.g. with a >>>> borrowed reference) because the attribute may be rebound to another >>>> object that happens to live at the same address as the previous one. >>>> However, if you want speed, you'd do it as in CPython and assign the >>>> object to a local variable to pay the lookup of only once. Problem >>>> solved. >>> >>> 'Problem solved' by pushing the work over to the user? By that line >>> of argument, why not just kill of Cython and require users to write C? >> >> What part of the work does the above proposal push to the user? To make it >> explicit that an object attribute or a global variable is not expected to >> change during whatever a loop does? Well, yes. If the user knows that, a >> global cdef or an assignment to a local variable is the easiest, safest, >> fastest and most obvious way to tell Cython that it should take advantage >> of it. Why invent yet another declaration for this? >> >> >>> Hyperbole aside; do you really believe it is worth dropping a relatively >>> easy optimization just to make the C level code more to the taste of >>> some python-dev posters? >> >> I find the above much easier for all sides. It's easier to implement for us >> and others, it doesn't have any impact on CPython and I also find it easier >> to understand for users. >> >> Besides, I was only responding to Mark's remarks (pun not intended) about >> the few cases where this may not immediately yield the expected advantage. >> They are easy to fix, that's all I was saying. In most cases, this simple >> scheme will do the right thing without any user interaction, and it does >> not require any changes or future constraints on CPython. >> >> So, why not just implement this for now and *then* re-evaluate if we really >> need more, and if we can really do better? > > Hm, I think we should implement fast dispatch first Sure, the one builds on the other. The question is only how you'd get at the pointer to the signatures. I say, a PyCapsule in an attribute will do in most cases. So, basically, I suggest to implement a fast, cached dispatch on top of a simple function object attribute first. Then we test and benchmark that, then we can decide what else we need. Once this infrastructure is implemented, adapting it to other ways of finding the signature dispatcher for a given function object will be trivial. And having it available will allow us to state exactly what the performance advantage of each such approach is and to make a case why (or why not) we need to change something outside of Cython in order to get it. Stefan From markflorisson88 at gmail.com Thu May 17 13:30:06 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Thu, 17 May 2012 12:30:06 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FB4CCF6.6020102@behnel.de> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <4FB49604.8070000@behnel.de> <4FB4CCF6.6020102@behnel.de> Message-ID: On 17 May 2012 11:03, Stefan Behnel wrote: > mark florisson, 17.05.2012 11:15: >> On 17 May 2012 07:09, Stefan Behnel wrote: >>> mark florisson, 16.05.2012 21:49: >>>> On 16 May 2012 20:15, Stefan Behnel wrote: >>>>> "Martin v. L?wis", 16.05.2012 20:33: >>>>>>> Does this use case make sense to everyone? >>>>>>> >>>>>>> The reason why we are discussing this on python-dev is that we are looking >>>>>>> for a general way to expose these C level signatures within the Python >>>>>>> ecosystem. And Dag's idea was to expose them as part of the type object, >>>>>>> basically as an addition to the current Python level tp_call() slot. >>>>>> >>>>>> The use case makes sense, yet there is also a long-standing solution >>>>>> already to expose APIs and function pointers: the capsule objects. >>>>>> >>>>>> If you want to avoid dictionary lookups on the server side, implement >>>>>> tp_getattro, comparing addresses of interned strings. >>>>> >>>>> I think Martin has a point there. Why not just use a custom attribute on >>>>> callables that hold a PyCapsule? Whenever we see inside of a Cython >>>>> implemented function that an object variable that was retrieved from the >>>>> outside, either as a function argument or as the result of a function call, >>>>> is being called, we try to unpack a C function pointer from it on all >>>>> assignments to the variable. If that works, we can scan for a suitable >>>>> signature (either right away or lazily on first access) and cache that. On >>>>> each subsequent call through that variable, the cached C function will be used. >>>>> >>>>> That means we'd replace Python variables that are being called by multiple >>>>> local variables, one that holds the object and one for each C function with >>>>> a different signature that it is being called with. We set the C function >>>>> variables to NULL when the Python function variable is being assigned to. >>>>> When the C function variable is NULL on call, we scan for a matching >>>>> signature and assign it to the variable. ?When no matching signature can be >>>>> found, we set it to (void*)-1. >>>>> >>>>> Additionally, we allow explicit user casts of Python objects to C function >>>>> types, which would then try to unpack the C function, raising a TypeError >>>>> on mismatch. >>>>> >>>>> Assignments to callable variables can be expected to occur much less >>>>> frequently than calls to them, so this will give us a good trade-off in >>>>> most cases. I don't see why this kind of caching would be any slower inside >>>>> of loops than what we were discussing so far. >>>> >>>> This works really well for local variables, but for globals, def >>>> methods or callbacks as attributes, this won't work so well, as they >>>> may be rebound at any time outside of the module scope. >>> >>> Only half true for globals, which can be declared "cdef object", e.g. for >>> imported names. That would allow Cython to see all possible reassignments >>> in a module, which would then apply the above scheme. >> >> I suppose by default they could be properties of a module subclass. >> That would also allow faster lookup of globals visible from Python in >> Cython space in the same module (but probably slower from outside). > > Yes, that's another way to do it and yet another nice feature (which we've > already been throwing into the discussions for years and years...) > > >>> I don't think def methods are a use case for this because you'd either >>> cpdef them or even cdef them if you want speed. If you want them to be >>> overridable, you'll have to live with the speed penalty that that implies. >> >> Which means you can no longer pass stuff around as a callback > > Callbacks are not a problem because they behave like any other object > that's being held in a variable (i.e. they are the normal case, not the > exception). I was referring to globally defined def functions which can be > reassigned in the module. That's a problem, but it's mostly the same as > with any global name. > Oh, I see what you were referring too now. I think Vitja already implemented something like the inline def calls, although I'm not sure what the status of that is. >>> For object attributes, you have to pay the penalty of a lookup anyway, no >>> way around that. >> >> Not in a cdef class. > > Sure, also in cdef classes. Nothing keeps me from reassigning to the > attribute of a cdef class. The difference is only that the attribute lookup > is faster for them because it passes through a pointer indirection instead > of a dict lookup. Apart from that, it's entirely the same thing. I guess we're talking about different things again, I though you meant dict lookups. Basically with the default of none-checking disabled, a cdef class attribute lookup is just a struct attribute reference. Anyway, my point was that caching pointers is a good idea, but only works in limited cases, and we shouldn't limit the programming model of users to enable fast calls. But if everyone agrees we need fast dispatching, it seems we're already on the same page :) > No, actually, it's even worse because you can't just hook into the dict (as > Vitja did recently) and check if it has changed. You actually need to read > the attribute again and then look up its C functions again, because even if > the object pointer is the same, it doesn't mean that the object is the > same, unless you keep an owned reference to it (which you can't without > keeping the object alive). So there is even less of a chance for efficient > caching. > > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Thu May 17 13:34:36 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Thu, 17 May 2012 12:34:36 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FB4CF6A.5040102@behnel.de> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <4FB49604.8070000@behnel.de> <4FB4AA94.7060505@behnel.de> <4FB4CF6A.5040102@behnel.de> Message-ID: On 17 May 2012 11:14, Stefan Behnel wrote: > mark florisson, 17.05.2012 11:26: >> On 17 May 2012 08:36, Stefan Behnel wrote: >>> Dag Sverre Seljebotn, 17.05.2012 09:12: >>>> Stefan Behnel wrote: >>>>> mark florisson, 16.05.2012 21:49: >>>>>> On 16 May 2012 20:15, Stefan Behnel wrote: >>>>>>> Why not just use a custom attribute on callables that hold a >>>>>>> PyCapsule? Whenever we see inside of a Cython implemented function >>>>>>> that an object variable that was retrieved from the outside, >>>>>>> either as a function argument or as the result of a function call, >>>>>>> is being called, we try to unpack a C function pointer from it on >>>>>>> all assignments to the variable. If that works, we can scan for a >>>>>>> suitable signature (either right away or lazily on first access) >>>>>>> and cache that. On each subsequent call through that variable, >>>>>>> the cached C function will be used. >>>>>>> >>>>>>> That means we'd replace Python variables that are being called by >>>>>>> multiple local variables, one that holds the object and one for each C >>>>>>> function with a different signature that it is being called with. We >>>>>>> set the C function variables to NULL when the Python function variable >>>>>>> is being assigned to. >>>>>>> When the C function variable is NULL on call, we scan for a matching >>>>>>> signature and assign it to the variable. ?When no matching signature >>>>>>> can be found, we set it to (void*)-1. >>>>>>> >>>>>>> Additionally, we allow explicit user casts of Python objects to C >>>>>>> function types, which would then try to unpack the C function, raising >>>>>>> a TypeError on mismatch. >>>>>>> >>>>>>> Assignments to callable variables can be expected to occur much less >>>>>>> frequently than calls to them, so this will give us a good trade-off >>>>>>> in most cases. I don't see why this kind of caching would be any slower >>>>>>> inside of loops than what we were discussing so far. >>>>>> >>>>>> This works really well for local variables, but for globals, def >>>>>> methods or callbacks as attributes, this won't work so well, as they >>>>>> may be rebound at any time outside of the module scope. >>>>> >>>>> Only half true for globals, which can be declared "cdef object", e.g. >>>>> for imported names. That would allow Cython to see all possible >>>>> reassignments in a module, which would then apply the above scheme. >>>>> >>>>> I don't think def methods are a use case for this because you'd either >>>>> cpdef them or even cdef them if you want speed. If you want them to be >>>>> overridable, you'll have to live with the speed penalty that that >>>>> implies. >>>>> >>>>> For object attributes, you have to pay the penalty of a lookup anyway, >>>>> no way around that. We can't even cache anything here (e.g. with a >>>>> borrowed reference) because the attribute may be rebound to another >>>>> object that happens to live at the same address as the previous one. >>>>> However, if you want speed, you'd do it as in CPython and assign the >>>>> object to a local variable to pay the lookup of only once. Problem >>>>> solved. >>>> >>>> 'Problem solved' by pushing the work over to the user? By that line >>>> of argument, why not just kill of Cython and require users to write C? >>> >>> What part of the work does the above proposal push to the user? To make it >>> explicit that an object attribute or a global variable is not expected to >>> change during whatever a loop does? Well, yes. If the user knows that, a >>> global cdef or an assignment to a local variable is the easiest, safest, >>> fastest and most obvious way to tell Cython that it should take advantage >>> of it. Why invent yet another declaration for this? >>> >>> >>>> Hyperbole aside; do you really believe it is worth dropping a relatively >>>> easy optimization just to make the C level code more to the taste of >>>> some python-dev posters? >>> >>> I find the above much easier for all sides. It's easier to implement for us >>> and others, it doesn't have any impact on CPython and I also find it easier >>> to understand for users. >>> >>> Besides, I was only responding to Mark's remarks (pun not intended) about >>> the few cases where this may not immediately yield the expected advantage. >>> They are easy to fix, that's all I was saying. In most cases, this simple >>> scheme will do the right thing without any user interaction, and it does >>> not require any changes or future constraints on CPython. >>> >>> So, why not just implement this for now and *then* re-evaluate if we really >>> need more, and if we can really do better? >> >> Hm, I think we should implement fast dispatch first > > Sure, the one builds on the other. The question is only how you'd get at > the pointer to the signatures. I say, a PyCapsule in an attribute will do > in most cases. > > So, basically, I suggest to implement a fast, cached dispatch on top of a > simple function object attribute first. Then we test and benchmark that, > then we can decide what else we need. Once this infrastructure is > implemented, adapting it to other ways of finding the signature dispatcher > for a given function object will be trivial. And having it available will > allow us to state exactly what the performance advantage of each such > approach is and to make a case why (or why not) we need to change something > outside of Cython in order to get it. I guess that's a good idea, although I would suggest typechecking for CythonFunction and giving it a pointer to a list of signatures (or maybe make it variable sized and put the signatures directly into the function). If it works well we can plunge ahead and generalize it for arbitrary types. (I think in any case we only needed to change things outside of Cython to standardize stuff across projects, not because we technically need it). > Stefan > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From stefan_ml at behnel.de Thu May 17 14:58:43 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 17 May 2012 14:58:43 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <4FB49604.8070000@behnel.de> <4FB4AA94.7060505@behnel.de> <4FB4CF6A.5040102@behnel.de> Message-ID: <4FB4F603.4040501@behnel.de> mark florisson, 17.05.2012 13:34: > On 17 May 2012 11:14, Stefan Behnel wrote: >> mark florisson, 17.05.2012 11:26: >>> On 17 May 2012 08:36, Stefan Behnel wrote: >>>> Dag Sverre Seljebotn, 17.05.2012 09:12: >>>>> Stefan Behnel wrote: >>>>>> mark florisson, 16.05.2012 21:49: >>>>>>> On 16 May 2012 20:15, Stefan Behnel wrote: >>>>>>>> Why not just use a custom attribute on callables that hold a >>>>>>>> PyCapsule? Whenever we see inside of a Cython implemented function >>>>>>>> that an object variable that was retrieved from the outside, >>>>>>>> either as a function argument or as the result of a function call, >>>>>>>> is being called, we try to unpack a C function pointer from it on >>>>>>>> all assignments to the variable. If that works, we can scan for a >>>>>>>> suitable signature (either right away or lazily on first access) >>>>>>>> and cache that. On each subsequent call through that variable, >>>>>>>> the cached C function will be used. >>>>>>>> >>>>>>>> That means we'd replace Python variables that are being called by >>>>>>>> multiple local variables, one that holds the object and one for each C >>>>>>>> function with a different signature that it is being called with. We >>>>>>>> set the C function variables to NULL when the Python function variable >>>>>>>> is being assigned to. >>>>>>>> When the C function variable is NULL on call, we scan for a matching >>>>>>>> signature and assign it to the variable. When no matching signature >>>>>>>> can be found, we set it to (void*)-1. >>>>>>>> >>>>>>>> Additionally, we allow explicit user casts of Python objects to C >>>>>>>> function types, which would then try to unpack the C function, raising >>>>>>>> a TypeError on mismatch. >>>>>>>> >>>>>>>> Assignments to callable variables can be expected to occur much less >>>>>>>> frequently than calls to them, so this will give us a good trade-off >>>>>>>> in most cases. I don't see why this kind of caching would be any slower >>>>>>>> inside of loops than what we were discussing so far. >>>>>>> >>>>>>> This works really well for local variables, but for globals, def >>>>>>> methods or callbacks as attributes, this won't work so well, as they >>>>>>> may be rebound at any time outside of the module scope. >>>>>> >>>>>> Only half true for globals, which can be declared "cdef object", e.g. >>>>>> for imported names. That would allow Cython to see all possible >>>>>> reassignments in a module, which would then apply the above scheme. >>>>>> >>>>>> I don't think def methods are a use case for this because you'd either >>>>>> cpdef them or even cdef them if you want speed. If you want them to be >>>>>> overridable, you'll have to live with the speed penalty that that >>>>>> implies. >>>>>> >>>>>> For object attributes, you have to pay the penalty of a lookup anyway, >>>>>> no way around that. We can't even cache anything here (e.g. with a >>>>>> borrowed reference) because the attribute may be rebound to another >>>>>> object that happens to live at the same address as the previous one. >>>>>> However, if you want speed, you'd do it as in CPython and assign the >>>>>> object to a local variable to pay the lookup of only once. Problem >>>>>> solved. >>>>> >>>>> 'Problem solved' by pushing the work over to the user? By that line >>>>> of argument, why not just kill of Cython and require users to write C? >>>> >>>> What part of the work does the above proposal push to the user? To make it >>>> explicit that an object attribute or a global variable is not expected to >>>> change during whatever a loop does? Well, yes. If the user knows that, a >>>> global cdef or an assignment to a local variable is the easiest, safest, >>>> fastest and most obvious way to tell Cython that it should take advantage >>>> of it. Why invent yet another declaration for this? >>>> >>>> >>>>> Hyperbole aside; do you really believe it is worth dropping a relatively >>>>> easy optimization just to make the C level code more to the taste of >>>>> some python-dev posters? >>>> >>>> I find the above much easier for all sides. It's easier to implement for us >>>> and others, it doesn't have any impact on CPython and I also find it easier >>>> to understand for users. >>>> >>>> Besides, I was only responding to Mark's remarks (pun not intended) about >>>> the few cases where this may not immediately yield the expected advantage. >>>> They are easy to fix, that's all I was saying. In most cases, this simple >>>> scheme will do the right thing without any user interaction, and it does >>>> not require any changes or future constraints on CPython. >>>> >>>> So, why not just implement this for now and *then* re-evaluate if we really >>>> need more, and if we can really do better? >>> >>> Hm, I think we should implement fast dispatch first >> >> Sure, the one builds on the other. The question is only how you'd get at >> the pointer to the signatures. I say, a PyCapsule in an attribute will do >> in most cases. >> >> So, basically, I suggest to implement a fast, cached dispatch on top of a >> simple function object attribute first. Then we test and benchmark that, >> then we can decide what else we need. Once this infrastructure is >> implemented, adapting it to other ways of finding the signature dispatcher >> for a given function object will be trivial. And having it available will >> allow us to state exactly what the performance advantage of each such >> approach is and to make a case why (or why not) we need to change something >> outside of Cython in order to get it. > > I guess that's a good idea, although I would suggest typechecking for > CythonFunction and giving it a pointer to a list of signatures (or > maybe make it variable sized and put the signatures directly into the > function). Either of the two will do, I think. There will be a slight performance difference when the CyFunction comes from another module, but that would only apply to the lookup. I think it's a good idea to start by only supporting CyFunction to get it working, then add another fallback for a function attribute, then take a look at other things. > If it works well we can plunge ahead and generalize it for > arbitrary types. (I think in any case we only needed to change things > outside of Cython to standardize stuff across projects, not because we > technically need it). Absolutely. Stefan From d.s.seljebotn at astro.uio.no Thu May 17 19:55:21 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 17 May 2012 19:55:21 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FB4F603.4040501@behnel.de> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB3F2F8.8060207@v.loewis.de> <4FB3FCEE.5020405@behnel.de> <4FB49604.8070000@behnel.de> <4FB4AA94.7060505@behnel.de> <4FB4CF6A.5040102@behnel.de> <4FB4F603.4040501@behnel.de> Message-ID: <4FB53B89.4040501@astro.uio.no> I don't know where to put this so I put it up top: I think this talk about implementing caching went a bit overboard myself. Here's a performance ladder for you: Alternative A) Focus on fast lookup; go from 100 ns function call to 5 ns function call Alternative B) Focus on caching a 20 ns lookup, go from 100 ns overhead to 2 ns function call (*any* function call through a pointer has a tiny bit of overhead according to my benchmark) Alternative C) Early binding at compile time ("cdef extern from ..."); essentially no overhead (if from the C standard library, the compiler might even inline it). Now, is there a need for B)? If you're not happy with a 5 ns function call, won't you always then go to alternative C)? Who's going to say "I can't live with a 5 ns penalty, but 2 ns is OK, thank you so much for implementing function pointer caching"? To me it seems much simpler to just focus on a fast lookup that will automatically be fast everywhere, than to keep chasing different cases that must be cached because the lookup is slow. What we're trying to kill is that 100ns-200ns Python call. I simply don't think caching function pointers will *ever* be needed, if the lookup can be done fast. And that seems like a so much simpler design. (Or, perhaps if we actually JIT-compile the call site. But that's a bridge we cross when we get there...) More follows: On 05/17/2012 02:58 PM, Stefan Behnel wrote: > mark florisson, 17.05.2012 13:34: > I think it's a good idea to start by only supporting CyFunction to get it > working, then add another fallback for a function attribute, then take a > look at other things. > > >> If it works well we can plunge ahead and generalize it for >> arbitrary types. (I think in any case we only needed to change things >> outside of Cython to standardize stuff across projects, not because we >> technically need it). > > Absolutely. The whole reason I brought up this is because Numba and SciPy is going to need this to talk between themselves. And Travis has a hand in both of those. One semi-standard will happen there one way or the other, I just wanted it to be something that was really benefitial to Cython as well, and allows Cython to be a player here. Obviously, a CyFunction is not what the manually written C code in SciPy is going to play along with. And if that happens anyway, why care about CyFunction? Dag From d.s.seljebotn at astro.uio.no Fri May 18 10:30:14 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Fri, 18 May 2012 10:30:14 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> Message-ID: <4FB60896.4030702@astro.uio.no> On 05/18/2012 12:57 AM, Nick Coghlan wrote: > I think the main things we'd be looking for would be: > - a clear explanation of why a new metaclass is considered too complex a > solution > - what the implications are for classes that have nothing to do with the > SciPy/NumPy ecosystem > - how subclassing would behave (both at the class and metaclass level) > > Yes, defining a new metaclass for fast signature exchange has its > challenges - but it means that *our* concerns about maintaining > consistent behaviour in the default object model and avoiding adverse > effects on code that doesn't need the new behaviour are addressed > automatically. > > Also, I'd consider a functioning reference implementation using a custom > metaclass a requirement before we considered modifying type anyway, so I > think that's the best thing to pursue next rather than a PEP. It also > has the virtue of letting you choose which Python versions to target and > iterating at a faster rate than CPython. This seems right on target. I could make a utility code C header for such a metaclass, and then the different libraries can all include it and handshake on which implementation becomes the real one through sys.modules during module initialization. That way an eventual PEP will only be a natural incremental step to make things more polished, whether that happens by making such a metaclass part of the standard library or by extending PyTypeObject. Thanks, Dag From robertwb at gmail.com Sun May 20 10:07:42 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Sun, 20 May 2012 01:07:42 -0700 Subject: [Cython] [cython] Python array support (#113) In-Reply-To: <4FB3D71F.3080602@behnel.de> References: <4FB3D71F.3080602@behnel.de> Message-ID: On Wed, May 16, 2012 at 9:34 AM, Stefan Behnel wrote: > Andreas van Cranenburgh, 16.05.2012 18:15: >> Any news on this? Let me know if there's anything I can do to help inclusion of this patch. > > Could someone please take over here? > > https://github.com/cython/cython/pull/113 > > I haven't merged this yet and won't have the time to do it soonish. What > I'd like to see happen is to get the current header file replaced by > utility code "somehow". Not sure how that "somehow" is going to work. > > Basically, if this can be solved, I'd love to have it in for 0.17. > Otherwise, well, not ... I've merged this and fixed it up somewhat. I didn't realize anonymous union members are a gnu (and now C11) extension, which this uses, so there's that caveat. I think it's good enough (and useful enough) go go in, we can continue to improve things from here. - Robert From markflorisson88 at gmail.com Sun May 20 16:03:09 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sun, 20 May 2012 15:03:09 +0100 Subject: [Cython] gsoc: array expressions Message-ID: Hey, For my gsoc we already have some simple initial ideas, i.e. elementwise vector expressions (a + b with a and b arrays with arbitrary rank), I don't think these need any discussion. However, there are a lot of things that haven't been formally discussed on the mailing list, so here goes. Fr?d?ric, I am CCing you since you expressed interest on the numpy mailing list, and I think your insights as a Theano developer can be very helpful in this discussion. User Interface =========== Besides simple array expressions for dense arrays I would like a mechanism for "custom ufuncs", although to a different extent to what Numpy or Numba provide. There are several ways in which we could want them, e.g. as typed functions (cdef, or external C) functions, as lambas or Python functions in the same module, or as general objects (e.g. functions Cython doesn't know about). To achieve maximum efficiency it will likely be good to allow sharing these functions in .pxd files. We have 'cdef inline' functions, but I would prefer annotated def functions where the parameters are specialized on demand, e.g. @elemental def add(a, b): # elemental functions can have any number of arguments and operate on any compatible dtype return a + b When calling cdef functions or elemental functions with memoryview arguments, the arguments perform a (broadcasted) elementwise operation. Alternatively, we can have a parallel.elementwise function which maps the function elementwise, which would also work for object callables. I prefer the former, since I think it will read much easier. Secondly, we can have a reduce function (and maybe a scan function), that reduce (respectively scan) in a specified axis or number of axes. E.g. parallel.reduce(add, a, b, axis=(0, 2)) where the default for axis is "all axes". As for the default value, this could be perhaps optionally provided to the elemental decorator. Otherwise, the reducer will have to get the default values from each dimension that is reduced in, and then skip those values when reducing. (Of course, the reducer function must be associate and commutative). Also, a lambda could be passed in instead of an elementwise or typed cdef function. Finally, we would have a parallel.nditer/ndenumerate/nditerate function, which would iterate over N memoryviews, and provide a sensible memory access pattern (like numpy.nditer). I'm not sure if it should provide only the indices, or also the values. e.g. an inplace elementwise add would read as follows: for i, j, k in parallel.nditerate(A, B): A[i, j, k] += B[i, j, k] Implementation =========== Fr?d?ric, feel free to correct me at any point here :) As for the implementation, I think it will be a good idea to at least reuse (optionally through command line flags) Theano's optimization pipeline. I think it would be reasonably easy to build a Theano expression graph (after fusing the expressions in Cython first), run the Theano optimizations on that and map back to a Cython AST. Optionally, we could store a pickled graph representation (or even compiled theano function?), and provide it as an optional specialization at runtime (but mapping back correctly to memoryviews where needed, etc). As Numba matures, a numba runtime specialization could optionally be provided. As for the implementation of the C specializations, I currently think we should implement our own, since theano heavily uses the numpy C API, and since its easier to have an optional theano runtime specialization anyway. I propose the following specializations, to be selected at runtime - vectorized contiguous, collapsed and aligned - this function can be called by a strided, inner dimension contiguous specialization - tiled (ndenumerate/nditerate) - tiled vectorized - plain C loops With 'aligned' it is not meant that the data itself should be aligned, but that they are aligned at the same (unaligned) offset. A runtime compiler could probably do much better here and allow for shuffling in the vectorized code for a minimal subset of the operands. Maybe it would be useful to provide a vectorized version where each operand is shuffled and the shuffle arguments are created up front? That might still be faster than non-vectorized... Anyway, the most important part would be tiling and proper memory access patterns. Which specialization is picked depends on a) which flags were passed to Cython, b) the runtime memory layout and c) what macros were defined when the Cython module was compiled. Criticism and constructive discussion welcome :) Cheers, Mark (heading out for lunch now) From markflorisson88 at gmail.com Sun May 20 16:10:53 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sun, 20 May 2012 15:10:53 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: References: Message-ID: On 20 May 2012 15:03, mark florisson wrote: > Hey, > > For my gsoc we already have some simple initial ideas, i.e. > elementwise vector expressions (a + b with a and b arrays with > arbitrary rank), I don't think these need any discussion. However, > there are a lot of things that haven't been formally discussed on the > mailing list, so here goes. > > Fr?d?ric, I am CCing you since you expressed interest on the numpy > mailing list, and I think your insights as a Theano developer can be > very helpful in this discussion. > > User Interface > =========== > Besides simple array expressions for dense arrays I would like a > mechanism for "custom ufuncs", although to a different extent to what > Numpy or Numba provide. There are several ways in which we could want > them, e.g. as typed functions (cdef, or external C) functions, as > lambas or Python functions in the same module, or as general objects > (e.g. functions Cython doesn't know about). > To achieve maximum efficiency it will likely be good to allow sharing > these functions in .pxd files. We have 'cdef inline' functions, but I > would prefer annotated def functions where the parameters are > specialized on demand, e.g. > > @elemental > def add(a, b): # elemental functions can have any number of arguments > and operate on any compatible dtype > ? ?return a + b > > When calling cdef functions or elemental functions with memoryview > arguments, the arguments perform a (broadcasted) elementwise > operation. Alternatively, we can have a parallel.elementwise function > which maps the function elementwise, which would also work for object > callables. I prefer the former, since I think it will read much > easier. > > Secondly, we can have a reduce function (and maybe a scan function), > that reduce (respectively scan) in a specified axis or number of axes. > E.g. > > ? ?parallel.reduce(add, a, b, axis=(0, 2)) > > where the default for axis is "all axes". As for the default value, > this could be perhaps optionally provided to the elemental decorator. > Otherwise, the reducer will have to get the default values from each > dimension that is reduced in, and then skip those values when > reducing. (Of course, the reducer function must be associate and > commutative). Also, a lambda could be passed in instead of an > elementwise or typed cdef function. > > Finally, we would have a parallel.nditer/ndenumerate/nditerate > function, which would iterate over N memoryviews, and provide a > sensible memory access pattern (like numpy.nditer). I'm not sure if it > should provide only the indices, or also the values. e.g. an inplace > elementwise add would read as follows: > > ? ?for i, j, k in parallel.nditerate(A, B): > ? ? ? ?A[i, j, k] += B[i, j, k] > > Implementation > =========== > Fr?d?ric, feel free to correct me at any point here :) > > As for the implementation, I think it will be a good idea to at least > reuse (optionally through command line flags) Theano's optimization > pipeline. I think it would be reasonably easy to build a Theano > expression graph (after fusing the expressions in Cython first), run > the Theano optimizations on that and map back to a Cython AST. > Optionally, we could store a pickled graph representation (or even > compiled theano function?), and provide it as an optional > specialization at runtime (but mapping back correctly to memoryviews > where needed, etc). As Numba matures, a numba runtime specialization > could optionally be provided. > > As for the implementation of the C specializations, I currently think > we should implement our own, since theano heavily uses the numpy C > API, and since its easier to have an optional theano runtime > specialization anyway. I propose the following specializations, to be > selected at runtime > > ? ?- vectorized contiguous, collapsed and aligned > ? ? ? ?- this function can be called by a strided, inner dimension > contiguous specialization > ? ?- tiled (ndenumerate/nditerate) > ? ?- tiled vectorized > ? ?- plain C loops > > With 'aligned' it is not meant that the data itself should be aligned, > but that they are aligned at the same (unaligned) offset. > A runtime compiler could probably do much better here and allow for > shuffling in the vectorized code for a minimal subset of the operands. > Maybe it would be useful to provide a vectorized version where each > operand is shuffled and the shuffle arguments are created up front? > That might still be faster than non-vectorized... Anyway, the most > important part would be tiling and proper memory access patterns. > > Which specialization is picked depends on a) which flags were passed > to Cython, b) the runtime memory layout and c) what macros were > defined when the Cython module was compiled. > > Criticism and constructive discussion welcome :) > > Cheers, > > Mark > (heading out for lunch now) This does not address code reuse yet, I believe Theano does not implement explicit vectorization or tiling for the CPU, correct? From that perspective, it makes more sense to implement this in Theano and reuse the code generation part (although it seems Theano is an even a larger beast than Cython). I sincerely believe that the larger part here would be code generation, so if another code backend is targeted (e.g. Numba), it may make more sense for such a project to reuse Theano's optimization pipeline, but not the code generation backend. From pav at iki.fi Sun May 20 22:59:58 2012 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 20 May 2012 22:59:58 +0200 Subject: [Cython] N-d arrays, without a Python object Message-ID: Hi, # OK, but slow cdef double[:,:] a = np.zeros((10, 10), float) # OK, fast (no Python object) cdef double[10] a # OK, but slow, makes Python calls (-> cython.view array) cdef double[10*10] a_ cdef double[:,:] a = (a_) # not allowed cdef double[10,10] a Small N-d work arrays are quite often needed in numerical code, and I'm not aware of a way for conveniently getting them in Cython. Maybe the recently added improved memoryviews could allow for Python-less N-dim arrays? This may be reinveinting a certain language, but such a feature would find immediate practical use. -- Pauli Virtanen From markflorisson88 at gmail.com Sun May 20 23:19:49 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Sun, 20 May 2012 22:19:49 +0100 Subject: [Cython] N-d arrays, without a Python object In-Reply-To: References: Message-ID: On 20 May 2012 21:59, Pauli Virtanen wrote: > Hi, > > > ? ? ? ?# OK, but slow > ? ? ? ?cdef double[:,:] a = np.zeros((10, 10), float) > > ? ? ? ?# OK, fast (no Python object) > ? ? ? ?cdef double[10] a > > ? ? ? ?# OK, but slow, makes Python calls (-> cython.view array) > ? ? ? ?cdef double[10*10] a_ > ? ? ? ?cdef double[:,:] a = (a_) > > ? ? ? ?# not allowed > ? ? ? ?cdef double[10,10] a > > Small N-d work arrays are quite often needed in numerical code, and I'm > not aware of a way for conveniently getting them in Cython. > > Maybe the recently added improved memoryviews could allow for > Python-less N-dim arrays? This may be reinveinting a certain language, > but such a feature would find immediate practical use. > > -- > Pauli Virtanen > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel Hey Pauli, Thanks for the feedback, that's actually really something I wanted as well, along with variable sized C arrays while we're at it. We think we can definitely make this work, probably with a syntax like 'cdef double[:10, :10] myview' for memoryviews. I'm not sure when I'll have the time to implement this, as I'm first going to focus on the gsoc, so I can't promise anything for 0.17. Cheers, Mark From d.s.seljebotn at astro.uio.no Mon May 21 12:34:40 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 21 May 2012 12:34:40 +0200 Subject: [Cython] gsoc: array expressions In-Reply-To: References: Message-ID: <4FBA1A40.2060202@astro.uio.no> On 05/20/2012 04:03 PM, mark florisson wrote: > Hey, > > For my gsoc we already have some simple initial ideas, i.e. > elementwise vector expressions (a + b with a and b arrays with > arbitrary rank), I don't think these need any discussion. However, > there are a lot of things that haven't been formally discussed on the > mailing list, so here goes. > > Fr?d?ric, I am CCing you since you expressed interest on the numpy > mailing list, and I think your insights as a Theano developer can be > very helpful in this discussion. > > User Interface > =========== > Besides simple array expressions for dense arrays I would like a > mechanism for "custom ufuncs", although to a different extent to what > Numpy or Numba provide. There are several ways in which we could want > them, e.g. as typed functions (cdef, or external C) functions, as > lambas or Python functions in the same module, or as general objects > (e.g. functions Cython doesn't know about). > To achieve maximum efficiency it will likely be good to allow sharing > these functions in .pxd files. We have 'cdef inline' functions, but I > would prefer annotated def functions where the parameters are > specialized on demand, e.g. > > @elemental > def add(a, b): # elemental functions can have any number of arguments > and operate on any compatible dtype > return a + b > > When calling cdef functions or elemental functions with memoryview > arguments, the arguments perform a (broadcasted) elementwise > operation. Alternatively, we can have a parallel.elementwise function > which maps the function elementwise, which would also work for object > callables. I prefer the former, since I think it will read much > easier. > > Secondly, we can have a reduce function (and maybe a scan function), > that reduce (respectively scan) in a specified axis or number of axes. > E.g. > > parallel.reduce(add, a, b, axis=(0, 2)) > > where the default for axis is "all axes". As for the default value, > this could be perhaps optionally provided to the elemental decorator. > Otherwise, the reducer will have to get the default values from each > dimension that is reduced in, and then skip those values when > reducing. (Of course, the reducer function must be associate and > commutative). Also, a lambda could be passed in instead of an Only associative, right? Sounds good to me. > elementwise or typed cdef function. > > Finally, we would have a parallel.nditer/ndenumerate/nditerate > function, which would iterate over N memoryviews, and provide a > sensible memory access pattern (like numpy.nditer). I'm not sure if it > should provide only the indices, or also the values. e.g. an inplace > elementwise add would read as follows: > > for i, j, k in parallel.nditerate(A, B): > A[i, j, k] += B[i, j, k] I think this sounds good; I guess don't see a particular reason for "ndenumerate", I think code like the above is clearer. It's perhaps worth at least thinking about how to support "for idx in ...", "A[idx[2], Ellipsis] = ...", i.e. arbitrary number of dimensions. Not in first iteration though. Putting it in "parallel" is nice because prange already have out-of-order semantics.... But of course, there are performance benefits even within a single thread because of the out-of-order aspect. This should at least be a big NOTE box in the documentation. > > Implementation > =========== > Fr?d?ric, feel free to correct me at any point here :) > > As for the implementation, I think it will be a good idea to at least > reuse (optionally through command line flags) Theano's optimization > pipeline. I think it would be reasonably easy to build a Theano > expression graph (after fusing the expressions in Cython first), run > the Theano optimizations on that and map back to a Cython AST. > Optionally, we could store a pickled graph representation (or even > compiled theano function?), and provide it as an optional > specialization at runtime (but mapping back correctly to memoryviews > where needed, etc). As Numba matures, a numba runtime specialization > could optionally be provided. Can you enlighten us a bit about what Theano's optimizations involve? You mention doing the iteration specializations yourself below, and also the tiling.. Is it just "scalar" optimizations of the form "x**3 -> x * x * x" and numeric stabilization like "log(1 + x) -> log1p(x)" that would be provided by Theano? If so, such optimizations should be done also for our scalar computations, not just vector, right? Or does Theano deal with the memory access patterns? > > As for the implementation of the C specializations, I currently think > we should implement our own, since theano heavily uses the numpy C > API, and since its easier to have an optional theano runtime > specialization anyway. I propose the following specializations, to be > selected at runtime > > - vectorized contiguous, collapsed and aligned > - this function can be called by a strided, inner dimension > contiguous specialization > - tiled (ndenumerate/nditerate) > - tiled vectorized > - plain C loops > > With 'aligned' it is not meant that the data itself should be aligned, > but that they are aligned at the same (unaligned) offset. Sounds good. About implementing this in Cython, would you simply turn a[...] = b + c into for i, j, k in parallel.nditerate(a, b, c): a[i, j, k] = b[i, j, k] + c[i, j, k] and then focus on optimizing nditerate? That seemed like the logical path to me, but perhaps that doesn't play nicely with using Theano to optimize the expression? (Again I need a clearer picture of what this involves.) (Yes, AMD and I guess probably Intel too seem to have moved towards making unaligned MOV as fast as aligned MOV I guess, so no need to worry about that.) > A runtime compiler could probably do much better here and allow for > shuffling in the vectorized code for a minimal subset of the operands. > Maybe it would be useful to provide a vectorized version where each > operand is shuffled and the shuffle arguments are created up front? > That might still be faster than non-vectorized... Anyway, the most > important part would be tiling and proper memory access patterns. > > Which specialization is picked depends on a) which flags were passed > to Cython, b) the runtime memory layout and c) what macros were > defined when the Cython module was compiled. > > Criticism and constructive discussion welcome :) > > Cheers, > > Mark > (heading out for lunch now) Dag From markflorisson88 at gmail.com Mon May 21 12:56:10 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 21 May 2012 11:56:10 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: <4FBA1A40.2060202@astro.uio.no> References: <4FBA1A40.2060202@astro.uio.no> Message-ID: On 21 May 2012 11:34, Dag Sverre Seljebotn wrote: > On 05/20/2012 04:03 PM, mark florisson wrote: >> >> Hey, >> >> For my gsoc we already have some simple initial ideas, i.e. >> elementwise vector expressions (a + b with a and b arrays with >> arbitrary rank), I don't think these need any discussion. However, >> there are a lot of things that haven't been formally discussed on the >> mailing list, so here goes. >> >> Fr?d?ric, I am CCing you since you expressed interest on the numpy >> mailing list, and I think your insights as a Theano developer can be >> very helpful in this discussion. >> >> User Interface >> =========== >> Besides simple array expressions for dense arrays I would like a >> mechanism for "custom ufuncs", although to a different extent to what >> Numpy or Numba provide. There are several ways in which we could want >> them, e.g. as typed functions (cdef, or external C) functions, as >> lambas or Python functions in the same module, or as general objects >> (e.g. functions Cython doesn't know about). >> To achieve maximum efficiency it will likely be good to allow sharing >> these functions in .pxd files. We have 'cdef inline' functions, but I >> would prefer annotated def functions where the parameters are >> specialized on demand, e.g. >> >> @elemental >> def add(a, b): # elemental functions can have any number of arguments >> and operate on any compatible dtype >> ? ? return a + b >> >> When calling cdef functions or elemental functions with memoryview >> arguments, the arguments perform a (broadcasted) elementwise >> operation. Alternatively, we can have a parallel.elementwise function >> which maps the function elementwise, which would also work for object >> callables. I prefer the former, since I think it will read much >> easier. >> >> Secondly, we can have a reduce function (and maybe a scan function), >> that reduce (respectively scan) in a specified axis or number of axes. >> E.g. >> >> ? ? parallel.reduce(add, a, b, axis=(0, 2)) >> >> where the default for axis is "all axes". As for the default value, >> this could be perhaps optionally provided to the elemental decorator. >> Otherwise, the reducer will have to get the default values from each >> dimension that is reduced in, and then skip those values when >> reducing. (Of course, the reducer function must be associate and >> commutative). Also, a lambda could be passed in instead of an > > > Only associative, right? > > Sounds good to me. > Ah, I guess, because we can reduce thead-local results manually in a specified (elementwise) order (I was thinking of generating OpenMP annotated loops, that can be enabled/disabled at the C level, with an 'if' clause with a sensible lower bound of iterations required). >> elementwise or typed cdef function. >> >> Finally, we would have a parallel.nditer/ndenumerate/nditerate >> function, which would iterate over N memoryviews, and provide a >> sensible memory access pattern (like numpy.nditer). I'm not sure if it >> should provide only the indices, or also the values. e.g. an inplace >> elementwise add would read as follows: >> >> ? ? for i, j, k in parallel.nditerate(A, B): >> ? ? ? ? A[i, j, k] += B[i, j, k] > > > > I think this sounds good; I guess don't see a particular reason for > "ndenumerate", I think code like the above is clearer. > > It's perhaps worth at least thinking about how to support "for idx in ...", > "A[idx[2], Ellipsis] = ...", i.e. arbitrary number of dimensions. Not in > first iteration though. Yeah, definitely. > Putting it in "parallel" is nice because prange already have out-of-order > semantics.... But of course, there are performance benefits even within a > single thread because of the out-of-order aspect. This should at least be a > big NOTE box in the documentation. > > >> >> Implementation >> =========== >> Fr?d?ric, feel free to correct me at any point here :) >> >> As for the implementation, I think it will be a good idea to at least >> reuse (optionally through command line flags) Theano's optimization >> pipeline. I think it would be reasonably easy to build a Theano >> expression graph (after fusing the expressions in Cython first), run >> the Theano optimizations on that and map back to a Cython AST. >> Optionally, we could store a pickled graph representation (or even >> compiled theano function?), and provide it as an optional >> specialization at runtime (but mapping back correctly to memoryviews >> where needed, etc). As Numba matures, a numba runtime specialization >> could optionally be provided. > > > Can you enlighten us a bit about what Theano's optimizations involve? You > mention doing the iteration specializations yourself below, and also the > tiling.. > > Is it just "scalar" optimizations of the form "x**3 -> x * x * x" and > numeric stabilization like "log(1 + x) -> log1p(x)" that would be provided > by Theano? Yes, it does those kind of things, and it also eliminates common subexpressions, and it transforms certain expressions to BLAS/LAPACK functionality. I'm not sure we want that specifically. I'm thinking it might be more fruitful to start off with a theano-only specialization, and implement low-level code generation in Theano, and use that from Cython by either directly dumping in the code, or deferring that to Theano. At this point I'm not entirely sure. > If so, such optimizations should be done also for our scalar computations, > not just vector, right? > > Or does Theano deal with the memory access patterns? > I think it does so for the CUDA backend, but not for the C++ backend. I think we need to discuss this stuff on the theano mailing list. >> >> As for the implementation of the C specializations, I currently think >> we should implement our own, since theano heavily uses the numpy C >> API, and since its easier to have an optional theano runtime >> specialization anyway. I propose the following specializations, to be >> selected at runtime >> >> ? ? - vectorized contiguous, collapsed and aligned >> ? ? ? ? - this function can be called by a strided, inner dimension >> contiguous specialization >> ? ? - tiled (ndenumerate/nditerate) >> ? ? - tiled vectorized >> ? ? - plain C loops >> >> With 'aligned' it is not meant that the data itself should be aligned, >> but that they are aligned at the same (unaligned) offset. > > > Sounds good. > > About implementing this in Cython, would you simply turn > > a[...] = b + c > > into > > for i, j, k in parallel.nditerate(a, b, c): > ? ?a[i, j, k] = b[i, j, k] + c[i, j, k] > > and then focus on optimizing nditerate? That seemed like the logical path to > me, but perhaps that doesn't play nicely with using Theano to optimize the > expression? (Again I need a clearer picture of what this involves.) > > (Yes, AMD and I guess probably Intel too seem to have moved towards making > unaligned MOV as fast as aligned MOV I guess, so no need to worry about > that.) > > >> A runtime compiler could probably do much better here and allow for >> shuffling in the vectorized code for a minimal subset of the operands. >> Maybe it would be useful to provide a vectorized version where each >> operand is shuffled and the shuffle arguments are created up front? >> That might still be faster than non-vectorized... Anyway, the most >> important part would be tiling and proper memory access patterns. >> >> Which specialization is picked depends on a) which flags were passed >> to Cython, b) the runtime memory layout and c) what macros were >> defined when the Cython module was compiled. >> >> Criticism and constructive discussion welcome :) >> >> Cheers, >> >> Mark >> (heading out for lunch now) > > > Dag > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Mon May 21 13:08:33 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 21 May 2012 12:08:33 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: <4FBA1A40.2060202@astro.uio.no> References: <4FBA1A40.2060202@astro.uio.no> Message-ID: On 21 May 2012 11:34, Dag Sverre Seljebotn wrote: > On 05/20/2012 04:03 PM, mark florisson wrote: >> >> Hey, >> >> For my gsoc we already have some simple initial ideas, i.e. >> elementwise vector expressions (a + b with a and b arrays with >> arbitrary rank), I don't think these need any discussion. However, >> there are a lot of things that haven't been formally discussed on the >> mailing list, so here goes. >> >> Fr?d?ric, I am CCing you since you expressed interest on the numpy >> mailing list, and I think your insights as a Theano developer can be >> very helpful in this discussion. >> >> User Interface >> =========== >> Besides simple array expressions for dense arrays I would like a >> mechanism for "custom ufuncs", although to a different extent to what >> Numpy or Numba provide. There are several ways in which we could want >> them, e.g. as typed functions (cdef, or external C) functions, as >> lambas or Python functions in the same module, or as general objects >> (e.g. functions Cython doesn't know about). >> To achieve maximum efficiency it will likely be good to allow sharing >> these functions in .pxd files. We have 'cdef inline' functions, but I >> would prefer annotated def functions where the parameters are >> specialized on demand, e.g. >> >> @elemental >> def add(a, b): # elemental functions can have any number of arguments >> and operate on any compatible dtype >> ? ? return a + b >> >> When calling cdef functions or elemental functions with memoryview >> arguments, the arguments perform a (broadcasted) elementwise >> operation. Alternatively, we can have a parallel.elementwise function >> which maps the function elementwise, which would also work for object >> callables. I prefer the former, since I think it will read much >> easier. >> >> Secondly, we can have a reduce function (and maybe a scan function), >> that reduce (respectively scan) in a specified axis or number of axes. >> E.g. >> >> ? ? parallel.reduce(add, a, b, axis=(0, 2)) >> >> where the default for axis is "all axes". As for the default value, >> this could be perhaps optionally provided to the elemental decorator. >> Otherwise, the reducer will have to get the default values from each >> dimension that is reduced in, and then skip those values when >> reducing. (Of course, the reducer function must be associate and >> commutative). Also, a lambda could be passed in instead of an > > > Only associative, right? > > Sounds good to me. > > >> elementwise or typed cdef function. >> >> Finally, we would have a parallel.nditer/ndenumerate/nditerate >> function, which would iterate over N memoryviews, and provide a >> sensible memory access pattern (like numpy.nditer). I'm not sure if it >> should provide only the indices, or also the values. e.g. an inplace >> elementwise add would read as follows: >> >> ? ? for i, j, k in parallel.nditerate(A, B): >> ? ? ? ? A[i, j, k] += B[i, j, k] > > > > I think this sounds good; I guess don't see a particular reason for > "ndenumerate", I think code like the above is clearer. > > It's perhaps worth at least thinking about how to support "for idx in ...", > "A[idx[2], Ellipsis] = ...", i.e. arbitrary number of dimensions. Not in > first iteration though. > > Putting it in "parallel" is nice because prange already have out-of-order > semantics.... But of course, there are performance benefits even within a > single thread because of the out-of-order aspect. This should at least be a > big NOTE box in the documentation. > > >> >> Implementation >> =========== >> Fr?d?ric, feel free to correct me at any point here :) >> >> As for the implementation, I think it will be a good idea to at least >> reuse (optionally through command line flags) Theano's optimization >> pipeline. I think it would be reasonably easy to build a Theano >> expression graph (after fusing the expressions in Cython first), run >> the Theano optimizations on that and map back to a Cython AST. >> Optionally, we could store a pickled graph representation (or even >> compiled theano function?), and provide it as an optional >> specialization at runtime (but mapping back correctly to memoryviews >> where needed, etc). As Numba matures, a numba runtime specialization >> could optionally be provided. > > > Can you enlighten us a bit about what Theano's optimizations involve? You > mention doing the iteration specializations yourself below, and also the > tiling.. > > Is it just "scalar" optimizations of the form "x**3 -> x * x * x" and > numeric stabilization like "log(1 + x) -> log1p(x)" that would be provided > by Theano? > > If so, such optimizations should be done also for our scalar computations, > not just vector, right? As for this, I don't think CSE is important for scalar computations, since if they are objects, you have to go through a Python layer since you have no idea what the code does, and if they are C objects, the C compiler will readily optimize that out. You want this for vector expressions since the computations may be expensive and the C compiler may not optimize them. E.g. consider the silly example of v1 * sum(A) + v2 * sum(A). It's more convenient to write than to introduce a new variable manually. > Or does Theano deal with the memory access patterns? > > >> >> As for the implementation of the C specializations, I currently think >> we should implement our own, since theano heavily uses the numpy C >> API, and since its easier to have an optional theano runtime >> specialization anyway. I propose the following specializations, to be >> selected at runtime >> >> ? ? - vectorized contiguous, collapsed and aligned >> ? ? ? ? - this function can be called by a strided, inner dimension >> contiguous specialization >> ? ? - tiled (ndenumerate/nditerate) >> ? ? - tiled vectorized >> ? ? - plain C loops >> >> With 'aligned' it is not meant that the data itself should be aligned, >> but that they are aligned at the same (unaligned) offset. > > > Sounds good. > > About implementing this in Cython, would you simply turn > > a[...] = b + c > > into > > for i, j, k in parallel.nditerate(a, b, c): > ? ?a[i, j, k] = b[i, j, k] + c[i, j, k] > > and then focus on optimizing nditerate? That seemed like the logical path to > me, but perhaps that doesn't play nicely with using Theano to optimize the > expression? (Again I need a clearer picture of what this involves.) I don't think so, since I think we want to try explicit vectorization. I think the iteration and tiling mechanism etc will be shared by both, but we wouldn't provide that direct mapping (I think). Although maybe we could insert a VectorAssignment with VectorScalars... let's see, in any case both will be optimized in the same way, except that with vector expressions you know you're always getting the best the compiler can do. > (Yes, AMD and I guess probably Intel too seem to have moved towards making > unaligned MOV as fast as aligned MOV I guess, so no need to worry about > that.) > > >> A runtime compiler could probably do much better here and allow for >> shuffling in the vectorized code for a minimal subset of the operands. >> Maybe it would be useful to provide a vectorized version where each >> operand is shuffled and the shuffle arguments are created up front? >> That might still be faster than non-vectorized... Anyway, the most >> important part would be tiling and proper memory access patterns. >> >> Which specialization is picked depends on a) which flags were passed >> to Cython, b) the runtime memory layout and c) what macros were >> defined when the Cython module was compiled. >> >> Criticism and constructive discussion welcome :) >> >> Cheers, >> >> Mark >> (heading out for lunch now) > > > Dag > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Mon May 21 13:11:49 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 21 May 2012 12:11:49 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> Message-ID: On 21 May 2012 12:08, mark florisson wrote: > On 21 May 2012 11:34, Dag Sverre Seljebotn wrote: >> On 05/20/2012 04:03 PM, mark florisson wrote: >>> >>> Hey, >>> >>> For my gsoc we already have some simple initial ideas, i.e. >>> elementwise vector expressions (a + b with a and b arrays with >>> arbitrary rank), I don't think these need any discussion. However, >>> there are a lot of things that haven't been formally discussed on the >>> mailing list, so here goes. >>> >>> Fr?d?ric, I am CCing you since you expressed interest on the numpy >>> mailing list, and I think your insights as a Theano developer can be >>> very helpful in this discussion. >>> >>> User Interface >>> =========== >>> Besides simple array expressions for dense arrays I would like a >>> mechanism for "custom ufuncs", although to a different extent to what >>> Numpy or Numba provide. There are several ways in which we could want >>> them, e.g. as typed functions (cdef, or external C) functions, as >>> lambas or Python functions in the same module, or as general objects >>> (e.g. functions Cython doesn't know about). >>> To achieve maximum efficiency it will likely be good to allow sharing >>> these functions in .pxd files. We have 'cdef inline' functions, but I >>> would prefer annotated def functions where the parameters are >>> specialized on demand, e.g. >>> >>> @elemental >>> def add(a, b): # elemental functions can have any number of arguments >>> and operate on any compatible dtype >>> ? ? return a + b >>> >>> When calling cdef functions or elemental functions with memoryview >>> arguments, the arguments perform a (broadcasted) elementwise >>> operation. Alternatively, we can have a parallel.elementwise function >>> which maps the function elementwise, which would also work for object >>> callables. I prefer the former, since I think it will read much >>> easier. >>> >>> Secondly, we can have a reduce function (and maybe a scan function), >>> that reduce (respectively scan) in a specified axis or number of axes. >>> E.g. >>> >>> ? ? parallel.reduce(add, a, b, axis=(0, 2)) >>> >>> where the default for axis is "all axes". As for the default value, >>> this could be perhaps optionally provided to the elemental decorator. >>> Otherwise, the reducer will have to get the default values from each >>> dimension that is reduced in, and then skip those values when >>> reducing. (Of course, the reducer function must be associate and >>> commutative). Also, a lambda could be passed in instead of an >> >> >> Only associative, right? >> >> Sounds good to me. >> >> >>> elementwise or typed cdef function. >>> >>> Finally, we would have a parallel.nditer/ndenumerate/nditerate >>> function, which would iterate over N memoryviews, and provide a >>> sensible memory access pattern (like numpy.nditer). I'm not sure if it >>> should provide only the indices, or also the values. e.g. an inplace >>> elementwise add would read as follows: >>> >>> ? ? for i, j, k in parallel.nditerate(A, B): >>> ? ? ? ? A[i, j, k] += B[i, j, k] >> >> >> >> I think this sounds good; I guess don't see a particular reason for >> "ndenumerate", I think code like the above is clearer. >> >> It's perhaps worth at least thinking about how to support "for idx in ...", >> "A[idx[2], Ellipsis] = ...", i.e. arbitrary number of dimensions. Not in >> first iteration though. >> >> Putting it in "parallel" is nice because prange already have out-of-order >> semantics.... But of course, there are performance benefits even within a >> single thread because of the out-of-order aspect. This should at least be a >> big NOTE box in the documentation. >> >> >>> >>> Implementation >>> =========== >>> Fr?d?ric, feel free to correct me at any point here :) >>> >>> As for the implementation, I think it will be a good idea to at least >>> reuse (optionally through command line flags) Theano's optimization >>> pipeline. I think it would be reasonably easy to build a Theano >>> expression graph (after fusing the expressions in Cython first), run >>> the Theano optimizations on that and map back to a Cython AST. >>> Optionally, we could store a pickled graph representation (or even >>> compiled theano function?), and provide it as an optional >>> specialization at runtime (but mapping back correctly to memoryviews >>> where needed, etc). As Numba matures, a numba runtime specialization >>> could optionally be provided. >> >> >> Can you enlighten us a bit about what Theano's optimizations involve? You >> mention doing the iteration specializations yourself below, and also the >> tiling.. >> >> Is it just "scalar" optimizations of the form "x**3 -> x * x * x" and >> numeric stabilization like "log(1 + x) -> log1p(x)" that would be provided >> by Theano? >> >> If so, such optimizations should be done also for our scalar computations, >> not just vector, right? > > As for this, I don't think CSE is important for scalar computations, > since if they are objects, you have to go through a Python layer since > you have no idea what the code does, and if they are C objects, the C > compiler will readily optimize that out. You want this for vector > expressions since the computations may be ?expensive and the C > compiler may not optimize them. E.g. consider the silly example of v1 > * sum(A) + v2 * sum(A). It's more convenient to write than to > introduce a new variable manually. > >> Or does Theano deal with the memory access patterns? >> >> >>> >>> As for the implementation of the C specializations, I currently think >>> we should implement our own, since theano heavily uses the numpy C >>> API, and since its easier to have an optional theano runtime >>> specialization anyway. I propose the following specializations, to be >>> selected at runtime >>> >>> ? ? - vectorized contiguous, collapsed and aligned >>> ? ? ? ? - this function can be called by a strided, inner dimension >>> contiguous specialization >>> ? ? - tiled (ndenumerate/nditerate) >>> ? ? - tiled vectorized >>> ? ? - plain C loops >>> >>> With 'aligned' it is not meant that the data itself should be aligned, >>> but that they are aligned at the same (unaligned) offset. >> >> >> Sounds good. >> >> About implementing this in Cython, would you simply turn >> >> a[...] = b + c >> >> into >> >> for i, j, k in parallel.nditerate(a, b, c): >> ? ?a[i, j, k] = b[i, j, k] + c[i, j, k] >> >> and then focus on optimizing nditerate? That seemed like the logical path to >> me, but perhaps that doesn't play nicely with using Theano to optimize the >> expression? (Again I need a clearer picture of what this involves.) > > I don't think so, since I think we want to try explicit vectorization. > I think the iteration and tiling mechanism etc will be shared by both, > but we wouldn't provide that direct mapping (I think). Although maybe > we could insert a VectorAssignment with VectorScalars... let's see, in > any case both will be optimized in the same way, except that with > vector expressions you know you're always getting the best the > compiler can do. (Where of course the nditerate loop striding pattern would be patched accordingly.) >> (Yes, AMD and I guess probably Intel too seem to have moved towards making >> unaligned MOV as fast as aligned MOV I guess, so no need to worry about >> that.) >> >> >>> A runtime compiler could probably do much better here and allow for >>> shuffling in the vectorized code for a minimal subset of the operands. >>> Maybe it would be useful to provide a vectorized version where each >>> operand is shuffled and the shuffle arguments are created up front? >>> That might still be faster than non-vectorized... Anyway, the most >>> important part would be tiling and proper memory access patterns. >>> >>> Which specialization is picked depends on a) which flags were passed >>> to Cython, b) the runtime memory layout and c) what macros were >>> defined when the Cython module was compiled. >>> >>> Criticism and constructive discussion welcome :) >>> >>> Cheers, >>> >>> Mark >>> (heading out for lunch now) >> >> >> Dag >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel From d.s.seljebotn at astro.uio.no Mon May 21 13:14:26 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 21 May 2012 13:14:26 +0200 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> Message-ID: <4FBA2392.2010105@astro.uio.no> On 05/21/2012 12:56 PM, mark florisson wrote: > On 21 May 2012 11:34, Dag Sverre Seljebotn wrote: >> On 05/20/2012 04:03 PM, mark florisson wrote: >>> >>> Hey, >>> >>> For my gsoc we already have some simple initial ideas, i.e. >>> elementwise vector expressions (a + b with a and b arrays with >>> arbitrary rank), I don't think these need any discussion. However, >>> there are a lot of things that haven't been formally discussed on the >>> mailing list, so here goes. >>> >>> Fr?d?ric, I am CCing you since you expressed interest on the numpy >>> mailing list, and I think your insights as a Theano developer can be >>> very helpful in this discussion. >>> >>> User Interface >>> =========== >>> Besides simple array expressions for dense arrays I would like a >>> mechanism for "custom ufuncs", although to a different extent to what >>> Numpy or Numba provide. There are several ways in which we could want >>> them, e.g. as typed functions (cdef, or external C) functions, as >>> lambas or Python functions in the same module, or as general objects >>> (e.g. functions Cython doesn't know about). >>> To achieve maximum efficiency it will likely be good to allow sharing >>> these functions in .pxd files. We have 'cdef inline' functions, but I >>> would prefer annotated def functions where the parameters are >>> specialized on demand, e.g. >>> >>> @elemental >>> def add(a, b): # elemental functions can have any number of arguments >>> and operate on any compatible dtype >>> return a + b >>> >>> When calling cdef functions or elemental functions with memoryview >>> arguments, the arguments perform a (broadcasted) elementwise >>> operation. Alternatively, we can have a parallel.elementwise function >>> which maps the function elementwise, which would also work for object >>> callables. I prefer the former, since I think it will read much >>> easier. >>> >>> Secondly, we can have a reduce function (and maybe a scan function), >>> that reduce (respectively scan) in a specified axis or number of axes. >>> E.g. >>> >>> parallel.reduce(add, a, b, axis=(0, 2)) >>> >>> where the default for axis is "all axes". As for the default value, >>> this could be perhaps optionally provided to the elemental decorator. >>> Otherwise, the reducer will have to get the default values from each >>> dimension that is reduced in, and then skip those values when >>> reducing. (Of course, the reducer function must be associate and >>> commutative). Also, a lambda could be passed in instead of an >> >> >> Only associative, right? >> >> Sounds good to me. >> > > Ah, I guess, because we can reduce thead-local results manually in a > specified (elementwise) order (I was thinking of generating OpenMP > annotated loops, that can be enabled/disabled at the C level, with an > 'if' clause with a sensible lower bound of iterations required). > >>> elementwise or typed cdef function. >>> >>> Finally, we would have a parallel.nditer/ndenumerate/nditerate >>> function, which would iterate over N memoryviews, and provide a >>> sensible memory access pattern (like numpy.nditer). I'm not sure if it >>> should provide only the indices, or also the values. e.g. an inplace >>> elementwise add would read as follows: >>> >>> for i, j, k in parallel.nditerate(A, B): >>> A[i, j, k] += B[i, j, k] >> >> >> >> I think this sounds good; I guess don't see a particular reason for >> "ndenumerate", I think code like the above is clearer. >> >> It's perhaps worth at least thinking about how to support "for idx in ...", >> "A[idx[2], Ellipsis] = ...", i.e. arbitrary number of dimensions. Not in >> first iteration though. > > Yeah, definitely. > >> Putting it in "parallel" is nice because prange already have out-of-order >> semantics.... But of course, there are performance benefits even within a >> single thread because of the out-of-order aspect. This should at least be a >> big NOTE box in the documentation. >> >> >>> >>> Implementation >>> =========== >>> Fr?d?ric, feel free to correct me at any point here :) >>> >>> As for the implementation, I think it will be a good idea to at least >>> reuse (optionally through command line flags) Theano's optimization >>> pipeline. I think it would be reasonably easy to build a Theano >>> expression graph (after fusing the expressions in Cython first), run >>> the Theano optimizations on that and map back to a Cython AST. >>> Optionally, we could store a pickled graph representation (or even >>> compiled theano function?), and provide it as an optional >>> specialization at runtime (but mapping back correctly to memoryviews >>> where needed, etc). As Numba matures, a numba runtime specialization >>> could optionally be provided. >> >> >> Can you enlighten us a bit about what Theano's optimizations involve? You >> mention doing the iteration specializations yourself below, and also the >> tiling.. >> >> Is it just "scalar" optimizations of the form "x**3 -> x * x * x" and >> numeric stabilization like "log(1 + x) -> log1p(x)" that would be provided >> by Theano? > > Yes, it does those kind of things, and it also eliminates common > subexpressions, and it transforms certain expressions to BLAS/LAPACK > functionality. I'm not sure we want that specifically. I'm thinking it > might be more fruitful to start off with a theano-only specialization, > and implement low-level code generation in Theano, and use that from > Cython by either directly dumping in the code, or deferring that to > Theano. At this point I'm not entirely sure. Still, if this is all Theano provides, I question structuring the project around reusing Theano. It's the sort of things that are nice-to-have but not fundamental (like memory access patterns). Put another way, it sounds like Theano could easily be made an optional dependency currently. Another question is of course whether it is better to work on Theano to implement tiling etc. for the CPU (and even compile all the specializations and select between them). You could perhaps even have Theano use PEP 3118 rather than NumPy too. I guess I should subscribe to the Theano list. Dag From markflorisson88 at gmail.com Mon May 21 13:21:33 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 21 May 2012 12:21:33 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: <4FBA2392.2010105@astro.uio.no> References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: On 21 May 2012 12:14, Dag Sverre Seljebotn wrote: > On 05/21/2012 12:56 PM, mark florisson wrote: >> >> On 21 May 2012 11:34, Dag Sverre Seljebotn >> ?wrote: >>> >>> On 05/20/2012 04:03 PM, mark florisson wrote: >>>> >>>> >>>> Hey, >>>> >>>> For my gsoc we already have some simple initial ideas, i.e. >>>> elementwise vector expressions (a + b with a and b arrays with >>>> arbitrary rank), I don't think these need any discussion. However, >>>> there are a lot of things that haven't been formally discussed on the >>>> mailing list, so here goes. >>>> >>>> Fr?d?ric, I am CCing you since you expressed interest on the numpy >>>> mailing list, and I think your insights as a Theano developer can be >>>> very helpful in this discussion. >>>> >>>> User Interface >>>> =========== >>>> Besides simple array expressions for dense arrays I would like a >>>> mechanism for "custom ufuncs", although to a different extent to what >>>> Numpy or Numba provide. There are several ways in which we could want >>>> them, e.g. as typed functions (cdef, or external C) functions, as >>>> lambas or Python functions in the same module, or as general objects >>>> (e.g. functions Cython doesn't know about). >>>> To achieve maximum efficiency it will likely be good to allow sharing >>>> these functions in .pxd files. We have 'cdef inline' functions, but I >>>> would prefer annotated def functions where the parameters are >>>> specialized on demand, e.g. >>>> >>>> @elemental >>>> def add(a, b): # elemental functions can have any number of arguments >>>> and operate on any compatible dtype >>>> ? ? return a + b >>>> >>>> When calling cdef functions or elemental functions with memoryview >>>> arguments, the arguments perform a (broadcasted) elementwise >>>> operation. Alternatively, we can have a parallel.elementwise function >>>> which maps the function elementwise, which would also work for object >>>> callables. I prefer the former, since I think it will read much >>>> easier. >>>> >>>> Secondly, we can have a reduce function (and maybe a scan function), >>>> that reduce (respectively scan) in a specified axis or number of axes. >>>> E.g. >>>> >>>> ? ? parallel.reduce(add, a, b, axis=(0, 2)) >>>> >>>> where the default for axis is "all axes". As for the default value, >>>> this could be perhaps optionally provided to the elemental decorator. >>>> Otherwise, the reducer will have to get the default values from each >>>> dimension that is reduced in, and then skip those values when >>>> reducing. (Of course, the reducer function must be associate and >>>> commutative). Also, a lambda could be passed in instead of an >>> >>> >>> >>> Only associative, right? >>> >>> Sounds good to me. >>> >> >> Ah, I guess, because we can reduce thead-local results manually in a >> specified (elementwise) order (I was thinking of generating OpenMP >> annotated loops, that can be enabled/disabled at the C level, with an >> 'if' clause with a sensible lower bound of iterations required). >> >>>> elementwise or typed cdef function. >>>> >>>> Finally, we would have a parallel.nditer/ndenumerate/nditerate >>>> function, which would iterate over N memoryviews, and provide a >>>> sensible memory access pattern (like numpy.nditer). I'm not sure if it >>>> should provide only the indices, or also the values. e.g. an inplace >>>> elementwise add would read as follows: >>>> >>>> ? ? for i, j, k in parallel.nditerate(A, B): >>>> ? ? ? ? A[i, j, k] += B[i, j, k] >>> >>> >>> >>> >>> I think this sounds good; I guess don't see a particular reason for >>> "ndenumerate", I think code like the above is clearer. >>> >>> It's perhaps worth at least thinking about how to support "for idx in >>> ...", >>> "A[idx[2], Ellipsis] = ...", i.e. arbitrary number of dimensions. Not in >>> first iteration though. >> >> >> Yeah, definitely. >> >>> Putting it in "parallel" is nice because prange already have out-of-order >>> semantics.... But of course, there are performance benefits even within a >>> single thread because of the out-of-order aspect. This should at least be >>> a >>> big NOTE box in the documentation. >>> >>> >>>> >>>> Implementation >>>> =========== >>>> Fr?d?ric, feel free to correct me at any point here :) >>>> >>>> As for the implementation, I think it will be a good idea to at least >>>> reuse (optionally through command line flags) Theano's optimization >>>> pipeline. I think it would be reasonably easy to build a Theano >>>> expression graph (after fusing the expressions in Cython first), run >>>> the Theano optimizations on that and map back to a Cython AST. >>>> Optionally, we could store a pickled graph representation (or even >>>> compiled theano function?), and provide it as an optional >>>> specialization at runtime (but mapping back correctly to memoryviews >>>> where needed, etc). As Numba matures, a numba runtime specialization >>>> could optionally be provided. >>> >>> >>> >>> Can you enlighten us a bit about what Theano's optimizations involve? You >>> mention doing the iteration specializations yourself below, and also the >>> tiling.. >>> >>> Is it just "scalar" optimizations of the form "x**3 -> ?x * x * x" and >>> numeric stabilization like "log(1 + x) -> ?log1p(x)" that would be >>> provided >>> by Theano? >> >> >> Yes, it does those kind of things, and it also eliminates common >> subexpressions, and it transforms certain expressions to BLAS/LAPACK >> functionality. I'm not sure we want that specifically. I'm thinking it >> might be more fruitful to start off with a theano-only specialization, >> and implement low-level code generation in Theano, and use that from >> Cython by either directly dumping in the code, or deferring that to >> Theano. At this point I'm not entirely sure. > > > Still, if this is all Theano provides, I question structuring the project > around reusing Theano. It's the sort of things that are nice-to-have but not > fundamental (like memory access patterns). > > Put another way, it sounds like Theano could easily be made an optional > dependency currently. > > Another question is of course whether it is better to work on Theano to > implement tiling etc. for the CPU (and even compile all the specializations > and select between them). Indeed, my initial idea was to make it optional, but since Theano would also give execution on the GPU as well as generate BLAS calls where appropriate, it makes sense to push for making it optimal for the CPU as well memory-access wise and vectorization-wise. > You could perhaps even have Theano use PEP 3118 rather than NumPy too. > > I guess I should subscribe to the Theano list. > > Dag Yeah, I've been digging a bit through the code, I'm not sure yet how much work that would require, but I think it is non-trivial. From robertwb at gmail.com Tue May 22 08:11:55 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Mon, 21 May 2012 23:11:55 -0700 Subject: [Cython] gsoc: array expressions In-Reply-To: <4FBA1A40.2060202@astro.uio.no> References: <4FBA1A40.2060202@astro.uio.no> Message-ID: On Mon, May 21, 2012 at 3:34 AM, Dag Sverre Seljebotn wrote: > On 05/20/2012 04:03 PM, mark florisson wrote: >> >> Hey, >> >> For my gsoc we already have some simple initial ideas, i.e. >> elementwise vector expressions (a + b with a and b arrays with >> arbitrary rank), I don't think these need any discussion. However, >> there are a lot of things that haven't been formally discussed on the >> mailing list, so here goes. >> >> Fr?d?ric, I am CCing you since you expressed interest on the numpy >> mailing list, and I think your insights as a Theano developer can be >> very helpful in this discussion. >> >> User Interface >> =========== >> Besides simple array expressions for dense arrays I would like a >> mechanism for "custom ufuncs", although to a different extent to what >> Numpy or Numba provide. There are several ways in which we could want >> them, e.g. as typed functions (cdef, or external C) functions, as >> lambas or Python functions in the same module, or as general objects >> (e.g. functions Cython doesn't know about). >> To achieve maximum efficiency it will likely be good to allow sharing >> these functions in .pxd files. We have 'cdef inline' functions, but I >> would prefer annotated def functions where the parameters are >> specialized on demand, e.g. >> >> @elemental >> def add(a, b): # elemental functions can have any number of arguments >> and operate on any compatible dtype >> ? ? return a + b >> >> When calling cdef functions or elemental functions with memoryview >> arguments, the arguments perform a (broadcasted) elementwise >> operation. Alternatively, we can have a parallel.elementwise function >> which maps the function elementwise, which would also work for object >> callables. I prefer the former, since I think it will read much >> easier. >> >> Secondly, we can have a reduce function (and maybe a scan function), >> that reduce (respectively scan) in a specified axis or number of axes. >> E.g. >> >> ? ? parallel.reduce(add, a, b, axis=(0, 2)) >> >> where the default for axis is "all axes". As for the default value, >> this could be perhaps optionally provided to the elemental decorator. >> Otherwise, the reducer will have to get the default values from each >> dimension that is reduced in, and then skip those values when >> reducing. (Of course, the reducer function must be associate and >> commutative). Also, a lambda could be passed in instead of an > > > Only associative, right? > > Sounds good to me. > > >> elementwise or typed cdef function. >> >> Finally, we would have a parallel.nditer/ndenumerate/nditerate >> function, which would iterate over N memoryviews, and provide a >> sensible memory access pattern (like numpy.nditer). I'm not sure if it >> should provide only the indices, or also the values. e.g. an inplace >> elementwise add would read as follows: >> >> ? ? for i, j, k in parallel.nditerate(A, B): >> ? ? ? ? A[i, j, k] += B[i, j, k] > > > > I think this sounds good; I guess don't see a particular reason for > "ndenumerate", I think code like the above is clearer. I'm assuming the index computations would not be re-done in this case (i.e. there's more magic going on here than looks like at first glance)? Otherwise there is an advantage to ndenumerate. - Robert From d.s.seljebotn at astro.uio.no Tue May 22 08:36:25 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 22 May 2012 08:36:25 +0200 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> Message-ID: <4FBB33E9.1090707@astro.uio.no> On 05/22/2012 08:11 AM, Robert Bradshaw wrote: > On Mon, May 21, 2012 at 3:34 AM, Dag Sverre Seljebotn > wrote: >> On 05/20/2012 04:03 PM, mark florisson wrote: >>> >>> Hey, >>> >>> For my gsoc we already have some simple initial ideas, i.e. >>> elementwise vector expressions (a + b with a and b arrays with >>> arbitrary rank), I don't think these need any discussion. However, >>> there are a lot of things that haven't been formally discussed on the >>> mailing list, so here goes. >>> >>> Fr?d?ric, I am CCing you since you expressed interest on the numpy >>> mailing list, and I think your insights as a Theano developer can be >>> very helpful in this discussion. >>> >>> User Interface >>> =========== >>> Besides simple array expressions for dense arrays I would like a >>> mechanism for "custom ufuncs", although to a different extent to what >>> Numpy or Numba provide. There are several ways in which we could want >>> them, e.g. as typed functions (cdef, or external C) functions, as >>> lambas or Python functions in the same module, or as general objects >>> (e.g. functions Cython doesn't know about). >>> To achieve maximum efficiency it will likely be good to allow sharing >>> these functions in .pxd files. We have 'cdef inline' functions, but I >>> would prefer annotated def functions where the parameters are >>> specialized on demand, e.g. >>> >>> @elemental >>> def add(a, b): # elemental functions can have any number of arguments >>> and operate on any compatible dtype >>> return a + b >>> >>> When calling cdef functions or elemental functions with memoryview >>> arguments, the arguments perform a (broadcasted) elementwise >>> operation. Alternatively, we can have a parallel.elementwise function >>> which maps the function elementwise, which would also work for object >>> callables. I prefer the former, since I think it will read much >>> easier. >>> >>> Secondly, we can have a reduce function (and maybe a scan function), >>> that reduce (respectively scan) in a specified axis or number of axes. >>> E.g. >>> >>> parallel.reduce(add, a, b, axis=(0, 2)) >>> >>> where the default for axis is "all axes". As for the default value, >>> this could be perhaps optionally provided to the elemental decorator. >>> Otherwise, the reducer will have to get the default values from each >>> dimension that is reduced in, and then skip those values when >>> reducing. (Of course, the reducer function must be associate and >>> commutative). Also, a lambda could be passed in instead of an >> >> >> Only associative, right? >> >> Sounds good to me. >> >> >>> elementwise or typed cdef function. >>> >>> Finally, we would have a parallel.nditer/ndenumerate/nditerate >>> function, which would iterate over N memoryviews, and provide a >>> sensible memory access pattern (like numpy.nditer). I'm not sure if it >>> should provide only the indices, or also the values. e.g. an inplace >>> elementwise add would read as follows: >>> >>> for i, j, k in parallel.nditerate(A, B): >>> A[i, j, k] += B[i, j, k] >> >> >> >> I think this sounds good; I guess don't see a particular reason for >> "ndenumerate", I think code like the above is clearer. > > I'm assuming the index computations would not be re-done in this case > (i.e. there's more magic going on here than looks like at first > glance)? Otherwise there is an advantage to ndenumerate. Ideally, there is a lot more magic going on, though I don't know how far Mark wants to go. Imagine "nditerate(A, A.T)", in that case it would have to make many small tiles so that for each tile being processed, A has a tile in cache and A.T has another tile in cache (so that one doesn't waste cache line transfers). So those array lookups would potentially look up in different memory buffers, with the strides known at compile time. Which begs the question: What about this body? if i < 100: continue else: A[i, j, k] += B[i - 100, j, k] I guess just fall back to a non-tiled version? One could of course do some shifting of which tiles of B to grab etc., but there's a limit to how smart one should try to be; one could emit a warning and say that one should slice and dice the memoryviews into shape before they are passed to nditerate. Dag From robertwb at gmail.com Tue May 22 08:48:23 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Mon, 21 May 2012 23:48:23 -0700 Subject: [Cython] gsoc: array expressions In-Reply-To: <4FBB33E9.1090707@astro.uio.no> References: <4FBA1A40.2060202@astro.uio.no> <4FBB33E9.1090707@astro.uio.no> Message-ID: On Mon, May 21, 2012 at 11:36 PM, Dag Sverre Seljebotn wrote: > On 05/22/2012 08:11 AM, Robert Bradshaw wrote: >> >> On Mon, May 21, 2012 at 3:34 AM, Dag Sverre Seljebotn >> ?wrote: >>> >>> On 05/20/2012 04:03 PM, mark florisson wrote: >>>> >>>> >>>> Hey, >>>> >>>> For my gsoc we already have some simple initial ideas, i.e. >>>> elementwise vector expressions (a + b with a and b arrays with >>>> arbitrary rank), I don't think these need any discussion. However, >>>> there are a lot of things that haven't been formally discussed on the >>>> mailing list, so here goes. >>>> >>>> Fr?d?ric, I am CCing you since you expressed interest on the numpy >>>> mailing list, and I think your insights as a Theano developer can be >>>> very helpful in this discussion. >>>> >>>> User Interface >>>> =========== >>>> Besides simple array expressions for dense arrays I would like a >>>> mechanism for "custom ufuncs", although to a different extent to what >>>> Numpy or Numba provide. There are several ways in which we could want >>>> them, e.g. as typed functions (cdef, or external C) functions, as >>>> lambas or Python functions in the same module, or as general objects >>>> (e.g. functions Cython doesn't know about). >>>> To achieve maximum efficiency it will likely be good to allow sharing >>>> these functions in .pxd files. We have 'cdef inline' functions, but I >>>> would prefer annotated def functions where the parameters are >>>> specialized on demand, e.g. >>>> >>>> @elemental >>>> def add(a, b): # elemental functions can have any number of arguments >>>> and operate on any compatible dtype >>>> ? ? return a + b >>>> >>>> When calling cdef functions or elemental functions with memoryview >>>> arguments, the arguments perform a (broadcasted) elementwise >>>> operation. Alternatively, we can have a parallel.elementwise function >>>> which maps the function elementwise, which would also work for object >>>> callables. I prefer the former, since I think it will read much >>>> easier. >>>> >>>> Secondly, we can have a reduce function (and maybe a scan function), >>>> that reduce (respectively scan) in a specified axis or number of axes. >>>> E.g. >>>> >>>> ? ? parallel.reduce(add, a, b, axis=(0, 2)) >>>> >>>> where the default for axis is "all axes". As for the default value, >>>> this could be perhaps optionally provided to the elemental decorator. >>>> Otherwise, the reducer will have to get the default values from each >>>> dimension that is reduced in, and then skip those values when >>>> reducing. (Of course, the reducer function must be associate and >>>> commutative). Also, a lambda could be passed in instead of an >>> >>> >>> >>> Only associative, right? >>> >>> Sounds good to me. >>> >>> >>>> elementwise or typed cdef function. >>>> >>>> Finally, we would have a parallel.nditer/ndenumerate/nditerate >>>> function, which would iterate over N memoryviews, and provide a >>>> sensible memory access pattern (like numpy.nditer). I'm not sure if it >>>> should provide only the indices, or also the values. e.g. an inplace >>>> elementwise add would read as follows: >>>> >>>> ? ? for i, j, k in parallel.nditerate(A, B): >>>> ? ? ? ? A[i, j, k] += B[i, j, k] >>> >>> >>> >>> >>> I think this sounds good; I guess don't see a particular reason for >>> "ndenumerate", I think code like the above is clearer. >> >> >> I'm assuming the index computations would not be re-done in this case >> (i.e. there's more magic going on here than looks like at first >> glance)? Otherwise there is an advantage to ndenumerate. > > > Ideally, there is a lot more magic going on, though I don't know how far > Mark wants to go. > > Imagine "nditerate(A, A.T)", in that case it would have to make many small > tiles so that for each tile being processed, A has a tile in cache and A.T > has another tile in cache (so that one doesn't waste cache line transfers). > > So those array lookups would potentially look up in different memory > buffers, with the strides known at compile time. Yes, being clever about the order in which to iterate over the indices is the hard problem to solve here. I was thinking more in terms of the inner loop iterating over the innermost dimension only to do the indexing (retrieval and assignment), similar to how the generic NumPy iterator works. > Which begs the question: What about this body? > > if i < 100: > ? ?continue > else: > ? ?A[i, j, k] += B[i - 100, j, k] > > I guess just fall back to a non-tiled version? One could of course do some > shifting of which tiles of B to grab etc., but there's a limit to how smart > one should try to be; one could emit a warning and say that one should slice > and dice the memoryviews into shape before they are passed to nditerate. Linear transformations of the index variables could probably be handled, but that's certainly not v1 (and not too difficult for the user to express manually). - Robert From d.s.seljebotn at astro.uio.no Tue May 22 08:57:04 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 22 May 2012 08:57:04 +0200 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBB33E9.1090707@astro.uio.no> Message-ID: <4FBB38C0.4030005@astro.uio.no> On 05/22/2012 08:48 AM, Robert Bradshaw wrote: > On Mon, May 21, 2012 at 11:36 PM, Dag Sverre Seljebotn > wrote: >> On 05/22/2012 08:11 AM, Robert Bradshaw wrote: >>> >>> On Mon, May 21, 2012 at 3:34 AM, Dag Sverre Seljebotn >>> wrote: >>>> >>>> On 05/20/2012 04:03 PM, mark florisson wrote: >>>>> >>>>> >>>>> Hey, >>>>> >>>>> For my gsoc we already have some simple initial ideas, i.e. >>>>> elementwise vector expressions (a + b with a and b arrays with >>>>> arbitrary rank), I don't think these need any discussion. However, >>>>> there are a lot of things that haven't been formally discussed on the >>>>> mailing list, so here goes. >>>>> >>>>> Fr?d?ric, I am CCing you since you expressed interest on the numpy >>>>> mailing list, and I think your insights as a Theano developer can be >>>>> very helpful in this discussion. >>>>> >>>>> User Interface >>>>> =========== >>>>> Besides simple array expressions for dense arrays I would like a >>>>> mechanism for "custom ufuncs", although to a different extent to what >>>>> Numpy or Numba provide. There are several ways in which we could want >>>>> them, e.g. as typed functions (cdef, or external C) functions, as >>>>> lambas or Python functions in the same module, or as general objects >>>>> (e.g. functions Cython doesn't know about). >>>>> To achieve maximum efficiency it will likely be good to allow sharing >>>>> these functions in .pxd files. We have 'cdef inline' functions, but I >>>>> would prefer annotated def functions where the parameters are >>>>> specialized on demand, e.g. >>>>> >>>>> @elemental >>>>> def add(a, b): # elemental functions can have any number of arguments >>>>> and operate on any compatible dtype >>>>> return a + b >>>>> >>>>> When calling cdef functions or elemental functions with memoryview >>>>> arguments, the arguments perform a (broadcasted) elementwise >>>>> operation. Alternatively, we can have a parallel.elementwise function >>>>> which maps the function elementwise, which would also work for object >>>>> callables. I prefer the former, since I think it will read much >>>>> easier. >>>>> >>>>> Secondly, we can have a reduce function (and maybe a scan function), >>>>> that reduce (respectively scan) in a specified axis or number of axes. >>>>> E.g. >>>>> >>>>> parallel.reduce(add, a, b, axis=(0, 2)) >>>>> >>>>> where the default for axis is "all axes". As for the default value, >>>>> this could be perhaps optionally provided to the elemental decorator. >>>>> Otherwise, the reducer will have to get the default values from each >>>>> dimension that is reduced in, and then skip those values when >>>>> reducing. (Of course, the reducer function must be associate and >>>>> commutative). Also, a lambda could be passed in instead of an >>>> >>>> >>>> >>>> Only associative, right? >>>> >>>> Sounds good to me. >>>> >>>> >>>>> elementwise or typed cdef function. >>>>> >>>>> Finally, we would have a parallel.nditer/ndenumerate/nditerate >>>>> function, which would iterate over N memoryviews, and provide a >>>>> sensible memory access pattern (like numpy.nditer). I'm not sure if it >>>>> should provide only the indices, or also the values. e.g. an inplace >>>>> elementwise add would read as follows: >>>>> >>>>> for i, j, k in parallel.nditerate(A, B): >>>>> A[i, j, k] += B[i, j, k] >>>> >>>> >>>> >>>> >>>> I think this sounds good; I guess don't see a particular reason for >>>> "ndenumerate", I think code like the above is clearer. >>> >>> >>> I'm assuming the index computations would not be re-done in this case >>> (i.e. there's more magic going on here than looks like at first >>> glance)? Otherwise there is an advantage to ndenumerate. >> >> >> Ideally, there is a lot more magic going on, though I don't know how far >> Mark wants to go. >> >> Imagine "nditerate(A, A.T)", in that case it would have to make many small >> tiles so that for each tile being processed, A has a tile in cache and A.T >> has another tile in cache (so that one doesn't waste cache line transfers). >> >> So those array lookups would potentially look up in different memory >> buffers, with the strides known at compile time. > > Yes, being clever about the order in which to iterate over the indices > is the hard problem to solve here. I was thinking more in terms of the > inner loop iterating over the innermost dimension only to do the > indexing (retrieval and assignment), similar to how the generic NumPy > iterator works. The point isn't only being clever about the *order*...you need "copy-in, copy-out". The point is that the NumPy iterator is not good enough (for out-of-cache situations). Since you grab a cache line (64 bytes) each time from main memory, a plain NumPy broadcasted iterator throws away a lot of memory for "A + A.T", since for ndim>1 there's NO iteration order which isn't bad (for instance, you could iterate in the order of A, and the result would be that for each element of A.T you fetch there is 64 bytes transferred). So the solution is to copy A.T block-wise to a temporary scratch space in cache so that you use all the elements in the cache line before throwing it out of cache. In C, I've seen a simple blocking transpose operation be over four times faster than the brute-force transpose for this reason. Dag > >> Which begs the question: What about this body? >> >> if i< 100: >> continue >> else: >> A[i, j, k] += B[i - 100, j, k] >> >> I guess just fall back to a non-tiled version? One could of course do some >> shifting of which tiles of B to grab etc., but there's a limit to how smart >> one should try to be; one could emit a warning and say that one should slice >> and dice the memoryviews into shape before they are passed to nditerate. > > Linear transformations of the index variables could probably be > handled, but that's certainly not v1 (and not too difficult for the > user to express manually). > > - Robert > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From d.s.seljebotn at astro.uio.no Tue May 22 08:57:51 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 22 May 2012 08:57:51 +0200 Subject: [Cython] gsoc: array expressions In-Reply-To: <4FBB38C0.4030005@astro.uio.no> References: <4FBA1A40.2060202@astro.uio.no> <4FBB33E9.1090707@astro.uio.no> <4FBB38C0.4030005@astro.uio.no> Message-ID: <4FBB38EF.9010104@astro.uio.no> On 05/22/2012 08:57 AM, Dag Sverre Seljebotn wrote: > On 05/22/2012 08:48 AM, Robert Bradshaw wrote: >> On Mon, May 21, 2012 at 11:36 PM, Dag Sverre Seljebotn >> wrote: >>> On 05/22/2012 08:11 AM, Robert Bradshaw wrote: >>>> >>>> On Mon, May 21, 2012 at 3:34 AM, Dag Sverre Seljebotn >>>> wrote: >>>>> >>>>> On 05/20/2012 04:03 PM, mark florisson wrote: >>>>>> >>>>>> >>>>>> Hey, >>>>>> >>>>>> For my gsoc we already have some simple initial ideas, i.e. >>>>>> elementwise vector expressions (a + b with a and b arrays with >>>>>> arbitrary rank), I don't think these need any discussion. However, >>>>>> there are a lot of things that haven't been formally discussed on the >>>>>> mailing list, so here goes. >>>>>> >>>>>> Fr?d?ric, I am CCing you since you expressed interest on the numpy >>>>>> mailing list, and I think your insights as a Theano developer can be >>>>>> very helpful in this discussion. >>>>>> >>>>>> User Interface >>>>>> =========== >>>>>> Besides simple array expressions for dense arrays I would like a >>>>>> mechanism for "custom ufuncs", although to a different extent to what >>>>>> Numpy or Numba provide. There are several ways in which we could want >>>>>> them, e.g. as typed functions (cdef, or external C) functions, as >>>>>> lambas or Python functions in the same module, or as general objects >>>>>> (e.g. functions Cython doesn't know about). >>>>>> To achieve maximum efficiency it will likely be good to allow sharing >>>>>> these functions in .pxd files. We have 'cdef inline' functions, but I >>>>>> would prefer annotated def functions where the parameters are >>>>>> specialized on demand, e.g. >>>>>> >>>>>> @elemental >>>>>> def add(a, b): # elemental functions can have any number of arguments >>>>>> and operate on any compatible dtype >>>>>> return a + b >>>>>> >>>>>> When calling cdef functions or elemental functions with memoryview >>>>>> arguments, the arguments perform a (broadcasted) elementwise >>>>>> operation. Alternatively, we can have a parallel.elementwise function >>>>>> which maps the function elementwise, which would also work for object >>>>>> callables. I prefer the former, since I think it will read much >>>>>> easier. >>>>>> >>>>>> Secondly, we can have a reduce function (and maybe a scan function), >>>>>> that reduce (respectively scan) in a specified axis or number of >>>>>> axes. >>>>>> E.g. >>>>>> >>>>>> parallel.reduce(add, a, b, axis=(0, 2)) >>>>>> >>>>>> where the default for axis is "all axes". As for the default value, >>>>>> this could be perhaps optionally provided to the elemental decorator. >>>>>> Otherwise, the reducer will have to get the default values from each >>>>>> dimension that is reduced in, and then skip those values when >>>>>> reducing. (Of course, the reducer function must be associate and >>>>>> commutative). Also, a lambda could be passed in instead of an >>>>> >>>>> >>>>> >>>>> Only associative, right? >>>>> >>>>> Sounds good to me. >>>>> >>>>> >>>>>> elementwise or typed cdef function. >>>>>> >>>>>> Finally, we would have a parallel.nditer/ndenumerate/nditerate >>>>>> function, which would iterate over N memoryviews, and provide a >>>>>> sensible memory access pattern (like numpy.nditer). I'm not sure >>>>>> if it >>>>>> should provide only the indices, or also the values. e.g. an inplace >>>>>> elementwise add would read as follows: >>>>>> >>>>>> for i, j, k in parallel.nditerate(A, B): >>>>>> A[i, j, k] += B[i, j, k] >>>>> >>>>> >>>>> >>>>> >>>>> I think this sounds good; I guess don't see a particular reason for >>>>> "ndenumerate", I think code like the above is clearer. >>>> >>>> >>>> I'm assuming the index computations would not be re-done in this case >>>> (i.e. there's more magic going on here than looks like at first >>>> glance)? Otherwise there is an advantage to ndenumerate. >>> >>> >>> Ideally, there is a lot more magic going on, though I don't know how far >>> Mark wants to go. >>> >>> Imagine "nditerate(A, A.T)", in that case it would have to make many >>> small >>> tiles so that for each tile being processed, A has a tile in cache >>> and A.T >>> has another tile in cache (so that one doesn't waste cache line >>> transfers). >>> >>> So those array lookups would potentially look up in different memory >>> buffers, with the strides known at compile time. >> >> Yes, being clever about the order in which to iterate over the indices >> is the hard problem to solve here. I was thinking more in terms of the >> inner loop iterating over the innermost dimension only to do the >> indexing (retrieval and assignment), similar to how the generic NumPy >> iterator works. > > The point isn't only being clever about the *order*...you need "copy-in, > copy-out". > > The point is that the NumPy iterator is not good enough (for > out-of-cache situations). Since you grab a cache line (64 bytes) each > time from main memory, a plain NumPy broadcasted iterator throws away a > lot of memory for "A + A.T", since for ndim>1 there's NO iteration order > which isn't bad (for instance, you could iterate in the order of A, and > the result would be that for each element of A.T you fetch there is 64 > bytes transferred). I meant, "throws away a lot of memory *bandwidth*". Dag > > So the solution is to copy A.T block-wise to a temporary scratch space > in cache so that you use all the elements in the cache line before > throwing it out of cache. > > In C, I've seen a simple blocking transpose operation be over four times > faster than the brute-force transpose for this reason. > > Dag > >> >>> Which begs the question: What about this body? >>> >>> if i< 100: >>> continue >>> else: >>> A[i, j, k] += B[i - 100, j, k] >>> >>> I guess just fall back to a non-tiled version? One could of course do >>> some >>> shifting of which tiles of B to grab etc., but there's a limit to how >>> smart >>> one should try to be; one could emit a warning and say that one >>> should slice >>> and dice the memoryviews into shape before they are passed to nditerate. >> >> Linear transformations of the index variables could probably be >> handled, but that's certainly not v1 (and not too difficult for the >> user to express manually). >> >> - Robert >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > From robertwb at gmail.com Tue May 22 09:06:15 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Tue, 22 May 2012 00:06:15 -0700 Subject: [Cython] gsoc: array expressions In-Reply-To: <4FBB38C0.4030005@astro.uio.no> References: <4FBA1A40.2060202@astro.uio.no> <4FBB33E9.1090707@astro.uio.no> <4FBB38C0.4030005@astro.uio.no> Message-ID: On Mon, May 21, 2012 at 11:57 PM, Dag Sverre Seljebotn wrote: > On 05/22/2012 08:48 AM, Robert Bradshaw wrote: >> >> On Mon, May 21, 2012 at 11:36 PM, Dag Sverre Seljebotn >> ?wrote: >>> >>> On 05/22/2012 08:11 AM, Robert Bradshaw wrote: >>>> >>>> >>>> On Mon, May 21, 2012 at 3:34 AM, Dag Sverre Seljebotn >>>> ? ?wrote: >>>>> >>>>> >>>>> On 05/20/2012 04:03 PM, mark florisson wrote: >>>>>> >>>>>> >>>>>> >>>>>> Hey, >>>>>> >>>>>> For my gsoc we already have some simple initial ideas, i.e. >>>>>> elementwise vector expressions (a + b with a and b arrays with >>>>>> arbitrary rank), I don't think these need any discussion. However, >>>>>> there are a lot of things that haven't been formally discussed on the >>>>>> mailing list, so here goes. >>>>>> >>>>>> Fr?d?ric, I am CCing you since you expressed interest on the numpy >>>>>> mailing list, and I think your insights as a Theano developer can be >>>>>> very helpful in this discussion. >>>>>> >>>>>> User Interface >>>>>> =========== >>>>>> Besides simple array expressions for dense arrays I would like a >>>>>> mechanism for "custom ufuncs", although to a different extent to what >>>>>> Numpy or Numba provide. There are several ways in which we could want >>>>>> them, e.g. as typed functions (cdef, or external C) functions, as >>>>>> lambas or Python functions in the same module, or as general objects >>>>>> (e.g. functions Cython doesn't know about). >>>>>> To achieve maximum efficiency it will likely be good to allow sharing >>>>>> these functions in .pxd files. We have 'cdef inline' functions, but I >>>>>> would prefer annotated def functions where the parameters are >>>>>> specialized on demand, e.g. >>>>>> >>>>>> @elemental >>>>>> def add(a, b): # elemental functions can have any number of arguments >>>>>> and operate on any compatible dtype >>>>>> ? ? return a + b >>>>>> >>>>>> When calling cdef functions or elemental functions with memoryview >>>>>> arguments, the arguments perform a (broadcasted) elementwise >>>>>> operation. Alternatively, we can have a parallel.elementwise function >>>>>> which maps the function elementwise, which would also work for object >>>>>> callables. I prefer the former, since I think it will read much >>>>>> easier. >>>>>> >>>>>> Secondly, we can have a reduce function (and maybe a scan function), >>>>>> that reduce (respectively scan) in a specified axis or number of axes. >>>>>> E.g. >>>>>> >>>>>> ? ? parallel.reduce(add, a, b, axis=(0, 2)) >>>>>> >>>>>> where the default for axis is "all axes". As for the default value, >>>>>> this could be perhaps optionally provided to the elemental decorator. >>>>>> Otherwise, the reducer will have to get the default values from each >>>>>> dimension that is reduced in, and then skip those values when >>>>>> reducing. (Of course, the reducer function must be associate and >>>>>> commutative). Also, a lambda could be passed in instead of an >>>>> >>>>> >>>>> >>>>> >>>>> Only associative, right? >>>>> >>>>> Sounds good to me. >>>>> >>>>> >>>>>> elementwise or typed cdef function. >>>>>> >>>>>> Finally, we would have a parallel.nditer/ndenumerate/nditerate >>>>>> function, which would iterate over N memoryviews, and provide a >>>>>> sensible memory access pattern (like numpy.nditer). I'm not sure if it >>>>>> should provide only the indices, or also the values. e.g. an inplace >>>>>> elementwise add would read as follows: >>>>>> >>>>>> ? ? for i, j, k in parallel.nditerate(A, B): >>>>>> ? ? ? ? A[i, j, k] += B[i, j, k] >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> I think this sounds good; I guess don't see a particular reason for >>>>> "ndenumerate", I think code like the above is clearer. >>>> >>>> >>>> >>>> I'm assuming the index computations would not be re-done in this case >>>> (i.e. there's more magic going on here than looks like at first >>>> glance)? Otherwise there is an advantage to ndenumerate. >>> >>> >>> >>> Ideally, there is a lot more magic going on, though I don't know how far >>> Mark wants to go. >>> >>> Imagine "nditerate(A, A.T)", in that case it would have to make many >>> small >>> tiles so that for each tile being processed, A has a tile in cache and >>> A.T >>> has another tile in cache (so that one doesn't waste cache line >>> transfers). >>> >>> So those array lookups would potentially look up in different memory >>> buffers, with the strides known at compile time. >> >> >> Yes, being clever about the order in which to iterate over the indices >> is the hard problem to solve here. I was thinking more in terms of the >> inner loop iterating over the innermost dimension only to do the >> indexing (retrieval and assignment), similar to how the generic NumPy >> iterator works. > > > The point isn't only being clever about the *order*...you need "copy-in, > copy-out". > > The point is that the NumPy iterator is not good enough (for out-of-cache > situations). Since you grab a cache line (64 bytes) each time from main > memory, a plain NumPy broadcasted iterator throws away a lot of memory for > "A + A.T", since for ndim>1 there's NO iteration order which isn't bad (for > instance, you could iterate in the order of A, and the result would be that > for each element of A.T you fetch there is 64 bytes transferred). > > So the solution is to copy A.T block-wise to a temporary scratch space in > cache so that you use all the elements in the cache line before throwing it > out of cache. > > In C, I've seen a simple blocking transpose operation be over four times > faster than the brute-force transpose for this reason. Yes, I understand this. Truly element-wise arithmetic with arrays of the same memory layout (or even size) is not that uncommon though, and should be optimized for as well. Fortunately, I feel pretty comfortable sitting back and watching 'cause you've both thought about these issues far more than I and I don't see you both getting it wrong :). - Robert From d.s.seljebotn at astro.uio.no Tue May 22 09:13:17 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Tue, 22 May 2012 09:13:17 +0200 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBB33E9.1090707@astro.uio.no> <4FBB38C0.4030005@astro.uio.no> Message-ID: <4FBB3C8D.9080702@astro.uio.no> On 05/22/2012 09:06 AM, Robert Bradshaw wrote: > On Mon, May 21, 2012 at 11:57 PM, Dag Sverre Seljebotn > wrote: >> On 05/22/2012 08:48 AM, Robert Bradshaw wrote: >>> >>> On Mon, May 21, 2012 at 11:36 PM, Dag Sverre Seljebotn >>> wrote: >>>> >>>> On 05/22/2012 08:11 AM, Robert Bradshaw wrote: >>>>> >>>>> >>>>> On Mon, May 21, 2012 at 3:34 AM, Dag Sverre Seljebotn >>>>> wrote: >>>>>> >>>>>> >>>>>> On 05/20/2012 04:03 PM, mark florisson wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hey, >>>>>>> >>>>>>> For my gsoc we already have some simple initial ideas, i.e. >>>>>>> elementwise vector expressions (a + b with a and b arrays with >>>>>>> arbitrary rank), I don't think these need any discussion. However, >>>>>>> there are a lot of things that haven't been formally discussed on the >>>>>>> mailing list, so here goes. >>>>>>> >>>>>>> Fr?d?ric, I am CCing you since you expressed interest on the numpy >>>>>>> mailing list, and I think your insights as a Theano developer can be >>>>>>> very helpful in this discussion. >>>>>>> >>>>>>> User Interface >>>>>>> =========== >>>>>>> Besides simple array expressions for dense arrays I would like a >>>>>>> mechanism for "custom ufuncs", although to a different extent to what >>>>>>> Numpy or Numba provide. There are several ways in which we could want >>>>>>> them, e.g. as typed functions (cdef, or external C) functions, as >>>>>>> lambas or Python functions in the same module, or as general objects >>>>>>> (e.g. functions Cython doesn't know about). >>>>>>> To achieve maximum efficiency it will likely be good to allow sharing >>>>>>> these functions in .pxd files. We have 'cdef inline' functions, but I >>>>>>> would prefer annotated def functions where the parameters are >>>>>>> specialized on demand, e.g. >>>>>>> >>>>>>> @elemental >>>>>>> def add(a, b): # elemental functions can have any number of arguments >>>>>>> and operate on any compatible dtype >>>>>>> return a + b >>>>>>> >>>>>>> When calling cdef functions or elemental functions with memoryview >>>>>>> arguments, the arguments perform a (broadcasted) elementwise >>>>>>> operation. Alternatively, we can have a parallel.elementwise function >>>>>>> which maps the function elementwise, which would also work for object >>>>>>> callables. I prefer the former, since I think it will read much >>>>>>> easier. >>>>>>> >>>>>>> Secondly, we can have a reduce function (and maybe a scan function), >>>>>>> that reduce (respectively scan) in a specified axis or number of axes. >>>>>>> E.g. >>>>>>> >>>>>>> parallel.reduce(add, a, b, axis=(0, 2)) >>>>>>> >>>>>>> where the default for axis is "all axes". As for the default value, >>>>>>> this could be perhaps optionally provided to the elemental decorator. >>>>>>> Otherwise, the reducer will have to get the default values from each >>>>>>> dimension that is reduced in, and then skip those values when >>>>>>> reducing. (Of course, the reducer function must be associate and >>>>>>> commutative). Also, a lambda could be passed in instead of an >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Only associative, right? >>>>>> >>>>>> Sounds good to me. >>>>>> >>>>>> >>>>>>> elementwise or typed cdef function. >>>>>>> >>>>>>> Finally, we would have a parallel.nditer/ndenumerate/nditerate >>>>>>> function, which would iterate over N memoryviews, and provide a >>>>>>> sensible memory access pattern (like numpy.nditer). I'm not sure if it >>>>>>> should provide only the indices, or also the values. e.g. an inplace >>>>>>> elementwise add would read as follows: >>>>>>> >>>>>>> for i, j, k in parallel.nditerate(A, B): >>>>>>> A[i, j, k] += B[i, j, k] >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> I think this sounds good; I guess don't see a particular reason for >>>>>> "ndenumerate", I think code like the above is clearer. >>>>> >>>>> >>>>> >>>>> I'm assuming the index computations would not be re-done in this case >>>>> (i.e. there's more magic going on here than looks like at first >>>>> glance)? Otherwise there is an advantage to ndenumerate. >>>> >>>> >>>> >>>> Ideally, there is a lot more magic going on, though I don't know how far >>>> Mark wants to go. >>>> >>>> Imagine "nditerate(A, A.T)", in that case it would have to make many >>>> small >>>> tiles so that for each tile being processed, A has a tile in cache and >>>> A.T >>>> has another tile in cache (so that one doesn't waste cache line >>>> transfers). >>>> >>>> So those array lookups would potentially look up in different memory >>>> buffers, with the strides known at compile time. >>> >>> >>> Yes, being clever about the order in which to iterate over the indices >>> is the hard problem to solve here. I was thinking more in terms of the >>> inner loop iterating over the innermost dimension only to do the >>> indexing (retrieval and assignment), similar to how the generic NumPy >>> iterator works. >> >> >> The point isn't only being clever about the *order*...you need "copy-in, >> copy-out". >> >> The point is that the NumPy iterator is not good enough (for out-of-cache >> situations). Since you grab a cache line (64 bytes) each time from main >> memory, a plain NumPy broadcasted iterator throws away a lot of memory for >> "A + A.T", since for ndim>1 there's NO iteration order which isn't bad (for >> instance, you could iterate in the order of A, and the result would be that >> for each element of A.T you fetch there is 64 bytes transferred). >> >> So the solution is to copy A.T block-wise to a temporary scratch space in >> cache so that you use all the elements in the cache line before throwing it >> out of cache. >> >> In C, I've seen a simple blocking transpose operation be over four times >> faster than the brute-force transpose for this reason. > > Yes, I understand this. Truly element-wise arithmetic with arrays of > the same memory layout (or even size) is not that uncommon though, and > should be optimized for as well. Fortunately, I feel pretty > comfortable sitting back and watching 'cause you've both thought about > these issues far more than I and I don't see you both getting it wrong > :). Sorry for being an such an annoying know-it-all, it just seemed from your comment like you didn't know :-) Dag From markflorisson88 at gmail.com Tue May 22 15:08:07 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 22 May 2012 14:08:07 +0100 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: On 6 May 2012 15:28, mark florisson wrote: > Hey, > > I think we already have quite a bit of functionality (nearly) ready, > after merging some pending pull requests maybe it will be a good time > for a 0.17 release? I think it would be good to also document to what > extent pypy support works, what works and what doesn't. Stefan, since > you added a large majority of the features, would you want to be the > release manager? > > In summary, the following pull requests should likely go in > ? ?- array.array support (unless further discussion prevents that) > ? ?- fused types runtime buffer dispatch > ? ?- newaxis > ? ?- more? > > The memoryview documentation should also be reworked a bit. Matthew, > are you still willing to have a go at that? Otherwise I can clean up > the mess first, some things are no longer true and simply outdated, > and then have a second opinion. > > Mark I think we have enough stuff in to go for a 0.17 release, I have a few more fixes and a refactoring that I'll finish tonight that might be useful to get in as well. Currently Jenkins is yellow though, as the reduce_pickle test fails in Python 3. From markflorisson88 at gmail.com Tue May 22 15:16:44 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 22 May 2012 14:16:44 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBB33E9.1090707@astro.uio.no> Message-ID: On 22 May 2012 07:48, Robert Bradshaw wrote: > On Mon, May 21, 2012 at 11:36 PM, Dag Sverre Seljebotn > wrote: >> On 05/22/2012 08:11 AM, Robert Bradshaw wrote: >>> >>> On Mon, May 21, 2012 at 3:34 AM, Dag Sverre Seljebotn >>> ?wrote: >>>> >>>> On 05/20/2012 04:03 PM, mark florisson wrote: >>>>> >>>>> >>>>> Hey, >>>>> >>>>> For my gsoc we already have some simple initial ideas, i.e. >>>>> elementwise vector expressions (a + b with a and b arrays with >>>>> arbitrary rank), I don't think these need any discussion. However, >>>>> there are a lot of things that haven't been formally discussed on the >>>>> mailing list, so here goes. >>>>> >>>>> Fr?d?ric, I am CCing you since you expressed interest on the numpy >>>>> mailing list, and I think your insights as a Theano developer can be >>>>> very helpful in this discussion. >>>>> >>>>> User Interface >>>>> =========== >>>>> Besides simple array expressions for dense arrays I would like a >>>>> mechanism for "custom ufuncs", although to a different extent to what >>>>> Numpy or Numba provide. There are several ways in which we could want >>>>> them, e.g. as typed functions (cdef, or external C) functions, as >>>>> lambas or Python functions in the same module, or as general objects >>>>> (e.g. functions Cython doesn't know about). >>>>> To achieve maximum efficiency it will likely be good to allow sharing >>>>> these functions in .pxd files. We have 'cdef inline' functions, but I >>>>> would prefer annotated def functions where the parameters are >>>>> specialized on demand, e.g. >>>>> >>>>> @elemental >>>>> def add(a, b): # elemental functions can have any number of arguments >>>>> and operate on any compatible dtype >>>>> ? ? return a + b >>>>> >>>>> When calling cdef functions or elemental functions with memoryview >>>>> arguments, the arguments perform a (broadcasted) elementwise >>>>> operation. Alternatively, we can have a parallel.elementwise function >>>>> which maps the function elementwise, which would also work for object >>>>> callables. I prefer the former, since I think it will read much >>>>> easier. >>>>> >>>>> Secondly, we can have a reduce function (and maybe a scan function), >>>>> that reduce (respectively scan) in a specified axis or number of axes. >>>>> E.g. >>>>> >>>>> ? ? parallel.reduce(add, a, b, axis=(0, 2)) >>>>> >>>>> where the default for axis is "all axes". As for the default value, >>>>> this could be perhaps optionally provided to the elemental decorator. >>>>> Otherwise, the reducer will have to get the default values from each >>>>> dimension that is reduced in, and then skip those values when >>>>> reducing. (Of course, the reducer function must be associate and >>>>> commutative). Also, a lambda could be passed in instead of an >>>> >>>> >>>> >>>> Only associative, right? >>>> >>>> Sounds good to me. >>>> >>>> >>>>> elementwise or typed cdef function. >>>>> >>>>> Finally, we would have a parallel.nditer/ndenumerate/nditerate >>>>> function, which would iterate over N memoryviews, and provide a >>>>> sensible memory access pattern (like numpy.nditer). I'm not sure if it >>>>> should provide only the indices, or also the values. e.g. an inplace >>>>> elementwise add would read as follows: >>>>> >>>>> ? ? for i, j, k in parallel.nditerate(A, B): >>>>> ? ? ? ? A[i, j, k] += B[i, j, k] >>>> >>>> >>>> >>>> >>>> I think this sounds good; I guess don't see a particular reason for >>>> "ndenumerate", I think code like the above is clearer. >>> >>> >>> I'm assuming the index computations would not be re-done in this case >>> (i.e. there's more magic going on here than looks like at first >>> glance)? Otherwise there is an advantage to ndenumerate. >> >> >> Ideally, there is a lot more magic going on, though I don't know how far >> Mark wants to go. >> >> Imagine "nditerate(A, A.T)", in that case it would have to make many small >> tiles so that for each tile being processed, A has a tile in cache and A.T >> has another tile in cache (so that one doesn't waste cache line transfers). >> >> So those array lookups would potentially look up in different memory >> buffers, with the strides known at compile time. > > Yes, being clever about the order in which to iterate over the indices > is the hard problem to solve here. I was thinking more in terms of the > inner loop iterating over the innermost dimension only to do the > indexing (retrieval and assignment), similar to how the generic NumPy > iterator works. That's a valid point, but my experience has been that any worthy C compiler will do common subexpression elimination for the outer dimensions and not recompute the offset every time. It actually generated marginally faster code for scalar assignment than a "cascaded pointer assignment", i.e. faster than p0 = data; for (...) { p1 = p0 + i * strides[0] for (...) { p2 = p1 + j * strides[1] ... } } (haven't tried manual strength reduction there though). >> Which begs the question: What about this body? >> >> if i < 100: >> ? ?continue >> else: >> ? ?A[i, j, k] += B[i - 100, j, k] >> >> I guess just fall back to a non-tiled version? One could of course do some >> shifting of which tiles of B to grab etc., but there's a limit to how smart >> one should try to be; one could emit a warning and say that one should slice >> and dice the memoryviews into shape before they are passed to nditerate. > > Linear transformations of the index variables could probably be > handled, but that's certainly not v1 (and not too difficult for the > user to express manually). > > - Robert > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From robertwb at gmail.com Wed May 23 13:31:36 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 23 May 2012 04:31:36 -0700 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: On Tue, May 22, 2012 at 6:08 AM, mark florisson wrote: > On 6 May 2012 15:28, mark florisson wrote: >> Hey, >> >> I think we already have quite a bit of functionality (nearly) ready, >> after merging some pending pull requests maybe it will be a good time >> for a 0.17 release? I think it would be good to also document to what >> extent pypy support works, what works and what doesn't. Stefan, since >> you added a large majority of the features, would you want to be the >> release manager? >> >> In summary, the following pull requests should likely go in >> ? ?- array.array support (unless further discussion prevents that) >> ? ?- fused types runtime buffer dispatch >> ? ?- newaxis >> ? ?- more? >> >> The memoryview documentation should also be reworked a bit. Matthew, >> are you still willing to have a go at that? Otherwise I can clean up >> the mess first, some things are no longer true and simply outdated, >> and then have a second opinion. >> >> Mark > > I think we have enough stuff in to go for a 0.17 release, I have a few > more fixes and a refactoring that I'll finish tonight that might be > useful to get in as well. Currently Jenkins is yellow though, as the > reduce_pickle test fails in Python 3. I pushed a fix to the pickle tests. I've got some minor cythonize optimizations I'd like to get in for Sage as well. I'll push when I confirm thy don't break anything on jenkins. - Robert From dewachter.jonathan at gmail.com Wed May 23 18:05:04 2012 From: dewachter.jonathan at gmail.com (Jonathan De Wachter) Date: Wed, 23 May 2012 18:05:04 +0200 Subject: [Cython] bug when using the built-in fused type cython.numeric Message-ID: It must probably be a known bug since the minimal code that reproduces this bug can be anything using the built-in type cython.numeric. cimport cython cdef class Foo: cdef cython.numeric bar Cython version: 0.16 (last release) I didn't find this bug in the bug trackers that's why I'm mailing you. By the way, it's the first time I'm using a mail list, if I'm doing wrong, please let me know. -------------- next part -------------- An HTML attachment was scrubbed... URL: From markflorisson88 at gmail.com Wed May 23 19:03:01 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 23 May 2012 18:03:01 +0100 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: On 23 May 2012 12:31, Robert Bradshaw wrote: > On Tue, May 22, 2012 at 6:08 AM, mark florisson > wrote: >> On 6 May 2012 15:28, mark florisson wrote: >>> Hey, >>> >>> I think we already have quite a bit of functionality (nearly) ready, >>> after merging some pending pull requests maybe it will be a good time >>> for a 0.17 release? I think it would be good to also document to what >>> extent pypy support works, what works and what doesn't. Stefan, since >>> you added a large majority of the features, would you want to be the >>> release manager? >>> >>> In summary, the following pull requests should likely go in >>> ? ?- array.array support (unless further discussion prevents that) >>> ? ?- fused types runtime buffer dispatch >>> ? ?- newaxis >>> ? ?- more? >>> >>> The memoryview documentation should also be reworked a bit. Matthew, >>> are you still willing to have a go at that? Otherwise I can clean up >>> the mess first, some things are no longer true and simply outdated, >>> and then have a second opinion. >>> >>> Mark >> >> I think we have enough stuff in to go for a 0.17 release, I have a few >> more fixes and a refactoring that I'll finish tonight that might be >> useful to get in as well. Currently Jenkins is yellow though, as the >> reduce_pickle test fails in Python 3. > > I pushed a fix to the pickle tests. I've got some minor cythonize > optimizations I'd like to get in for > Sage as well. I'll push when I confirm thy don't break anything on > jenkins. > > - Robert > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel Great, thanks for fixing it! Ok, let's wait for those things, and we still also need to fix the memoryview documentation, and then we're good to go I think :). From markflorisson88 at gmail.com Wed May 23 19:01:03 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 23 May 2012 18:01:03 +0100 Subject: [Cython] bug when using the built-in fused type cython.numeric In-Reply-To: References: Message-ID: On 23 May 2012 17:05, Jonathan De Wachter wrote: > It must probably be a known bug since the minimal code that reproduces this > bug can be anything using the built-in type cython.numeric. > > cimport cython > > cdef class Foo: > ???? cdef cython.numeric bar > > Cython version: 0.16 (last release) > > I didn't find this bug in the bug trackers that's why I'm mailing you. By > the way, it's the first time I'm using a mail list, if I'm doing wrong, > please let me know. > > > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel > Thanks for the report, I pushed two fixes here: https://github.com/markflorisson88/cython (cb6a62628f4d89096a8b0cdcc4ad66990141f927 and 0b7c152fda53d6664e355bcc74712639c7d9ff5a, for future reference). We were aware of the first problem, but there was a second problem where complex numbers don't have all their utility code declared properly. From matthew.brett at gmail.com Wed May 23 21:49:18 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 23 May 2012 15:49:18 -0400 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: Hi, For the promised memoryview doc edits: Sorry - I'm in Cuba - not much internet. I will push something for review by Friday, but please go ahead without me if that's not fast enough. Sorry to be the blocker, Best, Matthew On 5/23/12, mark florisson wrote: > On 23 May 2012 12:31, Robert Bradshaw wrote: >> On Tue, May 22, 2012 at 6:08 AM, mark florisson >> wrote: >>> On 6 May 2012 15:28, mark florisson wrote: >>>> Hey, >>>> >>>> I think we already have quite a bit of functionality (nearly) ready, >>>> after merging some pending pull requests maybe it will be a good time >>>> for a 0.17 release? I think it would be good to also document to what >>>> extent pypy support works, what works and what doesn't. Stefan, since >>>> you added a large majority of the features, would you want to be the >>>> release manager? >>>> >>>> In summary, the following pull requests should likely go in >>>> ? ?- array.array support (unless further discussion prevents that) >>>> ? ?- fused types runtime buffer dispatch >>>> ? ?- newaxis >>>> ? ?- more? >>>> >>>> The memoryview documentation should also be reworked a bit. Matthew, >>>> are you still willing to have a go at that? Otherwise I can clean up >>>> the mess first, some things are no longer true and simply outdated, >>>> and then have a second opinion. >>>> >>>> Mark >>> >>> I think we have enough stuff in to go for a 0.17 release, I have a few >>> more fixes and a refactoring that I'll finish tonight that might be >>> useful to get in as well. Currently Jenkins is yellow though, as the >>> reduce_pickle test fails in Python 3. >> >> I pushed a fix to the pickle tests. I've got some minor cythonize >> optimizations I'd like to get in for >> Sage as well. I'll push when I confirm thy don't break anything on >> jenkins. >> >> - Robert >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > Great, thanks for fixing it! Ok, let's wait for those things, and we > still also need to fix the memoryview documentation, and then we're > good to go I think :). > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel > From matthew.brett at gmail.com Wed May 23 21:52:30 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 23 May 2012 15:52:30 -0400 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: Hi, For the promised memoryview doc edits: Sorry - I'm in Cuba - not much internet. I will push something for review by Friday, but please go ahead without me if that's not fast enough. Sorry to be the blocker, Best, Matthew On 5/23/12, Matthew Brett wrote: > Hi, > > For the promised memoryview doc edits: > > Sorry - I'm in Cuba - not much internet. I will push something for > review by Friday, but please go ahead without me if that's not fast > enough. Sorry to be the blocker, > > Best, > > Matthew > > > On 5/23/12, mark florisson wrote: >> On 23 May 2012 12:31, Robert Bradshaw wrote: >>> On Tue, May 22, 2012 at 6:08 AM, mark florisson >>> wrote: >>>> On 6 May 2012 15:28, mark florisson wrote: >>>>> Hey, >>>>> >>>>> I think we already have quite a bit of functionality (nearly) ready, >>>>> after merging some pending pull requests maybe it will be a good time >>>>> for a 0.17 release? I think it would be good to also document to what >>>>> extent pypy support works, what works and what doesn't. Stefan, since >>>>> you added a large majority of the features, would you want to be the >>>>> release manager? >>>>> >>>>> In summary, the following pull requests should likely go in >>>>> ? ?- array.array support (unless further discussion prevents that) >>>>> ? ?- fused types runtime buffer dispatch >>>>> ? ?- newaxis >>>>> ? ?- more? >>>>> >>>>> The memoryview documentation should also be reworked a bit. Matthew, >>>>> are you still willing to have a go at that? Otherwise I can clean up >>>>> the mess first, some things are no longer true and simply outdated, >>>>> and then have a second opinion. >>>>> >>>>> Mark >>>> >>>> I think we have enough stuff in to go for a 0.17 release, I have a few >>>> more fixes and a refactoring that I'll finish tonight that might be >>>> useful to get in as well. Currently Jenkins is yellow though, as the >>>> reduce_pickle test fails in Python 3. >>> >>> I pushed a fix to the pickle tests. I've got some minor cythonize >>> optimizations I'd like to get in for >>> Sage as well. I'll push when I confirm thy don't break anything on >>> jenkins. >>> >>> - Robert >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >> >> Great, thanks for fixing it! Ok, let's wait for those things, and we >> still also need to fix the memoryview documentation, and then we're >> good to go I think :). >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel >> > From robertwb at gmail.com Wed May 23 22:55:48 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 23 May 2012 13:55:48 -0700 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: Pushed my change. On Wed, May 23, 2012 at 10:03 AM, mark florisson wrote: > On 23 May 2012 12:31, Robert Bradshaw wrote: >> On Tue, May 22, 2012 at 6:08 AM, mark florisson >> wrote: >>> On 6 May 2012 15:28, mark florisson wrote: >>>> Hey, >>>> >>>> I think we already have quite a bit of functionality (nearly) ready, >>>> after merging some pending pull requests maybe it will be a good time >>>> for a 0.17 release? I think it would be good to also document to what >>>> extent pypy support works, what works and what doesn't. Stefan, since >>>> you added a large majority of the features, would you want to be the >>>> release manager? >>>> >>>> In summary, the following pull requests should likely go in >>>> ? ?- array.array support (unless further discussion prevents that) >>>> ? ?- fused types runtime buffer dispatch >>>> ? ?- newaxis >>>> ? ?- more? >>>> >>>> The memoryview documentation should also be reworked a bit. Matthew, >>>> are you still willing to have a go at that? Otherwise I can clean up >>>> the mess first, some things are no longer true and simply outdated, >>>> and then have a second opinion. >>>> >>>> Mark >>> >>> I think we have enough stuff in to go for a 0.17 release, I have a few >>> more fixes and a refactoring that I'll finish tonight that might be >>> useful to get in as well. Currently Jenkins is yellow though, as the >>> reduce_pickle test fails in Python 3. >> >> I pushed a fix to the pickle tests. I've got some minor cythonize >> optimizations I'd like to get in for >> Sage as well. I'll push when I confirm thy don't break anything on >> jenkins. >> >> - Robert >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > Great, thanks for fixing it! Ok, let's wait for those things, and we > still also need to fix the memoryview documentation, and then we're > good to go I think :). > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Wed May 23 23:35:04 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 23 May 2012 22:35:04 +0100 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: Hey Matthew, Seriously, no problem, we're still getting stuff in there, no need to hurry. If you're in Cuba it sounds like you have better stuff to do than improve Cython's documentation :) Not to discourage contributors, but I would really enjoy Cuba here :). I'm working on my gsoc now, so making some time free for me is no problem here. So I propose I just clean up my mess, post back a link to the documentation, and then anyone who is interested can comment on improvements. Mark On 23 May 2012 20:49, Matthew Brett wrote: > Hi, > > For the promised memoryview doc edits: > > Sorry - I'm in Cuba - not much internet. ?I will push something for > review by Friday, but please go ahead without me if that's not fast > enough. ?Sorry to be the blocker, > > Best, > > Matthew > > > On 5/23/12, mark florisson wrote: >> On 23 May 2012 12:31, Robert Bradshaw wrote: >>> On Tue, May 22, 2012 at 6:08 AM, mark florisson >>> wrote: >>>> On 6 May 2012 15:28, mark florisson wrote: >>>>> Hey, >>>>> >>>>> I think we already have quite a bit of functionality (nearly) ready, >>>>> after merging some pending pull requests maybe it will be a good time >>>>> for a 0.17 release? I think it would be good to also document to what >>>>> extent pypy support works, what works and what doesn't. Stefan, since >>>>> you added a large majority of the features, would you want to be the >>>>> release manager? >>>>> >>>>> In summary, the following pull requests should likely go in >>>>> ? ?- array.array support (unless further discussion prevents that) >>>>> ? ?- fused types runtime buffer dispatch >>>>> ? ?- newaxis >>>>> ? ?- more? >>>>> >>>>> The memoryview documentation should also be reworked a bit. Matthew, >>>>> are you still willing to have a go at that? Otherwise I can clean up >>>>> the mess first, some things are no longer true and simply outdated, >>>>> and then have a second opinion. >>>>> >>>>> Mark >>>> >>>> I think we have enough stuff in to go for a 0.17 release, I have a few >>>> more fixes and a refactoring that I'll finish tonight that might be >>>> useful to get in as well. Currently Jenkins is yellow though, as the >>>> reduce_pickle test fails in Python 3. >>> >>> I pushed a fix to the pickle tests. I've got some minor cythonize >>> optimizations I'd like to get in for >>> Sage as well. I'll push when I confirm thy don't break anything on >>> jenkins. >>> >>> - Robert >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >> >> Great, thanks for fixing it! Ok, let's wait for those things, and we >> still also need to fix the memoryview documentation, and then we're >> good to go I think :). >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel >> > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From robertwb at gmail.com Thu May 24 07:46:20 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 23 May 2012 22:46:20 -0700 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBB33E9.1090707@astro.uio.no> Message-ID: On Tue, May 22, 2012 at 6:16 AM, mark florisson wrote: > On 22 May 2012 07:48, Robert Bradshaw wrote: >> On Mon, May 21, 2012 at 11:36 PM, Dag Sverre Seljebotn >> wrote: >>> On 05/22/2012 08:11 AM, Robert Bradshaw wrote: >>>> >>>> On Mon, May 21, 2012 at 3:34 AM, Dag Sverre Seljebotn >>>> ?wrote: >>>>> >>>>> On 05/20/2012 04:03 PM, mark florisson wrote: >>>>>> >>>>>> >>>>>> Hey, >>>>>> >>>>>> For my gsoc we already have some simple initial ideas, i.e. >>>>>> elementwise vector expressions (a + b with a and b arrays with >>>>>> arbitrary rank), I don't think these need any discussion. However, >>>>>> there are a lot of things that haven't been formally discussed on the >>>>>> mailing list, so here goes. >>>>>> >>>>>> Fr?d?ric, I am CCing you since you expressed interest on the numpy >>>>>> mailing list, and I think your insights as a Theano developer can be >>>>>> very helpful in this discussion. >>>>>> >>>>>> User Interface >>>>>> =========== >>>>>> Besides simple array expressions for dense arrays I would like a >>>>>> mechanism for "custom ufuncs", although to a different extent to what >>>>>> Numpy or Numba provide. There are several ways in which we could want >>>>>> them, e.g. as typed functions (cdef, or external C) functions, as >>>>>> lambas or Python functions in the same module, or as general objects >>>>>> (e.g. functions Cython doesn't know about). >>>>>> To achieve maximum efficiency it will likely be good to allow sharing >>>>>> these functions in .pxd files. We have 'cdef inline' functions, but I >>>>>> would prefer annotated def functions where the parameters are >>>>>> specialized on demand, e.g. >>>>>> >>>>>> @elemental >>>>>> def add(a, b): # elemental functions can have any number of arguments >>>>>> and operate on any compatible dtype >>>>>> ? ? return a + b >>>>>> >>>>>> When calling cdef functions or elemental functions with memoryview >>>>>> arguments, the arguments perform a (broadcasted) elementwise >>>>>> operation. Alternatively, we can have a parallel.elementwise function >>>>>> which maps the function elementwise, which would also work for object >>>>>> callables. I prefer the former, since I think it will read much >>>>>> easier. >>>>>> >>>>>> Secondly, we can have a reduce function (and maybe a scan function), >>>>>> that reduce (respectively scan) in a specified axis or number of axes. >>>>>> E.g. >>>>>> >>>>>> ? ? parallel.reduce(add, a, b, axis=(0, 2)) >>>>>> >>>>>> where the default for axis is "all axes". As for the default value, >>>>>> this could be perhaps optionally provided to the elemental decorator. >>>>>> Otherwise, the reducer will have to get the default values from each >>>>>> dimension that is reduced in, and then skip those values when >>>>>> reducing. (Of course, the reducer function must be associate and >>>>>> commutative). Also, a lambda could be passed in instead of an >>>>> >>>>> >>>>> >>>>> Only associative, right? >>>>> >>>>> Sounds good to me. >>>>> >>>>> >>>>>> elementwise or typed cdef function. >>>>>> >>>>>> Finally, we would have a parallel.nditer/ndenumerate/nditerate >>>>>> function, which would iterate over N memoryviews, and provide a >>>>>> sensible memory access pattern (like numpy.nditer). I'm not sure if it >>>>>> should provide only the indices, or also the values. e.g. an inplace >>>>>> elementwise add would read as follows: >>>>>> >>>>>> ? ? for i, j, k in parallel.nditerate(A, B): >>>>>> ? ? ? ? A[i, j, k] += B[i, j, k] >>>>> >>>>> >>>>> >>>>> >>>>> I think this sounds good; I guess don't see a particular reason for >>>>> "ndenumerate", I think code like the above is clearer. >>>> >>>> >>>> I'm assuming the index computations would not be re-done in this case >>>> (i.e. there's more magic going on here than looks like at first >>>> glance)? Otherwise there is an advantage to ndenumerate. >>> >>> >>> Ideally, there is a lot more magic going on, though I don't know how far >>> Mark wants to go. >>> >>> Imagine "nditerate(A, A.T)", in that case it would have to make many small >>> tiles so that for each tile being processed, A has a tile in cache and A.T >>> has another tile in cache (so that one doesn't waste cache line transfers). >>> >>> So those array lookups would potentially look up in different memory >>> buffers, with the strides known at compile time. >> >> Yes, being clever about the order in which to iterate over the indices >> is the hard problem to solve here. I was thinking more in terms of the >> inner loop iterating over the innermost dimension only to do the >> indexing (retrieval and assignment), similar to how the generic NumPy >> iterator works. > > That's a valid point, but my experience has been that any worthy C > compiler will do common subexpression elimination for the outer > dimensions and not recompute the offset every time. It actually > generated marginally faster code for scalar assignment than a > "cascaded pointer assignment", i.e. faster than > > p0 = data; > for (...) { > ? ?p1 = p0 + i * strides[0] > ? ?for (...) { > ? ? ? ?p2 = p1 + j * strides[1] > ? ? ? ?... > ? ?} > } > > (haven't tried manual strength reduction there though). That's a good point, though "for(p2=p1; p2 < precomputed; p2 += stride1) {...}" is probably a better manual reduction. I concede that compilers are really smart about this kind of stuff these days though (though they might not be able to infer that, for example, strides doesn't change). - Robert From matthew.brett at gmail.com Fri May 25 00:01:10 2012 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 24 May 2012 22:01:10 +0000 Subject: [Cython] 0.17 In-Reply-To: References: Message-ID: Hey, On Wed, May 23, 2012 at 9:35 PM, mark florisson wrote: > Hey Matthew, > > Seriously, no problem, we're still getting stuff in there, no need to > hurry. If you're in Cuba it sounds like you have better stuff to do > than improve Cython's documentation :) Not to discourage contributors, > but I would really enjoy Cuba here :). I'm working on my gsoc now, so > making some time free for me is no problem here. > > So I propose I just clean up my mess, post back a link to the > documentation, and then anyone who is interested can comment on > improvements. Thank you - well - why don't I aim for a push tomorrow, and if that get's derailed by Cuba-related things, then, I bow to your kind waiver, with thanks. See you, Matthew From valmynd at gmail.com Fri May 25 12:53:08 2012 From: valmynd at gmail.com (c.) Date: Fri, 25 May 2012 12:53:08 +0200 Subject: [Cython] Metaclasses to generate cdef'ed Classes In-Reply-To: References: Message-ID: <20120525125308.73eeee82f1defa910f7ef7a9@gmail.com> I make excessive use of Meta Classes and I would like to be able to use them with Classes that use the cclass Annotation. Well I don't think it is impossible to do, althrough of cause I am aware that there are many issues with a higher priority right now. Here are some ideas of how this could be accomplished: - evaluate cdef-Classes which are using a Metaclass before any other classes - create temporary pyd and pyx files for any cdef-class that uses cdef-metaclasses - those temporary classes need to be re-evaluated if anything changes in the original Benefits: - Once the Extension Module is built, there would be no need to do process the creation/manipulation via metaclasses, thus reduce startup time for application - Have all the benefits of cdef-Classes, like lower memory footprint, when there might be hundreds of those classes with lots of data in them - Have all the benefits of python beeing both very dynamic and flexible and simple to use What do you think? Btw., Thank you all for this great Project C.Wilhelm trac ticket: http://trac.cython.org/cython_trac/ticket/777 From nouiz at nouiz.org Fri May 25 22:57:24 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Fri, 25 May 2012 16:57:24 -0400 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: I just resended this email as it was rejected by the mailing list. So I subscribed to it. Hi, Sorry for the delay, I had some schedule change. thanks for adding me. Should I subscribe to cython-dev? How much email daily there is? I didn't found this on the archives. Fell free to add me in CC again when you think it is appropriate. I'll reply here to all email at the same time. Do you prefer that I reply to each email individually if this happen again? I'll try to reply faster next time. - About pickling theano, we currently can't pick Theano function. It could be made to work in some cases, but not for all cases as there is hardware dependent optimization in the Theano function. Currently it is mostly CPU vs GPU operation. So if we stay on the CPU, we could do some pickling, but we should make sure that the compiled c code into python module are still there when we unpickle or recompile them. - I think it make sense to make a theano graph from cython ast, optimize and redo a cython ast from the optimized graph. This would allow using Theano optimizations. - It also make sense to do the code generation in Theano and reuse it in Cython. But this would make the Theano dependency much stronger. I'm not sure you want this. - Another point not raised, theano need to know at compile time is the dtype, number of dimensions and witch dimensions are broadcastable for each variable. I think that the last one could cause problem, but if you use specialization for the dtype, the same can be done for the broadcsatability of a dimensions. - The compyte(gpu nd array) project do collapsing of dimensions. This is an important optimization on the GPU as doing the index computation in parallel is costlier. I think on the CPU we could probably do collapsing just of the inner dimensions to make it faster. - Theano don't generate intrinsect or assembly, but we suppose that g++ will generate vectorized operation for simple loop. Recent version of gcc/g++ do this. - Our generated code for element-wise operation take care a little about the memory access pattern. We swap dimensions to iterate on the dimensions with the smallest strides. But we don't go further. - What do you mean by CSE? Constant ?optimization? Fred From d.s.seljebotn at astro.uio.no Sun May 27 23:24:44 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Sun, 27 May 2012 23:24:44 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FB60896.4030702@astro.uio.no> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> Message-ID: <4FC29B9C.1000308@astro.uio.no> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: > On 05/18/2012 12:57 AM, Nick Coghlan wrote: >> I think the main things we'd be looking for would be: >> - a clear explanation of why a new metaclass is considered too complex a >> solution >> - what the implications are for classes that have nothing to do with the >> SciPy/NumPy ecosystem >> - how subclassing would behave (both at the class and metaclass level) >> >> Yes, defining a new metaclass for fast signature exchange has its >> challenges - but it means that *our* concerns about maintaining >> consistent behaviour in the default object model and avoiding adverse >> effects on code that doesn't need the new behaviour are addressed >> automatically. >> >> Also, I'd consider a functioning reference implementation using a custom >> metaclass a requirement before we considered modifying type anyway, so I >> think that's the best thing to pursue next rather than a PEP. It also >> has the virtue of letting you choose which Python versions to target and >> iterating at a faster rate than CPython. > > This seems right on target. I could make a utility code C header for > such a metaclass, and then the different libraries can all include it > and handshake on which implementation becomes the real one through > sys.modules during module initialization. That way an eventual PEP will > only be a natural incremental step to make things more polished, whether > that happens by making such a metaclass part of the standard library or > by extending PyTypeObject. So I finally got around to implementing this: https://github.com/dagss/pyextensibletype Documentation now in a draft in the NumFOCUS SEP repo, which I believe is a better place to store cross-project standards like this. (The NumPy docstring standard will be SEP 100). https://github.com/numfocus/sep/blob/master/sep200.rst Summary: - No common runtime dependency - 1 ns overhead per lookup (that's for the custom slot *alone*, no fast-callable signature matching or similar) - Slight annoyance: Types that want to use the metaclass must be a PyHeapExtensibleType, to make the binary layout work with how CPython makes subclasses from Python scripts My conclusion: I think the metaclass approach should work really well. Dag From njs at pobox.com Mon May 28 00:12:32 2012 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 27 May 2012 23:12:32 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FC29B9C.1000308@astro.uio.no> References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> Message-ID: On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn wrote: > On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: >> >> On 05/18/2012 12:57 AM, Nick Coghlan wrote: >>> >>> I think the main things we'd be looking for would be: >>> - a clear explanation of why a new metaclass is considered too complex a >>> solution >>> - what the implications are for classes that have nothing to do with the >>> SciPy/NumPy ecosystem >>> - how subclassing would behave (both at the class and metaclass level) >>> >>> Yes, defining a new metaclass for fast signature exchange has its >>> challenges - but it means that *our* concerns about maintaining >>> consistent behaviour in the default object model and avoiding adverse >>> effects on code that doesn't need the new behaviour are addressed >>> automatically. >>> >>> Also, I'd consider a functioning reference implementation using a custom >>> metaclass a requirement before we considered modifying type anyway, so I >>> think that's the best thing to pursue next rather than a PEP. It also >>> has the virtue of letting you choose which Python versions to target and >>> iterating at a faster rate than CPython. >> >> >> This seems right on target. I could make a utility code C header for >> such a metaclass, and then the different libraries can all include it >> and handshake on which implementation becomes the real one through >> sys.modules during module initialization. That way an eventual PEP will >> only be a natural incremental step to make things more polished, whether >> that happens by making such a metaclass part of the standard library or >> by extending PyTypeObject. > > > So I finally got around to implementing this: > > https://github.com/dagss/pyextensibletype > > Documentation now in a draft in the NumFOCUS SEP repo, which I believe is a > better place to store cross-project standards like this. (The NumPy > docstring standard will be SEP 100). > > https://github.com/numfocus/sep/blob/master/sep200.rst > > Summary: > > ?- No common runtime dependency > > ?- 1 ns overhead per lookup (that's for the custom slot *alone*, no > fast-callable signature matching or similar) > > ?- Slight annoyance: Types that want to use the metaclass must be a > PyHeapExtensibleType, to make the binary layout work with how CPython makes > subclasses from Python scripts > > My conclusion: I think the metaclass approach should work really well. Few quick comments on skimming the code: The complicated nested #ifdef for __builtin_expect could be simplified to #if defined(__GNUC__) && (__GNUC__ > 2 || __GNUC_MINOR__ > 95) PyCustomSlots_Check should be called PyCustomSlots_CheckExact, surely? And given that, how can this code work if someone does subclass this metaclass? Stealing a flag bit (but now to indicate this metaclass) would allow us to make a real PyCustomSlots_Check function that was still fast. It would also mean that different implementations didn't have to rendezvous on a single PyExtensibleType_Type, so long as they all used the same flag bit. That would let us skip monkeying around with sys.modules. Speaking of which, surely we should not be using sys.modules for this? Stashing it in sys itself or something would make more sense, if we're going to do it at all. - N From markflorisson88 at gmail.com Mon May 28 10:54:38 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 28 May 2012 09:54:38 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> Message-ID: On 27 May 2012 23:12, Nathaniel Smith wrote: > On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn > wrote: >> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: >>> >>> On 05/18/2012 12:57 AM, Nick Coghlan wrote: >>>> >>>> I think the main things we'd be looking for would be: >>>> - a clear explanation of why a new metaclass is considered too complex a >>>> solution >>>> - what the implications are for classes that have nothing to do with the >>>> SciPy/NumPy ecosystem >>>> - how subclassing would behave (both at the class and metaclass level) >>>> >>>> Yes, defining a new metaclass for fast signature exchange has its >>>> challenges - but it means that *our* concerns about maintaining >>>> consistent behaviour in the default object model and avoiding adverse >>>> effects on code that doesn't need the new behaviour are addressed >>>> automatically. >>>> >>>> Also, I'd consider a functioning reference implementation using a custom >>>> metaclass a requirement before we considered modifying type anyway, so I >>>> think that's the best thing to pursue next rather than a PEP. It also >>>> has the virtue of letting you choose which Python versions to target and >>>> iterating at a faster rate than CPython. >>> >>> >>> This seems right on target. I could make a utility code C header for >>> such a metaclass, and then the different libraries can all include it >>> and handshake on which implementation becomes the real one through >>> sys.modules during module initialization. That way an eventual PEP will >>> only be a natural incremental step to make things more polished, whether >>> that happens by making such a metaclass part of the standard library or >>> by extending PyTypeObject. >> >> >> So I finally got around to implementing this: >> >> https://github.com/dagss/pyextensibletype >> >> Documentation now in a draft in the NumFOCUS SEP repo, which I believe is a >> better place to store cross-project standards like this. (The NumPy >> docstring standard will be SEP 100). >> >> https://github.com/numfocus/sep/blob/master/sep200.rst >> >> Summary: >> >> ?- No common runtime dependency >> >> ?- 1 ns overhead per lookup (that's for the custom slot *alone*, no >> fast-callable signature matching or similar) >> >> ?- Slight annoyance: Types that want to use the metaclass must be a >> PyHeapExtensibleType, to make the binary layout work with how CPython makes >> subclasses from Python scripts >> >> My conclusion: I think the metaclass approach should work really well. > > Few quick comments on skimming the code: > > The complicated nested #ifdef for __builtin_expect could be simplified to > ?#if defined(__GNUC__) && (__GNUC__ > 2 || __GNUC_MINOR__ > 95) > > PyCustomSlots_Check should be called PyCustomSlots_CheckExact, surely? > And given that, how can this code work if someone does subclass this > metaclass? I think we should provide a wrapper for PyType_Ready, which just copies the pointer to the table and the count directly into the subclass. If a user then wishes to add stuff, the user can allocate a new memory region dynamically, memcpy the base class' stuff in there, and append some entries. > Stealing a flag bit (but now to indicate this metaclass) would allow > us to make a real PyCustomSlots_Check function that was still fast. It > would also mean that different implementations didn't have to > rendezvous on a single PyExtensibleType_Type, so long as they all used > the same flag bit. That would let us skip monkeying around with > sys.modules. > > Speaking of which, surely we should not be using sys.modules for this? > Stashing it in sys itself or something would make more sense, if we're > going to do it at all. I think a module makes sense, if mangled appropriately. A module really means shared state (even if the only state are the functions and classes). > - N > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From markflorisson88 at gmail.com Mon May 28 11:13:30 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 28 May 2012 10:13:30 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> Message-ID: On 28 May 2012 09:54, mark florisson wrote: > On 27 May 2012 23:12, Nathaniel Smith wrote: >> On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn >> wrote: >>> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: >>>> >>>> On 05/18/2012 12:57 AM, Nick Coghlan wrote: >>>>> >>>>> I think the main things we'd be looking for would be: >>>>> - a clear explanation of why a new metaclass is considered too complex a >>>>> solution >>>>> - what the implications are for classes that have nothing to do with the >>>>> SciPy/NumPy ecosystem >>>>> - how subclassing would behave (both at the class and metaclass level) >>>>> >>>>> Yes, defining a new metaclass for fast signature exchange has its >>>>> challenges - but it means that *our* concerns about maintaining >>>>> consistent behaviour in the default object model and avoiding adverse >>>>> effects on code that doesn't need the new behaviour are addressed >>>>> automatically. >>>>> >>>>> Also, I'd consider a functioning reference implementation using a custom >>>>> metaclass a requirement before we considered modifying type anyway, so I >>>>> think that's the best thing to pursue next rather than a PEP. It also >>>>> has the virtue of letting you choose which Python versions to target and >>>>> iterating at a faster rate than CPython. >>>> >>>> >>>> This seems right on target. I could make a utility code C header for >>>> such a metaclass, and then the different libraries can all include it >>>> and handshake on which implementation becomes the real one through >>>> sys.modules during module initialization. That way an eventual PEP will >>>> only be a natural incremental step to make things more polished, whether >>>> that happens by making such a metaclass part of the standard library or >>>> by extending PyTypeObject. >>> >>> >>> So I finally got around to implementing this: >>> >>> https://github.com/dagss/pyextensibletype >>> >>> Documentation now in a draft in the NumFOCUS SEP repo, which I believe is a >>> better place to store cross-project standards like this. (The NumPy >>> docstring standard will be SEP 100). >>> >>> https://github.com/numfocus/sep/blob/master/sep200.rst >>> >>> Summary: >>> >>> ?- No common runtime dependency >>> >>> ?- 1 ns overhead per lookup (that's for the custom slot *alone*, no >>> fast-callable signature matching or similar) >>> >>> ?- Slight annoyance: Types that want to use the metaclass must be a >>> PyHeapExtensibleType, to make the binary layout work with how CPython makes >>> subclasses from Python scripts >>> >>> My conclusion: I think the metaclass approach should work really well. >> >> Few quick comments on skimming the code: >> >> The complicated nested #ifdef for __builtin_expect could be simplified to >> ?#if defined(__GNUC__) && (__GNUC__ > 2 || __GNUC_MINOR__ > 95) >> >> PyCustomSlots_Check should be called PyCustomSlots_CheckExact, surely? >> And given that, how can this code work if someone does subclass this >> metaclass? > > I think we should provide a wrapper for PyType_Ready, which just > copies the pointer to the table and the count directly into the > subclass. If a user then wishes to add stuff, the user can allocate a > new memory region dynamically, memcpy the base class' stuff in there, > and append some entries. Maybe we should also allow each custom type to set a deallocator, since they are then heap types which can go out of scope. The metaclass can then call this deallocator to deallocate the table. >> Stealing a flag bit (but now to indicate this metaclass) would allow >> us to make a real PyCustomSlots_Check function that was still fast. It >> would also mean that different implementations didn't have to >> rendezvous on a single PyExtensibleType_Type, so long as they all used >> the same flag bit. That would let us skip monkeying around with >> sys.modules. >> >> Speaking of which, surely we should not be using sys.modules for this? >> Stashing it in sys itself or something would make more sense, if we're >> going to do it at all. > > I think a module makes sense, if mangled appropriately. A module > really means shared state (even if the only state are the functions > and classes). > >> - N >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel From njs at pobox.com Mon May 28 12:41:37 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 28 May 2012 11:41:37 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> Message-ID: On Mon, May 28, 2012 at 10:13 AM, mark florisson wrote: > On 28 May 2012 09:54, mark florisson wrote: >> On 27 May 2012 23:12, Nathaniel Smith wrote: >>> On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn >>> wrote: >>>> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: >>>>> >>>>> On 05/18/2012 12:57 AM, Nick Coghlan wrote: >>>>>> >>>>>> I think the main things we'd be looking for would be: >>>>>> - a clear explanation of why a new metaclass is considered too complex a >>>>>> solution >>>>>> - what the implications are for classes that have nothing to do with the >>>>>> SciPy/NumPy ecosystem >>>>>> - how subclassing would behave (both at the class and metaclass level) >>>>>> >>>>>> Yes, defining a new metaclass for fast signature exchange has its >>>>>> challenges - but it means that *our* concerns about maintaining >>>>>> consistent behaviour in the default object model and avoiding adverse >>>>>> effects on code that doesn't need the new behaviour are addressed >>>>>> automatically. >>>>>> >>>>>> Also, I'd consider a functioning reference implementation using a custom >>>>>> metaclass a requirement before we considered modifying type anyway, so I >>>>>> think that's the best thing to pursue next rather than a PEP. It also >>>>>> has the virtue of letting you choose which Python versions to target and >>>>>> iterating at a faster rate than CPython. >>>>> >>>>> >>>>> This seems right on target. I could make a utility code C header for >>>>> such a metaclass, and then the different libraries can all include it >>>>> and handshake on which implementation becomes the real one through >>>>> sys.modules during module initialization. That way an eventual PEP will >>>>> only be a natural incremental step to make things more polished, whether >>>>> that happens by making such a metaclass part of the standard library or >>>>> by extending PyTypeObject. >>>> >>>> >>>> So I finally got around to implementing this: >>>> >>>> https://github.com/dagss/pyextensibletype >>>> >>>> Documentation now in a draft in the NumFOCUS SEP repo, which I believe is a >>>> better place to store cross-project standards like this. (The NumPy >>>> docstring standard will be SEP 100). >>>> >>>> https://github.com/numfocus/sep/blob/master/sep200.rst >>>> >>>> Summary: >>>> >>>> ?- No common runtime dependency >>>> >>>> ?- 1 ns overhead per lookup (that's for the custom slot *alone*, no >>>> fast-callable signature matching or similar) >>>> >>>> ?- Slight annoyance: Types that want to use the metaclass must be a >>>> PyHeapExtensibleType, to make the binary layout work with how CPython makes >>>> subclasses from Python scripts >>>> >>>> My conclusion: I think the metaclass approach should work really well. >>> >>> Few quick comments on skimming the code: >>> >>> The complicated nested #ifdef for __builtin_expect could be simplified to >>> ?#if defined(__GNUC__) && (__GNUC__ > 2 || __GNUC_MINOR__ > 95) >>> >>> PyCustomSlots_Check should be called PyCustomSlots_CheckExact, surely? >>> And given that, how can this code work if someone does subclass this >>> metaclass? >> >> I think we should provide a wrapper for PyType_Ready, which just >> copies the pointer to the table and the count directly into the >> subclass. If a user then wishes to add stuff, the user can allocate a >> new memory region dynamically, memcpy the base class' stuff in there, >> and append some entries. > > Maybe we should also allow each custom type to set a deallocator, > since they are then heap types which can go out of scope. The > metaclass can then call this deallocator to deallocate the table. Custom types are plain old Python objects, they can use tp_dealloc. - N From markflorisson88 at gmail.com Mon May 28 12:55:55 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 28 May 2012 11:55:55 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> Message-ID: On 28 May 2012 11:41, Nathaniel Smith wrote: > On Mon, May 28, 2012 at 10:13 AM, mark florisson > wrote: >> On 28 May 2012 09:54, mark florisson wrote: >>> On 27 May 2012 23:12, Nathaniel Smith wrote: >>>> On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn >>>> wrote: >>>>> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: >>>>>> >>>>>> On 05/18/2012 12:57 AM, Nick Coghlan wrote: >>>>>>> >>>>>>> I think the main things we'd be looking for would be: >>>>>>> - a clear explanation of why a new metaclass is considered too complex a >>>>>>> solution >>>>>>> - what the implications are for classes that have nothing to do with the >>>>>>> SciPy/NumPy ecosystem >>>>>>> - how subclassing would behave (both at the class and metaclass level) >>>>>>> >>>>>>> Yes, defining a new metaclass for fast signature exchange has its >>>>>>> challenges - but it means that *our* concerns about maintaining >>>>>>> consistent behaviour in the default object model and avoiding adverse >>>>>>> effects on code that doesn't need the new behaviour are addressed >>>>>>> automatically. >>>>>>> >>>>>>> Also, I'd consider a functioning reference implementation using a custom >>>>>>> metaclass a requirement before we considered modifying type anyway, so I >>>>>>> think that's the best thing to pursue next rather than a PEP. It also >>>>>>> has the virtue of letting you choose which Python versions to target and >>>>>>> iterating at a faster rate than CPython. >>>>>> >>>>>> >>>>>> This seems right on target. I could make a utility code C header for >>>>>> such a metaclass, and then the different libraries can all include it >>>>>> and handshake on which implementation becomes the real one through >>>>>> sys.modules during module initialization. That way an eventual PEP will >>>>>> only be a natural incremental step to make things more polished, whether >>>>>> that happens by making such a metaclass part of the standard library or >>>>>> by extending PyTypeObject. >>>>> >>>>> >>>>> So I finally got around to implementing this: >>>>> >>>>> https://github.com/dagss/pyextensibletype >>>>> >>>>> Documentation now in a draft in the NumFOCUS SEP repo, which I believe is a >>>>> better place to store cross-project standards like this. (The NumPy >>>>> docstring standard will be SEP 100). >>>>> >>>>> https://github.com/numfocus/sep/blob/master/sep200.rst >>>>> >>>>> Summary: >>>>> >>>>> ?- No common runtime dependency >>>>> >>>>> ?- 1 ns overhead per lookup (that's for the custom slot *alone*, no >>>>> fast-callable signature matching or similar) >>>>> >>>>> ?- Slight annoyance: Types that want to use the metaclass must be a >>>>> PyHeapExtensibleType, to make the binary layout work with how CPython makes >>>>> subclasses from Python scripts >>>>> >>>>> My conclusion: I think the metaclass approach should work really well. >>>> >>>> Few quick comments on skimming the code: >>>> >>>> The complicated nested #ifdef for __builtin_expect could be simplified to >>>> ?#if defined(__GNUC__) && (__GNUC__ > 2 || __GNUC_MINOR__ > 95) >>>> >>>> PyCustomSlots_Check should be called PyCustomSlots_CheckExact, surely? >>>> And given that, how can this code work if someone does subclass this >>>> metaclass? >>> >>> I think we should provide a wrapper for PyType_Ready, which just >>> copies the pointer to the table and the count directly into the >>> subclass. If a user then wishes to add stuff, the user can allocate a >>> new memory region dynamically, memcpy the base class' stuff in there, >>> and append some entries. >> >> Maybe we should also allow each custom type to set a deallocator, >> since they are then heap types which can go out of scope. The >> metaclass can then call this deallocator to deallocate the table. > > Custom types are plain old Python objects, they can use tp_dealloc. > > - N > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel If I set etp_custom_slots to something allocated on the heap, then the (shared) metaclass would have to deallocate it. The tp_dealloc of the type itself would be called for its instances (which can be used to deallocate dynamically allocated memory in the objects if you use a custom slot "pointer offset"). From njs at pobox.com Mon May 28 13:01:25 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 28 May 2012 12:01:25 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> Message-ID: On Mon, May 28, 2012 at 11:55 AM, mark florisson wrote: > On 28 May 2012 11:41, Nathaniel Smith wrote: >> On Mon, May 28, 2012 at 10:13 AM, mark florisson >> wrote: >>> On 28 May 2012 09:54, mark florisson wrote: >>>> On 27 May 2012 23:12, Nathaniel Smith wrote: >>>>> On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn >>>>> wrote: >>>>>> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: >>>>>>> >>>>>>> On 05/18/2012 12:57 AM, Nick Coghlan wrote: >>>>>>>> >>>>>>>> I think the main things we'd be looking for would be: >>>>>>>> - a clear explanation of why a new metaclass is considered too complex a >>>>>>>> solution >>>>>>>> - what the implications are for classes that have nothing to do with the >>>>>>>> SciPy/NumPy ecosystem >>>>>>>> - how subclassing would behave (both at the class and metaclass level) >>>>>>>> >>>>>>>> Yes, defining a new metaclass for fast signature exchange has its >>>>>>>> challenges - but it means that *our* concerns about maintaining >>>>>>>> consistent behaviour in the default object model and avoiding adverse >>>>>>>> effects on code that doesn't need the new behaviour are addressed >>>>>>>> automatically. >>>>>>>> >>>>>>>> Also, I'd consider a functioning reference implementation using a custom >>>>>>>> metaclass a requirement before we considered modifying type anyway, so I >>>>>>>> think that's the best thing to pursue next rather than a PEP. It also >>>>>>>> has the virtue of letting you choose which Python versions to target and >>>>>>>> iterating at a faster rate than CPython. >>>>>>> >>>>>>> >>>>>>> This seems right on target. I could make a utility code C header for >>>>>>> such a metaclass, and then the different libraries can all include it >>>>>>> and handshake on which implementation becomes the real one through >>>>>>> sys.modules during module initialization. That way an eventual PEP will >>>>>>> only be a natural incremental step to make things more polished, whether >>>>>>> that happens by making such a metaclass part of the standard library or >>>>>>> by extending PyTypeObject. >>>>>> >>>>>> >>>>>> So I finally got around to implementing this: >>>>>> >>>>>> https://github.com/dagss/pyextensibletype >>>>>> >>>>>> Documentation now in a draft in the NumFOCUS SEP repo, which I believe is a >>>>>> better place to store cross-project standards like this. (The NumPy >>>>>> docstring standard will be SEP 100). >>>>>> >>>>>> https://github.com/numfocus/sep/blob/master/sep200.rst >>>>>> >>>>>> Summary: >>>>>> >>>>>> ?- No common runtime dependency >>>>>> >>>>>> ?- 1 ns overhead per lookup (that's for the custom slot *alone*, no >>>>>> fast-callable signature matching or similar) >>>>>> >>>>>> ?- Slight annoyance: Types that want to use the metaclass must be a >>>>>> PyHeapExtensibleType, to make the binary layout work with how CPython makes >>>>>> subclasses from Python scripts >>>>>> >>>>>> My conclusion: I think the metaclass approach should work really well. >>>>> >>>>> Few quick comments on skimming the code: >>>>> >>>>> The complicated nested #ifdef for __builtin_expect could be simplified to >>>>> ?#if defined(__GNUC__) && (__GNUC__ > 2 || __GNUC_MINOR__ > 95) >>>>> >>>>> PyCustomSlots_Check should be called PyCustomSlots_CheckExact, surely? >>>>> And given that, how can this code work if someone does subclass this >>>>> metaclass? >>>> >>>> I think we should provide a wrapper for PyType_Ready, which just >>>> copies the pointer to the table and the count directly into the >>>> subclass. If a user then wishes to add stuff, the user can allocate a >>>> new memory region dynamically, memcpy the base class' stuff in there, >>>> and append some entries. >>> >>> Maybe we should also allow each custom type to set a deallocator, >>> since they are then heap types which can go out of scope. The >>> metaclass can then call this deallocator to deallocate the table. >> >> Custom types are plain old Python objects, they can use tp_dealloc. >> > If I set etp_custom_slots to something allocated on the heap, then the > (shared) metaclass would have to deallocate it. The tp_dealloc of the > type itself would be called for its instances (which can be used to > deallocate dynamically allocated memory in the objects if you use a > custom slot "pointer offset"). Oh, I see. Right, the natural way to handle this would be have each user define their own metaclass with the behavior they want. Another argument for supporting multiple metaclasses simultaneously I guess... - N From markflorisson88 at gmail.com Mon May 28 13:09:22 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 28 May 2012 12:09:22 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> Message-ID: On 28 May 2012 12:01, Nathaniel Smith wrote: > On Mon, May 28, 2012 at 11:55 AM, mark florisson > wrote: >> On 28 May 2012 11:41, Nathaniel Smith wrote: >>> On Mon, May 28, 2012 at 10:13 AM, mark florisson >>> wrote: >>>> On 28 May 2012 09:54, mark florisson wrote: >>>>> On 27 May 2012 23:12, Nathaniel Smith wrote: >>>>>> On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn >>>>>> wrote: >>>>>>> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: >>>>>>>> >>>>>>>> On 05/18/2012 12:57 AM, Nick Coghlan wrote: >>>>>>>>> >>>>>>>>> I think the main things we'd be looking for would be: >>>>>>>>> - a clear explanation of why a new metaclass is considered too complex a >>>>>>>>> solution >>>>>>>>> - what the implications are for classes that have nothing to do with the >>>>>>>>> SciPy/NumPy ecosystem >>>>>>>>> - how subclassing would behave (both at the class and metaclass level) >>>>>>>>> >>>>>>>>> Yes, defining a new metaclass for fast signature exchange has its >>>>>>>>> challenges - but it means that *our* concerns about maintaining >>>>>>>>> consistent behaviour in the default object model and avoiding adverse >>>>>>>>> effects on code that doesn't need the new behaviour are addressed >>>>>>>>> automatically. >>>>>>>>> >>>>>>>>> Also, I'd consider a functioning reference implementation using a custom >>>>>>>>> metaclass a requirement before we considered modifying type anyway, so I >>>>>>>>> think that's the best thing to pursue next rather than a PEP. It also >>>>>>>>> has the virtue of letting you choose which Python versions to target and >>>>>>>>> iterating at a faster rate than CPython. >>>>>>>> >>>>>>>> >>>>>>>> This seems right on target. I could make a utility code C header for >>>>>>>> such a metaclass, and then the different libraries can all include it >>>>>>>> and handshake on which implementation becomes the real one through >>>>>>>> sys.modules during module initialization. That way an eventual PEP will >>>>>>>> only be a natural incremental step to make things more polished, whether >>>>>>>> that happens by making such a metaclass part of the standard library or >>>>>>>> by extending PyTypeObject. >>>>>>> >>>>>>> >>>>>>> So I finally got around to implementing this: >>>>>>> >>>>>>> https://github.com/dagss/pyextensibletype >>>>>>> >>>>>>> Documentation now in a draft in the NumFOCUS SEP repo, which I believe is a >>>>>>> better place to store cross-project standards like this. (The NumPy >>>>>>> docstring standard will be SEP 100). >>>>>>> >>>>>>> https://github.com/numfocus/sep/blob/master/sep200.rst >>>>>>> >>>>>>> Summary: >>>>>>> >>>>>>> ?- No common runtime dependency >>>>>>> >>>>>>> ?- 1 ns overhead per lookup (that's for the custom slot *alone*, no >>>>>>> fast-callable signature matching or similar) >>>>>>> >>>>>>> ?- Slight annoyance: Types that want to use the metaclass must be a >>>>>>> PyHeapExtensibleType, to make the binary layout work with how CPython makes >>>>>>> subclasses from Python scripts >>>>>>> >>>>>>> My conclusion: I think the metaclass approach should work really well. >>>>>> >>>>>> Few quick comments on skimming the code: >>>>>> >>>>>> The complicated nested #ifdef for __builtin_expect could be simplified to >>>>>> ?#if defined(__GNUC__) && (__GNUC__ > 2 || __GNUC_MINOR__ > 95) >>>>>> >>>>>> PyCustomSlots_Check should be called PyCustomSlots_CheckExact, surely? >>>>>> And given that, how can this code work if someone does subclass this >>>>>> metaclass? >>>>> >>>>> I think we should provide a wrapper for PyType_Ready, which just >>>>> copies the pointer to the table and the count directly into the >>>>> subclass. If a user then wishes to add stuff, the user can allocate a >>>>> new memory region dynamically, memcpy the base class' stuff in there, >>>>> and append some entries. >>>> >>>> Maybe we should also allow each custom type to set a deallocator, >>>> since they are then heap types which can go out of scope. The >>>> metaclass can then call this deallocator to deallocate the table. >>> >>> Custom types are plain old Python objects, they can use tp_dealloc. >>> >> If I set etp_custom_slots to something allocated on the heap, then the >> (shared) metaclass would have to deallocate it. The tp_dealloc of the >> type itself would be called for its instances (which can be used to >> deallocate dynamically allocated memory in the objects if you use a >> custom slot "pointer offset"). > > Oh, I see. Right, the natural way to handle this would be have each > user define their own metaclass with the behavior they want. Another > argument for supporting multiple metaclasses simultaneously I guess... > > - N > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel That bludgeons your constant time type check. It's easier to just reserve an extra slot for a deallocator pointer :) It would probably be set to NULL in the common case anyway, since you allocate your slots statically. From njs at pobox.com Mon May 28 13:24:08 2012 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 28 May 2012 12:24:08 +0100 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB366F3.7010208@v.loewis.de> <4FB3784C.9020906@v.loewis.de> <4FB385F3.7070209@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> Message-ID: On Mon, May 28, 2012 at 12:09 PM, mark florisson wrote: > On 28 May 2012 12:01, Nathaniel Smith wrote: >> On Mon, May 28, 2012 at 11:55 AM, mark florisson >> wrote: >>> On 28 May 2012 11:41, Nathaniel Smith wrote: >>>> On Mon, May 28, 2012 at 10:13 AM, mark florisson >>>> wrote: >>>>> On 28 May 2012 09:54, mark florisson wrote: >>>>>> On 27 May 2012 23:12, Nathaniel Smith wrote: >>>>>>> On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn >>>>>>> wrote: >>>>>>>> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: >>>>>>>>> >>>>>>>>> On 05/18/2012 12:57 AM, Nick Coghlan wrote: >>>>>>>>>> >>>>>>>>>> I think the main things we'd be looking for would be: >>>>>>>>>> - a clear explanation of why a new metaclass is considered too complex a >>>>>>>>>> solution >>>>>>>>>> - what the implications are for classes that have nothing to do with the >>>>>>>>>> SciPy/NumPy ecosystem >>>>>>>>>> - how subclassing would behave (both at the class and metaclass level) >>>>>>>>>> >>>>>>>>>> Yes, defining a new metaclass for fast signature exchange has its >>>>>>>>>> challenges - but it means that *our* concerns about maintaining >>>>>>>>>> consistent behaviour in the default object model and avoiding adverse >>>>>>>>>> effects on code that doesn't need the new behaviour are addressed >>>>>>>>>> automatically. >>>>>>>>>> >>>>>>>>>> Also, I'd consider a functioning reference implementation using a custom >>>>>>>>>> metaclass a requirement before we considered modifying type anyway, so I >>>>>>>>>> think that's the best thing to pursue next rather than a PEP. It also >>>>>>>>>> has the virtue of letting you choose which Python versions to target and >>>>>>>>>> iterating at a faster rate than CPython. >>>>>>>>> >>>>>>>>> >>>>>>>>> This seems right on target. I could make a utility code C header for >>>>>>>>> such a metaclass, and then the different libraries can all include it >>>>>>>>> and handshake on which implementation becomes the real one through >>>>>>>>> sys.modules during module initialization. That way an eventual PEP will >>>>>>>>> only be a natural incremental step to make things more polished, whether >>>>>>>>> that happens by making such a metaclass part of the standard library or >>>>>>>>> by extending PyTypeObject. >>>>>>>> >>>>>>>> >>>>>>>> So I finally got around to implementing this: >>>>>>>> >>>>>>>> https://github.com/dagss/pyextensibletype >>>>>>>> >>>>>>>> Documentation now in a draft in the NumFOCUS SEP repo, which I believe is a >>>>>>>> better place to store cross-project standards like this. (The NumPy >>>>>>>> docstring standard will be SEP 100). >>>>>>>> >>>>>>>> https://github.com/numfocus/sep/blob/master/sep200.rst >>>>>>>> >>>>>>>> Summary: >>>>>>>> >>>>>>>> ?- No common runtime dependency >>>>>>>> >>>>>>>> ?- 1 ns overhead per lookup (that's for the custom slot *alone*, no >>>>>>>> fast-callable signature matching or similar) >>>>>>>> >>>>>>>> ?- Slight annoyance: Types that want to use the metaclass must be a >>>>>>>> PyHeapExtensibleType, to make the binary layout work with how CPython makes >>>>>>>> subclasses from Python scripts >>>>>>>> >>>>>>>> My conclusion: I think the metaclass approach should work really well. >>>>>>> >>>>>>> Few quick comments on skimming the code: >>>>>>> >>>>>>> The complicated nested #ifdef for __builtin_expect could be simplified to >>>>>>> ?#if defined(__GNUC__) && (__GNUC__ > 2 || __GNUC_MINOR__ > 95) >>>>>>> >>>>>>> PyCustomSlots_Check should be called PyCustomSlots_CheckExact, surely? >>>>>>> And given that, how can this code work if someone does subclass this >>>>>>> metaclass? >>>>>> >>>>>> I think we should provide a wrapper for PyType_Ready, which just >>>>>> copies the pointer to the table and the count directly into the >>>>>> subclass. If a user then wishes to add stuff, the user can allocate a >>>>>> new memory region dynamically, memcpy the base class' stuff in there, >>>>>> and append some entries. >>>>> >>>>> Maybe we should also allow each custom type to set a deallocator, >>>>> since they are then heap types which can go out of scope. The >>>>> metaclass can then call this deallocator to deallocate the table. >>>> >>>> Custom types are plain old Python objects, they can use tp_dealloc. >>>> >>> If I set etp_custom_slots to something allocated on the heap, then the >>> (shared) metaclass would have to deallocate it. The tp_dealloc of the >>> type itself would be called for its instances (which can be used to >>> deallocate dynamically allocated memory in the objects if you use a >>> custom slot "pointer offset"). >> >> Oh, I see. Right, the natural way to handle this would be have each >> user define their own metaclass with the behavior they want. Another >> argument for supporting multiple metaclasses simultaneously I guess... >> >> - N >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > That bludgeons your constant time type check. Not if you steal a flag, like the interpreter already does with Py_TPFLAGS_INT_SUBCLASS, Py_TPFLAGS_STRING_SUBCLASS, etc. I was referring to that argument I made earlier :-) > It's easier to just > reserve an extra slot for a deallocator pointer :) It would probably > be set to NULL in the common case anyway, since you allocate your > slots statically. -N From markflorisson88 at gmail.com Mon May 28 14:49:36 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 28 May 2012 13:49:36 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: On 25 May 2012 21:53, Fr?d?ric Bastien wrote: > Hi, > > > Sorry for the delay, I had some schedule change. > > thanks for adding me. Should I subscribe to cython-dev? How much email > daily there is? I didn't found this on the archives. Fell free to add > me in CC again when you think it is appropriate. There is usually not so much traffic on cython-dev, unless something comes up that is debated to the death :) > I'll reply here to all email at the same time. Do you prefer that I > reply to each email individually if this happen again? I'll try to > reply faster next time. No worries, either way works fine, don't worry too much about protocol (the only thing to note is that we do bottom posting). > - About pickling theano, we currently can't pick Theano function. It > could be made to work in some cases, but not for all cases as there is > hardware dependent optimization in the Theano function. Currently it > is mostly CPU vs GPU operation. So if we stay on the CPU, we could do > some pickling, but we should make sure that the compiled c code into > python module are still there when we unpickle or recompile them. > > - I think it make sense to make a theano graph from cython ast, > optimize and redo a cython ast from the optimized graph. This would > allow using Theano optimizations. Ok, the important thing is that the graph can be pickled, it should be pretty straightforward to generate code to build the function again from the loaded graph. > - It also make sense to do the code generation in Theano and reuse it > in Cython. But this would make the Theano dependency much stronger. > I'm not sure you want this. > > > - Another point not raised, theano need to know at compile time is the > dtype, number of dimensions and witch dimensions are broadcastable for > each variable. I think that the last one could cause problem, but if > you use specialization for the dtype, the same can be done for the > broadcsatability of a dimensions. Hm, that would lead to kind of an explosion of combinations. I think we could specialize only on no broadcasting at all (except for operands with lesser dimensionality). > - The compyte(gpu nd array) project do collapsing of dimensions. This > is an important optimization on the GPU as doing the index computation > in parallel is costlier. I think on the CPU we could probably do > collapsing just of the inner dimensions to make it faster. > > - Theano don't generate intrinsect or assembly, but we suppose that > g++ will generate vectorized operation for simple loop. Recent version > of gcc/g++ do this. Right, the aim is definitely to specialize for contiguous arrays, where you collapse everything. Specializing statically for anything more would be unfeasible, and better handled by a runtime compiler I think. For the C backend, I'll start by generating simple C loops and see if the compilers vectorize that already. > - Our generated code for element-wise operation take care a little > about the memory access pattern. We swap dimensions to iterate on the > dimensions with the smallest strides. But we don't go further. > > - What do you mean by CSE? Constant ?optimization? Yes, common subexpression elimination and also hoisting of unchanging expressions outside the loop. > Fred I started a new project, https://github.com/markflorisson88/minivect , which currently features a simple C code generator. The specializer and astbuilder do most of the work of creating the right AST, so the code generator only has to implement code generation functions for simple expressions. Depending on how it progresses I will look at incorporating Theano's optimizations into it and having Theano use it as a C backend for compatible expressions. From markflorisson88 at gmail.com Mon May 28 14:52:35 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 28 May 2012 13:52:35 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: On 28 May 2012 13:49, mark florisson wrote: > On 25 May 2012 21:53, Fr?d?ric Bastien wrote: >> Hi, >> >> >> Sorry for the delay, I had some schedule change. >> >> thanks for adding me. Should I subscribe to cython-dev? How much email >> daily there is? I didn't found this on the archives. Fell free to add >> me in CC again when you think it is appropriate. > > There is usually not so much traffic on cython-dev, unless something > comes up that is debated to the death :) > >> I'll reply here to all email at the same time. Do you prefer that I >> reply to each email individually if this happen again? I'll try to >> reply faster next time. > > No worries, either way works fine, don't worry too much about protocol > (the only thing to note is that we do bottom posting). > >> - About pickling theano, we currently can't pick Theano function. It >> could be made to work in some cases, but not for all cases as there is >> hardware dependent optimization in the Theano function. Currently it >> is mostly CPU vs GPU operation. So if we stay on the CPU, we could do >> some pickling, but we should make sure that the compiled c code into >> python module are still there when we unpickle or recompile them. >> >> - I think it make sense to make a theano graph from cython ast, >> optimize and redo a cython ast from the optimized graph. This would >> allow using Theano optimizations. > > Ok, the important thing is that the graph can be pickled, it should be > pretty straightforward to generate code to build the function again > from the loaded graph. > >> - It also make sense to do the code generation in Theano and reuse it >> in Cython. But this would make the Theano dependency much stronger. >> I'm not sure you want this. >> >> >> - Another point not raised, theano need to know at compile time is the >> dtype, number of dimensions and witch dimensions are broadcastable for >> each variable. I think that the last one could cause problem, but if >> you use specialization for the dtype, the same can be done for the >> broadcsatability of a dimensions. > > Hm, that would lead to kind of an explosion of combinations. I think > we could specialize only on no broadcasting at all (except for > operands with lesser dimensionality). > >> - The compyte(gpu nd array) project do collapsing of dimensions. This >> is an important optimization on the GPU as doing the index computation >> in parallel is costlier. I think on the CPU we could probably do >> collapsing just of the inner dimensions to make it faster. >> >> - Theano don't generate intrinsect or assembly, but we suppose that >> g++ will generate vectorized operation for simple loop. Recent version >> of gcc/g++ do this. > > Right, the aim is definitely to specialize for contiguous arrays, > where you collapse everything. Specializing statically for anything > more would be unfeasible, and better handled by a runtime compiler I > think. For the C backend, I'll start by generating simple C loops and > see if the compilers vectorize that already. > >> - Our generated code for element-wise operation take care a little >> about the memory access pattern. We swap dimensions to iterate on the >> dimensions with the smallest strides. But we don't go further. >> >> - What do you mean by CSE? Constant ?optimization? > > Yes, common subexpression elimination and also hoisting of unchanging > expressions outside the loop. > >> Fred > > I started a new project, https://github.com/markflorisson88/minivect , > which currently features a simple C code generator. The specializer > and astbuilder do most of the work of creating the right AST, so the > code generator only has to implement code generation functions for > simple expressions. Depending on how it progresses I will look at > incorporating Theano's optimizations into it and having Theano use it > as a C backend for compatible expressions. I forgot to mention, it's still pretty basic, but it works for simple arithmetic expressions with non-overlapping (shifted) memory from Cython: https://github.com/markflorisson88/cython/commit/2c316abdbc1228597bbdf480f737a59213ee9532#L4R1 From markflorisson88 at gmail.com Mon May 28 14:54:33 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Mon, 28 May 2012 13:54:33 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: On 28 May 2012 13:52, mark florisson wrote: > On 28 May 2012 13:49, mark florisson wrote: >> On 25 May 2012 21:53, Fr?d?ric Bastien wrote: >>> Hi, >>> >>> >>> Sorry for the delay, I had some schedule change. >>> >>> thanks for adding me. Should I subscribe to cython-dev? How much email >>> daily there is? I didn't found this on the archives. Fell free to add >>> me in CC again when you think it is appropriate. >> >> There is usually not so much traffic on cython-dev, unless something >> comes up that is debated to the death :) >> >>> I'll reply here to all email at the same time. Do you prefer that I >>> reply to each email individually if this happen again? I'll try to >>> reply faster next time. >> >> No worries, either way works fine, don't worry too much about protocol >> (the only thing to note is that we do bottom posting). >> >>> - About pickling theano, we currently can't pick Theano function. It >>> could be made to work in some cases, but not for all cases as there is >>> hardware dependent optimization in the Theano function. Currently it >>> is mostly CPU vs GPU operation. So if we stay on the CPU, we could do >>> some pickling, but we should make sure that the compiled c code into >>> python module are still there when we unpickle or recompile them. >>> >>> - I think it make sense to make a theano graph from cython ast, >>> optimize and redo a cython ast from the optimized graph. This would >>> allow using Theano optimizations. >> >> Ok, the important thing is that the graph can be pickled, it should be >> pretty straightforward to generate code to build the function again >> from the loaded graph. >> >>> - It also make sense to do the code generation in Theano and reuse it >>> in Cython. But this would make the Theano dependency much stronger. >>> I'm not sure you want this. >>> >>> >>> - Another point not raised, theano need to know at compile time is the >>> dtype, number of dimensions and witch dimensions are broadcastable for >>> each variable. I think that the last one could cause problem, but if >>> you use specialization for the dtype, the same can be done for the >>> broadcsatability of a dimensions. >> >> Hm, that would lead to kind of an explosion of combinations. I think >> we could specialize only on no broadcasting at all (except for >> operands with lesser dimensionality). >> >>> - The compyte(gpu nd array) project do collapsing of dimensions. This >>> is an important optimization on the GPU as doing the index computation >>> in parallel is costlier. I think on the CPU we could probably do >>> collapsing just of the inner dimensions to make it faster. >>> >>> - Theano don't generate intrinsect or assembly, but we suppose that >>> g++ will generate vectorized operation for simple loop. Recent version >>> of gcc/g++ do this. >> >> Right, the aim is definitely to specialize for contiguous arrays, >> where you collapse everything. Specializing statically for anything >> more would be unfeasible, and better handled by a runtime compiler I >> think. For the C backend, I'll start by generating simple C loops and >> see if the compilers vectorize that already. >> >>> - Our generated code for element-wise operation take care a little >>> about the memory access pattern. We swap dimensions to iterate on the >>> dimensions with the smallest strides. But we don't go further. >>> >>> - What do you mean by CSE? Constant ?optimization? >> >> Yes, common subexpression elimination and also hoisting of unchanging >> expressions outside the loop. >> >>> Fred >> >> I started a new project, https://github.com/markflorisson88/minivect , >> which currently features a simple C code generator. The specializer >> and astbuilder do most of the work of creating the right AST, so the >> code generator only has to implement code generation functions for >> simple expressions. Depending on how it progresses I will look at >> incorporating Theano's optimizations into it and having Theano use it >> as a C backend for compatible expressions. > > I forgot to mention, it's still pretty basic, but it works for simple > arithmetic expressions with non-overlapping (shifted) memory from > Cython: https://github.com/markflorisson88/cython/commit/2c316abdbc1228597bbdf480f737a59213ee9532#L4R1 So basically, this project is to be used as a git submodule in Cython, and to be shipped directly in the source distribution. Is there any objection to that? From d.s.seljebotn at astro.uio.no Mon May 28 17:31:14 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 28 May 2012 17:31:14 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: References: <4FB35ACA.7090908@astro.uio.no> <4FB385F3.7070209@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> Message-ID: <4FC39A42.4040602@astro.uio.no> On 05/28/2012 01:24 PM, Nathaniel Smith wrote: > On Mon, May 28, 2012 at 12:09 PM, mark florisson > wrote: >> On 28 May 2012 12:01, Nathaniel Smith wrote: >>> On Mon, May 28, 2012 at 11:55 AM, mark florisson >>> wrote: >>>> On 28 May 2012 11:41, Nathaniel Smith wrote: >>>>> On Mon, May 28, 2012 at 10:13 AM, mark florisson >>>>> wrote: >>>>>> On 28 May 2012 09:54, mark florisson wrote: >>>>>>> On 27 May 2012 23:12, Nathaniel Smith wrote: >>>>>>>> On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn >>>>>>>> wrote: >>>>>>>>> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: >>>>>>>>>> >>>>>>>>>> On 05/18/2012 12:57 AM, Nick Coghlan wrote: >>>>>>>>>>> >>>>>>>>>>> I think the main things we'd be looking for would be: >>>>>>>>>>> - a clear explanation of why a new metaclass is considered too complex a >>>>>>>>>>> solution >>>>>>>>>>> - what the implications are for classes that have nothing to do with the >>>>>>>>>>> SciPy/NumPy ecosystem >>>>>>>>>>> - how subclassing would behave (both at the class and metaclass level) >>>>>>>>>>> >>>>>>>>>>> Yes, defining a new metaclass for fast signature exchange has its >>>>>>>>>>> challenges - but it means that *our* concerns about maintaining >>>>>>>>>>> consistent behaviour in the default object model and avoiding adverse >>>>>>>>>>> effects on code that doesn't need the new behaviour are addressed >>>>>>>>>>> automatically. >>>>>>>>>>> >>>>>>>>>>> Also, I'd consider a functioning reference implementation using a custom >>>>>>>>>>> metaclass a requirement before we considered modifying type anyway, so I >>>>>>>>>>> think that's the best thing to pursue next rather than a PEP. It also >>>>>>>>>>> has the virtue of letting you choose which Python versions to target and >>>>>>>>>>> iterating at a faster rate than CPython. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This seems right on target. I could make a utility code C header for >>>>>>>>>> such a metaclass, and then the different libraries can all include it >>>>>>>>>> and handshake on which implementation becomes the real one through >>>>>>>>>> sys.modules during module initialization. That way an eventual PEP will >>>>>>>>>> only be a natural incremental step to make things more polished, whether >>>>>>>>>> that happens by making such a metaclass part of the standard library or >>>>>>>>>> by extending PyTypeObject. >>>>>>>>> >>>>>>>>> >>>>>>>>> So I finally got around to implementing this: >>>>>>>>> >>>>>>>>> https://github.com/dagss/pyextensibletype >>>>>>>>> >>>>>>>>> Documentation now in a draft in the NumFOCUS SEP repo, which I believe is a >>>>>>>>> better place to store cross-project standards like this. (The NumPy >>>>>>>>> docstring standard will be SEP 100). >>>>>>>>> >>>>>>>>> https://github.com/numfocus/sep/blob/master/sep200.rst >>>>>>>>> >>>>>>>>> Summary: >>>>>>>>> >>>>>>>>> - No common runtime dependency >>>>>>>>> >>>>>>>>> - 1 ns overhead per lookup (that's for the custom slot *alone*, no >>>>>>>>> fast-callable signature matching or similar) >>>>>>>>> >>>>>>>>> - Slight annoyance: Types that want to use the metaclass must be a >>>>>>>>> PyHeapExtensibleType, to make the binary layout work with how CPython makes >>>>>>>>> subclasses from Python scripts >>>>>>>>> >>>>>>>>> My conclusion: I think the metaclass approach should work really well. >>>>>>>> >>>>>>>> Few quick comments on skimming the code: >>>>>>>> >>>>>>>> The complicated nested #ifdef for __builtin_expect could be simplified to >>>>>>>> #if defined(__GNUC__)&& (__GNUC__> 2 || __GNUC_MINOR__> 95) >>>>>>>> >>>>>>>> PyCustomSlots_Check should be called PyCustomSlots_CheckExact, surely? >>>>>>>> And given that, how can this code work if someone does subclass this >>>>>>>> metaclass? >>>>>>> >>>>>>> I think we should provide a wrapper for PyType_Ready, which just >>>>>>> copies the pointer to the table and the count directly into the >>>>>>> subclass. If a user then wishes to add stuff, the user can allocate a >>>>>>> new memory region dynamically, memcpy the base class' stuff in there, >>>>>>> and append some entries. >>>>>> >>>>>> Maybe we should also allow each custom type to set a deallocator, >>>>>> since they are then heap types which can go out of scope. The >>>>>> metaclass can then call this deallocator to deallocate the table. >>>>> >>>>> Custom types are plain old Python objects, they can use tp_dealloc. >>>>> >>>> If I set etp_custom_slots to something allocated on the heap, then the >>>> (shared) metaclass would have to deallocate it. The tp_dealloc of the >>>> type itself would be called for its instances (which can be used to >>>> deallocate dynamically allocated memory in the objects if you use a >>>> custom slot "pointer offset"). >>> >>> Oh, I see. Right, the natural way to handle this would be have each >>> user define their own metaclass with the behavior they want. Another >>> argument for supporting multiple metaclasses simultaneously I guess... >>> >>> - N >>> _______________________________________________ >>> cython-devel mailing list >>> cython-devel at python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >> >> That bludgeons your constant time type check. > > Not if you steal a flag, like the interpreter already does with > Py_TPFLAGS_INT_SUBCLASS, Py_TPFLAGS_STRING_SUBCLASS, etc. I was > referring to that argument I made earlier :-) > >> It's easier to just >> reserve an extra slot for a deallocator pointer :) It would probably >> be set to NULL in the common case anyway, since you allocate your >> slots statically. Subclassing: Note that even if all types has to have a PyHeapTypeObject structure, they are still statically allocated! So for statically created subclasses (which should be the majority of the cases), there's not going to be any deallocator... I agree that there should be a PyExtensibleType_Ready. To keep allocating statically I propose that the subclass should leave some room open for slots from the superclass: PyCustomSlot subclass_custom_slots[10] = { {SLOT_C, foo}, {SLOT_D, BAR}, {0,0}, ... } Then, fill in etp_count=2, etp_custom_slots=subclass_custom_slots, and then call PyExtensibleType_Ready(&Subclass_Type, 10); i.e., the number of total elements in etp_custom_slots is passed in. One should always leave more room than one thinks one needs if the superclass is from another library... Then, inheritance happens according to the following rules: - Slots are inherited from superclass - Slots in subclass with same ID overwrites superclass - Slots from superclass are put before slots from subclass - Exception raised if the number of final slots is larger than the limit passed in to PyExtensibleType_Ready. (Whenever this is not sufficient, you can always manually munge the table after PyExtensibleType_Ready.) Question: How to deal with possible flag bits in the ID? Three approaches: a) Forget about the flags-in-ID idea; if you want flags, stick them in the data b) Embed a seperate variable for flags in every PyCustomSlot c) Standardize on a *hard* requirement on the bottom 8 bits being flags while the top 24 bits indicate incompatible slots; so for the purposes of inheritance, 0x12345601 would overwrite 0x12345600. To me, b) is OK, but the 32 bit ID space is already so ridiculously huge that c) is a "why not"? -1 on a), it'd be rather tedious if the payload is an offset to the PyObject*. Subclassing in heap-allocated types (subclasses Python side): It'd certainly be nice to completely ignore this for now and require making a sub-metaclass to support this (e.g., have tp_new parse some __customslots__ attribute in the class dict). Hijacking a TP_FLAG: We could make the branch for a direct hit on metaclass comparison likely(), so that the branch checking tp_base on the metaclass unlikely(), which with branch prediction I think makes it very likely that there's no penalty for allowing sub-metaclasses (when you don't use them -- when you do, there's a slight penalty). But here's another great argument in favour of a TP_FLAG bit: Consumers would then not need to import the metaclass or contain its definition (which is really only around in case the user imports the consumer before the provider...). This would make the header file the consumers need to bundle much lighter. So I think I'm +1. At any rate, I would like the metaclass rendezvous to keep happening just because it's less confusing if "extensibletype is extensibletype" in general. Anyway, the metaclass checking is a nice fallback if CPython uses all their flag bits. Dag From d.s.seljebotn at astro.uio.no Mon May 28 17:59:43 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 28 May 2012 17:59:43 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FC39A42.4040602@astro.uio.no> References: <4FB35ACA.7090908@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> <4FC39A42.4040602@astro.uio.no> Message-ID: <90d9723f-7003-48f7-b959-955c22850d2d@email.android.com> Dag Sverre Seljebotn wrote: >On 05/28/2012 01:24 PM, Nathaniel Smith wrote: >> On Mon, May 28, 2012 at 12:09 PM, mark florisson >> wrote: >>> On 28 May 2012 12:01, Nathaniel Smith wrote: >>>> On Mon, May 28, 2012 at 11:55 AM, mark florisson >>>> wrote: >>>>> On 28 May 2012 11:41, Nathaniel Smith wrote: >>>>>> On Mon, May 28, 2012 at 10:13 AM, mark florisson >>>>>> wrote: >>>>>>> On 28 May 2012 09:54, mark florisson >wrote: >>>>>>>> On 27 May 2012 23:12, Nathaniel Smith wrote: >>>>>>>>> On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn >>>>>>>>> wrote: >>>>>>>>>> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: >>>>>>>>>>> >>>>>>>>>>> On 05/18/2012 12:57 AM, Nick Coghlan wrote: >>>>>>>>>>>> >>>>>>>>>>>> I think the main things we'd be looking for would be: >>>>>>>>>>>> - a clear explanation of why a new metaclass is considered >too complex a >>>>>>>>>>>> solution >>>>>>>>>>>> - what the implications are for classes that have nothing >to do with the >>>>>>>>>>>> SciPy/NumPy ecosystem >>>>>>>>>>>> - how subclassing would behave (both at the class and >metaclass level) >>>>>>>>>>>> >>>>>>>>>>>> Yes, defining a new metaclass for fast signature exchange >has its >>>>>>>>>>>> challenges - but it means that *our* concerns about >maintaining >>>>>>>>>>>> consistent behaviour in the default object model and >avoiding adverse >>>>>>>>>>>> effects on code that doesn't need the new behaviour are >addressed >>>>>>>>>>>> automatically. >>>>>>>>>>>> >>>>>>>>>>>> Also, I'd consider a functioning reference implementation >using a custom >>>>>>>>>>>> metaclass a requirement before we considered modifying type >anyway, so I >>>>>>>>>>>> think that's the best thing to pursue next rather than a >PEP. It also >>>>>>>>>>>> has the virtue of letting you choose which Python versions >to target and >>>>>>>>>>>> iterating at a faster rate than CPython. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> This seems right on target. I could make a utility code C >header for >>>>>>>>>>> such a metaclass, and then the different libraries can all >include it >>>>>>>>>>> and handshake on which implementation becomes the real one >through >>>>>>>>>>> sys.modules during module initialization. That way an >eventual PEP will >>>>>>>>>>> only be a natural incremental step to make things more >polished, whether >>>>>>>>>>> that happens by making such a metaclass part of the standard >library or >>>>>>>>>>> by extending PyTypeObject. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> So I finally got around to implementing this: >>>>>>>>>> >>>>>>>>>> https://github.com/dagss/pyextensibletype >>>>>>>>>> >>>>>>>>>> Documentation now in a draft in the NumFOCUS SEP repo, which >I believe is a >>>>>>>>>> better place to store cross-project standards like this. (The >NumPy >>>>>>>>>> docstring standard will be SEP 100). >>>>>>>>>> >>>>>>>>>> https://github.com/numfocus/sep/blob/master/sep200.rst >>>>>>>>>> >>>>>>>>>> Summary: >>>>>>>>>> >>>>>>>>>> - No common runtime dependency >>>>>>>>>> >>>>>>>>>> - 1 ns overhead per lookup (that's for the custom slot >*alone*, no >>>>>>>>>> fast-callable signature matching or similar) >>>>>>>>>> >>>>>>>>>> - Slight annoyance: Types that want to use the metaclass >must be a >>>>>>>>>> PyHeapExtensibleType, to make the binary layout work with how >CPython makes >>>>>>>>>> subclasses from Python scripts >>>>>>>>>> >>>>>>>>>> My conclusion: I think the metaclass approach should work >really well. >>>>>>>>> >>>>>>>>> Few quick comments on skimming the code: >>>>>>>>> >>>>>>>>> The complicated nested #ifdef for __builtin_expect could be >simplified to >>>>>>>>> #if defined(__GNUC__)&& (__GNUC__> 2 || __GNUC_MINOR__> >95) >>>>>>>>> >>>>>>>>> PyCustomSlots_Check should be called PyCustomSlots_CheckExact, >surely? >>>>>>>>> And given that, how can this code work if someone does >subclass this >>>>>>>>> metaclass? >>>>>>>> >>>>>>>> I think we should provide a wrapper for PyType_Ready, which >just >>>>>>>> copies the pointer to the table and the count directly into the >>>>>>>> subclass. If a user then wishes to add stuff, the user can >allocate a >>>>>>>> new memory region dynamically, memcpy the base class' stuff in >there, >>>>>>>> and append some entries. >>>>>>> >>>>>>> Maybe we should also allow each custom type to set a >deallocator, >>>>>>> since they are then heap types which can go out of scope. The >>>>>>> metaclass can then call this deallocator to deallocate the >table. >>>>>> >>>>>> Custom types are plain old Python objects, they can use >tp_dealloc. >>>>>> >>>>> If I set etp_custom_slots to something allocated on the heap, then >the >>>>> (shared) metaclass would have to deallocate it. The tp_dealloc of >the >>>>> type itself would be called for its instances (which can be used >to >>>>> deallocate dynamically allocated memory in the objects if you use >a >>>>> custom slot "pointer offset"). >>>> >>>> Oh, I see. Right, the natural way to handle this would be have each >>>> user define their own metaclass with the behavior they want. >Another >>>> argument for supporting multiple metaclasses simultaneously I >guess... >>>> >>>> - N >>>> _______________________________________________ >>>> cython-devel mailing list >>>> cython-devel at python.org >>>> http://mail.python.org/mailman/listinfo/cython-devel >>> >>> That bludgeons your constant time type check. >> >> Not if you steal a flag, like the interpreter already does with >> Py_TPFLAGS_INT_SUBCLASS, Py_TPFLAGS_STRING_SUBCLASS, etc. I was >> referring to that argument I made earlier :-) >> >>> It's easier to just >>> reserve an extra slot for a deallocator pointer :) It would probably >>> be set to NULL in the common case anyway, since you allocate your >>> slots statically. > >Subclassing: Note that even if all types has to have a PyHeapTypeObject > >structure, they are still statically allocated! So for statically >created subclasses (which should be the majority of the cases), there's > >not going to be any deallocator... > >I agree that there should be a PyExtensibleType_Ready. To keep >allocating statically I propose that the subclass should leave some >room >open for slots from the superclass: > >PyCustomSlot subclass_custom_slots[10] = { > {SLOT_C, foo}, {SLOT_D, BAR}, {0,0}, ... >} > >Then, fill in etp_count=2, etp_custom_slots=subclass_custom_slots, and >then call PyExtensibleType_Ready(&Subclass_Type, 10); i.e., the number >of total elements in etp_custom_slots is passed in. > >One should always leave more room than one thinks one needs if the >superclass is from another library... > >Then, inheritance happens according to the following rules: > > - Slots are inherited from superclass > - Slots in subclass with same ID overwrites superclass > - Slots from superclass are put before slots from subclass > - Exception raised if the number of final slots is larger than the >limit passed in to PyExtensibleType_Ready. > >(Whenever this is not sufficient, you can always manually munge the >table after PyExtensibleType_Ready.) > >Question: How to deal with possible flag bits in the ID? > >Three approaches: > >a) Forget about the flags-in-ID idea; if you want flags, stick them in >the data > > b) Embed a seperate variable for flags in every PyCustomSlot > > c) Standardize on a *hard* requirement on the bottom 8 bits being >flags while the top 24 bits indicate incompatible slots; so for the >purposes of inheritance, 0x12345601 would overwrite 0x12345600. > >To me, b) is OK, but the 32 bit ID space is already so ridiculously >huge >that c) is a "why not"? -1 on a), it'd be rather tedious if the payload > I guess the sane thing to do is make the custom slot (id, flags, data); and have id and flags be 32 bits on all platforms. Otherwise 32 bits are wasted to padding on 64 bit platforms anyway. Is there a type one can safely use everywhere to get a 32 bit unsigned int? Does MSVC support stdint.h? Dag >is an offset to the PyObject*. > >Subclassing in heap-allocated types (subclasses Python side): It'd >certainly be nice to completely ignore this for now and require making >a >sub-metaclass to support this (e.g., have tp_new parse some >__customslots__ attribute in the class dict). > >Hijacking a TP_FLAG: We could make the branch for a direct hit on >metaclass comparison likely(), so that the branch checking tp_base on >the metaclass unlikely(), which with branch prediction I think makes it > >very likely that there's no penalty for allowing sub-metaclasses (when >you don't use them -- when you do, there's a slight penalty). > >But here's another great argument in favour of a TP_FLAG bit: Consumers > >would then not need to import the metaclass or contain its definition >(which is really only around in case the user imports the consumer >before the provider...). This would make the header file the consumers >need to bundle much lighter. So I think I'm +1. > >At any rate, I would like the metaclass rendezvous to keep happening >just because it's less confusing if "extensibletype is extensibletype" >in general. > >Anyway, the metaclass checking is a nice fallback if CPython uses all >their flag bits. > >Dag >_______________________________________________ >cython-devel mailing list >cython-devel at python.org >http://mail.python.org/mailman/listinfo/cython-devel -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. From d.s.seljebotn at astro.uio.no Mon May 28 21:11:24 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Mon, 28 May 2012 21:11:24 +0200 Subject: [Cython] NumFOCUS and continuous integration In-Reply-To: <84F88118-1C9D-4C4B-B644-36D68BF2B68A@gmail.com> References: <84F88118-1C9D-4C4B-B644-36D68BF2B68A@gmail.com> Message-ID: <4FC3CDDC.9010209@astro.uio.no> There's NumFOCUS money for CI -- also for the work used to configure and running it etc. It would not be only for Cython, but for open source scientific Python (Cython, NumPy, SciPy etc.). So if anybody would like to work up to 50% time on that (or perhaps just want to work for some weeks with setting up a CI system that can be shared by said projects) then please get in touch with Travis and NumFOCUS (see thread below). Some context: There's been heavy discussion on the NumFOCUS list on identifying a "core subset" of packages (Cython+NumPy+SciPy+Pandas+...) that could be tested and promoted together [1]. I started a discussion [2] to try to divert the question of CI-ing that core subset in the usual way ("release managing a software distribution") to instead be about giving those resources to the open source projects involved, so that there's not duplicate work in doing the CI-ing. Of course, Cython is in better standing than most already on the CI front thanks to Stefan and the Sage project, but the fact that the work is paid for may perhaps make it more appealing. Full threads: [1] https://groups.google.com/forum/?fromgroups#!topic/numfocus/MnRzBhmqXqk [2] https://groups.google.com/forum/?fromgroups#!topic/numfocus/I_kmL4FUGaY Dag -------- Original Message -------- Subject: Re: Continuous integration Date: Mon, 28 May 2012 13:47:52 -0500 From: Travis Oliphant Reply-To: numfocus at googlegroups.com To: numfocus at googlegroups.com On May 28, 2012, at 1:27 PM, Dag Sverre Seljebotn wrote: > On 05/28/2012 05:55 PM, David wrote: >> >> >> On Sunday, May 27, 2012 3:05:06 AM UTC+9, Dag Sverre Seljebotn wrote: >> >> That Other Thread contained some references to CI. So I'm mainly >> wondering what the current NumFOCUS plans for supporting CI efforts >> are, >> if any? Was there a mention on money being available for somebody to >> work on that? >> >> I think of Cython as one vertex in a CI graph: >> >> a) Upstream, Cython depends on various versions of NumPy and Python as >> part of its CI >> >> b) Downstream, we rely on building and testing Sage as an extended >> regression test suite. It's not just that our test suite doesn't have >> 100% coverage, but also sometimes that we intentionally break backwards >> compatibility, and fixing up Sage gives a nice indication of the >> consequences of a change. (I imagine NumPy is in a very similar >> situations with lots of libraries depending on it and potential for >> very >> subtle breakage of backwards compatibility.) >> >> Ideally, we'd have LOTS of libraries using Cython in our CI -- mpi4py, >> Pandas, scikits-learn... BUT, we're running into problems as it is with >> the CI server hardware keeping up (if there was infinite CI there'd >> probably be a lot more tests set up). >> >> Since most scientific-python libraries have both dependencies and >> dependees, it seems like there should be some benefit to having the >> same >> CI system test all of them. That could conserve both hardware use and >> administration overhead. And, which I believe is very important, >> make it >> easier for small projects like scikits-sparse to participate in >> automatic CI by simply participating in an environment maintained by >> the >> larger projects like Cython and NumPy. >> >> I think there's two ways of spinning this: >> >> 1) Build "Sciome" in a CI and approach it by "release managing" Sciome >> >> 2) Focus on providing "infrastructure for library developers", where >> there's a relatively big CI graph where each project has a node. I.e. >> something like "ShiningPanda for scientific Python", but with a >> critical >> difference being that each project can use the build artifacts of >> others; Cython would flip a switch and have all projects depending on >> Cython being rebuilt to try the new Cython version. This seems >> complicated, but certainly Jenkins for one seems to support such setups >> already so its mainly about hardware and administration... >> >> I know I get a lot more excited by 2) than 1), even if it's perhaps >> mainly the spin put on it rather than a technical difference. >> >> >> I know I prefer 2 as well. I know of at least one attempt of doing >> something like this (linux-only, though): the build service from open >> suse (http://www.open-build-service.org/). 4 years ago, it could already >> do useful bits that are not trivial in any CI I am aware of: >> >> - dependencies between packages is known: if you update the build >> service package numpy, and the build is successful, it will >> automatically rebuild all the dependent packages >> - it handles completely isolated build environments through vms >> - it produces rpm, debian, etc.. that can be easily installed in the >> supported distributions. >> >> My ideal setup would be something like this, but working on windows and >> mac (and not developed in perl), and with easier set up ala travis CI. I >> don't know yet the best way to go there: is is based on existing >> infrastructure (jenkins, travis, something else), building our own, a >> hybrid between the two ? > > What I was imagined was really something very simple (and I put it too convoluted). Just have NumFOCUS cash out :-) for one or more servers, then give everybody who wants shell access to a shared CI instance (running Jenkins or whatever). This sounds like a great plan. NumFOCUS can provide for this. We will just need a budget and board approval (but I suspect the board will be happy to approve this kind of plan). So, who wants to spec out the machine? There is some money. The problem is actually a people problem. Who is going to be available to do the work. So, far, I have not been able to find anyone experienced who is willing to work on this stuff at 50% time. If you are willing, let me know in a private email or an email to admin at numfocus.org Best, -Travis From robertwb at gmail.com Tue May 29 23:26:35 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Tue, 29 May 2012 14:26:35 -0700 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: On Mon, May 28, 2012 at 5:54 AM, mark florisson wrote: > On 28 May 2012 13:52, mark florisson wrote: >> On 28 May 2012 13:49, mark florisson wrote: >>> On 25 May 2012 21:53, Fr?d?ric Bastien wrote: >>>> Hi, >>>> >>>> >>>> Sorry for the delay, I had some schedule change. >>>> >>>> thanks for adding me. Should I subscribe to cython-dev? How much email >>>> daily there is? I didn't found this on the archives. Fell free to add >>>> me in CC again when you think it is appropriate. >>> >>> There is usually not so much traffic on cython-dev, unless something >>> comes up that is debated to the death :) >>> >>>> I'll reply here to all email at the same time. Do you prefer that I >>>> reply to each email individually if this happen again? I'll try to >>>> reply faster next time. >>> >>> No worries, either way works fine, don't worry too much about protocol >>> (the only thing to note is that we do bottom posting). >>> >>>> - About pickling theano, we currently can't pick Theano function. It >>>> could be made to work in some cases, but not for all cases as there is >>>> hardware dependent optimization in the Theano function. Currently it >>>> is mostly CPU vs GPU operation. So if we stay on the CPU, we could do >>>> some pickling, but we should make sure that the compiled c code into >>>> python module are still there when we unpickle or recompile them. >>>> >>>> - I think it make sense to make a theano graph from cython ast, >>>> optimize and redo a cython ast from the optimized graph. This would >>>> allow using Theano optimizations. >>> >>> Ok, the important thing is that the graph can be pickled, it should be >>> pretty straightforward to generate code to build the function again >>> from the loaded graph. >>> >>>> - It also make sense to do the code generation in Theano and reuse it >>>> in Cython. But this would make the Theano dependency much stronger. >>>> I'm not sure you want this. >>>> >>>> >>>> - Another point not raised, theano need to know at compile time is the >>>> dtype, number of dimensions and witch dimensions are broadcastable for >>>> each variable. I think that the last one could cause problem, but if >>>> you use specialization for the dtype, the same can be done for the >>>> broadcsatability of a dimensions. >>> >>> Hm, that would lead to kind of an explosion of combinations. I think >>> we could specialize only on no broadcasting at all (except for >>> operands with lesser dimensionality). >>> >>>> - The compyte(gpu nd array) project do collapsing of dimensions. This >>>> is an important optimization on the GPU as doing the index computation >>>> in parallel is costlier. I think on the CPU we could probably do >>>> collapsing just of the inner dimensions to make it faster. >>>> >>>> - Theano don't generate intrinsect or assembly, but we suppose that >>>> g++ will generate vectorized operation for simple loop. Recent version >>>> of gcc/g++ do this. >>> >>> Right, the aim is definitely to specialize for contiguous arrays, >>> where you collapse everything. Specializing statically for anything >>> more would be unfeasible, and better handled by a runtime compiler I >>> think. For the C backend, I'll start by generating simple C loops and >>> see if the compilers vectorize that already. >>> >>>> - Our generated code for element-wise operation take care a little >>>> about the memory access pattern. We swap dimensions to iterate on the >>>> dimensions with the smallest strides. But we don't go further. >>>> >>>> - What do you mean by CSE? Constant ?optimization? >>> >>> Yes, common subexpression elimination and also hoisting of unchanging >>> expressions outside the loop. >>> >>>> Fred >>> >>> I started a new project, https://github.com/markflorisson88/minivect , >>> which currently features a simple C code generator. The specializer >>> and astbuilder do most of the work of creating the right AST, so the >>> code generator only has to implement code generation functions for >>> simple expressions. Depending on how it progresses I will look at >>> incorporating Theano's optimizations into it and having Theano use it >>> as a C backend for compatible expressions. >> >> I forgot to mention, it's still pretty basic, but it works for simple >> arithmetic expressions with non-overlapping (shifted) memory from >> Cython: https://github.com/markflorisson88/cython/commit/2c316abdbc1228597bbdf480f737a59213ee9532#L4R1 > > So basically, this project is to be used as a git submodule in Cython, > and to be shipped directly in the source distribution. Is there any > objection to that? I'm not sure this is the best long-term solution (the alternative would be making it part of Cython or adding a dependency) but I think that's fine for now. I'm assuming there that the end user doesn't explicitly reference it, right? It's just an optimization if present. - Robert From markflorisson88 at gmail.com Wed May 30 00:22:01 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 29 May 2012 23:22:01 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: On 29 May 2012 22:26, Robert Bradshaw wrote: > On Mon, May 28, 2012 at 5:54 AM, mark florisson > wrote: >> On 28 May 2012 13:52, mark florisson wrote: >>> On 28 May 2012 13:49, mark florisson wrote: >>>> On 25 May 2012 21:53, Fr?d?ric Bastien wrote: >>>>> Hi, >>>>> >>>>> >>>>> Sorry for the delay, I had some schedule change. >>>>> >>>>> thanks for adding me. Should I subscribe to cython-dev? How much email >>>>> daily there is? I didn't found this on the archives. Fell free to add >>>>> me in CC again when you think it is appropriate. >>>> >>>> There is usually not so much traffic on cython-dev, unless something >>>> comes up that is debated to the death :) >>>> >>>>> I'll reply here to all email at the same time. Do you prefer that I >>>>> reply to each email individually if this happen again? I'll try to >>>>> reply faster next time. >>>> >>>> No worries, either way works fine, don't worry too much about protocol >>>> (the only thing to note is that we do bottom posting). >>>> >>>>> - About pickling theano, we currently can't pick Theano function. It >>>>> could be made to work in some cases, but not for all cases as there is >>>>> hardware dependent optimization in the Theano function. Currently it >>>>> is mostly CPU vs GPU operation. So if we stay on the CPU, we could do >>>>> some pickling, but we should make sure that the compiled c code into >>>>> python module are still there when we unpickle or recompile them. >>>>> >>>>> - I think it make sense to make a theano graph from cython ast, >>>>> optimize and redo a cython ast from the optimized graph. This would >>>>> allow using Theano optimizations. >>>> >>>> Ok, the important thing is that the graph can be pickled, it should be >>>> pretty straightforward to generate code to build the function again >>>> from the loaded graph. >>>> >>>>> - It also make sense to do the code generation in Theano and reuse it >>>>> in Cython. But this would make the Theano dependency much stronger. >>>>> I'm not sure you want this. >>>>> >>>>> >>>>> - Another point not raised, theano need to know at compile time is the >>>>> dtype, number of dimensions and witch dimensions are broadcastable for >>>>> each variable. I think that the last one could cause problem, but if >>>>> you use specialization for the dtype, the same can be done for the >>>>> broadcsatability of a dimensions. >>>> >>>> Hm, that would lead to kind of an explosion of combinations. I think >>>> we could specialize only on no broadcasting at all (except for >>>> operands with lesser dimensionality). >>>> >>>>> - The compyte(gpu nd array) project do collapsing of dimensions. This >>>>> is an important optimization on the GPU as doing the index computation >>>>> in parallel is costlier. I think on the CPU we could probably do >>>>> collapsing just of the inner dimensions to make it faster. >>>>> >>>>> - Theano don't generate intrinsect or assembly, but we suppose that >>>>> g++ will generate vectorized operation for simple loop. Recent version >>>>> of gcc/g++ do this. >>>> >>>> Right, the aim is definitely to specialize for contiguous arrays, >>>> where you collapse everything. Specializing statically for anything >>>> more would be unfeasible, and better handled by a runtime compiler I >>>> think. For the C backend, I'll start by generating simple C loops and >>>> see if the compilers vectorize that already. >>>> >>>>> - Our generated code for element-wise operation take care a little >>>>> about the memory access pattern. We swap dimensions to iterate on the >>>>> dimensions with the smallest strides. But we don't go further. >>>>> >>>>> - What do you mean by CSE? Constant ?optimization? >>>> >>>> Yes, common subexpression elimination and also hoisting of unchanging >>>> expressions outside the loop. >>>> >>>>> Fred >>>> >>>> I started a new project, https://github.com/markflorisson88/minivect , >>>> which currently features a simple C code generator. The specializer >>>> and astbuilder do most of the work of creating the right AST, so the >>>> code generator only has to implement code generation functions for >>>> simple expressions. Depending on how it progresses I will look at >>>> incorporating Theano's optimizations into it and having Theano use it >>>> as a C backend for compatible expressions. >>> >>> I forgot to mention, it's still pretty basic, but it works for simple >>> arithmetic expressions with non-overlapping (shifted) memory from >>> Cython: https://github.com/markflorisson88/cython/commit/2c316abdbc1228597bbdf480f737a59213ee9532#L4R1 >> >> So basically, this project is to be used as a git submodule in Cython, >> and to be shipped directly in the source distribution. Is there any >> objection to that? > > I'm not sure this is the best long-term solution (the alternative > would be making it part of Cython or adding a dependency) but I think > that's fine for now. I'm assuming there that the end user doesn't > explicitly reference it, right? It's just an optimization if present. > > - Robert > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel The only gotcha is that when you checkout Cython from github you will need to update the submodule. It's not really an optimization, it's an implementation, that is the Cython AST needs to be mapped onto the new AST, and then it generates the C code from that. I'm currently working on interweaving this AST with Cython's AST, to support operations on objects and complex numbers, as well as provide Cython semantics for division and such (complex numbers are working now). If it's all working, it might be possible to create an LLVM backend with reasonably ease as well for our vector expressions, to provide optimal just-in-time specializations. From markflorisson88 at gmail.com Wed May 30 00:23:10 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 29 May 2012 23:23:10 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: On 29 May 2012 23:22, mark florisson wrote: > On 29 May 2012 22:26, Robert Bradshaw wrote: >> On Mon, May 28, 2012 at 5:54 AM, mark florisson >> wrote: >>> On 28 May 2012 13:52, mark florisson wrote: >>>> On 28 May 2012 13:49, mark florisson wrote: >>>>> On 25 May 2012 21:53, Fr?d?ric Bastien wrote: >>>>>> Hi, >>>>>> >>>>>> >>>>>> Sorry for the delay, I had some schedule change. >>>>>> >>>>>> thanks for adding me. Should I subscribe to cython-dev? How much email >>>>>> daily there is? I didn't found this on the archives. Fell free to add >>>>>> me in CC again when you think it is appropriate. >>>>> >>>>> There is usually not so much traffic on cython-dev, unless something >>>>> comes up that is debated to the death :) >>>>> >>>>>> I'll reply here to all email at the same time. Do you prefer that I >>>>>> reply to each email individually if this happen again? I'll try to >>>>>> reply faster next time. >>>>> >>>>> No worries, either way works fine, don't worry too much about protocol >>>>> (the only thing to note is that we do bottom posting). >>>>> >>>>>> - About pickling theano, we currently can't pick Theano function. It >>>>>> could be made to work in some cases, but not for all cases as there is >>>>>> hardware dependent optimization in the Theano function. Currently it >>>>>> is mostly CPU vs GPU operation. So if we stay on the CPU, we could do >>>>>> some pickling, but we should make sure that the compiled c code into >>>>>> python module are still there when we unpickle or recompile them. >>>>>> >>>>>> - I think it make sense to make a theano graph from cython ast, >>>>>> optimize and redo a cython ast from the optimized graph. This would >>>>>> allow using Theano optimizations. >>>>> >>>>> Ok, the important thing is that the graph can be pickled, it should be >>>>> pretty straightforward to generate code to build the function again >>>>> from the loaded graph. >>>>> >>>>>> - It also make sense to do the code generation in Theano and reuse it >>>>>> in Cython. But this would make the Theano dependency much stronger. >>>>>> I'm not sure you want this. >>>>>> >>>>>> >>>>>> - Another point not raised, theano need to know at compile time is the >>>>>> dtype, number of dimensions and witch dimensions are broadcastable for >>>>>> each variable. I think that the last one could cause problem, but if >>>>>> you use specialization for the dtype, the same can be done for the >>>>>> broadcsatability of a dimensions. >>>>> >>>>> Hm, that would lead to kind of an explosion of combinations. I think >>>>> we could specialize only on no broadcasting at all (except for >>>>> operands with lesser dimensionality). >>>>> >>>>>> - The compyte(gpu nd array) project do collapsing of dimensions. This >>>>>> is an important optimization on the GPU as doing the index computation >>>>>> in parallel is costlier. I think on the CPU we could probably do >>>>>> collapsing just of the inner dimensions to make it faster. >>>>>> >>>>>> - Theano don't generate intrinsect or assembly, but we suppose that >>>>>> g++ will generate vectorized operation for simple loop. Recent version >>>>>> of gcc/g++ do this. >>>>> >>>>> Right, the aim is definitely to specialize for contiguous arrays, >>>>> where you collapse everything. Specializing statically for anything >>>>> more would be unfeasible, and better handled by a runtime compiler I >>>>> think. For the C backend, I'll start by generating simple C loops and >>>>> see if the compilers vectorize that already. >>>>> >>>>>> - Our generated code for element-wise operation take care a little >>>>>> about the memory access pattern. We swap dimensions to iterate on the >>>>>> dimensions with the smallest strides. But we don't go further. >>>>>> >>>>>> - What do you mean by CSE? Constant ?optimization? >>>>> >>>>> Yes, common subexpression elimination and also hoisting of unchanging >>>>> expressions outside the loop. >>>>> >>>>>> Fred >>>>> >>>>> I started a new project, https://github.com/markflorisson88/minivect , >>>>> which currently features a simple C code generator. The specializer >>>>> and astbuilder do most of the work of creating the right AST, so the >>>>> code generator only has to implement code generation functions for >>>>> simple expressions. Depending on how it progresses I will look at >>>>> incorporating Theano's optimizations into it and having Theano use it >>>>> as a C backend for compatible expressions. >>>> >>>> I forgot to mention, it's still pretty basic, but it works for simple >>>> arithmetic expressions with non-overlapping (shifted) memory from >>>> Cython: https://github.com/markflorisson88/cython/commit/2c316abdbc1228597bbdf480f737a59213ee9532#L4R1 >>> >>> So basically, this project is to be used as a git submodule in Cython, >>> and to be shipped directly in the source distribution. Is there any >>> objection to that? >> >> I'm not sure this is the best long-term solution (the alternative >> would be making it part of Cython or adding a dependency) but I think >> that's fine for now. I'm assuming there that the end user doesn't >> explicitly reference it, right? It's just an optimization if present. >> >> - Robert >> _______________________________________________ >> cython-devel mailing list >> cython-devel at python.org >> http://mail.python.org/mailman/listinfo/cython-devel > > The only gotcha is that when you checkout Cython from github you will > need to update the submodule. It's not really an optimization, it's an > implementation, that is the Cython AST needs to be mapped onto the new > AST, and then it generates the C code from that. I'm currently working > on interweaving this AST with Cython's AST, to support operations on > objects and complex numbers, as well as provide Cython semantics for > division and such (complex numbers are working now). > > If it's all working, it might be possible to create an LLVM backend > with reasonably ease as well for our vector expressions, to provide > optimal just-in-time specializations. (Eventually the project itself might support such things, but for now this is easier, and it may be useful in other situations as well). From robertwb at gmail.com Wed May 30 00:29:07 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Tue, 29 May 2012 15:29:07 -0700 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: On Tue, May 29, 2012 at 3:22 PM, mark florisson wrote: >>>>> I started a new project, https://github.com/markflorisson88/minivect , >>>>> which currently features a simple C code generator. The specializer >>>>> and astbuilder do most of the work of creating the right AST, so the >>>>> code generator only has to implement code generation functions for >>>>> simple expressions. Depending on how it progresses I will look at >>>>> incorporating Theano's optimizations into it and having Theano use it >>>>> as a C backend for compatible expressions. >>>> >>>> I forgot to mention, it's still pretty basic, but it works for simple >>>> arithmetic expressions with non-overlapping (shifted) memory from >>>> Cython: https://github.com/markflorisson88/cython/commit/2c316abdbc1228597bbdf480f737a59213ee9532#L4R1 >>> >>> So basically, this project is to be used as a git submodule in Cython, >>> and to be shipped directly in the source distribution. Is there any >>> objection to that? >> >> I'm not sure this is the best long-term solution (the alternative >> would be making it part of Cython or adding a dependency) but I think >> that's fine for now. I'm assuming there that the end user doesn't >> explicitly reference it, right? It's just an optimization if present. >> >> - Robert > > The only gotcha is that when you checkout Cython from github you will > need to update the submodule. Manually, right? How badly do things go wrong if you forget to? > It's not really an optimization, it's an > implementation, that is the Cython AST needs to be mapped onto the new > AST, and then it generates the C code from that. I meant in the sense that the user never refers to this code explicitly, so we have the flexibility of merging it in/splitting it off/moving it around internally without breaking users, right? - Robert From markflorisson88 at gmail.com Wed May 30 00:32:15 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Tue, 29 May 2012 23:32:15 +0100 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: On 29 May 2012 23:29, Robert Bradshaw wrote: > On Tue, May 29, 2012 at 3:22 PM, mark florisson > wrote: >>>>>> I started a new project, https://github.com/markflorisson88/minivect , >>>>>> which currently features a simple C code generator. The specializer >>>>>> and astbuilder do most of the work of creating the right AST, so the >>>>>> code generator only has to implement code generation functions for >>>>>> simple expressions. Depending on how it progresses I will look at >>>>>> incorporating Theano's optimizations into it and having Theano use it >>>>>> as a C backend for compatible expressions. >>>>> >>>>> I forgot to mention, it's still pretty basic, but it works for simple >>>>> arithmetic expressions with non-overlapping (shifted) memory from >>>>> Cython: https://github.com/markflorisson88/cython/commit/2c316abdbc1228597bbdf480f737a59213ee9532#L4R1 >>>> >>>> So basically, this project is to be used as a git submodule in Cython, >>>> and to be shipped directly in the source distribution. Is there any >>>> objection to that? >>> >>> I'm not sure this is the best long-term solution (the alternative >>> would be making it part of Cython or adding a dependency) but I think >>> that's fine for now. I'm assuming there that the end user doesn't >>> explicitly reference it, right? It's just an optimization if present. >>> >>> - Robert >> >> The only gotcha is that when you checkout Cython from github you will >> need to update the submodule. > > Manually, right? How badly do things go wrong if you forget to? Unfortunately, yes, I don't think there's an automatic way with git. The compiler could display an error message, like "did you forget to do ...". >> It's not really an optimization, it's an >> implementation, that is the Cython AST needs to be mapped onto the new >> AST, and then it generates the C code from that. > > I meant in the sense that the user never refers to this code > explicitly, so we have the flexibility of merging it in/splitting it > off/moving it around internally without breaking users, right? Definitely, we can do that if we wish. I don't know how easy merging in new code is if we also modify the Cython version though. > - Robert > _______________________________________________ > cython-devel mailing list > cython-devel at python.org > http://mail.python.org/mailman/listinfo/cython-devel From robertwb at gmail.com Wed May 30 00:43:07 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Tue, 29 May 2012 15:43:07 -0700 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: On Tue, May 29, 2012 at 3:32 PM, mark florisson wrote: > On 29 May 2012 23:29, Robert Bradshaw wrote: >> On Tue, May 29, 2012 at 3:22 PM, mark florisson >> wrote: >>>>>>> I started a new project, https://github.com/markflorisson88/minivect , >>>>>>> which currently features a simple C code generator. The specializer >>>>>>> and astbuilder do most of the work of creating the right AST, so the >>>>>>> code generator only has to implement code generation functions for >>>>>>> simple expressions. Depending on how it progresses I will look at >>>>>>> incorporating Theano's optimizations into it and having Theano use it >>>>>>> as a C backend for compatible expressions. >>>>>> >>>>>> I forgot to mention, it's still pretty basic, but it works for simple >>>>>> arithmetic expressions with non-overlapping (shifted) memory from >>>>>> Cython: https://github.com/markflorisson88/cython/commit/2c316abdbc1228597bbdf480f737a59213ee9532#L4R1 >>>>> >>>>> So basically, this project is to be used as a git submodule in Cython, >>>>> and to be shipped directly in the source distribution. Is there any >>>>> objection to that? >>>> >>>> I'm not sure this is the best long-term solution (the alternative >>>> would be making it part of Cython or adding a dependency) but I think >>>> that's fine for now. I'm assuming there that the end user doesn't >>>> explicitly reference it, right? It's just an optimization if present. >>>> >>>> - Robert >>> >>> The only gotcha is that when you checkout Cython from github you will >>> need to update the submodule. >> >> Manually, right? How badly do things go wrong if you forget to? > > Unfortunately, yes, I don't think there's an automatic way with git. > The compiler could display an error message, like "did you forget to > do ...". OK. I think we can live with this at this stage of development. - Robert From nouiz at nouiz.org Wed May 30 17:27:27 2012 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Wed, 30 May 2012 11:27:27 -0400 Subject: [Cython] gsoc: array expressions In-Reply-To: References: <4FBA1A40.2060202@astro.uio.no> <4FBA2392.2010105@astro.uio.no> Message-ID: On Mon, May 28, 2012 at 8:49 AM, mark florisson wrote: > On 25 May 2012 21:53, Fr?d?ric Bastien wrote: >> - About pickling theano, we currently can't pick Theano function. It >> could be made to work in some cases, but not for all cases as there is >> hardware dependent optimization in the Theano function. Currently it >> is mostly CPU vs GPU operation. So if we stay on the CPU, we could do >> some pickling, but we should make sure that the compiled c code into >> python module are still there when we unpickle or recompile them. >> >> - I think it make sense to make a theano graph from cython ast, >> optimize and redo a cython ast from the optimized graph. This would >> allow using Theano optimizations. > > Ok, the important thing is that the graph can be pickled, it should be > pretty straightforward to generate code to build the function again > from the loaded graph. We can pickle not compiled graph. So no problem here. >> - It also make sense to do the code generation in Theano and reuse it >> in Cython. But this would make the Theano dependency much stronger. >> I'm not sure you want this. >> >> >> - Another point not raised, theano need to know at compile time is the >> dtype, number of dimensions and witch dimensions are broadcastable for >> each variable. I think that the last one could cause problem, but if >> you use specialization for the dtype, the same can be done for the >> broadcsatability of a dimensions. > > Hm, that would lead to kind of an explosion of combinations. I think > we could specialize only on no broadcasting at all (except for > operands with lesser dimensionality). I expect that in normal user script they won't user all the combination :) So I won't worry about it at first. If there is a need, we could parametrise Theano op (especially the Elemwise op) so that when a dimensions is marked as not broadcasted, it also work when it is broadcasted. In the case of Elemwise, it is probably just the error checking code that will need to change. >> - The compyte(gpu nd array) project do collapsing of dimensions. This >> is an important optimization on the GPU as doing the index computation >> in parallel is costlier. I think on the CPU we could probably do >> collapsing just of the inner dimensions to make it faster. >> >> - Theano don't generate intrinsect or assembly, but we suppose that >> g++ will generate vectorized operation for simple loop. Recent version >> of gcc/g++ do this. > > Right, the aim is definitely to specialize for contiguous arrays, > where you collapse everything. Specializing statically for anything > more would be unfeasible, and better handled by a runtime compiler I > think. For the C backend, I'll start by generating simple C loops and > see if the compilers vectorize that already. I was under the impression you where doing run time code generation. I mixed the ongoing project. But collapsing the inner dimensions could still be useful as if you don't write explicitly all the loop, you will call a function or make a loop over the number of dimensions. It will reduice this number of looping. If the inner dimensions is small(ex matrix of shape (10000, 3)) this can be useful. But that is less important that the default contiguous case. >> - Our generated code for element-wise operation take care a little >> about the memory access pattern. We swap dimensions to iterate on the >> dimensions with the smallest strides. But we don't go further. >> >> - What do you mean by CSE? Constant ?optimization? > > Yes, common subexpression elimination and also hoisting of unchanging > expressions outside the loop. Theano do CSE in the merge optimization. As for lifting expression outside of loop, we do it for Theano Scan(our loop), but they are not normal loop. It is much better to use tensor expression then scan if possible. > I started a new project, https://github.com/markflorisson88/minivect , > which currently features a simple C code generator. The specializer > and astbuilder do most of the work of creating the right AST, so the > code generator only has to implement code generation functions for > simple expressions. Depending on how it progresses I will look at > incorporating Theano's optimizations into it and having Theano use it > as a C backend for compatible expressions. Great, when you think it is a good time for me to look at it, tell me. Do it mimic cython internal? If so, is there doc about it so that I look at it? Fred From robertwb at gmail.com Wed May 30 20:49:40 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Wed, 30 May 2012 11:49:40 -0700 Subject: [Cython] [cython-users] Re: How to use Cython to wrap a C++ template class for use in Python? In-Reply-To: References: <29917398.731.1321192554566.JavaMail.geo-discussion-forums@prlm15> <255edfdb-78a0-4b79-8421-a8e03fdc97d4@googlegroups.com> Message-ID: On Tue, May 29, 2012 at 5:42 PM, Paul Leopardi wrote: > > On Wednesday, 30 May 2012 02:40:47 UTC+10, Chris Barker wrote: >> >> >> Well, the third option is to use your own home-made templatesto >> auto-generate cython code for each type. That's actually not as >> painful as it sounds -- there are lot's of templating systems for >> python -- Cheetah, for instance is designed for "any" use, not just >> html (though I'm never used it for Cython). >> >> http://www.cheetahtemplate.org/ >> >> You might want to look at the "bottleneck" project -- they did >> something like this -- not for calling a C++ templates, but the >> principle is the same. >> >> http://berkeleyanalytics.com/bottleneck/ >> > > Chris, thanks for your suggestions. I will take a look at Cheetah, but I'm > afraid it would add extra complication to an already very complicated build > process for?PyClical. What I would like is a way for Cython to make > instantiations of my C++ template classes visible to Python. Cython supports > C++ templates, and Cython now has fused types, but as far as I can tell, > these two template concepts within Cython do not work with each other in any > meaningful, documented way, let alone support what I am proposing. Maybe I > am just too late to propose another Google Summer of Code project, or > perhaps the whole idea is all too hard? Surely there must be a wider > use-case for this idea than just me and my one library? Should I repost to > the core developers list? I think there are several issues with why Cython does not (yet?) have these capabilities, primarily: (1) Metaprogramming done Right can be very nice, but done wrong is disastrous, (2) The AST of Cython is really not that nice to work with, and (3) If it's just about specifying types, template preprocessing and, eventually, using JITs may be sufficient, more flexible, and certainly better than a half-baked solution. The thread you linked to also has some good discussion. We actually had a lot of discussion about this issue at the Cython Days workshop last year, and never hit upon a metaprogramming framework that seemed Right. Consensus was that until a clean proposal was put forward, we would focus on making it easy to use your templating engine of choice and implement fused types which would cover 85% or more of the need for metaproramming (especially tight loops over numeric types, as opposed to more generic cases where Python objects and vtables are often good enough). The fact that fused types don't work with C++ specializations is certainly a bug, though this still limits us to a fixed number of instantiations (and separate Python classes) in Python space due to the nature of C++ templates. - Robert From markflorisson88 at gmail.com Wed May 30 21:07:27 2012 From: markflorisson88 at gmail.com (mark florisson) Date: Wed, 30 May 2012 20:07:27 +0100 Subject: [Cython] [cython-users] Re: How to use Cython to wrap a C++ template class for use in Python? In-Reply-To: References: <29917398.731.1321192554566.JavaMail.geo-discussion-forums@prlm15> <255edfdb-78a0-4b79-8421-a8e03fdc97d4@googlegroups.com> Message-ID: On 30 May 2012 19:49, Robert Bradshaw wrote: > On Tue, May 29, 2012 at 5:42 PM, Paul Leopardi > wrote: >> >> On Wednesday, 30 May 2012 02:40:47 UTC+10, Chris Barker wrote: >>> >>> >>> Well, the third option is to use your own home-made templatesto >>> auto-generate cython code for each type. That's actually not as >>> painful as it sounds -- there are lot's of templating systems for >>> python -- Cheetah, for instance is designed for "any" use, not just >>> html (though I'm never used it for Cython). >>> >>> http://www.cheetahtemplate.org/ >>> >>> You might want to look at the "bottleneck" project -- they did >>> something like this -- not for calling a C++ templates, but the >>> principle is the same. >>> >>> http://berkeleyanalytics.com/bottleneck/ >>> >> >> Chris, thanks for your suggestions. I will take a look at Cheetah, but I'm >> afraid it would add extra complication to an already very complicated build >> process for?PyClical. What I would like is a way for Cython to make >> instantiations of my C++ template classes visible to Python. Cython supports >> C++ templates, and Cython now has fused types, but as far as I can tell, >> these two template concepts within Cython do not work with each other in any >> meaningful, documented way, let alone support what I am proposing. Maybe I >> am just too late to propose another Google Summer of Code project, or >> perhaps the whole idea is all too hard? Surely there must be a wider >> use-case for this idea than just me and my one library? Should I repost to >> the core developers list? > > I think there are several issues with why Cython does not (yet?) have > these capabilities, primarily: > > (1) Metaprogramming done Right can be very nice, but done wrong is disastrous, > (2) The AST of Cython is really not that nice to work with, and > (3) If it's just about specifying types, template preprocessing and, > eventually, using JITs may be sufficient, more flexible, and certainly > better than a half-baked solution. > > The thread you linked to also has some good discussion. > > We actually had a lot of discussion about this issue at the Cython > Days workshop last year, and never hit upon a metaprogramming > framework that seemed Right. Consensus was that until a clean proposal > was put forward, we would focus on making it easy to use your > templating engine of choice and implement fused types which would > cover 85% or more of the need for metaproramming (especially tight > loops over numeric types, as opposed to more generic cases where > Python objects and vtables are often good enough). Definitely, fused types are far from ideal, even if cdef class fused attributes would be supported. > The fact that fused types don't work with C++ specializations is > certainly a bug, though this still limits us to a fixed number of > instantiations (and separate Python classes) in Python space due to > the nature of C++ templates. > > - Robert If there really is a bug, then please report the use case (anyone who can find one). C++ and fused types *do* work together (unless there really is a bug), but the thing that is not supported (for anything) is fused types for cdef class attributes. One, entirely terrible but working, way to get around from Python space is the metaclass hack I posted earlier. From dave.hirschfeld at gmail.com Thu May 31 12:09:18 2012 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Thu, 31 May 2012 10:09:18 +0000 (UTC) Subject: [Cython] =?utf-8?q?vcvarsall=2Ebat_error_on_win32_with_mingw_when?= =?utf-8?q?_using_cython=5Finline?= Message-ID: Hi cython devs, I got the "unable to find vcvarsall.bat" error when using cython_inline when trying to compile with mingw. Using cython normally (creating a setup.py file) worked fine however. I was able to fix the problem for me by changing inline.py to parse the config files before building the extension. I've created a pull request with my changes: https://github.com/cython/cython/pull/129 Apologies if I haven't used the appropriate workflow - I'm fairly new to git and this is my first pull request! HTH, Dave From d.s.seljebotn at astro.uio.no Thu May 31 16:04:12 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 31 May 2012 16:04:12 +0200 Subject: [Cython] SEP 201 draft: Native callable objects Message-ID: <4FC77A5C.50009@astro.uio.no> [Discussion on numfocus at googlegroups.com please] I've uploaded a draft-state SEP 201 (previously CEP 1000): https://github.com/numfocus/sep/blob/master/sep201.rst """ Many callable objects are simply wrappers around native code. This holds for any Cython function, f2py functions, manually written CPython extensions, Numba, etc. Obviously, when native code calls other native code, it would be nice to skip the significant cost of boxing and unboxing all the arguments. """ The thread about this on the Cython list is almost endless: http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 There was a long discussion on the key-comparison vs. interned-string approach. I've written both up in SEP 201 since it was the major point of contention. There was some benchmarks starting here: http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 And why provide a table and not a get_function_pointer starting here: http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 For those who followed that and don't want to read the entire spec, the aspect of flags is new. How do we avoid to duplicate entries/check against two signatures for cases like a GIL-holding caller wanting to call a nogil function? My take: For key-comparison you can compare under a mask, for interned-string we should have additional flags field. The situation is a bit awkward: The Cython list consensus (well, me and Robert Bradshaw) decided on what is "Approach 1" (key-comparison) in SEP 201. I pushed for that. Still, now that a month has passed, I just think key-comparison is too ugly, and that the interning mechanism shouldn't be *that* hard to code up, probably 500 lines of C code if one just requires the GIL in a first iteration, and that keeping the spec simpler is more important. So I'm tentatively proposing Approach 2. Dag From d.s.seljebotn at astro.uio.no Thu May 31 16:20:18 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 31 May 2012 16:20:18 +0200 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <90d9723f-7003-48f7-b959-955c22850d2d@email.android.com> References: <4FB35ACA.7090908@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> <4FC39A42.4040602@astro.uio.no> <90d9723f-7003-48f7-b959-955c22850d2d@email.android.com> Message-ID: <4FC77E22.3030201@astro.uio.no> On 05/28/2012 05:59 PM, Dag Sverre Seljebotn wrote: > > > Dag Sverre Seljebotn wrote: > >> On 05/28/2012 01:24 PM, Nathaniel Smith wrote: >>> On Mon, May 28, 2012 at 12:09 PM, mark florisson >>> wrote: >>>> On 28 May 2012 12:01, Nathaniel Smith wrote: >>>>> On Mon, May 28, 2012 at 11:55 AM, mark florisson >>>>> wrote: >>>>>> On 28 May 2012 11:41, Nathaniel Smith wrote: >>>>>>> On Mon, May 28, 2012 at 10:13 AM, mark florisson >>>>>>> wrote: >>>>>>>> On 28 May 2012 09:54, mark florisson >> wrote: >>>>>>>>> On 27 May 2012 23:12, Nathaniel Smith wrote: >>>>>>>>>> On Sun, May 27, 2012 at 10:24 PM, Dag Sverre Seljebotn >>>>>>>>>> wrote: >>>>>>>>>>> On 05/18/2012 10:30 AM, Dag Sverre Seljebotn wrote: >>>>>>>>>>>> >>>>>>>>>>>> On 05/18/2012 12:57 AM, Nick Coghlan wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> I think the main things we'd be looking for would be: >>>>>>>>>>>>> - a clear explanation of why a new metaclass is considered >> too complex a >>>>>>>>>>>>> solution >>>>>>>>>>>>> - what the implications are for classes that have nothing >> to do with the >>>>>>>>>>>>> SciPy/NumPy ecosystem >>>>>>>>>>>>> - how subclassing would behave (both at the class and >> metaclass level) >>>>>>>>>>>>> >>>>>>>>>>>>> Yes, defining a new metaclass for fast signature exchange >> has its >>>>>>>>>>>>> challenges - but it means that *our* concerns about >> maintaining >>>>>>>>>>>>> consistent behaviour in the default object model and >> avoiding adverse >>>>>>>>>>>>> effects on code that doesn't need the new behaviour are >> addressed >>>>>>>>>>>>> automatically. >>>>>>>>>>>>> >>>>>>>>>>>>> Also, I'd consider a functioning reference implementation >> using a custom >>>>>>>>>>>>> metaclass a requirement before we considered modifying type >> anyway, so I >>>>>>>>>>>>> think that's the best thing to pursue next rather than a >> PEP. It also >>>>>>>>>>>>> has the virtue of letting you choose which Python versions >> to target and >>>>>>>>>>>>> iterating at a faster rate than CPython. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> This seems right on target. I could make a utility code C >> header for >>>>>>>>>>>> such a metaclass, and then the different libraries can all >> include it >>>>>>>>>>>> and handshake on which implementation becomes the real one >> through >>>>>>>>>>>> sys.modules during module initialization. That way an >> eventual PEP will >>>>>>>>>>>> only be a natural incremental step to make things more >> polished, whether >>>>>>>>>>>> that happens by making such a metaclass part of the standard >> library or >>>>>>>>>>>> by extending PyTypeObject. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> So I finally got around to implementing this: >>>>>>>>>>> >>>>>>>>>>> https://github.com/dagss/pyextensibletype >>>>>>>>>>> >>>>>>>>>>> Documentation now in a draft in the NumFOCUS SEP repo, which >> I believe is a >>>>>>>>>>> better place to store cross-project standards like this. (The >> NumPy >>>>>>>>>>> docstring standard will be SEP 100). >>>>>>>>>>> >>>>>>>>>>> https://github.com/numfocus/sep/blob/master/sep200.rst >>>>>>>>>>> >>>>>>>>>>> Summary: >>>>>>>>>>> >>>>>>>>>>> - No common runtime dependency >>>>>>>>>>> >>>>>>>>>>> - 1 ns overhead per lookup (that's for the custom slot >> *alone*, no >>>>>>>>>>> fast-callable signature matching or similar) >>>>>>>>>>> >>>>>>>>>>> - Slight annoyance: Types that want to use the metaclass >> must be a >>>>>>>>>>> PyHeapExtensibleType, to make the binary layout work with how >> CPython makes >>>>>>>>>>> subclasses from Python scripts >>>>>>>>>>> >>>>>>>>>>> My conclusion: I think the metaclass approach should work >> really well. >>>>>>>>>> >>>>>>>>>> Few quick comments on skimming the code: >>>>>>>>>> >>>>>>>>>> The complicated nested #ifdef for __builtin_expect could be >> simplified to >>>>>>>>>> #if defined(__GNUC__)&& (__GNUC__> 2 || __GNUC_MINOR__> >> 95) >>>>>>>>>> >>>>>>>>>> PyCustomSlots_Check should be called PyCustomSlots_CheckExact, >> surely? >>>>>>>>>> And given that, how can this code work if someone does >> subclass this >>>>>>>>>> metaclass? >>>>>>>>> >>>>>>>>> I think we should provide a wrapper for PyType_Ready, which >> just >>>>>>>>> copies the pointer to the table and the count directly into the >>>>>>>>> subclass. If a user then wishes to add stuff, the user can >> allocate a >>>>>>>>> new memory region dynamically, memcpy the base class' stuff in >> there, >>>>>>>>> and append some entries. >>>>>>>> >>>>>>>> Maybe we should also allow each custom type to set a >> deallocator, >>>>>>>> since they are then heap types which can go out of scope. The >>>>>>>> metaclass can then call this deallocator to deallocate the >> table. >>>>>>> >>>>>>> Custom types are plain old Python objects, they can use >> tp_dealloc. >>>>>>> >>>>>> If I set etp_custom_slots to something allocated on the heap, then >> the >>>>>> (shared) metaclass would have to deallocate it. The tp_dealloc of >> the >>>>>> type itself would be called for its instances (which can be used >> to >>>>>> deallocate dynamically allocated memory in the objects if you use >> a >>>>>> custom slot "pointer offset"). >>>>> >>>>> Oh, I see. Right, the natural way to handle this would be have each >>>>> user define their own metaclass with the behavior they want. >> Another >>>>> argument for supporting multiple metaclasses simultaneously I >> guess... >>>>> >>>>> - N >>>>> _______________________________________________ >>>>> cython-devel mailing list >>>>> cython-devel at python.org >>>>> http://mail.python.org/mailman/listinfo/cython-devel >>>> >>>> That bludgeons your constant time type check. >>> >>> Not if you steal a flag, like the interpreter already does with >>> Py_TPFLAGS_INT_SUBCLASS, Py_TPFLAGS_STRING_SUBCLASS, etc. I was >>> referring to that argument I made earlier :-) >>> >>>> It's easier to just >>>> reserve an extra slot for a deallocator pointer :) It would probably >>>> be set to NULL in the common case anyway, since you allocate your >>>> slots statically. >> >> Subclassing: Note that even if all types has to have a PyHeapTypeObject >> >> structure, they are still statically allocated! So for statically >> created subclasses (which should be the majority of the cases), there's >> >> not going to be any deallocator... >> >> I agree that there should be a PyExtensibleType_Ready. To keep >> allocating statically I propose that the subclass should leave some >> room >> open for slots from the superclass: >> >> PyCustomSlot subclass_custom_slots[10] = { >> {SLOT_C, foo}, {SLOT_D, BAR}, {0,0}, ... >> } >> >> Then, fill in etp_count=2, etp_custom_slots=subclass_custom_slots, and >> then call PyExtensibleType_Ready(&Subclass_Type, 10); i.e., the number >> of total elements in etp_custom_slots is passed in. >> >> One should always leave more room than one thinks one needs if the >> superclass is from another library... >> >> Then, inheritance happens according to the following rules: >> >> - Slots are inherited from superclass >> - Slots in subclass with same ID overwrites superclass >> - Slots from superclass are put before slots from subclass >> - Exception raised if the number of final slots is larger than the >> limit passed in to PyExtensibleType_Ready. >> >> (Whenever this is not sufficient, you can always manually munge the >> table after PyExtensibleType_Ready.) >> >> Question: How to deal with possible flag bits in the ID? >> >> Three approaches: >> >> a) Forget about the flags-in-ID idea; if you want flags, stick them in >> the data >> >> b) Embed a seperate variable for flags in every PyCustomSlot >> >> c) Standardize on a *hard* requirement on the bottom 8 bits being >> flags while the top 24 bits indicate incompatible slots; so for the >> purposes of inheritance, 0x12345601 would overwrite 0x12345600. >> >> To me, b) is OK, but the 32 bit ID space is already so ridiculously >> huge >> that c) is a "why not"? -1 on a), it'd be rather tedious if the payload >> > > > I guess the sane thing to do is make the custom slot (id, flags, data); and have id and flags be 32 bits on all platforms. Otherwise 32 bits are wasted to padding on 64 bit platforms anyway. > SEP updated (to what I hope is the final form): https://groups.google.com/forum/?fromgroups#!topic/numfocus/-XWwLMVgXBQ https://github.com/numfocus/sep/blob/master/sep200.rst https://github.com/dagss/pyextensibletype Changes: - Remove the flags concept; option a) above - Use the tp_flags bit. (I benchmarked walking the type hierarchy, and it doesn't cost if you don't take the branch, but I'm much happier for clients to avoid having to rendezvous on the metaclass, in particular if this is used in the NumPy API). - All manually allocated IDs have the least significant bit set, so that one can also use 2-byte aligned pointers as IDs (e.g., objects representing interfaces or interned strings can be used as slot IDs). Dag From robertwb at gmail.com Thu May 31 20:17:34 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Thu, 31 May 2012 11:17:34 -0700 Subject: [Cython] vcvarsall.bat error on win32 with mingw when using cython_inline In-Reply-To: References: Message-ID: On Thu, May 31, 2012 at 3:09 AM, Dave Hirschfeld wrote: > Hi cython devs, > I got the "unable to find vcvarsall.bat" error when using cython_inline when > trying to compile with mingw. Using cython normally (creating a setup.py file) > worked fine however. > > I was able to fix the problem for me by changing inline.py to parse the config > files before building the extension. I've created a pull request with my changes: > > https://github.com/cython/cython/pull/129 > > Apologies if I haven't used the appropriate workflow - I'm fairly new to git and > this is my first pull request! Thanks! Yes, this is the way to do it. - Robert From robertwb at gmail.com Thu May 31 20:22:18 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Thu, 31 May 2012 11:22:18 -0700 Subject: [Cython] [Python-Dev] C-level duck typing In-Reply-To: <4FC77E22.3030201@astro.uio.no> References: <4FB35ACA.7090908@astro.uio.no> <4FB44065.4010306@canterbury.ac.nz> <4FB469B1.3020804@canterbury.ac.nz> <4FB55E24.3090006@astro.uio.no> <4FB60896.4030702@astro.uio.no> <4FC29B9C.1000308@astro.uio.no> <4FC39A42.4040602@astro.uio.no> <90d9723f-7003-48f7-b959-955c22850d2d@email.android.com> <4FC77E22.3030201@astro.uio.no> Message-ID: On Thu, May 31, 2012 at 7:20 AM, Dag Sverre Seljebotn wrote: > > SEP updated (to what I hope is the final form): > > https://groups.google.com/forum/?fromgroups#!topic/numfocus/-XWwLMVgXBQ > https://github.com/numfocus/sep/blob/master/sep200.rst > https://github.com/dagss/pyextensibletype Very nice! > Changes: > > ?- Remove the flags concept; option a) above > > ?- Use the tp_flags bit. (I benchmarked walking the type hierarchy, and it > doesn't cost if you don't take the branch, but I'm much happier for clients > to avoid having to rendezvous on the metaclass, in particular if this is > used in the NumPy API). > > ?- All manually allocated IDs have the least significant bit set, so that > one can also use 2-byte aligned pointers as IDs (e.g., objects representing > interfaces or interned strings can be used as slot IDs). From robertwb at gmail.com Thu May 31 20:50:05 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Thu, 31 May 2012 11:50:05 -0700 Subject: [Cython] SEP 201 draft: Native callable objects In-Reply-To: <4FC77A5C.50009@astro.uio.no> References: <4FC77A5C.50009@astro.uio.no> Message-ID: On Thu, May 31, 2012 at 7:04 AM, Dag Sverre Seljebotn wrote: > [Discussion on numfocus at googlegroups.com please] > > I've uploaded a draft-state SEP 201 (previously CEP 1000): > > https://github.com/numfocus/sep/blob/master/sep201.rst > > """ > Many callable objects are simply wrappers around native code. This holds for > any Cython function, f2py functions, manually written CPython extensions, > Numba, etc. > > Obviously, when native code calls other native code, it would be nice to > skip the significant cost of boxing and unboxing all the arguments. > """ > > > The thread about this on the Cython list is almost endless: > > http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 > > There was a long discussion on the key-comparison vs. interned-string > approach. I've written both up in SEP 201 since it was the major point of > contention. There was some benchmarks starting here: > > http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 > > And why provide a table and not a get_function_pointer starting here: > > http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 > > For those who followed that and don't want to read the entire spec, the > aspect of flags is new. How do we avoid to duplicate entries/check against > two signatures for cases like a GIL-holding caller wanting to call a nogil > function? My take: For key-comparison you can compare under a mask, for > interned-string we should have additional flags field. > > The situation is a bit awkward: The Cython list consensus (well, me and > Robert Bradshaw) decided on what is "Approach 1" (key-comparison) in SEP > 201. I pushed for that. > > Still, now that a month has passed, I just think key-comparison is too ugly, > and that the interning mechanism shouldn't be *that* hard to code up, > probably 500 lines of C code if one just requires the GIL in a first > iteration, and that keeping the spec simpler is more important. > > So I'm tentatively proposing Approach 2. I'm still not convinced that a hybrid approach, where signatures below some cutoff are compiled down to keys, is not a worthwhile approach. This gets around variable-length keys (both the complexity and possible runtime costs for long keys) and allows simple libraries to produce and consume fast callables without participating in the interning mechanism. It's unclear how to rendezvous on a common interning interface without the GIL/Python, so perhaps requiring the GIL to use it not to onerous. An alternative is to acquire the GIL in the first/reference implementation (which could allow the interning function pointers to be cached by an external GIL-oblivions JIT for example). Presumably some other locking mechanism would be required if the GIL is not used, so the overhead would likely not be that great. - Robert From robertwb at gmail.com Thu May 31 20:57:38 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Thu, 31 May 2012 11:57:38 -0700 Subject: [Cython] SEP 201 draft: Native callable objects In-Reply-To: References: <4FC77A5C.50009@astro.uio.no> Message-ID: On this note, a global string interning mechanism is likely to be of interest beyond just native callable objects, so could be worth separating out into a separate spec. On Thu, May 31, 2012 at 11:50 AM, Robert Bradshaw wrote: > On Thu, May 31, 2012 at 7:04 AM, Dag Sverre Seljebotn > wrote: >> [Discussion on numfocus at googlegroups.com please] >> >> I've uploaded a draft-state SEP 201 (previously CEP 1000): >> >> https://github.com/numfocus/sep/blob/master/sep201.rst >> >> """ >> Many callable objects are simply wrappers around native code. This holds for >> any Cython function, f2py functions, manually written CPython extensions, >> Numba, etc. >> >> Obviously, when native code calls other native code, it would be nice to >> skip the significant cost of boxing and unboxing all the arguments. >> """ >> >> >> The thread about this on the Cython list is almost endless: >> >> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >> >> There was a long discussion on the key-comparison vs. interned-string >> approach. I've written both up in SEP 201 since it was the major point of >> contention. There was some benchmarks starting here: >> >> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >> >> And why provide a table and not a get_function_pointer starting here: >> >> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >> >> For those who followed that and don't want to read the entire spec, the >> aspect of flags is new. How do we avoid to duplicate entries/check against >> two signatures for cases like a GIL-holding caller wanting to call a nogil >> function? My take: For key-comparison you can compare under a mask, for >> interned-string we should have additional flags field. >> >> The situation is a bit awkward: The Cython list consensus (well, me and >> Robert Bradshaw) decided on what is "Approach 1" (key-comparison) in SEP >> 201. I pushed for that. >> >> Still, now that a month has passed, I just think key-comparison is too ugly, >> and that the interning mechanism shouldn't be *that* hard to code up, >> probably 500 lines of C code if one just requires the GIL in a first >> iteration, and that keeping the spec simpler is more important. >> >> So I'm tentatively proposing Approach 2. > > I'm still not convinced that a hybrid approach, where signatures below > some cutoff are compiled down to keys, is not a worthwhile approach. > This gets around variable-length keys (both the complexity and > possible runtime costs for long keys) and allows simple libraries to > produce and consume fast callables without participating in the > interning mechanism. > > It's unclear how to rendezvous on a common interning interface without > the GIL/Python, so perhaps requiring the GIL to use it not to onerous. > An alternative is to acquire the GIL in the first/reference > implementation (which could allow the interning function pointers to > be cached by an external GIL-oblivions JIT for example). Presumably > some other locking mechanism would be required if the GIL is not used, > so the overhead would likely not be that great. > > - Robert From d.s.seljebotn at astro.uio.no Thu May 31 21:29:40 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 31 May 2012 21:29:40 +0200 Subject: [Cython] SEP 201 draft: Native callable objects In-Reply-To: References: <4FC77A5C.50009@astro.uio.no> Message-ID: <4FC7C6A4.3060404@astro.uio.no> On 05/31/2012 08:50 PM, Robert Bradshaw wrote: > On Thu, May 31, 2012 at 7:04 AM, Dag Sverre Seljebotn > wrote: >> [Discussion on numfocus at googlegroups.com please] >> >> I've uploaded a draft-state SEP 201 (previously CEP 1000): >> >> https://github.com/numfocus/sep/blob/master/sep201.rst >> >> """ >> Many callable objects are simply wrappers around native code. This holds for >> any Cython function, f2py functions, manually written CPython extensions, >> Numba, etc. >> >> Obviously, when native code calls other native code, it would be nice to >> skip the significant cost of boxing and unboxing all the arguments. >> """ >> >> >> The thread about this on the Cython list is almost endless: >> >> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >> >> There was a long discussion on the key-comparison vs. interned-string >> approach. I've written both up in SEP 201 since it was the major point of >> contention. There was some benchmarks starting here: >> >> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >> >> And why provide a table and not a get_function_pointer starting here: >> >> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >> >> For those who followed that and don't want to read the entire spec, the >> aspect of flags is new. How do we avoid to duplicate entries/check against >> two signatures for cases like a GIL-holding caller wanting to call a nogil >> function? My take: For key-comparison you can compare under a mask, for >> interned-string we should have additional flags field. >> >> The situation is a bit awkward: The Cython list consensus (well, me and >> Robert Bradshaw) decided on what is "Approach 1" (key-comparison) in SEP >> 201. I pushed for that. >> >> Still, now that a month has passed, I just think key-comparison is too ugly, >> and that the interning mechanism shouldn't be *that* hard to code up, >> probably 500 lines of C code if one just requires the GIL in a first >> iteration, and that keeping the spec simpler is more important. >> >> So I'm tentatively proposing Approach 2. > > I'm still not convinced that a hybrid approach, where signatures below > some cutoff are compiled down to keys, is not a worthwhile approach. > This gets around variable-length keys (both the complexity and > possible runtime costs for long keys) and allows simple libraries to > produce and consume fast callables without participating in the > interning mechanism. I still think this gives us the "worst of both worlds", all the disadvantages and none of the advantages. How many simple libraries are there really? Cython on one end, the magnificently complicated NumPy ufuncs on the other? Thinking big, perhaps PyPy and Julia? Cython, PyPy, Julia would all have to deal with long signatures anyway. And NumPy ufuncs are already complicated so even more low-level stuff wouldn't hurt. > It's unclear how to rendezvous on a common interning interface without > the GIL/Python, so perhaps requiring the GIL to use it not to onerous. > An alternative is to acquire the GIL in the first/reference > implementation (which could allow the interning function pointers to > be cached by an external GIL-oblivions JIT for example). Presumably > some other locking mechanism would be required if the GIL is not used, > so the overhead would likely not be that great. Yes. I guess a goal could be to make sure there's no ABI breakage if/when the GIL requirement is lifted. Since modules can already have a reference to the interner by the time the first module interfacing with a GIL-less world is imported, this is non-trivial, but "every problem can be solved with another level of indirection", and particularly this one. Good idea on separating out interning as a separate spec; that's definitely useful for interfaces etc. as well down the line. I can get to work on a string interning spec and implementation as SEP 202 in spare minutes over the next month or so, but I won't bother unless SEP 201 will uses interning. My role in that depends on Travis' timeline as well, as my ETA is so unpredictable. Dag From d.s.seljebotn at astro.uio.no Thu May 31 22:06:33 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 31 May 2012 22:06:33 +0200 Subject: [Cython] Fwd: Re: SEP 201 draft: Native callable objects In-Reply-To: <4FC7CF2A.2080809@astro.uio.no> References: <4FC7CF2A.2080809@astro.uio.no> Message-ID: <4FC7CF49.8040905@astro.uio.no> Forgot to CC this list... -------- Original Message -------- Subject: Re: [Cython] SEP 201 draft: Native callable objects Date: Thu, 31 May 2012 22:06:02 +0200 From: Dag Sverre Seljebotn Reply-To: numfocus at googlegroups.com To: numfocus at googlegroups.com On 05/31/2012 09:29 PM, Dag Sverre Seljebotn wrote: > On 05/31/2012 08:50 PM, Robert Bradshaw wrote: >> On Thu, May 31, 2012 at 7:04 AM, Dag Sverre Seljebotn >> wrote: >>> [Discussion on numfocus at googlegroups.com please] >>> >>> I've uploaded a draft-state SEP 201 (previously CEP 1000): >>> >>> https://github.com/numfocus/sep/blob/master/sep201.rst >>> >>> """ >>> Many callable objects are simply wrappers around native code. This >>> holds for >>> any Cython function, f2py functions, manually written CPython >>> extensions, >>> Numba, etc. >>> >>> Obviously, when native code calls other native code, it would be nice to >>> skip the significant cost of boxing and unboxing all the arguments. >>> """ >>> >>> >>> The thread about this on the Cython list is almost endless: >>> >>> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >>> >>> There was a long discussion on the key-comparison vs. interned-string >>> approach. I've written both up in SEP 201 since it was the major >>> point of >>> contention. There was some benchmarks starting here: >>> >>> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >>> >>> And why provide a table and not a get_function_pointer starting here: >>> >>> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >>> >>> For those who followed that and don't want to read the entire spec, the >>> aspect of flags is new. How do we avoid to duplicate entries/check >>> against >>> two signatures for cases like a GIL-holding caller wanting to call a >>> nogil >>> function? My take: For key-comparison you can compare under a mask, for >>> interned-string we should have additional flags field. >>> >>> The situation is a bit awkward: The Cython list consensus (well, me and >>> Robert Bradshaw) decided on what is "Approach 1" (key-comparison) in SEP >>> 201. I pushed for that. >>> >>> Still, now that a month has passed, I just think key-comparison is >>> too ugly, >>> and that the interning mechanism shouldn't be *that* hard to code up, >>> probably 500 lines of C code if one just requires the GIL in a first >>> iteration, and that keeping the spec simpler is more important. >>> >>> So I'm tentatively proposing Approach 2. >> >> I'm still not convinced that a hybrid approach, where signatures below >> some cutoff are compiled down to keys, is not a worthwhile approach. >> This gets around variable-length keys (both the complexity and >> possible runtime costs for long keys) and allows simple libraries to >> produce and consume fast callables without participating in the >> interning mechanism. > > I still think this gives us the "worst of both worlds", all the > disadvantages and none of the advantages. Wait -- the complexity of the key approach is in the compilation, but avoiding any encoding/decoding would remove a major source of spec complexity. So this could just work: typedef struct { union { char *interned_sigptr; char short_sig[8]; } uintptr_t flags; void *funcptr; }; And then flags contains whether the signature is short. I think compiling down a signature that doesn't end with 0x000 is rather complicated even if there's no Huffman. The point is to be able to hand off a char* to a signature parsing routine easily, with no decompilation. Using a flag avoids that but requires a couple more instructions. Pro: Get somewhere without actually implementing interning (that's like a 3-hour job if you require the GIL, a little more to make sure it's forward-compatible) Cons: Is more complicated. Couple of extra assembly instructions but they probably don't matter. Dag > > How many simple libraries are there really? Cython on one end, the > magnificently complicated NumPy ufuncs on the other? Thinking big, > perhaps PyPy and Julia? Cython, PyPy, Julia would all have to deal with > long signatures anyway. And NumPy ufuncs are already complicated so even > more low-level stuff wouldn't hurt. > >> It's unclear how to rendezvous on a common interning interface without >> the GIL/Python, so perhaps requiring the GIL to use it not to onerous. >> An alternative is to acquire the GIL in the first/reference >> implementation (which could allow the interning function pointers to >> be cached by an external GIL-oblivions JIT for example). Presumably >> some other locking mechanism would be required if the GIL is not used, >> so the overhead would likely not be that great. > > Yes. I guess a goal could be to make sure there's no ABI breakage > if/when the GIL requirement is lifted. > > Since modules can already have a reference to the interner by the time > the first module interfacing with a GIL-less world is imported, this is > non-trivial, but "every problem can be solved with another level of > indirection", and particularly this one. > > Good idea on separating out interning as a separate spec; that's > definitely useful for interfaces etc. as well down the line. I can get > to work on a string interning spec and implementation as SEP 202 in > spare minutes over the next month or so, but I won't bother unless SEP > 201 will uses interning. My role in that depends on Travis' timeline as > well, as my ETA is so unpredictable. > > Dag From robertwb at gmail.com Thu May 31 22:13:05 2012 From: robertwb at gmail.com (Robert Bradshaw) Date: Thu, 31 May 2012 13:13:05 -0700 Subject: [Cython] SEP 201 draft: Native callable objects In-Reply-To: <4FC7C6A4.3060404@astro.uio.no> References: <4FC77A5C.50009@astro.uio.no> <4FC7C6A4.3060404@astro.uio.no> Message-ID: On Thu, May 31, 2012 at 12:29 PM, Dag Sverre Seljebotn wrote: > On 05/31/2012 08:50 PM, Robert Bradshaw wrote: >> >> On Thu, May 31, 2012 at 7:04 AM, Dag Sverre Seljebotn >> ?wrote: >>> >>> [Discussion on numfocus at googlegroups.com please] >>> >>> I've uploaded a draft-state SEP 201 (previously CEP 1000): >>> >>> https://github.com/numfocus/sep/blob/master/sep201.rst >>> >>> """ >>> Many callable objects are simply wrappers around native code. This holds >>> for >>> any Cython function, f2py functions, manually written CPython extensions, >>> Numba, etc. >>> >>> Obviously, when native code calls other native code, it would be nice to >>> skip the significant cost of boxing and unboxing all the arguments. >>> """ >>> >>> >>> The thread about this on the Cython list is almost endless: >>> >>> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >>> >>> There was a long discussion on the key-comparison vs. interned-string >>> approach. I've written both up in SEP 201 since it was the major point of >>> contention. There was some benchmarks starting here: >>> >>> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >>> >>> And why provide a table and not a get_function_pointer starting here: >>> >>> http://thread.gmane.org/gmane.comp.python.cython.devel/13416/focus=13443 >>> >>> For those who followed that and don't want to read the entire spec, the >>> aspect of flags is new. How do we avoid to duplicate entries/check >>> against >>> two signatures for cases like a GIL-holding caller wanting to call a >>> nogil >>> function? My take: For key-comparison you can compare under a mask, for >>> interned-string we should have additional flags field. >>> >>> The situation is a bit awkward: The Cython list consensus (well, me and >>> Robert Bradshaw) decided on what is "Approach 1" (key-comparison) in SEP >>> 201. I pushed for that. >>> >>> Still, now that a month has passed, I just think key-comparison is too >>> ugly, >>> and that the interning mechanism shouldn't be *that* hard to code up, >>> probably 500 lines of C code if one just requires the GIL in a first >>> iteration, and that keeping the spec simpler is more important. >>> >>> So I'm tentatively proposing Approach 2. >> >> >> I'm still not convinced that a hybrid approach, where signatures below >> some cutoff are compiled down to keys, is not a worthwhile approach. >> This gets around variable-length keys (both the complexity and >> possible runtime costs for long keys) and allows simple libraries to >> produce and consume fast callables without participating in the >> interning mechanism. > > I still think this gives us the "worst of both worlds", all the > disadvantages and none of the advantages. It avoids the one of the primary disadvantage of keys, namely the variable length complexity. > How many simple libraries are there really? Cython on one end, the > magnificently complicated NumPy ufuncs on the other? Thinking big, perhaps > PyPy and Julia? Cython, PyPy, Julia would all have to deal with long > signatures anyway. And NumPy ufuncs are already complicated so even more > low-level stuff wouldn't hurt. I was thinking of, for example, a differential equation solver written in C, C++, or Fortran that could take a PyNativeCallableTable* directly, primarily avoiding welding this spec to Python. >> It's unclear how to rendezvous on a common interning interface without >> the GIL/Python, so perhaps requiring the GIL to use it not to onerous. >> An alternative is to acquire the GIL in the first/reference >> implementation (which could allow the interning function pointers to >> be cached by an external GIL-oblivions JIT for example). Presumably >> some other locking mechanism would be required if the GIL is not used, >> so the overhead would likely not be that great. > > > Yes. I guess a goal could be to make sure there's no ABI breakage if/when > the GIL requirement is lifted. > > Since modules can already have a reference to the interner by the time the > first module interfacing with a GIL-less world is imported, this is > non-trivial, but "every problem can be solved with another level of > indirection", and particularly this one. > > Good idea on separating out interning as a separate spec; that's definitely > useful for interfaces etc. as well down the line. I can get to work on a > string interning spec and implementation as SEP 202 in spare minutes over > the next month or so, but I won't bother unless SEP 201 will uses interning. > My role in that depends on Travis' timeline as well, as my ETA is so > unpredictable. > > Dag