[Cython] RFC: an inline_ function that dumps c/c++ code to the code emitter

Jason Newton nevion at gmail.com
Fri Aug 19 14:19:54 EDT 2016


On Fri, Aug 19, 2016 at 5:36 AM, Robert Bradshaw <robertwb at gmail.com> wrote:

> On Thu, Aug 18, 2016 at 12:05 PM, Jason Newton <nevion at gmail.com> wrote:
> > Accidentally posted to an already-opened tab for the cython-users ML
> > yesterday, moving to here. Following up from a github opened issue here:
> >
> > https://github.com/cython/cython/issues/1440
> >
> > I was hoping we could give a way to drop straight into C/C++ inside of
> > Cython pyx files.
> >
> > Why?
> >
> > -It [helps] avoid needing to declare every class/function in cython, a
> > somewhat daunting/impossible task that I think everyone hates.  Have you
> > libraries like Eigen or others that use complex template based
> techniques?
> > How about those with tons of [member] functions to boot or getting C++
> > inheritance involved.
>
> I agree that this is a pain, and better tooling should be developed to
> (mostly?) eliminate this (e.g. see the recent thread at
> https://groups.google.com/forum/#!topic/cython-users/c8ChI6jERzY ).
>
> Of course having symbols magically appear (e.g. due some #Include, who
> knows which) has its downsides too, which is why import * is often
> discouraged in Python too.
>
> > -It works around having the Cython compiler know about all of C++'s
> nuances
> > - as an advanced C++ developer these are painful and it is a 2nd class
> > citizen to Cython's simpler C-support - that's no good.  Just right now I
> > was bitten by yet another template argument bug and it's clear C++
> template
> > arguments have been kind of dicy since support appeared.
>
> Yes, C++ is an extraordinarily complicated language, and exposing all
> of that would add significant amounts of complexity to Cython itself,
> and perhaps more importantly increase the barrier of entry to reading
> Cython code. One of the guiding principles is that we try to stay as
> close to Python as possible (if you know Python, you can probably read
> Cython, and with a minimal amount of C knowledge start writing it) and
> much of C++ simply isn't Pythonic.
>

Maybe it's off topic but I debate the guiding principle of Cython -  I was
not able to comprehend Cython before reading tutorials and this is not my
first time looking at it, I had a couple runins over the last 5 years with
it on projects such as h5py easily got lost on what was going on between
all the wrapper-wrappers (pure py code wrapping py-invokable code) and
re-declarations.  These projects complied with Cython's current philosophy
to the degradation of clarity, context, and overall idea of how code was
hooked up.  Perhaps Cython should take the lessons learned from it's
inception, time, and the results of the state of the c-python userbase to
guide us into a new philosophy.



>
> Though, as you discovered, there are some basic things like non-type
> template arguments that we would like to have. If there are other
> specific C++ constructs that are commonly used but impossible to
> express in Cython it'd be useful to know.
>

I haven't had the capability to use Cython sufficiently to learn more of
them because it currently can't solve my problems.  From prior SWIG et al
experiences, my outlook is that it is treacherous path to walk and unless
you brought in llvm/clang into the project for parsing/AST, I'd hold onto
that outlook.


>
> > -It would allow single source files - I think this is important for
> runtime
> > compiled quasi-JIT/AOC fragments, like OpenCL/PyOpenCL/PyCUDA provide
>
> Not quite following what you're saying here.
>

Maybe PyInline was a better example off the bat for you, but something a
little more flexible but also with less work is needed.  Compare with
PyOpenCL:
https://documen.tician.de/pyopencl/ - check out some examples.   There is a
c runtime api between the contexts hooking things up (this is the OpenCL
runtime part) - it's a pretty similar story to PyCuda (and by the same
author, execpt for that project has to jump out to nvcc and cache kernel
compilation like the inline function implementation does).  There's no
limit to the number of functions you can declare though and the OpenCL side
is kept simple - things are generally pretty typesafe/do what you would
expect on dispatch.  PyInline for comparison looks like it might lean on
the Python c-api for it's work more and maybe limited in the number of
functions per snippet it can declare.  I don't expect to be able to work
with Numpy ndarray data easily with it.


>
> > The idea is that Cython glue makes the playing field for extracting data
> > easy, but that once it's extracted to a cdef variable for instance,
> cython
> > doesn't need to know what happens.  Maybe in a way sort of like the GCC
> asm
> > extension.  Hopefully simpler variable passing though.
>
> Cython uses "mangled" names (e.g. with a __pyx prefix) to avoid any
> possible conflicts. Specifying what/how to mangle could get as ugly as
> GCC's asm variable passing. And embedded variable declarations, let
> alone control flow statements (especially return, break, ...) could
> get really messy. It obscures analysis Cython can do on the code, such
> as whether variables are used or what values they may take. Little
> code snippets are not always local either, e.g. do they often need to
> refer to variables (or headers) referenced elsewhere. And they must
> all be mutually compatible.
>

Like gcc's asm, let's let adults do what they want and let them worry about
the consequences of flow control/stray includes. I'm not even sure how most
of this would be an issue (switch/break/if) if you are properly nesting
pyxd output.  The only thing I think is an issue here is mangled names.  I
haven't yet figured out why (cdef) variable names must be mangled.  Can you
explain?  Maybe we add an option to allow it to be unmangled in their
declaration? C++ has extern "C" for example.


>
> That's aside from the jarring nature of interleaving Python and C++
> code. Would semicolons be required? How would parsers (including IDEs)
> handle the C++ snippets? Or would they be placed in opaque
> strings/comments (which I'd rather avoid)?
>

Opaque strings.  It's a good and time tested solution to the issue.  I'm
very happy with it in the contexts I use it in.


>
> Feels like you want PyInline or weave.inline, but at compile time.
>

You must realize that almost any other python driven way to compile c-code
in the spirit these projects do is deprecated/dead.  Cython has absorbed
all the reputation and users that didn't go to pure-c/boost.python -
pybind11 is the new kid on the block there so I'm not including it (I'm of
the opinion that SWIG users stayed unchanged).  Community
belief/QA/designers/google all think of Cython first.  Weave has
effectively closed up it's doors and I'm not even sure it had the power to
do what I wanted anyway because Cython provides a language that eases the
data-extraction/typecasting part of inlining C/C++.


>
> > The alternative to not having this construct is a concrete wall.  I'll
> have
> > to segment up files for C++ for the rest of time until I get function
> forms
> > that Cython can handle. At that point I just say screw it and use boost
> > python.  Of course cython does the definitions, data extraction, and
> > compilation so much easier than boost.python.  It would be a shame to not
> > consider this plight C++ developers have been cornered into and you can't
> > say the C++ libraries are broken, it is the strategy Cython is taking
> that
> > cannot work.
>
> Some C++ libraries are not very amenable to being used outside a C++
> context. They don't expose an "FFI-friendly" interface and calling
> them from other languages is more painful. That doesn't mean they're
> broken, just C++ centric.
>
> The strategy that Cython uses is that you use (as close to possible)
> Python structures and syntax to write your bindings. This is great for
> people who know and understand Python, but for those who would rather
> write their code (including bindings) in C++ there's Boost.Python and
> others (including writing C extensions by hand).
>

As a multi-year user and contributor to the project of Boost.Python, it's
not all it's cracked up to be - Cython with inline C/C++ is often a better
approach (and better for scientific code acceleration/glue).
Boost.Python's autoconversion magic is very unpredictable/broken - and this
is required.  I'm interested in Pybind for large projects but not as much
as in Cython for numeric processing.  Writing C extensions by hand is
usually a waste of time, which is why so many people tried to automate it.

Cython's audience *I believe* is not people not knowing C/C++ and only
familiar with python but is C/C++ users trying to wrap stuff with Cython.
Further, I don't see how this is detrimental - it's segmented.  For small
stuff the cognitive burden of finding and going to an external file is far
higher which is the way I've seen it end up so far.


> If your library is full of C++isms, another option is to create a C
> wrapper, and expose that to Python (which is what people did before
> there was any C++ support). If exposing these as Python functions,
> with automatic conversion to/from Python types is the primary value
> you're getting from Cython, simply declare these as cpdef functions in
> your extern blocks and wrappers will automatically be created. If your
> C wrappers are simple enough, you might even be able to auto-generate
> the .pxd and .pyx files with some of the tools listed at
> https://github.com/cython/cython/wiki/AutoPxd . Yes, you'd have your
> C++ code in a C++ file and "Python" code in a .pyx file rather than
> interleaving them, but the amount of code you write would be the same.
>

So many nopes to this approach.  I listed it myself effectively above, just
not the autopxd part.  No I don't want to make C bindings for all of C++
libraries, I don't buy this saved anything and only creates a ridiculous
amount of work again where the cost (far) exceeds the gains.

-Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20160819/d573f7db/attachment.html>


More information about the cython-devel mailing list