[Cython] New function (pointer) syntax.

C Blake cblake at pdos.csail.mit.edu
Thu Nov 6 22:35:36 CET 2014


I think you should just use the C declarator syntax.  Cython already
allows you to say "cdef int *foo[10]".  Declarators aren't bad - just
poorly taught, though I can see some saying those are the same thing.
More below.  I absolutely like the declarator one the most, and the
lambda one second most.  Declarator style makes it far easier to take
C code/.h files using function pointers over to Cython.  So, this
discussion also depends on whether you view Cython as a bridge to
C libs or its own island/bias toward pure Py3.

One other proposal that might appease Stefan's "Python lexical cues"
where he was missing "def" would be to take the Python3 function
definition header syntax and strip just variable names.  I.e., keep the
":"s
    def foo(x: type1, y: type2, z: type3) -> type0: pass
goes to
    (: type1, : type2, : type3) -> type0 [ident]

I don't think that looks like anything else that might valid in C or
Python.  It does just what C does - strip variable names from a function
header, but the ":"s maybe key your brain into a different syntax mode
since they are arguably more rare in C.  (Besides stripping names, C has
some extra parens for precedence/associativity - which admittedly are
tuned to make use-expressions simpler than type-expressions.)  Anyway,
I don't really like my own proposal above.  Just mentioning it for
completeness in case the ":"s help anyone.


Robert wrote:
>I really hope, in the long run, people won't have to manually do these
>declarations.

I agree they'll always be there somehow and also with Stefan's comments
about the entry bar.  So, most people not needing them most of the time
doesn't remove the question.


>I am curious, when you read "cdef int * p" do you parse this as "cdef
>(int*) p" or "cdef int (*p)" 'cause for me no matter how well I know
>it's the latter, I think the former (i.e. I think "I'm declaring a
>variable of type p that's of int pointer type.")
>[..]
>Essentially everyone thinks "cdef type var" even though that's not
>currently the true grammar.
>[..]
>The reason ctypedefs help, and are so commonly used for with function
>pointers, is because the existing syntax is just so horrendously bad.
>If there's a clear way to declare a function taking a single float and
>returning a single float, no typedef needed.

No, no, no.  Look, generations of C/C++ programmers have been done
monumental disservice by textbooks/code/style guides that suggest
"int*  p" is *any less* confusing than spacing "2+3/x" as "2+3 / x".
Early on in my C exposure someone pointed this out and I've never been
confused since.  It's a syntax-semantics confusion.  Concrete syntax
has always been right associative dereference *.  In this syntax family,
the moment any operators []/*/() are introduced, you have to start
formatting it/thinking of it as a real expression, and that formatting
should track the syntax not semantics like in your head "pointer
to"/indirection speak or whatever.  Spacing it as if * were left
associative to the type name is misleading at best.

If you can only think of type decls in general as (type, var) pairs
syntactically and semantically then *of course* you find typedefs more
clear.  They make pairing more explicit & shrink the expression tree to
be more trivial.  You should *still* space the typedef itself in a way
suggestive of the actual concrete syntax -- "typedef int *p" (or
"ctypedef") just like you shouldn't write "2+3 / x".  You should still
not essentially think of "ctypedef type var" *either*, but rather
"typedef basetype expr".  In short, "essentially everyone" *should* think
and be taught and have it reinforced and "gel" by spacing that "basetype
expr" is the syntax to create a type-var bindings semantically, and only
perceive "type var" as just one simple case.  Taking significance of
space in Python/Cython one step further, "int* p" could even be a hard
syntax error, but I'm not seriously proposing that change.  I really
do not think it is "essentially everyone".  You know better as you
said anyway, but are in conflict with yourself, I think syntax-semantics
wise.

Semantically, pointer indirection loads from an address before using,
and sure that can be confusing to new programmers in its own right.
Trying to unravel that confusion with anti-syntax spacing/thought
cascades the wrong way out of the whole situation and contributes
to your blocked assimilation.  *If* the space guides you or barring
space parens guide you, you quickly get to never forgetting that types
are inside-out/inverse/what-I-get-if specifications.  Note that this
is as it would if +,/ had somehow tricky concepts somehow "fixable"
by writing "2+3 / x" all the time.  Arithmetic isn't a binding..so
the analogy is hard to complete, but my point is ad nauseum at this
stage (or even sooner! ;-).  Undermine syntax with contrary whitespace
and of course it will seem bad/be harder.  It might even lock you in
to thought patterns that make it really hard to think about it how
you know you "ought" to.


Anyway, more the point, Cython has "cdef" not "py3def" or "javadef" or
whatever, after all.  There is even the cdef: blocks where all the decls
and inits look eminently C-like.  Having an alternative ptr syntax from C
only for functions but not arrays seems wrong to me *unless* it's really
part of a general Py3 annotations syntax move.  Stefan was in some
mypy-in-Py3 thread last Summer.  If the situation were Cython just
moving that way to be a Py3-with-mypy compiler with some extras for C
integration, that would be a different story to me.  Then trying to make
function pointers seem like definitions sans-names might make more sense.
It would really ease moves to Cython from pure Py3 users used to the type
annotations "for checking" to apply to code generation.  That may be such
a big move from Cython now that it might almost deserve some "Cython3"
fork that could also drop other redundant things (like pre-memory views
NumPy integration that seems not as good as more general memviews or
something).

All that being said, using lambda syntax to signify types as well as
values seems an interesting idea, if it would really work.  But we
know C declarators will work for all things C.  I don't see why we
shouldn't just stick to that.  Right now and with declarators you can
tell people you just need to learn C decl syntax and Python.  That's
a VERY simple thing to say (even if it requires some learning work,
and even if it isn't 100% exact).  There is a lot of documentation
out there to learn C decls.


More information about the cython-devel mailing list