[Cython] New function (pointer) syntax.

C Blake cblake at pdos.csail.mit.edu
Sun Nov 9 00:19:25 CET 2014


>But I admit it's hard to come up with an objective measure for how
>good a syntax is...if it's natural to you than that's great.

I think those queries you mention will mostly be biased by the squeakier
wheels being more beginning people and that's not a very good argument
or metric.  I agree an objective measure of "goodness" or "understanding"
is hard, but I happen to run Gentoo and keep my sources around.  So, I
did a quick grep over .c and .h files in 600 packages on my system..
pretty diverse: no one style guide or style or maintainer..Not even any
very common domains..utilities, libraries, all sorts of stuff.

$ grep '[a-zA-Z0-9_][a-zA-Z0-9_]\*\*[^ ]' `find -type f -name '*.[ch]'` |
    grep -v '/\*\*' | grep -v '\*\*/' | wc -l
3468

$ grep '[a-zA-Z0-9_][a-zA-Z0-9_]  *\*\*[^ ]' `find -type f -name '*.[ch]'` |
    | grep -v '/\*\*' | grep -v '\*\*/' | wc -l
68900

In other words, over 95% of the instances spaced the '**' as if they knew
it bound to the token on its right.  ('**' is easier than '*' since the
latter could be multiplies but '**' almost never is).

Yes, greps are way approximate.  Yes, some real parser would be better,
but that just took me only a few minutes.  I visually inspected what they
were catching by |less instead of |wc and both cases seemed to mostly be
catching decl/type uses as intended..less than a few percent error on
that.  If anything, the most glaring question was that 3468 "type**" cases
were highly polluted with near 50% questionable "no whitespace at all"
instances like (char**)a.  Maintainers who know better might accept
patches and lazily not fix confusing formatting.  So, in a couple ways
that 5% confused is an upper bound in this corpus (under a spacing = no
confusion hypothesis).  And, sure, confused people might format
non-confusingly.  And maybe '**' itself is slightly selecting for
less confused people.

Even so, 95..97.5% to me == "essentially no one" to you by some, let's
say "not totally bonkers" measure suggests that we are just thinking of
highly different populations of people.  Even if you think my methods way
hokey, it's probably at least suggestive "essentially no one" is a far
bigger set than you thought before.  So, I agree/disagree with some other
things you said.  Initializers are an (awfully convenient) aberration,
but your example of teaching is just an example of bad teaching - so what?
A list of vars of one type is easily achieved even thinking as you want
with a typedef, hand having both declarators and typedefs gets you
everything you want.  Still, disagreements aside, I give up trying to
convince you of anything beyond that you *just might* may have a very
skewed perception of how confused about declarators are people coming to
Cython from a C/C++ background or people who write C/C++ in general.
But it seems in this arc anyway you aren't trying to target them or
C code integration coherence or such...Period!  As per...

>I'm hoping we can avoid it 100% :-) for anyone who doesn't have to
>actually interact with C.

So, you're leaning hard on the Cython as a Python compiler direction.
I think Cython in general should probably either be A) as C-like as
possible or B) as Python-like as possible.  Given your (I still think
misguided) hatred of C function pointer syntax/scenario A), there's
your probable answer - be as Py-like as possible.  Given that, for
just function types, that seems to mean either:
    A) the "lambda type1, type2: type3" proposal,
    B) what mypy does which is roughly Function[ [type1, type2], type3 ],
or possibly C) what Numba does if that really catches on.
or maybe the (type, type) -> rtype though that seems unpopular here,
but almost surely not that "char*(..)" thing.

In a few years the mypy approach may well be a PEP approved lint/typing
approach and people coming from Python will at least already have maybe
seen it.  In dozens of emails 2..3 months ago Guido was really strongly
promoting mypy, but I think it is in some kind of a-PEP-needs-to-be-
written limbo.  Here is a link to the relevant sub-part for those who
haven't looked at it:

    http://www.mypy-lang.org/tutorial.html#callables

I actually like A) better, but not so much better it should override
what the parent to one of the two Cython syntax communities goes with.
A) is really easy to describe - "just take the function value structure
but use types instead of variables/expression value".

There are some other styles like pytypedecl or obiwan and such that
might also be worth looking into before you decide.  I haven't looked
at them, but thought I should mention them.


More information about the cython-devel mailing list