[Cython] [cython-users] Cython .pxd introspection: listing defined constants

Robert Bradshaw robertwb at math.washington.edu
Sat Feb 19 21:47:41 CET 2011


On Sat, Feb 19, 2011 at 11:22 AM, W. Trevor King <wking at drexel.edu> wrote:

>> > If you try to override anything in a .so compiled module at runtime,
>> > you'd get the same kind of error you currently do trying to rebind a
>> > compiled class' method.
>>
>> That's the desired behavior for statically-bound globals, but
>> implementing it is not so trivial.
>
> It's currently implemented for classes.  Are modules that different
> from classes?
>
>    >>> import types
>    >>> types.ModuleType.__class__.__mro__
>                (<type 'type'>, <type 'object'>)
>
> So you've got your standard __setattr__ to override with an
> error-message generator.  What is the implementation difficulty?

You can't just add a __setattr__ to a module and have it work, it's a
class-level slot. And you don't want to modify all module classes. To
do this you have to subclass module itself and insert that in place
during import.

> I am also unclear about the distinction between cdef and cpdef
> (perhaps this should be a new thread?).
>
> cpdef means "I'm declaring something with C and Python interfaces.
>  The Python interface is a thin wrapper which can be rebound to an
>  object of any type, leaving the static C interface inaccessible from
>  Python."

No, you can't re-bind cdef class methods. Despite Cython's attempt to
homogenize things for the user, extension classes are quite different
than "normal" Python classes. This is a Python C/API issue.

> cdef [private] means "I'm declaring something that only has a C
>  interface."
> cdef public means "I'm declaring something with C and Python
>  interfaces backed by C data.  Python code can alter the C data."
> cdef readonly means "I'm declaring something with C and Python
>  interfaces backed by C data.  Python code cannot alter the C data."

cdef means "back this by a C variable"

> This seems to be broken in Cython at the module level, since I can
> rebind a cdef-ed class but not a cpdef-ed method:

Yes, we don't currently control the module's set/getattr. And classes
are "public" by default.

>    $ cat square.pyx
>    cdef class A (object):
>        cdef public int value
>
>        cpdef square(self):
>            return self.value**2
>    $ python -c 'import pyximport; pyximport.install(); import xx;
>    a = xx.A(); a.value = 3; print a.square();
>    a.square = lambda self: self.value'
>    Traceback (most recent call last):
>      File "<string>", line 3, in <module>
>    AttributeError: 'square.A' object attribute 'square' is read-only
>    $ python -c 'import pyximport; pyximport.install(); import square;
>    square.A = object; print square.A; a = square.A(); a.value = 3'
>    <type 'object'>
>    Traceback (most recent call last):
>      File "<string>", line 2, in <module>
>    AttributeError: 'object' object has no attribute 'value'
>
> So the cdef-ed A currently has a rebindable Python interface (acting
> like a hypothetical cpdef-ed class), but it's square method is not
> rebindable (acting like a hypothetical `cdef readonly`-ed method).
>
>> >>>> (and where would we do it--on the first import of a cimporting
>> >>>> module?)
>> >>>
>> >>> Compilation is an issue.  I think that .pxd files should be able to be
>> >>> cythoned directly, since then they Cython can build any wrappers they
>> >>> request.  If the file has a matching .pyx file, cythoning either one
>> >>> should compile both together, since they'll produce a single Python
>> >>> .so module.
>> >>
>> >> ...
>> >
>> > Under the mantra "explicit is better than implicit", we could have
>> > users add something like
>> >
>> >    cdef module "modname"
>> >
>> > to any .pxd files that should be inflated into Python modules.  .pxd
>> > files without such a tag would receive the current treatment, error on
>> > any cpdef, etc.  The drawback of this approach is that it makes Cython
>> > more complicated, but if both behaviors are reasonable, there's
>> > probably no getting around that.
>>
>> The other drawback is that it subverts the usual filename <-> module
>> name convention that one usually expects.
>
> I've been convinced that the `cimport .pyx file` route is a better way
> to go.
>
> However, the filename <-> module mapping is troublesome for backing
> externally-implemented Python modules (e.g. numpy).  If you wanted to
> write a .pxd file backing numpy.random, how would you go about getting
> your module installed in Cython/Includes/numpy/random.pxd or another
> path that cython would successfully match with `cimport numpy.random`?

Note that extern blocks (by definition) declare where things come from.

> On Sat, Feb 19, 2011 at 10:24:05AM +0100, Stefan Behnel wrote:
>> Robert Bradshaw, 18.02.2011 23:08:
>> > On Thu, Feb 17, 2011 at 8:38 PM, W. Trevor King wrote:
>> >> On Thu, Feb 17, 2011 at 3:53 PM, Robert Bradshaw wrote:
>> >>> On Thu, Feb 17, 2011 at 3:12 PM, W. Trevor King wrote:
>> >>>>>> A side effect of this cpdef change would be that now even bare .pxd
>> >>>>>> files (no matching .pyx) would have a Python presence,
>> >>>>>
>> >>>>> Where would it live? Would we just create this module (in essence,
>> >>>>> acting as if there was an empty .pyx file sitting there as well)? On
>> >>>>> this note, it may be worth pursuing the idea of a "cython helper"
>> >>>>> module where common code and objects could live.
>> >>>>
>> >>>> I'm not sure exactly what you mean by "cython helper", but this sounds
>> >>>> like my 'bare .pyx can create a Python .so module idea above.
>> >>>
>> >>> I'm thinking of a place to put, e.g. the generator and bind-able
>> >>> function classes, which are now re-implemented in every module that
>> >>> uses them. I think there will be more cases like this in the future
>> >>> rather than less. C-level code could be #included and linked from
>> >>> "global" stores as well. However, that's somewhat tangential.
>>
>> If you generate more than one file from a .pyx, including files that are
>> shared between compiler runs (or even readily built as .so files), you'd
>> quickly end up in dependency hell.
>
> I disagree here, but I like your cimportable .pyx better, so it
> doesn't matter ;).
>
>> >>>>>> Unions don't really have a Python parallel,
>> >>>>>
>> >>>>> They can be a cdef class wrapping the union type.
>> >>>>
>> >>>> But I would think coercion would be difficult.  Unions are usually (in
>> >>>> my limited experience) for "don't worry about the type, just make sure
>> >>>> it fits in X bytes".  How would union->Python conversion work?
>> >>>
>> >>> There would be a wrapping type, e.g.
>> >>>
>> >>> cdef class MyUnion:
>> >>>     cdef union_type value
>>
>> Wouldn't that have to be a pointer to the real thing instead?
>
> Do you mean `cdef union_type *value`?  Why would the above version not
> work?  The union type has a well defined size and a number of well
> defined interpretations, so I don't see the problem.
>
>> >>> with a bunch of setters/getters for the values, just like there are
>> >>> for structs. (In fact the same code would handle structs and unions).
>> >>>
>> >>> This is getting into the wrapper-generator territory, but I'm starting
>> >>> to think for simple things that might be worth it.
>> >>
>> >> I think that if Cython will automatically generate a wrapper for
>> >>
>> >>     cdef public int x
>> >>
>> >> it should generate a wrapper for
>> >>
>> >>     cdef struct X: cdef public int x
>> >
>> > Or
>> >
>> >      cdef public struct X:
>> >          int x
>> >          readonly int z
>> >          private int z
>> >
>> > I would perhaps say that non-Pythonable non-private members in public
>> > structs would be a compile error.
>>
>> +1, keep it safe at the beginning.
>
> -1, keep the code clean and the interface consistent ;).  I think the
> struct syntax should be identical to the class syntax, with the
> exception that you can't bind methods to structs.  That's the only
> real difference between structs and classes, isn't it?

In C++, the only difference between structs and classes is that struct
members are public by default. (Not saying that C++ is always the
model to follow, but it gives precedent). And structs can have
function members, that's how to do OOP in C.

> If safety with a new feature is a concern, a warning like
> "EXPERIMENTAL FEATURE" in the associated docs and compiler output
> should be sufficient.

I think the point of "safe" is to start out with a compiler error, and
we can change our minds later, which is better than trying to make
legal statements illegal in the future.

>> >> There really aren't that metatypes in C, so it doesn't seem like a
>> >> slippery slope to me.  Maybe I'm just missing something...
>> >>
>> >>>> Ok, I think we're pretty much agreed ;).  I think that the next step
>> >>>> is to start working on implementations of:
>> >>>>
>> >>>> * Stand alone .pxd ->  Python module
>> >>>
>> >>> I'm not sure we're agreed on this one.
>>
>> Same from here. To me, that doesn't make much sense for code that wraps a
>> library. And if it doesn't wrap a library, there isn't much benefit in
>> writing a stand-alone .pxd in the first place. A .pyx is much more explicit
>> and obvious in this case. Especially having some .pxd files that generate
>> .so files and others that don't will make this very ugly.
>>
>> I'd prefer adding support for cimporting from .pyx files instead,
>> potentially with an automated caching generation of corresponding .pxd
>> files (maybe as ".pxdg" files to make them easier to handle for users).
>> However, cyclic dependencies would be tricky to handle automatically then.
>
> That's fine with me, since I just checked and you can do
>    cdef extern from 'somelib.h':
> in .pyx files.
>
> Why would cyclic dependencies be difficult?  The contents of .pxdg
> files should only depend on the paired .pyx file, so there is no need
> to parse another .pyx/.pxd file when generating a .pxdg.  After you've
> created a .pxdg file, you're in the same situation that Cython
> presumably already handles well, with a .pyx/.pxd(g) pair.

Cyclic dependancies might require some special work, but they already do.

>> >>>> * Extending class cdef/cdpef/public/readonly handling to cover enums,
>> >>>>   stucts, and possibly unions.
>> >>>
>> >>> This seems like the best first step.
>>
>> +1
>
> Working on it...
>
>> >>>> * I don't know how to handle things like dummy enums (perhaps by
>> >>>>   requiring all cdef-ed enums to be named).
>> >>>
>> >>> All enums in C are named.
>> >>
>> >> But my Cython declaration (exposing a C `#define CONST_A 1`):
>> >>
>> >>     cdef extern from 'mylib.h':
>> >>         enum: CONST_A
>> >>
>> >> is not a named enum.
>> >
>> > Ah, yes. Maybe we require a name (that would only be used in Python space).
>>
>> ... require it for cpdef enums, you mean?
>>
>> OTOH, the "enum: NAME" scheme is ugly by itself. There should be a way to
>> declare external constants correctly. After all, we loose all type
>> information that way. I just saw that in math.pxd things like "M_PI" are
>> declared as plain "enum" instead of "const double" or something. The type
>> inferencer can't do anything with that. It might even draw the completely
>> wrong conclusions.
>
> Something like:
>
>    [cdef|cpdef] extern [public|readonly] <type> <name>
>
> For example:
>
>                cdef extern readonly double M_PI
>
> That would be nice, since the C compiler would (I think) raise an error
> when you try to use an invalid <type> for macro value.

Const is different than readonly, as readonly specifies the
python-level accessibility.

- Robert


More information about the cython-devel mailing list