[Cython] [cython-users] Cython .pxd introspection: listing defined constants

W. Trevor King wking at drexel.edu
Sat Feb 19 20:22:03 CET 2011


On Fri, Feb 18, 2011 at 02:08:04PM -0800, Robert Bradshaw wrote:
> On Thu, Feb 17, 2011 at 8:38 PM, W. Trevor King <wking at drexel.edu> wrote:
> > On Thu, Feb 17, 2011 at 3:53 PM, Robert Bradshaw wrote:
> >> On Thu, Feb 17, 2011 at 3:12 PM, W. Trevor King wrote:
> >>> On Thu, Feb 17, 2011 at 01:25:10PM -0800, Robert Bradshaw wrote:
> >>>> On Thu, Feb 17, 2011 at 5:29 AM, W. Trevor King wrote:
> >>>> > On Wed, Feb 16, 2011 at 03:55:19PM -0800, Robert Bradshaw wrote:
> >>>> >> On Wed, Feb 16, 2011 at 8:17 AM, W. Trevor King wrote:
> >>>> >> > What I'm missing is a way to bind the ModuleScope namespace to a name
> >>>> >> > in expose.pyx so that commands like `dir(mylib)` and `getattr(mylib,
> >>>> >> > name)` will work in expose.pyx.
> >>>> >>
> >>>> >> You have also hit into the thorny issue that .pxd files are used for
> >>>> >> many things. They may be pure C library declarations with no Python
> >>>> >> module backing, they may be declarations of (externally implemented)
> >>>> >> Python modules (such as numpy.pxd), or they may be declarations for
> >>>> >> Cython-implemented modules.
> >>>> >>
> >>>> >> Here's another idea, what if extern blocks could contain cpdef
> >>>> >> declarations, which would automatically generate a Python-level
> >>>> >> wrappers for the declared members (if possible, otherwise an error)?
> >>>> >
> >>>> > Ah, this sounds good!  Of the three .pxd roles you list above,
> >>>> > external Python modules (e.g. numpy) and Cython-implemented modules
> >>>> > (e.g. matched .pxd/.pyx) both already have a presence in Python-space.
> >>>> > What's missing is a way to give (where possible) declarations of
> >>>> > external C libraries a Python presence.  cpdef fills this hole nicely,
> >>>> > since its whole purpose is to expose Python interfaces to
> >>>> > C-based elements.
> >>>>
> >>>> In the case of external Python modules, I'm not so sure we want to
> >>>> monkey-patch our stuff in
> >>>
> >>> I don't think any of the changes we are suggesting would require
> >>> changes to existing code, so .pxd-s with external implementations
> >>> wouldn't be affected unless they brough the changes upon themselves.
> >>
> >> Say, in numpy.pxd, I have
> >>
> >> cdef extern from "...":
> >>    cpdef struct obscure_internal_struct:
> >>        ...
> >>
> >> Do we add an "obscure_internal_struct" onto the (global) numpy module?
> >> What if it conflicts with a (runtime) name? This is the issue I'm
> >> bringing up.
> >
> > Defining a cpdef *and* a non-matching external implementation should
> > raise a compile-time error.  I agree that there is a useful
> > distinction between external-C-library and external-Python-module .pxd
> > wrappers.  Perhaps your matching blank .py or .pyx file could serve as
> > a marker that the .pxd file should be inflated into its own full
> > fledged python module.  I'm not even sure how you would go about
> > adding attributes to the numpy module.  When/how would the
> > Cython-created attributes get added?
> 
> Yes, this is exactly the issue.

Ah, I'm retracting my agreement on the external-C-library and
external-Python-module .pxd wrappers.  There is no difference in how
their .pxd files should be treated, and I now agree that .pxd files
should not generate .so modules unless they have a paried .py/.pyx
file.

> > If you try to override anything in a .so compiled module at runtime,
> > you'd get the same kind of error you currently do trying to rebind a
> > compiled class' method.
> 
> That's the desired behavior for statically-bound globals, but
> implementing it is not so trivial.

It's currently implemented for classes.  Are modules that different
from classes?

    >>> import types
    >>> types.ModuleType.__class__.__mro__
		(<type 'type'>, <type 'object'>)

So you've got your standard __setattr__ to override with an
error-message generator.  What is the implementation difficulty?

I am also unclear about the distinction between cdef and cpdef
(perhaps this should be a new thread?).

cpdef means "I'm declaring something with C and Python interfaces.
  The Python interface is a thin wrapper which can be rebound to an
  object of any type, leaving the static C interface inaccessible from
  Python."
cdef [private] means "I'm declaring something that only has a C
  interface."
cdef public means "I'm declaring something with C and Python
  interfaces backed by C data.  Python code can alter the C data."
cdef readonly means "I'm declaring something with C and Python
  interfaces backed by C data.  Python code cannot alter the C data."

This seems to be broken in Cython at the module level, since I can
rebind a cdef-ed class but not a cpdef-ed method:

    $ cat square.pyx
    cdef class A (object):
        cdef public int value
    
        cpdef square(self):
            return self.value**2
    $ python -c 'import pyximport; pyximport.install(); import xx;
    a = xx.A(); a.value = 3; print a.square();
    a.square = lambda self: self.value'
    Traceback (most recent call last):
      File "<string>", line 3, in <module>
    AttributeError: 'square.A' object attribute 'square' is read-only
    $ python -c 'import pyximport; pyximport.install(); import square;
    square.A = object; print square.A; a = square.A(); a.value = 3'
    <type 'object'>
    Traceback (most recent call last):
      File "<string>", line 2, in <module>
    AttributeError: 'object' object has no attribute 'value'

So the cdef-ed A currently has a rebindable Python interface (acting
like a hypothetical cpdef-ed class), but it's square method is not
rebindable (acting like a hypothetical `cdef readonly`-ed method).

> >>>> (and where would we do it--on the first import of a cimporting
> >>>> module?)
> >>>
> >>> Compilation is an issue.  I think that .pxd files should be able to be
> >>> cythoned directly, since then they Cython can build any wrappers they
> >>> request.  If the file has a matching .pyx file, cythoning either one
> >>> should compile both together, since they'll produce a single Python
> >>> .so module.
> >>
> >> ...
> >
> > Under the mantra "explicit is better than implicit", we could have
> > users add something like
> >
> >    cdef module "modname"
> >
> > to any .pxd files that should be inflated into Python modules.  .pxd
> > files without such a tag would receive the current treatment, error on
> > any cpdef, etc.  The drawback of this approach is that it makes Cython
> > more complicated, but if both behaviors are reasonable, there's
> > probably no getting around that.
> 
> The other drawback is that it subverts the usual filename <-> module
> name convention that one usually expects.

I've been convinced that the `cimport .pyx file` route is a better way
to go.

However, the filename <-> module mapping is troublesome for backing
externally-implemented Python modules (e.g. numpy).  If you wanted to
write a .pxd file backing numpy.random, how would you go about getting
your module installed in Cython/Includes/numpy/random.pxd or another
path that cython would successfully match with `cimport numpy.random`?

On Sat, Feb 19, 2011 at 10:24:05AM +0100, Stefan Behnel wrote:
> Robert Bradshaw, 18.02.2011 23:08:
> > On Thu, Feb 17, 2011 at 8:38 PM, W. Trevor King wrote:
> >> On Thu, Feb 17, 2011 at 3:53 PM, Robert Bradshaw wrote:
> >>> On Thu, Feb 17, 2011 at 3:12 PM, W. Trevor King wrote:
> >>>>>> A side effect of this cpdef change would be that now even bare .pxd
> >>>>>> files (no matching .pyx) would have a Python presence,
> >>>>>
> >>>>> Where would it live? Would we just create this module (in essence,
> >>>>> acting as if there was an empty .pyx file sitting there as well)? On
> >>>>> this note, it may be worth pursuing the idea of a "cython helper"
> >>>>> module where common code and objects could live.
> >>>>
> >>>> I'm not sure exactly what you mean by "cython helper", but this sounds
> >>>> like my 'bare .pyx can create a Python .so module idea above.
> >>>
> >>> I'm thinking of a place to put, e.g. the generator and bind-able
> >>> function classes, which are now re-implemented in every module that
> >>> uses them. I think there will be more cases like this in the future
> >>> rather than less. C-level code could be #included and linked from
> >>> "global" stores as well. However, that's somewhat tangential.
> 
> If you generate more than one file from a .pyx, including files that are 
> shared between compiler runs (or even readily built as .so files), you'd 
> quickly end up in dependency hell.

I disagree here, but I like your cimportable .pyx better, so it
doesn't matter ;).

> >>>>>> Unions don't really have a Python parallel,
> >>>>>
> >>>>> They can be a cdef class wrapping the union type.
> >>>>
> >>>> But I would think coercion would be difficult.  Unions are usually (in
> >>>> my limited experience) for "don't worry about the type, just make sure
> >>>> it fits in X bytes".  How would union->Python conversion work?
> >>>
> >>> There would be a wrapping type, e.g.
> >>>
> >>> cdef class MyUnion:
> >>>     cdef union_type value
> 
> Wouldn't that have to be a pointer to the real thing instead?

Do you mean `cdef union_type *value`?  Why would the above version not
work?  The union type has a well defined size and a number of well
defined interpretations, so I don't see the problem.

> >>> with a bunch of setters/getters for the values, just like there are
> >>> for structs. (In fact the same code would handle structs and unions).
> >>>
> >>> This is getting into the wrapper-generator territory, but I'm starting
> >>> to think for simple things that might be worth it.
> >>
> >> I think that if Cython will automatically generate a wrapper for
> >>
> >>     cdef public int x
> >>
> >> it should generate a wrapper for
> >>
> >>     cdef struct X: cdef public int x
> >
> > Or
> >
> >      cdef public struct X:
> >          int x
> >          readonly int z
> >          private int z
> >
> > I would perhaps say that non-Pythonable non-private members in public
> > structs would be a compile error.
> 
> +1, keep it safe at the beginning.

-1, keep the code clean and the interface consistent ;).  I think the
struct syntax should be identical to the class syntax, with the
exception that you can't bind methods to structs.  That's the only
real difference between structs and classes, isn't it?

If safety with a new feature is a concern, a warning like
"EXPERIMENTAL FEATURE" in the associated docs and compiler output
should be sufficient.

> >> There really aren't that metatypes in C, so it doesn't seem like a
> >> slippery slope to me.  Maybe I'm just missing something...
> >>
> >>>> Ok, I think we're pretty much agreed ;).  I think that the next step
> >>>> is to start working on implementations of:
> >>>>
> >>>> * Stand alone .pxd ->  Python module
> >>>
> >>> I'm not sure we're agreed on this one.
> 
> Same from here. To me, that doesn't make much sense for code that wraps a 
> library. And if it doesn't wrap a library, there isn't much benefit in 
> writing a stand-alone .pxd in the first place. A .pyx is much more explicit 
> and obvious in this case. Especially having some .pxd files that generate 
> .so files and others that don't will make this very ugly.
> 
> I'd prefer adding support for cimporting from .pyx files instead, 
> potentially with an automated caching generation of corresponding .pxd 
> files (maybe as ".pxdg" files to make them easier to handle for users). 
> However, cyclic dependencies would be tricky to handle automatically then.

That's fine with me, since I just checked and you can do
    cdef extern from 'somelib.h':
in .pyx files.

Why would cyclic dependencies be difficult?  The contents of .pxdg
files should only depend on the paired .pyx file, so there is no need
to parse another .pyx/.pxd file when generating a .pxdg.  After you've
created a .pxdg file, you're in the same situation that Cython
presumably already handles well, with a .pyx/.pxd(g) pair.

> >>>> * Extending class cdef/cdpef/public/readonly handling to cover enums,
> >>>>   stucts, and possibly unions.
> >>>
> >>> This seems like the best first step.
> 
> +1

Working on it...

> >>>> * I don't know how to handle things like dummy enums (perhaps by
> >>>>   requiring all cdef-ed enums to be named).
> >>>
> >>> All enums in C are named.
> >>
> >> But my Cython declaration (exposing a C `#define CONST_A 1`):
> >>
> >>     cdef extern from 'mylib.h':
> >>         enum: CONST_A
> >>
> >> is not a named enum.
> >
> > Ah, yes. Maybe we require a name (that would only be used in Python space).
> 
> ... require it for cpdef enums, you mean?
> 
> OTOH, the "enum: NAME" scheme is ugly by itself. There should be a way to 
> declare external constants correctly. After all, we loose all type 
> information that way. I just saw that in math.pxd things like "M_PI" are 
> declared as plain "enum" instead of "const double" or something. The type 
> inferencer can't do anything with that. It might even draw the completely 
> wrong conclusions.

Something like:

    [cdef|cpdef] extern [public|readonly] <type> <name>

For example:

		cdef extern readonly double M_PI

That would be nice, since the C compiler would (I think) raise an error
when you try to use an invalid <type> for macro value.


[1]: http://projects.scipy.org/numpy/ticket/1686

-- 
This email may be signed or encrypted with GPG (http://www.gnupg.org).
The GPG signature (if present) will be attached as 'signature.asc'.
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

My public key is at http://www.physics.drexel.edu/~wking/pubkey.txt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20110219/4f84fe78/attachment-0001.pgp>


More information about the cython-devel mailing list