[Numpy-discussion] PR added: frozen dimensions in gufunc signatures

Fri Aug 29 04:55:00 EDT 2014

On Thu, Aug 28, 2014 at 5:40 PM, Nathaniel Smith <njs at pobox.com> wrote:

> Some thoughts:
>
> But, for your computed dimension idea I'm wondering if what we should
> do instead is just let a gufunc provide a C callback that looks at the
> input array dimensions and explicitly says somehow which dimensions it
> wants to treat as the core dimensions and what its output shapes will
> be. There's no rule that we have to extend the signature mini-language
> to be Turing complete, we can just use C :-).
>
> It would be good to have a better motivation for computed gufunc
> dimensions, though. Your "all pairwise cross products" example would
> be *much* better handled by implementing the .outer method for binary
> gufuncs: pairwise_cross(a) == cross.outer(a, a). This would make
> gufuncs more consistent with ufuncs, plus let you do
> all-pairwise-cross-products between two different sets of cross
> products, plus give us all-pairwise-matrix-products for free, etc.
>

The outer for binary gufuncs sounds like a good idea. A reduce for binary
gufuncs that allow it (like square matrix multiplication) would also be
nice. But going back to the original question, the pairwise whatevers were
just an example: one could come up with several others, e.g.:

    (m),(n)->($p),($q) with $p = m - n and $q = n - 1, could be (I think)
the signature of a polynomial division gufunc
    (m),(n)->($p), with $p = m - n + 1, could be the signature of a
convolution or correlation gufunc
    (m)->($n), with $n = m / 2, could be some form of downsampling gufunc

> While you're messing around with the gufunc dimension matching logic,
> any chance we can tempt you to implement the "optional dimensions"
> needed to handle '@', solve, etc. elegantly? The rule would be that
> you can write something like
>    (n?,k),(k,m?)->(n?,m?)
> and the ? dimensions are allowed to take on an additional value
> "nothing at all". If there's no dimension available in the input, then
> we act like it was reshaped to add a dimension with shape 1, and then
> in the output we squeeze this dimension out again. I guess the rules
> would be that (1) in the input, you can have ? dimensions at the
> beginning or the end of your shape, but not both at the same time, (2)
> any dimension that has a ? in one place must have it in all places,
> (3) when checking argument conformity, "nothing at all" only matches
> against "nothing at all", not against 1; this is because if we allowed
> (n?,m),(n?,m)->(n?,m) to be applied to two arrays with shapes (5,) and
> (1, 5), then it would be ambiguous whether the output should have
> shape (5,) or (1, 5).
>

I definitely do not mind taking a look into it. I need to think a little
more about the rules to convince myself that there is a consistent set of
them that we can use. I also thought there may be a performance concern,
that you may want to have different implementations when dimensions are
missing, not automatically add a 1 and then remove it. It doesn't seem to
be the case with neither `np.dot` nor `np.solve`, so maybe I am being
overly cautious.

Thanks for your comments and ideas. I have a feeling there are some nice
features hidden in here, but I can't seem to figure out what should they be
on my own.

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140829/9465eddb/attachment.html>