[SciPy-Dev] Implementation choices in scipy.sparse.linalg.LinearOperator

Sun Feb 2 16:45:00 EST 2020

On Fri, Jan 24, 2020 at 6:54 PM matteo ravasi <matteoravasi at gmail.com>
wrote:

> Dear All,
> just trying to revive this as I didn’t get any reply.
>

Hi Matteo, thanks for being persistent! The main reason for the lack of
reply is probably that no one remembers the answer - the code is quite old
(pre-2008), and the git history says Pauli is the only currently active dev
who has done significant work on it in the last years.

> I can see that there is more interest in the community with respect to
> ad-hoc matrix-vector multiplications (aka Linear operators) - see also the
> discussion in Re: [SciPy-Dev] Efficient Toeplitz Matrix-Vector
> Multiplication and related PR.
>
> Hopefully some of the core developers or anyone involved in this area can
> comment here ;)
>
> Thank you!
> MR
>
>
>
> On 8 Jan 2020, at 20:31, matteo ravasi <matteoravasi at gmail.com> wrote:
>
> Dear Scipy Developers,
> some of you may be aware of the PyLops project (
> https://github.com/equinor/pylops), which heavily builds on the
> scipy.sparse.linalg.LinearOperator class of scipy and provides an extensive
> library of general purpose and domain specific linear operators.
>
> Recently a users asked for a functionality that would create the
> equivalent dense matrix representation of an operator (to be mostly used
> for teaching purposes). While implementing it, we came across two internal
> behaviours in the interface.py file that we would like to mention and
> discuss about:
>
> - The method _matmat and _rmatmat as implemented per today requires
> operators to work on (N, 1)/(M, 1) arrays. Whilst the first sentence in the
> Notes of LinearOperator states *"The user-defined matvec() function must
> properly handle the case where v has shape (N,) as well as the (N,1) case*",
> I tend to believe that the first option should be the encouraged one, as I
> do not really see any advantage with carrying over trailing dimensions in
> this case. Moreover, some of the linear solvers in scipy that can be used
> together with linear operator such as lsqr or lsmr, require the RHS b
> vector to have a single dimension (b : (m,) ndarray). Again, although a
> quick test showed that they work also when providing a b vector of shape
> (m, 1), this shows to me that the extra trailing dimension is just a
> nuisance and should be discouraged. Assuming that you agree with this
> argument, an alternative solution that works on (N, ) input is actually
> possible and currently implemented in PyLops for _matmat (see here
> https://github.com/equinor/pylops/blob/master/pylops/LinearOperator.py).
> This avoids having implementations that work for both (N, ) and (N, 1),
> which sometimes requires adding additional checks and squeezes within the
> implementation of _matvec and _rmatvec in situations where trailing
> dimensions in the input vector are not really bringing any benefit.
>
> It seems to be, only based on this description, that we can't just break
backwards compatibility here, however it is likely not hard to add a
squeeze call in the right places (or even one central validation function)
and then make everything else assume (N,).

If it would make your life in PyLops easier (not entirely clear to me that
that's why you are asking), then we could look at the impact of deprecating
(N, 1). If no deprecation is needed, a refactoring seems welcome.

See also https://github.com/scipy/scipy/issues/2473, I remembered that we
had an open issue of merging back PyOperators changes. Now there's PyLops,
which may be in a similar position as PyOperators was.

If you would want to start doing some refactoring/maintenance/improvements
in ways that help PyLops (or just make LinearOperator better), that would
be great - we'd love to have someone take a bit of ownership of it.

>
> - The above mentioned new feature has been currently added as a method to
> the PyLops LinearOperator class (and named todense in a similar fashion to
> scipy.sparse.xxx_matrix.todense methods). Doing so, we realized that every
> time two operators are summed/subtracted/multiplied… (pretty much when one
> of the overloading of scipy.sparse.linalg.LinearOperator is invoked), the
> returned object is not of LinearOperator type anymore but of the type of
> the private class in scipy.sparse.linalg.interface used to perform the
> requested operator (e.g, _SumLinearOperator). I wonder if returning an
> object of a ‘private’ type is actually something recommended to do in scipy
> (or even if this happens anywhere else in scipy).
>
> When we did this in scipy.stats, we also exposed the new classes. It's
normally not good practice to return instances of private classes I'd say.

> Again, in PyLops we need to overload the __add__ method (and similar
> methods) and make it into a thin wrappers of the scipy equivalent,
> returning an object of our parent class LinearOperator. This is to ensure
> that the returned object would maintain the additional properties and
> methods of our Linearoperator class, but also to avoid ‘leaking’ any of the
> private classes to the end users. I wonder if also the scipy API should
> perhaps avoid returning to end users objects of some private classes.
>
> That's really hard to avoid when users or other libraries create
subclasses.

>
> This email is not intended to criticise any of current behaviours in
> scipy.sparse.linalg.LinearOperator, rather to gain an understanding of some
> of the decisions made by the core developers of the
> scipy.sparse.linalg.interface submodule, which may all have their own right
> to be the way they are today :)
>
>
I hope the above helps!

Cheers,
Ralf

> Looking forward to hearing from you.
>
> Best wishes,
> MR
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20200202/bdc2e2a3/attachment-0001.html>