[SciPy-dev] eig, eigh, and symeig in scipy

Tue Oct 28 13:43:34 EDT 2008

On Tue, Oct 28, 2008 at 12:30, Tiziano Zito <opossumnano at gmail.com> wrote:
> dear list,
>
> I have started to work on the integration of symeig in scipy. I am
> unfortunately unavailable for the sprint on november 1st and 2nd,
> but I hope to make up for lost time during next week. This mail is
> mainly directed to robert, who was assigned the task of helping me
> with this issues, but other devs may have something to say
> nonetheless. This is a looong message, be aware!
>
> I've tried to understand the organization of the symmetric
> eigenproblem routines in numpy and scipy, and came up with this:
>
> - numpy.linalg.eigh:
>    - only standard problems ( Ax = lx)
>    - wraps (or uses its own lapack_lite version) EVD routines
>
> - scipy.linalg.eigh:
>    - only standard problems (Ax = lx)
>    - wraps EV routines
>    - signature differs from numpy.linalg.eigh
>
> - symeig:
>    - for standard problems (Ax = lx) wraps EVR routines
>    - for generalized problems (Ax = lBx) wraps GV, GVX, and GVD routines
>    - the routine is chosen depending on some flags and depending on
>      the type of problem
>
> Those, who don't know the corresponding LAPACK routines, can find a
> description here:
> http://www.netlib.org/lapack/lug/node30.html
> and here:
> http://www.netlib.org/lapack/lug/node34.html
>
> Basically, for the standard problems
> - the EV routines are the oldest one (less optimized and less
>  precise)
> - the EVD are faster than the EV but require more memory
> - the EVX offer the possibility to extract just a subset of the
>  eigenvectors/eigenvalues
> - the EVR are the fastest and use less memory than all others
>
> for the generalized problems:
> - the GV routines are the oldest one
> - the GVX offer the possibility to extract just a subset of the
>  eigenvectors/eigenvalues
> - the GVD are faster then GV but require more memory
>
> This is the status quo, now:
> - why are numpy and scipy wrapping different routines?

The usual reasons. History. Limited available time. Possibly limited
understanding of the capabilities of LAPACK. Possibly limited
capabilities of the common version of LAPACK at the time the wrappers
were done.

>  I think numpy should keep on wrapping the EVD routines (unless
>  someone manages to add the EVR routines to lapack_lite.c) and I
>  see no reason for scipy to wrap the old EV routines instead of the
>  EVD or EVR ones.

Agreed.

> - In my opinion, scipy.linalg.eigh should take over the
>  functionality now offered by symeig. Note that this would be
>  consistent with what is already done with linalg.eig: the scipy
>  version adds generalized problem functionality to the standard
>  problem functionality already available in numpy (as a plus eigh
>  would also grow the possibility to request just
>  a subset of the eigenvs).

Agreed.

> - While developing symeig with tested the EVR routines: they are
>  really better, so I see no reason not to use them by default. On
>  top of that, they fail when a matrix is nearly singular, while EV
>  and EVD tend to give wrong results without failing. When an EVR
>  routine fails the user can be advised that linalg.svd may be
>  better for his case.

When were they introduced to LAPACK? We may need to fall back on the
EVD routines if the EVR routines are not present.

> - The pyf description files are (if I am not mistaken) automatically
>  generated. Should I try to let the generator create pyf files for
>  our routines, or do you want to have the little (?) inconsistency
>  of having some hand-made pyf in the distribution?

They are f2py-generated, then edited. You should use f2py to create
descriptions the new subroutines, then add them to the appropriate
.pyf files, then edit them by hand to provide the missing information.

> - In the symeig test suite we use two very useful routines:
>  random_rot and symrand. Their docstrings:
>
>  random_rot:
>    """Return a random rotation matrix, drawn from the Haar distribution
>    (the only uniform distribution on SO(n)).
>    The algorithm is described in the paper
>    Stewart, G.W., "The efficient generation of random orthogonal
>    matrices with an application to condition estimators", SIAM Journal
>    on Numerical Analysis, 17(3), pp. 403-409, 1980.
>    For more information see
>    http://en.wikipedia.org/wiki/Orthogonal_matrix#Randomization"""
>
>  symrand:
>    """Return a random symmetric (Hermitian) matrix.
>
>    If 'dim_or_eigv' is an integer N, return a NxN matrix, with eigenvalues
>        uniformly distributed on (0,1].
>
>    If 'dim_or_eigv' is  1-D real array 'a', return a matrix whose
>                      eigenvalues are sort(a)."""
>
>  Do you think these routines may be a useful addition to scipy, or
>  should I keep them hidden in the tests?

Add them, please. I have needed at least the first one on a few occasions.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco