[SciPy-dev] eig, eigh, and symeig in scipy

Tiziano Zito opossumnano at gmail.com
Tue Oct 28 13:30:57 EDT 2008


dear list, 

I have started to work on the integration of symeig in scipy. I am
unfortunately unavailable for the sprint on november 1st and 2nd,
but I hope to make up for lost time during next week. This mail is
mainly directed to robert, who was assigned the task of helping me
with this issues, but other devs may have something to say
nonetheless. This is a looong message, be aware! 

I've tried to understand the organization of the symmetric 
eigenproblem routines in numpy and scipy, and came up with this:

- numpy.linalg.eigh:
    - only standard problems ( Ax = lx)
    - wraps (or uses its own lapack_lite version) EVD routines

- scipy.linalg.eigh:
    - only standard problems (Ax = lx)
    - wraps EV routines
    - signature differs from numpy.linalg.eigh

- symeig:
    - for standard problems (Ax = lx) wraps EVR routines
    - for generalized problems (Ax = lBx) wraps GV, GVX, and GVD routines
    - the routine is chosen depending on some flags and depending on
      the type of problem

Those, who don't know the corresponding LAPACK routines, can find a
description here:
http://www.netlib.org/lapack/lug/node30.html
and here:
http://www.netlib.org/lapack/lug/node34.html

Basically, for the standard problems 
- the EV routines are the oldest one (less optimized and less
  precise)
- the EVD are faster than the EV but require more memory
- the EVX offer the possibility to extract just a subset of the
  eigenvectors/eigenvalues
- the EVR are the fastest and use less memory than all others

for the generalized problems:
- the GV routines are the oldest one
- the GVX offer the possibility to extract just a subset of the
  eigenvectors/eigenvalues
- the GVD are faster then GV but require more memory

This is the status quo, now:
- why are numpy and scipy wrapping different routines?
  I think numpy should keep on wrapping the EVD routines (unless
  someone manages to add the EVR routines to lapack_lite.c) and I
  see no reason for scipy to wrap the old EV routines instead of the
  EVD or EVR ones.

- In my opinion, scipy.linalg.eigh should take over the
  functionality now offered by symeig. Note that this would be
  consistent with what is already done with linalg.eig: the scipy
  version adds generalized problem functionality to the standard
  problem functionality already available in numpy (as a plus eigh
  would also grow the possibility to request just 
  a subset of the eigenvs).

- While developing symeig with tested the EVR routines: they are
  really better, so I see no reason not to use them by default. On
  top of that, they fail when a matrix is nearly singular, while EV
  and EVD tend to give wrong results without failing. When an EVR
  routine fails the user can be advised that linalg.svd may be
  better for his case.

- The pyf description files are (if I am not mistaken) automatically
  generated. Should I try to let the generator create pyf files for
  our routines, or do you want to have the little (?) inconsistency
  of having some hand-made pyf in the distribution?

- In the symeig test suite we use two very useful routines:
  random_rot and symrand. Their docstrings:
  
  random_rot:
    """Return a random rotation matrix, drawn from the Haar distribution
    (the only uniform distribution on SO(n)).
    The algorithm is described in the paper
    Stewart, G.W., "The efficient generation of random orthogonal
    matrices with an application to condition estimators", SIAM Journal
    on Numerical Analysis, 17(3), pp. 403-409, 1980.
    For more information see
    http://en.wikipedia.org/wiki/Orthogonal_matrix#Randomization"""

  symrand:
    """Return a random symmetric (Hermitian) matrix.
    
    If 'dim_or_eigv' is an integer N, return a NxN matrix, with eigenvalues
        uniformly distributed on (0,1].
        
    If 'dim_or_eigv' is  1-D real array 'a', return a matrix whose
                      eigenvalues are sort(a)."""

  Do you think these routines may be a useful addition to scipy, or
  should I keep them hidden in the tests?

let me know what do you think!

ciao,
tiziano










More information about the SciPy-Dev mailing list