[SciPy-User] Graph connect components and sparse matrices

Gael Varoquaux gael.varoquaux at normalesup.org
Wed Nov 18 11:10:04 EST 2009


On Wed, Nov 18, 2009 at 09:36:26AM -0500, Zachary Pincus wrote:
> Note that at least the non-sparse routines in numpy do this:

>   numpy.linalg.eig(numpy.zeros((5,5)))

> (array([ 0.,  0.,  0.,  0.,  0.]),
>   array([[ 1.,  0.,  0.,  0.,  0.],
>         [ 0.,  1.,  0.,  0.,  0.],
>         [ 0.,  0.,  1.,  0.,  0.],
>         [ 0.,  0.,  0.,  1.,  0.],
>         [ 0.,  0.,  0.,  0.,  1.]]))

Correct, and they do work OK on real-word graphs, but arpack doesn't, and
its easy to see why (more on that below).

> > The problem arises also on non trivial graphs, by the way. The
> > problem is that doing an EVD of the transition of laplace matrix only
> > gives a subspace of the kernel of the laplace matrix. I could
> > probably do a sparse matrix factorization on that, but I see
> > complexity and cost  coming in, and I am trying to avoid that.

> My linear algebra is only tenuous at best, so I don't exactly see why  
> this is a problem. As far as I understand, to find the connected  
> components, first you find the eigenvectors of the laplacian that have  
> an eigenvalue of zero. Then for each node in the graph i, there will  
> be exactly one eigenvector with a non-zero value at position i. The  
> index of this eigenvector is the index of the connected component that  
> i belongs to.

> Is that right? Again, my linear algebra is rusty.

Well, the problem is that if you have several eigen values that have the
same value (0 for the laplacian, or 1 for the transition matrix), there
is an infinity of eigen vectors defined: any combination of eigen vector
corresponding to that eigen value is an eigen vector. What I am looking
for is a set of particular eigen vectors. I suspect that the property
that defines seem is sparsity (in a machine learning sens, rather than a
sparse linear algebra sens): many of their coefficients are 0. There
machine learning algorithms to find a sparse basis from a non-sparse one,
but first of all it starts getting too complex for my liking, second I am
unsure that the sparsity is really the exact property that will give me
the connect components of my graph.

Gaël



More information about the SciPy-User mailing list