[SciPy-User] cs_graph_componets returning wrong results?

Gael Varoquaux gael.varoquaux at normalesup.org
Wed Jan 11 06:01:15 EST 2012


On Wed, Jan 11, 2012 at 07:27:06AM +0100, Gael Varoquaux wrote:
> Now the docstring says that only the upper triangular part of the matrix
> is used. This docstring is clearly wrong in the current state. I'll try
> and fix that.

I have been looking at that problem more closely, and I have pretty much
come to the conclusion that there is no good way of cheaply finding
connected components with non symmetric adjacency matrix.

Technically this boils down to the fact that connect components
algorithms do a graph traversal. With the CSR representation (used by the
algorithm) this graph traversal is cheap only in one direction (jumping
from row to columns). If the adjacency matrix is not symmetric, we are
thus doing a graph traversal on a directed graph, and the connected
component found will be the descendant of the starting nodes.

The cost of building the lookup tables to do traversal in the opposite
direction is pretty much the cost of making the matrix symmetrical. Thus
it seems to me that for such a low level algorithm it is better to simply
tell the user that the input matrix should be symmetric and leave it to
them to comply.

If people agree, I'll just modify the docstring to point this out.

Gaël



More information about the SciPy-User mailing list