[scikit-learn] [semi-supervised learning] Using a pre-existing graph with LabelSpreading API
Delip Rao
deliprao at gmail.com
Thu Dec 1 22:33:33 EST 2016
Hello,
I have an existing graph dataset in the edge format:
node_i node_j weight
The number of nodes are around 3.6M, and the number of edges are around 72M.
I also have some labeled data (around a dozen per class with 16 classes in
total), so overall, a perfect setting for label propagation or its
variants. In particular, I want to try the LabelSpreading implementation
for the regularization. I looked at the documentation and can't find a way
to plug in a pre-computed graph (or adjacency matrix). So two questions:
1. What are any scaling issues I should be aware of for a dataset of this
size? I can try sparsifying the graph, but would love to learn any knobs I
should be aware of.
2. How do I plugin an existing weighted graph with the current API? Happy
to use any undocumented features.
Thanks in advance!
Delip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20161202/41cd8ffa/attachment.html>
More information about the scikit-learn
mailing list