[scikit-learn] Agglomerative Clustering without knowing number of clusters

Shane Grigsby shane.grigsby at colorado.edu
Thu Jul 6 12:32:57 EDT 2017


This sounds like it may be a problem more amenable to either DBSCAN or 
OPTICS. Both algorithms don't require a priori knowledge of the number 
of clusters, and both let you specify a minimum point membership 
threshold for cluster membership. The OPTICS algorithm will also produce 
a dendrogram that you can cut for sub clusters if need be.

DBSCAN is part of the stable release and has been for some time; OPTICS 
is pending as a pull request, but it's stable and you can try it if you 
like:

https://github.com/scikit-learn/scikit-learn/pull/1984

Cheers,
Shane

On 06/30, Ariani A wrote:
>I want to perform agglomerative clustering, but I have no idea of number of
>clusters before hand. But I want that every cluster has at least 40 data
>points in it. How can I apply this to sklearn.agglomerative clustering?
>Should I use dendrogram and cut it somehow? I have no idea how to relate
>dendrogram to this and cutting it out. Any help will be appreciated!

>_______________________________________________
>scikit-learn mailing list
>scikit-learn at python.org
>https://mail.python.org/mailman/listinfo/scikit-learn


-- 
*PhD candidate & Research Assistant*
*Cooperative Institute for Research in Environmental Sciences (CIRES)*
*University of Colorado at Boulder*


More information about the scikit-learn mailing list