[scikit-learn] Agglomerative clustering problem

Uri Goren uri at goren4u.com
Tue Jul 11 13:54:12 EDT 2017


Take a look at scipy's fcluster function.
If M is a matrix of all of your feature vectors, this code snippet should
work.

You need to figure out what metric and algorithm work for you

    from sklearn.metrics import pairwise_distance
    from scipy.cluster import  hierarchy
    X = pairwise_distance(M, metric=metric)
    Z = hierarchy.linkage(X, algo, metric=metric)
    C = hierarchy.fcluster(Z,threshold, criterion="distance")

Best,
Uri Goren

On Tue, Jul 11, 2017 at 7:42 PM, Ariani A <b.noushin7 at gmail.com> wrote:

> Hi all,
> I want to perform agglomerative clustering, but I have no idea of number
>  of clusters before hand. But I want that every cluster has at least 40
> data points in it. How can I apply this to sklearn.agglomerative clusteri
> ng?
> Should I use dendrogram and cut it somehow? I have no idea how to relate
> dendrogram to this and cutting it out. Any help will be appreciated!
> I have to use agglomerative clustering!
> Thanks,
> -Ariani
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>


-- 


*Uri Goren,Software innovator*

*Phone: +972-507-649-650*

*EMail: uri at goren4u.com <uri at goren4u.com>*
*Linkedin: il.linkedin.com/in/ugoren/ <http://il.linkedin.com/in/ugoren/>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170711/985b1f86/attachment.html>


More information about the scikit-learn mailing list