[scikit-learn] clustering on big dataset

Shiheng Duan shiduan at ucdavis.edu
Wed Jan 3 19:04:18 EST 2018


Yes, it is an efficient method, still, we need to specify the number of
clusters or the threshold. Is there another way to run hierarchy clustering
on the big dataset? The main problem is the distance matrix.
Thanks.

On Tue, Jan 2, 2018 at 6:02 AM, Olivier Grisel <olivier.grisel at ensta.org>
wrote:

> Have you had a look at BIRCH?
>
> http://scikit-learn.org/stable/modules/clustering.html#birch
>
> --
> Olivier
>>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180103/0c783723/attachment.html>


More information about the scikit-learn mailing list