[scikit-learn] Agglomerative Clustering without knowing number of clusters

Ariani A b.noushin7 at gmail.com
Thu Jul 13 19:03:32 EDT 2017


Dear Shane,
Thanks for your prompt answer.
Do you mean that for DBSCAN there is no need to feed other parameters? Do I
just call the function or I have to manipulate the code?
P.S. I was not able to find the DBSCAN code on github.
Looking forward to hearing from you.
Best,
-Noushin

On Thu, Jul 13, 2017 at 5:38 PM, Shane Grigsby <shane.grigsby at colorado.edu>
wrote:

> Hi Ariani,
> Yes, you can use a distance matrix-- I think that what you want is
> metric='precomputed', and then X would be your N by N distance matrix.
> Hope that helps,
> ~Shane
>
>
> On 07/13, Ariani A wrote:
>
>> Dear Shane,
>> Thanks for your answer.
>> Does DBSCAN works with distance matrix/? I have a distance matrix
>> (symmetric matrix which contains pairwise distances). Can you help me? I
>> did not find DBSCAN code in that link.
>> Best,
>> -Ariani
>>
>> On Thu, Jul 6, 2017 at 12:32 PM, Shane Grigsby <
>> shane.grigsby at colorado.edu>
>> wrote:
>>
>> This sounds like it may be a problem more amenable to either DBSCAN or
>>> OPTICS. Both algorithms don't require a priori knowledge of the number of
>>> clusters, and both let you specify a minimum point membership threshold
>>> for
>>> cluster membership. The OPTICS algorithm will also produce a dendrogram
>>> that you can cut for sub clusters if need be.
>>>
>>> DBSCAN is part of the stable release and has been for some time; OPTICS
>>> is
>>> pending as a pull request, but it's stable and you can try it if you
>>> like:
>>>
>>> https://github.com/scikit-learn/scikit-learn/pull/1984
>>>
>>> Cheers,
>>> Shane
>>>
>>>
>>> On 06/30, Ariani A wrote:
>>>
>>> I want to perform agglomerative clustering, but I have no idea of number
>>>> of
>>>> clusters before hand. But I want that every cluster has at least 40 data
>>>> points in it. How can I apply this to sklearn.agglomerative clustering?
>>>> Should I use dendrogram and cut it somehow? I have no idea how to relate
>>>> dendrogram to this and cutting it out. Any help will be appreciated!
>>>>
>>>>
>>> _______________________________________________
>>>
>>>> scikit-learn mailing list
>>>> scikit-learn at python.org
>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>
>>>>
>>>
>>> --
>>> *PhD candidate & Research Assistant*
>>> *Cooperative Institute for Research in Environmental Sciences (CIRES)*
>>> *University of Colorado at Boulder*
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>>
> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
>
> --
> *PhD candidate & Research Assistant*
> *Cooperative Institute for Research in Environmental Sciences (CIRES)*
> *University of Colorado at Boulder*
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170713/28f690b4/attachment-0001.html>


More information about the scikit-learn mailing list