[scikit-learn] Accessing Clustering Feature Tree in Birch

Sema Atasever s.atasever at gmail.com
Wed Sep 20 07:40:32 EDT 2017


I need this information to use it in a scientific study and
I think that a function interface would make this easier.

Thank you for your answer.

On Sat, Sep 16, 2017 at 1:53 PM, Joel Nothman <joel.nothman at gmail.com>
wrote:

> There is no such thing as "the data samples in this cluster". The point of
> Birch being online is that it loses any reference to the individual samples
> that contributed to each node, but stores some statistics on their basis.
> Roman Yurchak has, however, offered a PR where, for the non-online case,
> storage of the indices contributing to each node can be optionally turned
> on: https://github.com/scikit-learn/scikit-learn/pull/8808
>
> As for finding what is contained under any particular node, traversing the
> tree is a fairly basic task from a computer science perspective. Before we
> were to support something to make this much easier, I think we'd need to be
> clear on what kinds of use case we were supporting. What do you hope to do
> with this information, and what would a function interface look like that
> would make this much easier?
>
> Decimals aren't a practical option as the branching factor may be greater
> than 10, it is a hard structure to inspect, and susceptible to
> computational imprecision. Better off with a list of tuples, but what for
> that is not easy enough to do now?
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170920/17917bd7/attachment.html>


More information about the scikit-learn mailing list