[scikit-learn] Long term roadmap and moonshot goals

Andreas Mueller t3kcit at gmail.com
Tue Jul 23 11:52:40 EDT 2019


Can you give an example?

I imagine that just supporting the data structure will not give you any 
speed benefit unless the algorithms are reimplemented to take advantage 
of the problem structure.
Even if the output of logistic regression would be a sparse binary 
vector, you'd still need to compute every entry, which would be the slow 
part.



On 7/23/19 10:36 AM, Piotr Szymański wrote:
> If I could pitch in, it would be lovely, very lovely indeed, if 
> scikit-learn models could:
>
> - operate on sparse data, both input and output by default
> - implement some kind of sparse vector representation (as in 
> https://github.com/scikit-learn/scikit-learn/issues/8908 )
> - perhaps have a unifiying numpy.array / scipy.sparse_matrix interface 
> to give people some slack on jumping betwen [] operator conventions
>
> We would benefit from that strongly in scikit-multilearn, as when a 
> multi-output problem is transformed to a single-output problem based 
> on unique combinations, this representation has to be dense for 
> scikit-learn at the moment. We end up losing some speed there. I'm 
> sure other libraries like ex. imbalanced-learn, or scikit-multiflow 
> would also see these as a huge thing.
>
> Best,
> Piotr
>
>
>
> On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller <t3kcit at gmail.com 
> <mailto:t3kcit at gmail.com>> wrote:
>
>     Hi all.
>     At SciPy, Brian Granger raised a good point about their planning
>     for the
>     Jupyter Project, which is the importance of long-term goals.
>
>     I think it's great that we now have a detailed short-term roadmap
>     (https://scikit-learn.org/dev/roadmap.html).
>     Given that we now have about 6(!) full time people (Oliver, Jeremy,
>     Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I
>     think
>     it's realistic
>     to achieve most of these within a year or two. We have actually made
>     some significant progress already.
>
>     I think now would be a good time to start thinking about a
>     longer-term
>     roadmap, say 3-5 years out.
>     What do we want to achieve? What are realistic goals, and what are
>     moonshot goals?
>     Having a common vision and shared goals might help us with
>     funding, but
>     might also help us with prioritization and motivation.
>
>     What do you think? Do you think this is important and worth-while?
>     And what should our goals be?
>
>     Best,
>     Andy
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> -- 
> Piotr Szymański
> niedakh at gmail.com <mailto:niedakh at gmail.com>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190723/12accf04/attachment.html>


More information about the scikit-learn mailing list