From pahome.chen at mirlab.org  Tue Jul  2 00:48:11 2019
From: pahome.chen at mirlab.org (lampahome)
Date: Tue, 2 Jul 2019 12:48:11 +0800
Subject: [scikit-learn] What's the principle of partial_fit?
Message-ID: <CAB3eZftu2e=F3aKZA1t3R7WJtkyuj4zXJiNAqOP_uGaEw+FR9g@mail.gmail.com>

I work with partial_fit of Birch because the dataset is too huge to load
into memory.

So I cluster data batch by batch. eg: I have 50000 samples and every batch
contain 1000 samples.

I found clustering result is better if I cluster data which contain part of
last batch better than cluster data which doesn't contain previous data.

So I want to know how partail_fit works.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190702/2203a4b5/attachment.html>

From olivier.grisel at ensta.org  Wed Jul  3 04:12:46 2019
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Wed, 3 Jul 2019 10:12:46 +0200
Subject: [scikit-learn] New core developer: jeremiedbb
Message-ID: <CAFvE7K51r0hwpmi-7gi9tZ2+M=titEZapcvtdxg-2PMUq0FUQQ@mail.gmail.com>

The core developers of Scikit-learn have recently voted to welcome
J?r?mie Du Boisberranger to the team, in recognition of his efforts
and trustworthiness as contributor. J?r?mie's works at Inria Saclay
and is supported by the scikit-learn initiative at Fondation Inria and
its partners.

Congratulations and welcome to the team J?r?mie!

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

From adrin.jalali at gmail.com  Fri Jul  5 10:02:18 2019
From: adrin.jalali at gmail.com (Adrin)
Date: Fri, 5 Jul 2019 16:02:18 +0200
Subject: [scikit-learn] New core developer: jeremiedbb
In-Reply-To: <CAFvE7K51r0hwpmi-7gi9tZ2+M=titEZapcvtdxg-2PMUq0FUQQ@mail.gmail.com>
References: <CAFvE7K51r0hwpmi-7gi9tZ2+M=titEZapcvtdxg-2PMUq0FUQQ@mail.gmail.com>
Message-ID: <CAEOrW4_5+NTmvicMM7XzrQ5j36k=wXc7xbvswFvq6XV71nTjpw@mail.gmail.com>

woohoo, congrats Jeremie :)

On Wed, Jul 3, 2019 at 10:14 AM Olivier Grisel <olivier.grisel at ensta.org>
wrote:

> The core developers of Scikit-learn have recently voted to welcome
> J?r?mie Du Boisberranger to the team, in recognition of his efforts
> and trustworthiness as contributor. J?r?mie's works at Inria Saclay
> and is supported by the scikit-learn initiative at Fondation Inria and
> its partners.
>
> Congratulations and welcome to the team J?r?mie!
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190705/7e130115/attachment.html>

From charujing123 at 163.com  Sun Jul  7 02:03:52 2019
From: charujing123 at 163.com (charujing123)
Date: Sun, 7 Jul 2019 14:03:52 +0800
Subject: [scikit-learn] how to preprocess in the cross_validate
Message-ID: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com>

Hi
It's easy to preprocess when i used part of data to train and test. However, how to preprocess within the function of sklearn.model_selection.cross_validate?
Thanks.
Rujing

2019-07-07


charujing123 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190707/e5ed9d14/attachment.html>

From oliverrausch99 at gmail.com  Sun Jul  7 04:03:08 2019
From: oliverrausch99 at gmail.com (Oliver Rausch)
Date: Sun, 7 Jul 2019 10:03:08 +0200
Subject: [scikit-learn] how to preprocess in the cross_validate
In-Reply-To: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com>
References: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com>
Message-ID: <CAJ4EJTHCBmgAx0cuUhBowHoVzPhf=TZRbeES04gO_n44bhv35A@mail.gmail.com>

Hi Rujing,
The Pipeline [0] from sklearn may be of interest to you.

Best regards,
Oliver
?
[0] https://scikit-learn.org/stable/modules/compose.html

On Sun, Jul 7, 2019 at 08:50 charujing123 <charujing123 at 163.com> wrote:

> Hi
> It's easy to preprocess when i used part of data to train and test.
> However, how to preprocess within the function of
> sklearn.model_selection.cross_validate?
> Thanks.
> Rujing
>
> 2019-07-07
> ------------------------------
> charujing123
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-- 
Best Regards,
Oliver
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190707/401201c6/attachment.html>

From pahome.chen at mirlab.org  Mon Jul  8 06:17:36 2019
From: pahome.chen at mirlab.org (lampahome)
Date: Mon, 8 Jul 2019 18:17:36 +0800
Subject: [scikit-learn] Can I pre-calculate parameter threshold of Birch?
Message-ID: <CAB3eZfs1Y43mOc5sc9GgXBB46P+nvY_jq1-fWnXVhxY1BKazKg@mail.gmail.com>

The threshold is determined by the sphere and simulate the points into a
sphere.

When I tune parameters, I don't know how to set the range of threshold to
tune.

Can I pre-calculate the threshold?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190708/f5b54034/attachment.html>

From np.dong572 at gmail.com  Mon Jul  8 09:48:20 2019
From: np.dong572 at gmail.com (Naiping Dong)
Date: Mon, 8 Jul 2019 21:48:20 +0800
Subject: [scikit-learn] Variable kernel density estimation
Message-ID: <CABHU7TTJzR=BNagG2YLQBYyUffg90EfPDYhywZcXHx9LXSuRTQ@mail.gmail.com>

How sklearn perform cross validation "GridSearchCV" for bandwidth
selection? It seems that the CV for kernel density estimation is different
with the one used for classification. Is it used least square errors for
this aim?

Second, is it possible for me to use variable bandwidth for kernel density
estimation, that is, use different bandwidth for different data point?

Thanks.
-- 
Elkan
Department of Chemistry, HKU, HK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190708/01130fbe/attachment.html>

From albertthomas88 at gmail.com  Mon Jul  8 10:04:48 2019
From: albertthomas88 at gmail.com (Albert Thomas)
Date: Mon, 8 Jul 2019 16:04:48 +0200
Subject: [scikit-learn] Variable kernel density estimation
In-Reply-To: <CABHU7TTJzR=BNagG2YLQBYyUffg90EfPDYhywZcXHx9LXSuRTQ@mail.gmail.com>
References: <CABHU7TTJzR=BNagG2YLQBYyUffg90EfPDYhywZcXHx9LXSuRTQ@mail.gmail.com>
Message-ID: <CAK6amUMebtLDQYzmK_cb-bfb2MWT5Gt+73y8VnN=2XjdByJ+Eg@mail.gmail.com>

Hi,

The default score used by GridSearchCV is the one of the estimator; for
KernelDensity it?s the total log likelihood.

As far as I know it is not possible to have different bandwidths.

Albert

On Mon 8 Jul 2019 at 15:50, Naiping Dong <np.dong572 at gmail.com> wrote:

> How sklearn perform cross validation "GridSearchCV" for bandwidth
> selection? It seems that the CV for kernel density estimation is different
> with the one used for classification. Is it used least square errors for
> this aim?
>
> Second, is it possible for me to use variable bandwidth for kernel density
> estimation, that is, use different bandwidth for different data point?
>
> Thanks.
> --
> Elkan
> Department of Chemistry, HKU, HK
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190708/271cfe64/attachment.html>

From charujing123 at 163.com  Mon Jul  8 20:48:04 2019
From: charujing123 at 163.com (charujing123)
Date: Tue, 9 Jul 2019 08:48:04 +0800
Subject: [scikit-learn] how to preprocess in the cross_validate
In-Reply-To: <CAJ4EJTHCBmgAx0cuUhBowHoVzPhf=TZRbeES04gO_n44bhv35A@mail.gmail.com>
References: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com><CAJ4EJTHCBmgAx0cuUhBowHoVzPhf=TZRbeES04gO_n44bhv35A@mail.gmail.com>
Message-ID: <78725424.10ccf02.16bd433adef.Coremail.charujing123@163.com>

Hi Oliver,
Thanks for your kind reply. I read the manual, however, i did not find any options in the function of cross_validate to control the fit transformation. The fit_transform could be used to preprocessing in the pipeline, however, how to integrate this into the function of sklearn.model_selection.cross_validate?
Thanks.
Rujing

2019-07-09 

charujing123 


????Oliver Rausch <oliverrausch99 at gmail.com>
?????2019-07-07 16:03
???Re: [scikit-learn] how to preprocess in the cross_validate
????"Scikit-learn mailing list"<scikit-learn at python.org>
???

Hi Rujing,
The Pipeline [0] from sklearn may be of interest to you.


Best regards,
Oliver 
?
[0] https://scikit-learn.org/stable/modules/compose.html


On Sun, Jul 7, 2019 at 08:50 charujing123 <charujing123 at 163.com> wrote:

Hi
It's easy to preprocess when i used part of data to train and test. However, how to preprocess within the function of sklearn.model_selection.cross_validate?
Thanks.
Rujing

2019-07-07


charujing123 
_______________________________________________
scikit-learn mailing list
scikit-learn at python.org
https://mail.python.org/mailman/listinfo/scikit-learn

-- 

Best Regards,
Oliver
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190709/67a78d6d/attachment.html>

From oliverrausch99 at gmail.com  Tue Jul  9 05:00:05 2019
From: oliverrausch99 at gmail.com (Oliver Rausch)
Date: Tue, 9 Jul 2019 11:00:05 +0200
Subject: [scikit-learn] how to preprocess in the cross_validate
In-Reply-To: <78725424.10ccf02.16bd433adef.Coremail.charujing123@163.com>
References: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com>
 <CAJ4EJTHCBmgAx0cuUhBowHoVzPhf=TZRbeES04gO_n44bhv35A@mail.gmail.com>
 <78725424.10ccf02.16bd433adef.Coremail.charujing123@163.com>
Message-ID: <CAJ4EJTH26UuuzK4fuZpHmp30h3dtXg5zeRYEihPjQEJDitMLnw@mail.gmail.com>

Hi Rujing,
You can integrate the preprocessing into the estimator by placing an
estimator at the end of the pipeline.

For example:
make_pipeline(StandardScaler(), SVC())

This pipeline has a Support vector classifier at the end. Calling a
function of the pipeline, for example fit(X, y), will first apply the
StandardScaler to X, and then use the preprocessed X to fit the SVC.

When you use such an estimator in the cross_validate function, the result
is that the preprocessing will be applied during cross validation, like you
wanted.

Let me know if you have more questions.
Oliver

On Tue, Jul 9, 2019 at 03:04 charujing123 <charujing123 at 163.com> wrote:

> Hi Oliver,
> Thanks for your kind reply. I read the manual, however, i did not find any
> options in the function of cross_validate to control the fit
> transformation. The fit_transform could be used to preprocessing in the
> pipeline, however, how to integrate this into the function of
> sklearn.model_selection.cross_validate?
> Thanks.
> Rujing
>
> 2019-07-09
> ------------------------------
> charujing123
> ------------------------------
>
> *????*Oliver Rausch <oliverrausch99 at gmail.com>
> *?????*2019-07-07 16:03
> *???*Re: [scikit-learn] how to preprocess in the cross_validate
> *????*"Scikit-learn mailing list"<scikit-learn at python.org>
> *???*
>
> Hi Rujing,
> The Pipeline [0] from sklearn may be of interest to you.
>
> Best regards,
> Oliver
> ?
> [0] https://scikit-learn.org/stable/modules/compose.html
>
> On Sun, Jul 7, 2019 at 08:50 charujing123 <charujing123 at 163.com> wrote:
>
>> Hi
>> It's easy to preprocess when i used part of data to train and test.
>> However, how to preprocess within the function of
>> sklearn.model_selection.cross_validate?
>> Thanks.
>> Rujing
>>
>> 2019-07-07
>> ------------------------------
>> charujing123
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
> --
> Best Regards,
> Oliver
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-- 
Best Regards,
Oliver
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190709/fe999500/attachment.html>

From charujing123 at 163.com  Wed Jul 10 03:53:58 2019
From: charujing123 at 163.com (charujing123)
Date: Wed, 10 Jul 2019 15:53:58 +0800
Subject: [scikit-learn] how to preprocess in the cross_validate
In-Reply-To: <CAJ4EJTH26UuuzK4fuZpHmp30h3dtXg5zeRYEihPjQEJDitMLnw@mail.gmail.com>
References: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com>
 <CAJ4EJTHCBmgAx0cuUhBowHoVzPhf=TZRbeES04gO_n44bhv35A@mail.gmail.com>
 <78725424.10ccf02.16bd433adef.Coremail.charujing123@163.com><CAJ4EJTH26UuuzK4fuZpHmp30h3dtXg5zeRYEihPjQEJDitMLnw@mail.gmail.com>
Message-ID: <4a10b04.1bae.16bdadff096.Coremail.charujing123@163.com>

Hi Oliver 
For example 5-cross validation. In the cross_validate function,  the StandardScaler would be fit in the trainning data, generating a model transformation? Then test data would also be transformed based on this model transformation. This two steps would be done 5 times.
If I used make_pipeline(StandardScaler(), SVC()), is this right?
Thanks.
Rujing

2019-07-10 

charujing123 


????Oliver Rausch <oliverrausch99 at gmail.com>
?????2019-07-09 17:00
???Re: [scikit-learn] how to preprocess in the cross_validate
????"Scikit-learn mailing list"<scikit-learn at python.org>
???

Hi Rujing,
You can integrate the preprocessing into the estimator by placing an estimator at the end of the pipeline.


For example:
make_pipeline(StandardScaler(), SVC())


This pipeline has a Support vector classifier at the end. Calling a function of the pipeline, for example fit(X, y), will first apply the StandardScaler to X, and then use the preprocessed X to fit the SVC.


When you use such an estimator in the cross_validate function, the result is that the preprocessing will be applied during cross validation, like you wanted.


Let me know if you have more questions.
Oliver 


On Tue, Jul 9, 2019 at 03:04 charujing123 <charujing123 at 163.com> wrote:

Hi Oliver,
Thanks for your kind reply. I read the manual, however, i did not find any options in the function of cross_validate to control the fit transformation. The fit_transform could be used to preprocessing in the pipeline, however, how to integrate this into the function of sklearn.model_selection.cross_validate?
Thanks.
Rujing

2019-07-09 

charujing123 


????Oliver Rausch <oliverrausch99 at gmail.com>
?????2019-07-07 16:03
???Re: [scikit-learn] how to preprocess in the cross_validate
????"Scikit-learn mailing list"<scikit-learn at python.org>
???

Hi Rujing,
The Pipeline [0] from sklearn may be of interest to you.


Best regards,
Oliver 
?
[0] https://scikit-learn.org/stable/modules/compose.html


On Sun, Jul 7, 2019 at 08:50 charujing123 <charujing123 at 163.com> wrote:

Hi
It's easy to preprocess when i used part of data to train and test. However, how to preprocess within the function of sklearn.model_selection.cross_validate?
Thanks.
Rujing

2019-07-07


charujing123 
_______________________________________________
scikit-learn mailing list
scikit-learn at python.org
https://mail.python.org/mailman/listinfo/scikit-learn

-- 

Best Regards,
Oliver


_______________________________________________
scikit-learn mailing list
scikit-learn at python.org
https://mail.python.org/mailman/listinfo/scikit-learn

-- 

Best Regards,
Oliver
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190710/2a074689/attachment.html>

From t3kcit at gmail.com  Sun Jul 14 14:43:39 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Sun, 14 Jul 2019 13:43:39 -0500
Subject: [scikit-learn] Long term roadmap and moonshot goals
Message-ID: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com>

Hi all.
At SciPy, Brian Granger raised a good point about their planning for the 
Jupyter Project, which is the importance of long-term goals.

I think it's great that we now have a detailed short-term roadmap 
(https://scikit-learn.org/dev/roadmap.html).
Given that we now have about 6(!) full time people (Oliver, Jeremy, 
Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I think 
it's realistic
to achieve most of these within a year or two. We have actually made 
some significant progress already.

I think now would be a good time to start thinking about a longer-term 
roadmap, say 3-5 years out.
What do we want to achieve? What are realistic goals, and what are 
moonshot goals?
Having a common vision and shared goals might help us with funding, but 
might also help us with prioritization and motivation.

What do you think? Do you think this is important and worth-while?
And what should our goals be?

Best,
Andy

From nshervt at gmail.com  Tue Jul 16 12:01:40 2019
From: nshervt at gmail.com (Navid Shervani-Tabar)
Date: Tue, 16 Jul 2019 12:01:40 -0400
Subject: [scikit-learn] Multi-output regressor and sklearn's RFE module
Message-ID: <CAAaLMm3ShDz=2Wu0A=Q7k12S8BTo4boGKtOVYcBazXhjPWPE7w@mail.gmail.com>

Hello,

I have a question regarding sklearn's RFE module. I have a multi-output
regressor and I would like to reduce dimensionality of input using RFE.
However, it seems that it is not possible to have multi-dimensional output
when using RFE. I was wondering if there is a workaround for this or there
is something conceptually wrong about it. You can find a minimal working
code at the following link.

https://stackoverflow.com/questions/57060003/multi-output-regressor-and-sklearns-rfe-module

Thanks!
Navid
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190716/a69015e8/attachment.html>

From niourf at gmail.com  Wed Jul 17 14:49:00 2019
From: niourf at gmail.com (Nicolas Hug)
Date: Wed, 17 Jul 2019 14:49:00 -0400
Subject: [scikit-learn] Monthly meetings between core developers
Message-ID: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>

Hi Everyone,

The scikit-learn team have been expanding significantly lately: we have 
now 3 FTEs in NY, 1 in Berlin, and 3 (soon 4) in Paris.

To scale efficiently, I think we should try to communicate more.

*I'd like to propose monthly meetings* *between the core-developers*. 
This would be the occasion to:

- communicate what everyone is currently working on

- ask for feedback/reviews on some specific PRs

- keep everybody apprised of the latest news/decisions regarding the 
project. Some discussions often take place in channels that some of us 
may miss.

To keep it efficient, maybe we could have a hard time limit, e.g. 30 
mins. One person would be in charge of conducting the meeting, and 
another one would take notes. We would take rounds every month. While 
meetings and notes would be public, only core-developers or 
strongly-involved members would be invited to join the discussion.

I understand the time-zone differences and personal schedules will make 
it hard to arrange for all of us to be present. I'm not sure how to 
handle this equitably. We could have a document that sets the agenda and 
a place to take notes. For members that are not able to join, they can 
add items onto the agenda with their thoughts before and after the meeting.

Core-devs, WDYT?

Thanks!

Nicolas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190717/06341b36/attachment.html>

From gael.varoquaux at normalesup.org  Wed Jul 17 15:02:27 2019
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Wed, 17 Jul 2019 21:02:27 +0200
Subject: [scikit-learn] Monthly meetings between core developers
In-Reply-To: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
Message-ID: <20190717190227.cke2mrh6wddp4lsl@phare.normalesup.org>

On Wed, Jul 17, 2019 at 02:49:00PM -0400, Nicolas Hug wrote:
> Core-devs, WDYT?

+1.

The real challenge will be to find a time slot! :)

G

From adrin.jalali at gmail.com  Wed Jul 17 15:05:01 2019
From: adrin.jalali at gmail.com (Adrin)
Date: Wed, 17 Jul 2019 21:05:01 +0200
Subject: [scikit-learn] Monthly meetings between core developers
In-Reply-To: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
Message-ID: <CAEOrW4_yEaMf8V3oASb+CnhPwjiMhcPNz8PivLFTGFGsXY_mvw@mail.gmail.com>

I'm very strongly all in favor, thanks for bringing it up.

About the time-zone issue, I know some other teams with the same situation,
conduct their meetings
at different times, so that each time it's inconvenient for different
people.

Also, I know some of us are full-time on the project, but having regular
meetings to me is independent
of that, and I'd really like to hear in these meetings from people like
Joel and Hanmin too.

Cheers,
Adrin.

On Wed, Jul 17, 2019 at 8:52 PM Nicolas Hug <niourf at gmail.com> wrote:

> Hi Everyone,
>
> The scikit-learn team have been expanding significantly lately: we have
> now 3 FTEs in NY, 1 in Berlin, and 3 (soon 4) in Paris.
>
> To scale efficiently, I think we should try to communicate more.
>
> *I'd like to propose monthly meetings* *between the core-developers*.
> This would be the occasion to:
>
> - communicate what everyone is currently working on
>
> - ask for feedback/reviews on some specific PRs
>
> - keep everybody apprised of the latest news/decisions regarding the
> project. Some discussions often take place in channels that some of us may
> miss.
>
> To keep it efficient, maybe we could have a hard time limit, e.g. 30 mins.
> One person would be in charge of conducting the meeting, and another one
> would take notes. We would take rounds every month. While meetings and
> notes would be public, only core-developers or strongly-involved members
> would be invited to join the discussion.
>
> I understand the time-zone differences and personal schedules will make it
> hard to arrange for all of us to be present. I'm not sure how to handle
> this equitably. We could have a document that sets the agenda and a place
> to take notes. For members that are not able to join, they can add items
> onto the agenda with their thoughts before and after the meeting.
>
> Core-devs, WDYT?
>
> Thanks!
> Nicolas
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190717/80ff8775/attachment.html>

From g.lemaitre58 at gmail.com  Wed Jul 17 15:17:15 2019
From: g.lemaitre58 at gmail.com (=?UTF-8?Q?Guillaume_Lema=C3=AEtre?=)
Date: Wed, 17 Jul 2019 21:17:15 +0200
Subject: [scikit-learn] Monthly meetings between core developers
In-Reply-To: <CAEOrW4_yEaMf8V3oASb+CnhPwjiMhcPNz8PivLFTGFGsXY_mvw@mail.gmail.com>
References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
 <CAEOrW4_yEaMf8V3oASb+CnhPwjiMhcPNz8PivLFTGFGsXY_mvw@mail.gmail.com>
Message-ID: <CACDxx9gqY1Hf9KypeWkcjDMbKOyAJbpP91X3TruVdaYWK=jXpg@mail.gmail.com>

I am +1. This is a great initiative.

IMO, we could make it really regular (i.e., a specific week-day of a
specific week in a month), with a rolling time (for the time-zone issue).
In this matter, we could maybe clear more in advance our agenda instead of
trying to find a date which accommodates everyone.

Just a thought.

Cheers,

On Wed, 17 Jul 2019 at 21:07, Adrin <adrin.jalali at gmail.com> wrote:

> I'm very strongly all in favor, thanks for bringing it up.
>
> About the time-zone issue, I know some other teams with the same
> situation, conduct their meetings
> at different times, so that each time it's inconvenient for different
> people.
>
> Also, I know some of us are full-time on the project, but having regular
> meetings to me is independent
> of that, and I'd really like to hear in these meetings from people like
> Joel and Hanmin too.
>
> Cheers,
> Adrin.
>
> On Wed, Jul 17, 2019 at 8:52 PM Nicolas Hug <niourf at gmail.com> wrote:
>
>> Hi Everyone,
>>
>> The scikit-learn team have been expanding significantly lately: we have
>> now 3 FTEs in NY, 1 in Berlin, and 3 (soon 4) in Paris.
>>
>> To scale efficiently, I think we should try to communicate more.
>>
>> *I'd like to propose monthly meetings* *between the core-developers*.
>> This would be the occasion to:
>>
>> - communicate what everyone is currently working on
>>
>> - ask for feedback/reviews on some specific PRs
>>
>> - keep everybody apprised of the latest news/decisions regarding the
>> project. Some discussions often take place in channels that some of us may
>> miss.
>>
>> To keep it efficient, maybe we could have a hard time limit, e.g. 30
>> mins. One person would be in charge of conducting the meeting, and another
>> one would take notes. We would take rounds every month. While meetings and
>> notes would be public, only core-developers or strongly-involved members
>> would be invited to join the discussion.
>>
>> I understand the time-zone differences and personal schedules will make
>> it hard to arrange for all of us to be present. I'm not sure how to handle
>> this equitably. We could have a document that sets the agenda and a place
>> to take notes. For members that are not able to join, they can add items
>> onto the agenda with their thoughts before and after the meeting.
>>
>> Core-devs, WDYT?
>>
>> Thanks!
>> Nicolas
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>


-- 
Guillaume Lemaitre
INRIA Saclay - Parietal team
Center for Data Science Paris-Saclay
https://glemaitre.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190717/ad080147/attachment-0001.html>

From t3kcit at gmail.com  Wed Jul 17 18:12:51 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Wed, 17 Jul 2019 18:12:51 -0400
Subject: [scikit-learn] Monthly meetings between core developers
In-Reply-To: <CACDxx9gqY1Hf9KypeWkcjDMbKOyAJbpP91X3TruVdaYWK=jXpg@mail.gmail.com>
References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
 <CAEOrW4_yEaMf8V3oASb+CnhPwjiMhcPNz8PivLFTGFGsXY_mvw@mail.gmail.com>
 <CACDxx9gqY1Hf9KypeWkcjDMbKOyAJbpP91X3TruVdaYWK=jXpg@mail.gmail.com>
Message-ID: <b086465c-25ad-e3d0-26c2-48cacaaf6316@gmail.com>


On 7/17/19 2:17 PM, Guillaume Lema?tre wrote:
> I am +1. This is a great initiative.
>
> IMO, we could make it really regular (i.e., a specific week-day of a 
> specific week in a month), with a rolling time (for the time-zone issue).
> In this matter, we could maybe clear more in advance our agenda 
> instead of trying to find a date which accommodates everyone.
>
I agree, we could do something like the last Monday every month and 
alternate between two (or three) different time zones.
We have CET (UTC+1), EST (UTC-5), CT (UTC+08), AEDT (USC+11) so that 
seems super easy, right?
(TIL CST can stand for "Central"/US, China, and Cuba! not confusing at all)

I agree that we should be as inclusive as possible, but I also don't 
want to create the expectation that some people (not thinking of any 
Australian in particular)
who already sacrifice a lot of their free time have to invest even more 
time to keep up with the rest.

I think the idea of posting write-ups will help being more inclusive in 
that regard.

From olivier.grisel at ensta.org  Thu Jul 18 02:00:50 2019
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Thu, 18 Jul 2019 08:00:50 +0200
Subject: [scikit-learn] Monthly meetings between core developers
In-Reply-To: <b086465c-25ad-e3d0-26c2-48cacaaf6316@gmail.com>
References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
 <CAEOrW4_yEaMf8V3oASb+CnhPwjiMhcPNz8PivLFTGFGsXY_mvw@mail.gmail.com>
 <CACDxx9gqY1Hf9KypeWkcjDMbKOyAJbpP91X3TruVdaYWK=jXpg@mail.gmail.com>
 <b086465c-25ad-e3d0-26c2-48cacaaf6316@gmail.com>
Message-ID: <CAFvE7K7wZeckk_xjYvUwRi0+7TkGiP+QKO8LxUTH3r6njCRX-Q@mail.gmail.com>

+1 for last Monday of each month. How about the duration? 1h max + breakout
in smaller groups on more specific topics if needed?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190718/0adc354a/attachment.html>

From adrin.jalali at gmail.com  Thu Jul 18 02:26:37 2019
From: adrin.jalali at gmail.com (Adrin)
Date: Thu, 18 Jul 2019 08:26:37 +0200
Subject: [scikit-learn] Monthly meetings between core developers
In-Reply-To: <CAFvE7K7wZeckk_xjYvUwRi0+7TkGiP+QKO8LxUTH3r6njCRX-Q@mail.gmail.com>
References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
 <CAEOrW4_yEaMf8V3oASb+CnhPwjiMhcPNz8PivLFTGFGsXY_mvw@mail.gmail.com>
 <CACDxx9gqY1Hf9KypeWkcjDMbKOyAJbpP91X3TruVdaYWK=jXpg@mail.gmail.com>
 <b086465c-25ad-e3d0-26c2-48cacaaf6316@gmail.com>
 <CAFvE7K7wZeckk_xjYvUwRi0+7TkGiP+QKO8LxUTH3r6njCRX-Q@mail.gmail.com>
Message-ID: <CAEOrW49Q6mfEXvd8pdx3EALgsurvzr23CYdfcgkixwDZoSKeSA@mail.gmail.com>

BTW, where was the meeting for last Monday organized? I don't think I knew
it was happening.

On Thu., Jul. 18, 2019, 08:02 Olivier Grisel, <olivier.grisel at ensta.org>
wrote:

> +1 for last Monday of each month. How about the duration? 1h max +
> breakout in smaller groups on more specific topics if needed?
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190718/2e5c962f/attachment.html>

From olivier.grisel at ensta.org  Thu Jul 18 02:38:26 2019
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Thu, 18 Jul 2019 08:38:26 +0200
Subject: [scikit-learn] Monthly meetings between core developers
In-Reply-To: <CAEOrW49Q6mfEXvd8pdx3EALgsurvzr23CYdfcgkixwDZoSKeSA@mail.gmail.com>
References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
 <CAEOrW4_yEaMf8V3oASb+CnhPwjiMhcPNz8PivLFTGFGsXY_mvw@mail.gmail.com>
 <CACDxx9gqY1Hf9KypeWkcjDMbKOyAJbpP91X3TruVdaYWK=jXpg@mail.gmail.com>
 <b086465c-25ad-e3d0-26c2-48cacaaf6316@gmail.com>
 <CAFvE7K7wZeckk_xjYvUwRi0+7TkGiP+QKO8LxUTH3r6njCRX-Q@mail.gmail.com>
 <CAEOrW49Q6mfEXvd8pdx3EALgsurvzr23CYdfcgkixwDZoSKeSA@mail.gmail.com>
Message-ID: <CAFvE7K5NwLcr4okLNAA27Je_hf7kxV-te83X0u175-DfaSDfjA@mail.gmail.com>

Le jeu. 18 juil. 2019 ? 08:29, Adrin <adrin.jalali at gmail.com> a ?crit :
>
> BTW, where was the meeting for last Monday organized? I don't think I knew it was happening.

I do not understand what you are referring to. My email was about the
organization of future meetings as suggested by Andreas.

From adrin.jalali at gmail.com  Thu Jul 18 02:41:38 2019
From: adrin.jalali at gmail.com (Adrin)
Date: Thu, 18 Jul 2019 08:41:38 +0200
Subject: [scikit-learn] Monthly meetings between core developers
In-Reply-To: <CAFvE7K5NwLcr4okLNAA27Je_hf7kxV-te83X0u175-DfaSDfjA@mail.gmail.com>
References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
 <CAEOrW4_yEaMf8V3oASb+CnhPwjiMhcPNz8PivLFTGFGsXY_mvw@mail.gmail.com>
 <CACDxx9gqY1Hf9KypeWkcjDMbKOyAJbpP91X3TruVdaYWK=jXpg@mail.gmail.com>
 <b086465c-25ad-e3d0-26c2-48cacaaf6316@gmail.com>
 <CAFvE7K7wZeckk_xjYvUwRi0+7TkGiP+QKO8LxUTH3r6njCRX-Q@mail.gmail.com>
 <CAEOrW49Q6mfEXvd8pdx3EALgsurvzr23CYdfcgkixwDZoSKeSA@mail.gmail.com>
 <CAFvE7K5NwLcr4okLNAA27Je_hf7kxV-te83X0u175-DfaSDfjA@mail.gmail.com>
Message-ID: <CAEOrW4--zWxOQa-V=bdVibsOD6zx=Qp-c+=7jkqp2qSnsWCzig@mail.gmail.com>

Ah sorry, my eyes skipped the "of the month" part of "last Monday of the
month". My bad!

On Thu., Jul. 18, 2019, 08:39 Olivier Grisel, <olivier.grisel at ensta.org>
wrote:

> Le jeu. 18 juil. 2019 ? 08:29, Adrin <adrin.jalali at gmail.com> a ?crit :
> >
> > BTW, where was the meeting for last Monday organized? I don't think I
> knew it was happening.
>
> I do not understand what you are referring to. My email was about the
> organization of future meetings as suggested by Andreas.
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190718/52b7147c/attachment.html>

From olivier.grisel at ensta.org  Thu Jul 18 04:57:21 2019
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Thu, 18 Jul 2019 10:57:21 +0200
Subject: [scikit-learn] Monthly meetings between core developers
In-Reply-To: <b086465c-25ad-e3d0-26c2-48cacaaf6316@gmail.com>
References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
 <CAEOrW4_yEaMf8V3oASb+CnhPwjiMhcPNz8PivLFTGFGsXY_mvw@mail.gmail.com>
 <CACDxx9gqY1Hf9KypeWkcjDMbKOyAJbpP91X3TruVdaYWK=jXpg@mail.gmail.com>
 <b086465c-25ad-e3d0-26c2-48cacaaf6316@gmail.com>
Message-ID: <CAFvE7K4b3G_VqNuG_SgYL+9HLPvPiwu6WzN=52BjzN7uj8zJ2A@mail.gmail.com>

I just found this planner to give it a try:

https://www.timeanddate.com/worldclock/meetingtime.html?day=29&month=7&year=2019&p1=240&p2=33&p3=37&p4=179&iv=0

(Berlin and Paris are on the same timezone so I did not put only Berlin).

It's going to be challenging to find a timeslot for every body. The
least extreme timeslot for everybody to attend at the same time would
be:

https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=11&min=0&sec=0&p1=240&p2=33&p3=37&p4=179

We could also arrange for a second timeslot later (that would be
Tuesday morning in Australia and China):

https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=21&min=0&sec=0&p1=240&p2=33&p3=37&p4=179

I wouldn't mind doing a meeting around 11pm on Monday evening from
time to time but it would still be very early for Beijing.

Just to let you know, I will be off from next Saturday till Monday
August 19 (big summer break :) so don't count on my for the first
meeting if you start the meetings  in the mean time.

Le jeu. 18 juil. 2019 ? 00:15, Andreas Mueller <t3kcit at gmail.com> a ?crit :
>
>
>
> On 7/17/19 2:17 PM, Guillaume Lema?tre wrote:
> > I am +1. This is a great initiative.
> >
> > IMO, we could make it really regular (i.e., a specific week-day of a
> > specific week in a month), with a rolling time (for the time-zone issue).
> > In this matter, we could maybe clear more in advance our agenda
> > instead of trying to find a date which accommodates everyone.
> >
> I agree, we could do something like the last Monday every month and
> alternate between two (or three) different time zones.
> We have CET (UTC+1), EST (UTC-5), CT (UTC+08), AEDT (USC+11) so that
> seems super easy, right?
> (TIL CST can stand for "Central"/US, China, and Cuba! not confusing at all)
>
> I agree that we should be as inclusive as possible, but I also don't
> want to create the expectation that some people (not thinking of any
> Australian in particular)
> who already sacrifice a lot of their free time have to invest even more
> time to keep up with the rest.
>
> I think the idea of posting write-ups will help being more inclusive in
> that regard.
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

From alexandre.gramfort at inria.fr  Thu Jul 18 05:13:54 2019
From: alexandre.gramfort at inria.fr (Alexandre Gramfort)
Date: Thu, 18 Jul 2019 11:13:54 +0200
Subject: [scikit-learn] Monthly meetings between core developers
In-Reply-To: <CAFvE7K4b3G_VqNuG_SgYL+9HLPvPiwu6WzN=52BjzN7uj8zJ2A@mail.gmail.com>
References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com>
 <CAEOrW4_yEaMf8V3oASb+CnhPwjiMhcPNz8PivLFTGFGsXY_mvw@mail.gmail.com>
 <CACDxx9gqY1Hf9KypeWkcjDMbKOyAJbpP91X3TruVdaYWK=jXpg@mail.gmail.com>
 <b086465c-25ad-e3d0-26c2-48cacaaf6316@gmail.com>
 <CAFvE7K4b3G_VqNuG_SgYL+9HLPvPiwu6WzN=52BjzN7uj8zJ2A@mail.gmail.com>
Message-ID: <CADeotZp8Nh5Z_T-f-8uxjaq-iaya10WxWwBqaHpT_Md+Eb_qNQ@mail.gmail.com>

hi,

I kind of like the project boards we used for sprints:

https://github.com/scikit-learn/scikit-learn/projects/11

the outcome of the core devs meeting could be to agree what
should be listed on such a priority board.

my 2c
Alex


On Thu, Jul 18, 2019 at 10:59 AM Olivier Grisel
<olivier.grisel at ensta.org> wrote:
>
> I just found this planner to give it a try:
>
> https://www.timeanddate.com/worldclock/meetingtime.html?day=29&month=7&year=2019&p1=240&p2=33&p3=37&p4=179&iv=0
>
> (Berlin and Paris are on the same timezone so I did not put only Berlin).
>
> It's going to be challenging to find a timeslot for every body. The
> least extreme timeslot for everybody to attend at the same time would
> be:
>
> https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=11&min=0&sec=0&p1=240&p2=33&p3=37&p4=179
>
> We could also arrange for a second timeslot later (that would be
> Tuesday morning in Australia and China):
>
> https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=21&min=0&sec=0&p1=240&p2=33&p3=37&p4=179
>
> I wouldn't mind doing a meeting around 11pm on Monday evening from
> time to time but it would still be very early for Beijing.
>
> Just to let you know, I will be off from next Saturday till Monday
> August 19 (big summer break :) so don't count on my for the first
> meeting if you start the meetings  in the mean time.
>
> Le jeu. 18 juil. 2019 ? 00:15, Andreas Mueller <t3kcit at gmail.com> a ?crit :
> >
> >
> >
> > On 7/17/19 2:17 PM, Guillaume Lema?tre wrote:
> > > I am +1. This is a great initiative.
> > >
> > > IMO, we could make it really regular (i.e., a specific week-day of a
> > > specific week in a month), with a rolling time (for the time-zone issue).
> > > In this matter, we could maybe clear more in advance our agenda
> > > instead of trying to find a date which accommodates everyone.
> > >
> > I agree, we could do something like the last Monday every month and
> > alternate between two (or three) different time zones.
> > We have CET (UTC+1), EST (UTC-5), CT (UTC+08), AEDT (USC+11) so that
> > seems super easy, right?
> > (TIL CST can stand for "Central"/US, China, and Cuba! not confusing at all)
> >
> > I agree that we should be as inclusive as possible, but I also don't
> > want to create the expectation that some people (not thinking of any
> > Australian in particular)
> > who already sacrifice a lot of their free time have to invest even more
> > time to keep up with the rest.
> >
> > I think the idea of posting write-ups will help being more inclusive in
> > that regard.
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

From qinhanmin2005 at sina.com  Thu Jul 18 06:04:51 2019
From: qinhanmin2005 at sina.com (Hanmin Qin)
Date: Thu, 18 Jul 2019 18:04:51 +0800
Subject: [scikit-learn] Monthly meetings between core developers
Message-ID: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn>

I don't think it's worthwhile to worry too much about Beijing time if I'm the only person in Beijing. I'm happy to get up early once a month to learn from the great team!
Hanmin Qin
----- Original Message -----
From: Alexandre Gramfort <alexandre.gramfort at inria.fr>
To: Scikit-learn mailing list <scikit-learn at python.org>
Subject: Re: [scikit-learn] Monthly meetings between core developers
Date: 2019-07-18 17:16

hi,
I kind of like the project boards we used for sprints:
https://github.com/scikit-learn/scikit-learn/projects/11
the outcome of the core devs meeting could be to agree what
should be listed on such a priority board.
my 2c
Alex
On Thu, Jul 18, 2019 at 10:59 AM Olivier Grisel
<olivier.grisel at ensta.org> wrote:
>
> I just found this planner to give it a try:
>
> https://www.timeanddate.com/worldclock/meetingtime.html?day=29&month=7&year=2019&p1=240&p2=33&p3=37&p4=179&iv=0
>
> (Berlin and Paris are on the same timezone so I did not put only Berlin).
>
> It's going to be challenging to find a timeslot for every body. The
> least extreme timeslot for everybody to attend at the same time would
> be:
>
> https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=11&min=0&sec=0&p1=240&p2=33&p3=37&p4=179
>
> We could also arrange for a second timeslot later (that would be
> Tuesday morning in Australia and China):
>
> https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=21&min=0&sec=0&p1=240&p2=33&p3=37&p4=179
>
> I wouldn't mind doing a meeting around 11pm on Monday evening from
> time to time but it would still be very early for Beijing.
>
> Just to let you know, I will be off from next Saturday till Monday
> August 19 (big summer break :) so don't count on my for the first
> meeting if you start the meetings  in the mean time.
>
> Le jeu. 18 juil. 2019 ? 00:15, Andreas Mueller <t3kcit at gmail.com> a ?crit :
> >
> >
> >
> > On 7/17/19 2:17 PM, Guillaume Lema?tre wrote:
> > > I am +1. This is a great initiative.
> > >
> > > IMO, we could make it really regular (i.e., a specific week-day of a
> > > specific week in a month), with a rolling time (for the time-zone issue).
> > > In this matter, we could maybe clear more in advance our agenda
> > > instead of trying to find a date which accommodates everyone.
> > >
> > I agree, we could do something like the last Monday every month and
> > alternate between two (or three) different time zones.
> > We have CET (UTC+1), EST (UTC-5), CT (UTC+08), AEDT (USC+11) so that
> > seems super easy, right?
> > (TIL CST can stand for "Central"/US, China, and Cuba! not confusing at all)
> >
> > I agree that we should be as inclusive as possible, but I also don't
> > want to create the expectation that some people (not thinking of any
> > Australian in particular)
> > who already sacrifice a lot of their free time have to invest even more
> > time to keep up with the rest.
> >
> > I think the idea of posting write-ups will help being more inclusive in
> > that regard.
> > _______________________________________________
> > scikit-learn mailing list
> > scikit-learn at python.org
> > https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
scikit-learn at python.org
https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190718/82bf81be/attachment.html>

From joel.nothman at gmail.com  Thu Jul 18 19:57:20 2019
From: joel.nothman at gmail.com (Joel Nothman)
Date: Fri, 19 Jul 2019 09:57:20 +1000
Subject: [scikit-learn] Monthly meetings between core developers
In-Reply-To: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn>
References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn>
Message-ID: <CAAkaFLX60h==ZazLjZat6tv7pSnykYHnY8YiF3+XP35q0vSiBQ@mail.gmail.com>

I'm away on a holiday at the moment (in case you hadn't identified my
silence). I'd be keen to join in but might not be able to move schedules
around it. I like the idea of prioritising together, though I'm not sure
how to keep the meetings clipped.

I'm also going to be quite lost on the issue tracker when I return to
civilisation next week... (Are we still on for two patch releases?)

J
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190719/899a0567/attachment.html>

From pahome.chen at mirlab.org  Fri Jul 19 05:39:27 2019
From: pahome.chen at mirlab.org (lampahome)
Date: Fri, 19 Jul 2019 17:39:27 +0800
Subject: [scikit-learn] Any machine learning used in storage company?
Message-ID: <CAB3eZftDb7H=ipjD=OPzEEfF5TuXHhyrcfMdrNMUrTqh7mqZOg@mail.gmail.com>

Is there any application used in storage company?

Can anyone briefly introduce what application in what company?

thx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190719/43b9df18/attachment.html>

From marmochiaskl at gmail.com  Fri Jul 19 08:46:37 2019
From: marmochiaskl at gmail.com (Chiara Marmo)
Date: Fri, 19 Jul 2019 14:46:37 +0200
Subject: [scikit-learn] Monthly meetings between core developers +
 "Hello World"
In-Reply-To: <CAAkaFLX60h==ZazLjZat6tv7pSnykYHnY8YiF3+XP35q0vSiBQ@mail.gmail.com>
References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn>
 <CAAkaFLX60h==ZazLjZat6tv7pSnykYHnY8YiF3+XP35q0vSiBQ@mail.gmail.com>
Message-ID: <CAGfF14_7VJfcpgTqd_JPu51oO2bJJTyDfV6Oy9eHVzye=d2pmw@mail.gmail.com>

Dear list,

I'm Chiara, in September I will start to work full time for the
Scikit-Learn Consortium at INRIA (France). My background is in Astronomy
and Planetary Science: I've worked there as a Research Engineer for around
15 years, writing code, mining data and managing some project.

One of my task at the Consortium will be to take care of our connection
with the developer community, so let me know if I can help in managing
those monthly meetings in some way.
In the meanwhile, may I suggest to create a github team for core developers
in the scikit-learn organization? As Alexandre said, team specific projects
and discussions on github could be a way to efficiently prepare meetings
and prioritize issues.

Thanks for listening,
have a nice day.
Chiara
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190719/25e6131a/attachment.html>

From milton.pifanos at gmail.com  Mon Jul 22 09:16:35 2019
From: milton.pifanos at gmail.com (Milton Pifano)
Date: Mon, 22 Jul 2019 10:16:35 -0300
Subject: [scikit-learn] Test Sample Size
Message-ID: <CAC2OfOOUk3Eu_9HwD3FSo4Ht4HWSzR+PRS4eJ1q2-xKde7-=4w@mail.gmail.com>

Dear scikit-learn subscribers.

I am working on a multiclass classificacition project and I have found many
resources about how to deal with  an imbalaced dataset for trainning, bu I
have not been able to find  any reference on the test dataset size.
Can anyone send some references?

Thanks,
Milton Pifano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190722/196fbcec/attachment.html>

From adrin.jalali at gmail.com  Mon Jul 22 09:22:51 2019
From: adrin.jalali at gmail.com (Adrin)
Date: Mon, 22 Jul 2019 15:22:51 +0200
Subject: [scikit-learn] Monthly meetings between core developers +
 "Hello World"
In-Reply-To: <CAGfF14_7VJfcpgTqd_JPu51oO2bJJTyDfV6Oy9eHVzye=d2pmw@mail.gmail.com>
References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn>
 <CAAkaFLX60h==ZazLjZat6tv7pSnykYHnY8YiF3+XP35q0vSiBQ@mail.gmail.com>
 <CAGfF14_7VJfcpgTqd_JPu51oO2bJJTyDfV6Oy9eHVzye=d2pmw@mail.gmail.com>
Message-ID: <CAEOrW48LZDTkAd9yX==cxHBuGEbZApg9QWFrLsAmVFXEpdN-xA@mail.gmail.com>

Awesome, excited to have your help around :)

We already have the @core-devs team on github, we can use it more
often/more organized.

On Fri, Jul 19, 2019 at 2:48 PM Chiara Marmo <marmochiaskl at gmail.com> wrote:

> Dear list,
>
> I'm Chiara, in September I will start to work full time for the
> Scikit-Learn Consortium at INRIA (France). My background is in Astronomy
> and Planetary Science: I've worked there as a Research Engineer for around
> 15 years, writing code, mining data and managing some project.
>
> One of my task at the Consortium will be to take care of our connection
> with the developer community, so let me know if I can help in managing
> those monthly meetings in some way.
> In the meanwhile, may I suggest to create a github team for core
> developers in the scikit-learn organization? As Alexandre said, team
> specific projects and discussions on github could be a way to efficiently
> prepare meetings and prioritize issues.
>
> Thanks for listening,
> have a nice day.
> Chiara
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190722/1dedeef7/attachment.html>

From jbbrown at kuhp.kyoto-u.ac.jp  Mon Jul 22 09:24:26 2019
From: jbbrown at kuhp.kyoto-u.ac.jp (Brown J.B.)
Date: Mon, 22 Jul 2019 22:24:26 +0900
Subject: [scikit-learn] Test Sample Size
In-Reply-To: <CAC2OfOOUk3Eu_9HwD3FSo4Ht4HWSzR+PRS4eJ1q2-xKde7-=4w@mail.gmail.com>
References: <CAC2OfOOUk3Eu_9HwD3FSo4Ht4HWSzR+PRS4eJ1q2-xKde7-=4w@mail.gmail.com>
Message-ID: <CAJe_vxD8g=+opwKw5zMdLQYbAr3F-WBm33QtDgKS3iiSEj+ovg@mail.gmail.com>

Dear Milton,

It is just my opinion based on many experiences, but if you want to
stress-test your estimator, make your test set at least as big as, if not
bigger than, the training set.

Sincerely,
J.B.

2019?7?22?(?) 22:18 Milton Pifano <milton.pifanos at gmail.com>:

> Dear scikit-learn subscribers.
>
> I am working on a multiclass classificacition project and I have found
> many resources about how to deal with  an imbalaced dataset for trainning,
> bu I have not been able to find  any reference on the test dataset size.
> Can anyone send some references?
>
> Thanks,
> Milton Pifano
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190722/fa68ca7a/attachment.html>

From adrin.jalali at gmail.com  Mon Jul 22 09:51:12 2019
From: adrin.jalali at gmail.com (Adrin)
Date: Mon, 22 Jul 2019 15:51:12 +0200
Subject: [scikit-learn] Continues monitoring of benchmark performances
Message-ID: <CAEOrW486qGcFF76OezNMMJcUKnkw=ASOuoMmzw2qe16MczbaKw@mail.gmail.com>

Hi,

There is this [page](https://pandas.pydata.org/speed/scikit-learn/)
maintained by some of the pandas maintainers (@TomAugspurger in
particular), and it seems like a really good idea to have an eye on the
performance of different benchmarks through time just in case a PR
introduces some major drawbacks.

However, he doesn't have the bandwidth to maintain it much more, and not
really the hardware. I think it'd be a good idea for us to have that,
wanted to bring it up and see what you think!

Cheers,
Adrin.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190722/836c560a/attachment.html>

From t3kcit at gmail.com  Mon Jul 22 09:54:57 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Mon, 22 Jul 2019 09:54:57 -0400
Subject: [scikit-learn] Monthly meetings between core developers +
 "Hello World"
In-Reply-To: <CAEOrW48LZDTkAd9yX==cxHBuGEbZApg9QWFrLsAmVFXEpdN-xA@mail.gmail.com>
References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn>
 <CAAkaFLX60h==ZazLjZat6tv7pSnykYHnY8YiF3+XP35q0vSiBQ@mail.gmail.com>
 <CAGfF14_7VJfcpgTqd_JPu51oO2bJJTyDfV6Oy9eHVzye=d2pmw@mail.gmail.com>
 <CAEOrW48LZDTkAd9yX==cxHBuGEbZApg9QWFrLsAmVFXEpdN-xA@mail.gmail.com>
Message-ID: <b57b51be-478d-2098-ee3b-fceaec21d906@gmail.com>


On 7/22/19 9:22 AM, Adrin wrote:
> Awesome, excited to have your help around :)
>
> We already have the @core-devs team on github, we can use it more 
> often/more organized.hi

Why wouldn't we just use the scikit-learn repo projects?


>
> On Fri, Jul 19, 2019 at 2:48 PM Chiara Marmo <marmochiaskl at gmail.com 
> <mailto:marmochiaskl at gmail.com>> wrote:
>
>     Dear list,
>
>     I'm Chiara, in September I will start to work full time for the
>     Scikit-Learn Consortium at INRIA (France). My background is in
>     Astronomy and Planetary Science: I've worked there as a Research
>     Engineer for around 15 years, writing code, mining data and
>     managing some project.
>
>     One of my task at the Consortium will be to take care of our
>     connection with the developer community, so let me know if I can
>     help in managing those monthly meetings in some way.
>     In the meanwhile, may I suggest to create a github team for core
>     developers in the scikit-learn organization? As Alexandre said,
>     team specific projects and discussions on github could be a way to
>     efficiently prepare meetings and prioritize issues.
>
>     Thanks for listening,
>     have a nice day.
>     Chiara
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190722/618c9b29/attachment-0001.html>

From adrin.jalali at gmail.com  Mon Jul 22 09:57:37 2019
From: adrin.jalali at gmail.com (Adrin)
Date: Mon, 22 Jul 2019 15:57:37 +0200
Subject: [scikit-learn] Monthly meetings between core developers +
 "Hello World"
In-Reply-To: <b57b51be-478d-2098-ee3b-fceaec21d906@gmail.com>
References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn>
 <CAAkaFLX60h==ZazLjZat6tv7pSnykYHnY8YiF3+XP35q0vSiBQ@mail.gmail.com>
 <CAGfF14_7VJfcpgTqd_JPu51oO2bJJTyDfV6Oy9eHVzye=d2pmw@mail.gmail.com>
 <CAEOrW48LZDTkAd9yX==cxHBuGEbZApg9QWFrLsAmVFXEpdN-xA@mail.gmail.com>
 <b57b51be-478d-2098-ee3b-fceaec21d906@gmail.com>
Message-ID: <CAEOrW4-L08RDzKkidBb5r0UXUzRJ9rrjEhGUUPY+ogezQ1E4Wg@mail.gmail.com>

That's kinda what I meant. I didn't mean to limit the access to the project
to @core-devs, I meant they can be pinged.

On Mon, Jul 22, 2019 at 3:56 PM Andreas Mueller <t3kcit at gmail.com> wrote:

>
> On 7/22/19 9:22 AM, Adrin wrote:
>
> Awesome, excited to have your help around :)
>
> We already have the @core-devs team on github, we can use it more
> often/more organized.hi
>
> Why wouldn't we just use the scikit-learn repo projects?
>
>
>
> On Fri, Jul 19, 2019 at 2:48 PM Chiara Marmo <marmochiaskl at gmail.com>
> wrote:
>
>> Dear list,
>>
>> I'm Chiara, in September I will start to work full time for the
>> Scikit-Learn Consortium at INRIA (France). My background is in Astronomy
>> and Planetary Science: I've worked there as a Research Engineer for around
>> 15 years, writing code, mining data and managing some project.
>>
>> One of my task at the Consortium will be to take care of our connection
>> with the developer community, so let me know if I can help in managing
>> those monthly meetings in some way.
>> In the meanwhile, may I suggest to create a github team for core
>> developers in the scikit-learn organization? As Alexandre said, team
>> specific projects and discussions on github could be a way to efficiently
>> prepare meetings and prioritize issues.
>>
>> Thanks for listening,
>> have a nice day.
>> Chiara
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190722/be1255a1/attachment.html>

From tom.augspurger88 at gmail.com  Mon Jul 22 09:59:51 2019
From: tom.augspurger88 at gmail.com (Tom Augspurger)
Date: Mon, 22 Jul 2019 08:59:51 -0500
Subject: [scikit-learn] Continues monitoring of benchmark performances
In-Reply-To: <CAEOrW486qGcFF76OezNMMJcUKnkw=ASOuoMmzw2qe16MczbaKw@mail.gmail.com>
References: <CAEOrW486qGcFF76OezNMMJcUKnkw=ASOuoMmzw2qe16MczbaKw@mail.gmail.com>
Message-ID: <CAE1aY-=bKtz7eAam-LGTXxHeUHEP_1Ou_pBacrb=wHOOu=BnUg@mail.gmail.com>

Thanks Adrin,

A month or so ago I started running scikit-learn benchmarks, but I had to
disable them since they were taking too long (longer than a day).
I haven't had time to investigate why, but I assume it was an issue with
how I set them up.

Just FYI, I'm planning to include "maintain and improve the benchmark
running tools" as part of the pandas' application for the CZI grant.
All that is in https://github.com/asv-runner (a mix of Ansible, Airflow,
and GitHub bots). If anyone is interested in (possibly) having funding
to work on this, feel free to reach out to me off list and we can discuss
things.

Tom

On Mon, Jul 22, 2019 at 8:53 AM Adrin <adrin.jalali at gmail.com> wrote:

> Hi,
>
> There is this [page](https://pandas.pydata.org/speed/scikit-learn/)
> maintained by some of the pandas maintainers (@TomAugspurger in
> particular), and it seems like a really good idea to have an eye on the
> performance of different benchmarks through time just in case a PR
> introduces some major drawbacks.
>
> However, he doesn't have the bandwidth to maintain it much more, and not
> really the hardware. I think it'd be a good idea for us to have that,
> wanted to bring it up and see what you think!
>
> Cheers,
> Adrin.
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190722/6e6fd39b/attachment.html>

From niourf at gmail.com  Mon Jul 22 10:14:10 2019
From: niourf at gmail.com (Nicolas Hug)
Date: Mon, 22 Jul 2019 10:14:10 -0400
Subject: [scikit-learn] Continues monitoring of benchmark performances
In-Reply-To: <CAE1aY-=bKtz7eAam-LGTXxHeUHEP_1Ou_pBacrb=wHOOu=BnUg@mail.gmail.com>
References: <CAEOrW486qGcFF76OezNMMJcUKnkw=ASOuoMmzw2qe16MczbaKw@mail.gmail.com>
 <CAE1aY-=bKtz7eAam-LGTXxHeUHEP_1Ou_pBacrb=wHOOu=BnUg@mail.gmail.com>
Message-ID: <216082b3-39cd-c898-c6cf-236d52d3abaa@gmail.com>

I agree having benchmarks for non regression would be very helpful. A 
seemingly simple change in Cython code can lead to drastic performance drop.

I can't find it back but I think J?r?mie has submitted an issue about this?


On 7/22/19 9:59 AM, Tom Augspurger wrote:
> Thanks Adrin,
>
> A month or so ago I started running scikit-learn benchmarks, but I had 
> to disable them since they were taking too long (longer than a day).
> I haven't had time to investigate why, but I assume it was an issue 
> with how I set them up.
>
> Just FYI, I'm planning to include "maintain and improve the benchmark 
> running tools" as part of the pandas' application for the CZI grant.
> All that is in https://github.com/asv-runner (a mix of Ansible, 
> Airflow, and GitHub bots). If anyone is interested in (possibly) 
> having funding
> to work on this, feel free to reach out to me off list and we can 
> discuss things.
>
> Tom
>
> On Mon, Jul 22, 2019 at 8:53 AM Adrin <adrin.jalali at gmail.com 
> <mailto:adrin.jalali at gmail.com>> wrote:
>
>     Hi,
>
>     There is this
>     [page](https://pandas.pydata.org/speed/scikit-learn/) maintained
>     by some of the pandas maintainers (@TomAugspurger in particular),
>     and it seems like a really good idea to have an eye on the
>     performance of different benchmarks through time just in case a PR
>     introduces some major drawbacks.
>
>     However, he doesn't have the bandwidth to maintain it much more,
>     and not really the hardware. I think it'd be a good idea for us to
>     have that, wanted to bring it up and see what you think!
>
>     Cheers,
>     Adrin.
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190722/9ca28579/attachment-0001.html>

From joel.nothman at gmail.com  Mon Jul 22 20:12:34 2019
From: joel.nothman at gmail.com (Joel Nothman)
Date: Tue, 23 Jul 2019 10:12:34 +1000
Subject: [scikit-learn] Continues monitoring of benchmark performances
In-Reply-To: <216082b3-39cd-c898-c6cf-236d52d3abaa@gmail.com>
References: <CAEOrW486qGcFF76OezNMMJcUKnkw=ASOuoMmzw2qe16MczbaKw@mail.gmail.com>
 <CAE1aY-=bKtz7eAam-LGTXxHeUHEP_1Ou_pBacrb=wHOOu=BnUg@mail.gmail.com>
 <216082b3-39cd-c898-c6cf-236d52d3abaa@gmail.com>
Message-ID: <CAAkaFLXRJ8Rp0yw9ET=Qxsu51iLnDOuQWXz=w4Pmb0rSBKCq-g@mail.gmail.com>

Isn't J?r?mie's project at
https://github.com/jeremiedbb/scikit-learn_benchmarks meant to be doing
this? What's its status? How does it relate to Tom's work?

(Can we please take http://scikit-learn.org/ml-benchmarks/ offline?)

On Tue, 23 Jul 2019 at 00:17, Nicolas Hug <niourf at gmail.com> wrote:

> I agree having benchmarks for non regression would be very helpful. A
> seemingly simple change in Cython code can lead to drastic performance drop.
>
> I can't find it back but I think J?r?mie has submitted an issue about this?
>
> On 7/22/19 9:59 AM, Tom Augspurger wrote:
>
> Thanks Adrin,
>
> A month or so ago I started running scikit-learn benchmarks, but I had to
> disable them since they were taking too long (longer than a day).
> I haven't had time to investigate why, but I assume it was an issue with
> how I set them up.
>
> Just FYI, I'm planning to include "maintain and improve the benchmark
> running tools" as part of the pandas' application for the CZI grant.
> All that is in https://github.com/asv-runner (a mix of Ansible, Airflow,
> and GitHub bots). If anyone is interested in (possibly) having funding
> to work on this, feel free to reach out to me off list and we can discuss
> things.
>
> Tom
>
> On Mon, Jul 22, 2019 at 8:53 AM Adrin <adrin.jalali at gmail.com> wrote:
>
>> Hi,
>>
>> There is this [page](https://pandas.pydata.org/speed/scikit-learn/)
>> maintained by some of the pandas maintainers (@TomAugspurger in
>> particular), and it seems like a really good idea to have an eye on the
>> performance of different benchmarks through time just in case a PR
>> introduces some major drawbacks.
>>
>> However, he doesn't have the bandwidth to maintain it much more, and not
>> really the hardware. I think it'd be a good idea for us to have that,
>> wanted to bring it up and see what you think!
>>
>> Cheers,
>> Adrin.
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190723/71ad3b04/attachment.html>

From alexandre.gramfort at inria.fr  Tue Jul 23 02:25:16 2019
From: alexandre.gramfort at inria.fr (Alexandre Gramfort)
Date: Tue, 23 Jul 2019 08:25:16 +0200
Subject: [scikit-learn] Continues monitoring of benchmark performances
In-Reply-To: <CAAkaFLXRJ8Rp0yw9ET=Qxsu51iLnDOuQWXz=w4Pmb0rSBKCq-g@mail.gmail.com>
References: <CAEOrW486qGcFF76OezNMMJcUKnkw=ASOuoMmzw2qe16MczbaKw@mail.gmail.com>
 <CAE1aY-=bKtz7eAam-LGTXxHeUHEP_1Ou_pBacrb=wHOOu=BnUg@mail.gmail.com>
 <216082b3-39cd-c898-c6cf-236d52d3abaa@gmail.com>
 <CAAkaFLXRJ8Rp0yw9ET=Qxsu51iLnDOuQWXz=w4Pmb0rSBKCq-g@mail.gmail.com>
Message-ID: <CADeotZqbPx3CiVyLH7B=6CKw6JY84VkV_+-VEJ3WzZC-aMBPjw@mail.gmail.com>

it's the same projects.

Alex

From jeremie.du-boisberranger at inria.fr  Tue Jul 23 04:37:24 2019
From: jeremie.du-boisberranger at inria.fr (Jeremie du Boisberranger)
Date: Tue, 23 Jul 2019 10:37:24 +0200
Subject: [scikit-learn] Continues monitoring of benchmark performances
In-Reply-To: <CADeotZqbPx3CiVyLH7B=6CKw6JY84VkV_+-VEJ3WzZC-aMBPjw@mail.gmail.com>
References: <CAEOrW486qGcFF76OezNMMJcUKnkw=ASOuoMmzw2qe16MczbaKw@mail.gmail.com>
 <CAE1aY-=bKtz7eAam-LGTXxHeUHEP_1Ou_pBacrb=wHOOu=BnUg@mail.gmail.com>
 <216082b3-39cd-c898-c6cf-236d52d3abaa@gmail.com>
 <CAAkaFLXRJ8Rp0yw9ET=Qxsu51iLnDOuQWXz=w4Pmb0rSBKCq-g@mail.gmail.com>
 <CADeotZqbPx3CiVyLH7B=6CKw6JY84VkV_+-VEJ3WzZC-aMBPjw@mail.gmail.com>
Message-ID: <27b2229e-6840-99b2-e5bf-ace5d92017ee@inria.fr>

 > Isn't J?r?mie's project at 
https://github.com/jeremiedbb/scikit-learn_benchmarks?meant to be doing 
this? What's its status? How does it relate to Tom's work?

Yes it's the same project. Tom kindly accepted to run it alongside other 
projects from the pydata ecosystem.


 > I can't find it back but I think J?r?mie has submitted an issue about 
this?

I didn't submit an issue but briefly mentioned it at the Paris sprint in 
february.


 > but I had to disable them since they were taking too long (longer 
than a day)

Something wrong is definitely going on. It should take 1-2 hours (it 
does on my laptop).

Before it was running on Tom's machine, we were considering running it 
on a dedicated machine at INRIA.

Maybe it will be better to do that after all.


 > What's its status?

After seeing run for a few weeks, I think it still needs some more work. 
More readable presentation of the results. Some benchmarks show large 
fluctuations. It might be the hardware or maybe my settings are not good.


On 23/07/2019 08:25, Alexandre Gramfort wrote:
> it's the same projects.
>
> Alex
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

From adrin.jalali at gmail.com  Tue Jul 23 07:51:43 2019
From: adrin.jalali at gmail.com (Adrin)
Date: Tue, 23 Jul 2019 13:51:43 +0200
Subject: [scikit-learn] Long term roadmap and moonshot goals
In-Reply-To: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com>
References: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com>
Message-ID: <CAEOrW4-8EMpTz51jZDp9EBnpzRyKSZtaL-E8_-nOtiFuchMzvQ@mail.gmail.com>

It may be worth doing a user survey to get a feeling of what people care
about, we may or may not take them into account afterwards.

Here's how Dask is doing it: https://github.com/dask/dask/issues/4748

On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller <t3kcit at gmail.com> wrote:

> Hi all.
> At SciPy, Brian Granger raised a good point about their planning for the
> Jupyter Project, which is the importance of long-term goals.
>
> I think it's great that we now have a detailed short-term roadmap
> (https://scikit-learn.org/dev/roadmap.html).
> Given that we now have about 6(!) full time people (Oliver, Jeremy,
> Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I think
> it's realistic
> to achieve most of these within a year or two. We have actually made
> some significant progress already.
>
> I think now would be a good time to start thinking about a longer-term
> roadmap, say 3-5 years out.
> What do we want to achieve? What are realistic goals, and what are
> moonshot goals?
> Having a common vision and shared goals might help us with funding, but
> might also help us with prioritization and motivation.
>
> What do you think? Do you think this is important and worth-while?
> And what should our goals be?
>
> Best,
> Andy
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190723/59448fcd/attachment-0001.html>

From tom.augspurger88 at gmail.com  Tue Jul 23 11:28:30 2019
From: tom.augspurger88 at gmail.com (Tom Augspurger)
Date: Tue, 23 Jul 2019 10:28:30 -0500
Subject: [scikit-learn] Long term roadmap and moonshot goals
In-Reply-To: <CAEOrW4-8EMpTz51jZDp9EBnpzRyKSZtaL-E8_-nOtiFuchMzvQ@mail.gmail.com>
References: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com>
 <CAEOrW4-8EMpTz51jZDp9EBnpzRyKSZtaL-E8_-nOtiFuchMzvQ@mail.gmail.com>
Message-ID: <CAE1aY-nAKBWXxoAeUbJnnf0QtxF+1vupYsBc8b=5-4xLOA0ADA@mail.gmail.com>

Pandas will be running one soon too:
https://github.com/pandas-dev/pandas/issues/27477

It may be worth coordinating on questions so that we can compare
communities (or combining surveys to reduce "survey-fatigue" somehow?
Haven't thought through this).

Tom

On Tue, Jul 23, 2019 at 6:54 AM Adrin <adrin.jalali at gmail.com> wrote:

> It may be worth doing a user survey to get a feeling of what people care
> about, we may or may not take them into account afterwards.
>
> Here's how Dask is doing it: https://github.com/dask/dask/issues/4748
>
> On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller <t3kcit at gmail.com> wrote:
>
>> Hi all.
>> At SciPy, Brian Granger raised a good point about their planning for the
>> Jupyter Project, which is the importance of long-term goals.
>>
>> I think it's great that we now have a detailed short-term roadmap
>> (https://scikit-learn.org/dev/roadmap.html).
>> Given that we now have about 6(!) full time people (Oliver, Jeremy,
>> Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I think
>> it's realistic
>> to achieve most of these within a year or two. We have actually made
>> some significant progress already.
>>
>> I think now would be a good time to start thinking about a longer-term
>> roadmap, say 3-5 years out.
>> What do we want to achieve? What are realistic goals, and what are
>> moonshot goals?
>> Having a common vision and shared goals might help us with funding, but
>> might also help us with prioritization and motivation.
>>
>> What do you think? Do you think this is important and worth-while?
>> And what should our goals be?
>>
>> Best,
>> Andy
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190723/84029b9c/attachment.html>

From niedakh at gmail.com  Tue Jul 23 11:36:24 2019
From: niedakh at gmail.com (=?UTF-8?Q?Piotr_Szyma=C5=84ski?=)
Date: Tue, 23 Jul 2019 17:36:24 +0200
Subject: [scikit-learn] Long term roadmap and moonshot goals
In-Reply-To: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com>
References: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com>
Message-ID: <CAOMdbFDM3thhjziccmTX0GGphHpz4svqF7fGGDZHd4QQgk5NTw@mail.gmail.com>

If I could pitch in, it would be lovely, very lovely indeed, if
scikit-learn models could:

- operate on sparse data, both input and output by default
- implement some kind of sparse vector representation (as in
https://github.com/scikit-learn/scikit-learn/issues/8908 )
- perhaps have a unifiying numpy.array / scipy.sparse_matrix interface to
give people some slack on jumping betwen [] operator conventions

We would benefit from that strongly in scikit-multilearn, as when a
multi-output problem is transformed to a single-output problem based on
unique combinations, this representation has to be dense for scikit-learn
at the moment. We end up losing some speed there. I'm sure other libraries
like ex. imbalanced-learn, or scikit-multiflow would also see these as a
huge thing.

Best,
Piotr


On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller <t3kcit at gmail.com> wrote:

> Hi all.
> At SciPy, Brian Granger raised a good point about their planning for the
> Jupyter Project, which is the importance of long-term goals.
>
> I think it's great that we now have a detailed short-term roadmap
> (https://scikit-learn.org/dev/roadmap.html).
> Given that we now have about 6(!) full time people (Oliver, Jeremy,
> Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I think
> it's realistic
> to achieve most of these within a year or two. We have actually made
> some significant progress already.
>
> I think now would be a good time to start thinking about a longer-term
> roadmap, say 3-5 years out.
> What do we want to achieve? What are realistic goals, and what are
> moonshot goals?
> Having a common vision and shared goals might help us with funding, but
> might also help us with prioritization and motivation.
>
> What do you think? Do you think this is important and worth-while?
> And what should our goals be?
>
> Best,
> Andy
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>


-- 
Piotr Szyma?ski
niedakh at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190723/e2363ee7/attachment.html>

From t3kcit at gmail.com  Tue Jul 23 11:45:38 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Tue, 23 Jul 2019 11:45:38 -0400
Subject: [scikit-learn] Long term roadmap and moonshot goals
In-Reply-To: <CAE1aY-nAKBWXxoAeUbJnnf0QtxF+1vupYsBc8b=5-4xLOA0ADA@mail.gmail.com>
References: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com>
 <CAEOrW4-8EMpTz51jZDp9EBnpzRyKSZtaL-E8_-nOtiFuchMzvQ@mail.gmail.com>
 <CAE1aY-nAKBWXxoAeUbJnnf0QtxF+1vupYsBc8b=5-4xLOA0ADA@mail.gmail.com>
Message-ID: <9b007d3f-33b6-f00b-f2e2-5f523cf36be8@gmail.com>

We had one done in 2013 (wow!).
I'll post the link to the internal mailing list since it could have 
identifying information.
Obviously the answers now would be quite different, just thought it 
would be interesting to look at it again.

On 7/23/19 10:28 AM, Tom Augspurger wrote:
> Pandas will be running one soon too: 
> https://github.com/pandas-dev/pandas/issues/27477
>
> It may be worth coordinating on questions so that we can compare 
> communities (or combining surveys to reduce "survey-fatigue" somehow? 
> Haven't thought through this).
>
> Tom
>
> On Tue, Jul 23, 2019 at 6:54 AM Adrin <adrin.jalali at gmail.com 
> <mailto:adrin.jalali at gmail.com>> wrote:
>
>     It may be worth doing a user survey to get a feeling of what
>     people care about, we may or may not take them into account
>     afterwards.
>
>     Here's how Dask is doing it: https://github.com/dask/dask/issues/4748
>
>     On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller <t3kcit at gmail.com
>     <mailto:t3kcit at gmail.com>> wrote:
>
>         Hi all.
>         At SciPy, Brian Granger raised a good point about their
>         planning for the
>         Jupyter Project, which is the importance of long-term goals.
>
>         I think it's great that we now have a detailed short-term roadmap
>         (https://scikit-learn.org/dev/roadmap.html).
>         Given that we now have about 6(!) full time people (Oliver,
>         Jeremy,
>         Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO
>         TEAM!!), I think
>         it's realistic
>         to achieve most of these within a year or two. We have
>         actually made
>         some significant progress already.
>
>         I think now would be a good time to start thinking about a
>         longer-term
>         roadmap, say 3-5 years out.
>         What do we want to achieve? What are realistic goals, and what
>         are
>         moonshot goals?
>         Having a common vision and shared goals might help us with
>         funding, but
>         might also help us with prioritization and motivation.
>
>         What do you think? Do you think this is important and worth-while?
>         And what should our goals be?
>
>         Best,
>         Andy
>         _______________________________________________
>         scikit-learn mailing list
>         scikit-learn at python.org <mailto:scikit-learn at python.org>
>         https://mail.python.org/mailman/listinfo/scikit-learn
>
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190723/d0af89d1/attachment-0001.html>

From t3kcit at gmail.com  Tue Jul 23 11:52:40 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Tue, 23 Jul 2019 11:52:40 -0400
Subject: [scikit-learn] Long term roadmap and moonshot goals
In-Reply-To: <CAOMdbFDM3thhjziccmTX0GGphHpz4svqF7fGGDZHd4QQgk5NTw@mail.gmail.com>
References: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com>
 <CAOMdbFDM3thhjziccmTX0GGphHpz4svqF7fGGDZHd4QQgk5NTw@mail.gmail.com>
Message-ID: <5e9ff930-7db6-67a6-2231-f59395f47e68@gmail.com>

Can you give an example?

I imagine that just supporting the data structure will not give you any 
speed benefit unless the algorithms are reimplemented to take advantage 
of the problem structure.
Even if the output of logistic regression would be a sparse binary 
vector, you'd still need to compute every entry, which would be the slow 
part.


On 7/23/19 10:36 AM, Piotr Szyma?ski wrote:
> If I could pitch in, it would be lovely, very lovely indeed, if 
> scikit-learn models could:
>
> - operate on sparse data, both input and output by default
> - implement some kind of sparse vector representation (as in 
> https://github.com/scikit-learn/scikit-learn/issues/8908 )
> - perhaps have a unifiying numpy.array / scipy.sparse_matrix interface 
> to give people some slack on jumping betwen [] operator conventions
>
> We would benefit from that strongly in scikit-multilearn, as when a 
> multi-output problem is transformed to a single-output problem based 
> on unique combinations, this representation has to be dense for 
> scikit-learn at the moment. We end up losing some speed there. I'm 
> sure other libraries like ex. imbalanced-learn, or scikit-multiflow 
> would also see these as a huge thing.
>
> Best,
> Piotr
>
>
>
> On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller <t3kcit at gmail.com 
> <mailto:t3kcit at gmail.com>> wrote:
>
>     Hi all.
>     At SciPy, Brian Granger raised a good point about their planning
>     for the
>     Jupyter Project, which is the importance of long-term goals.
>
>     I think it's great that we now have a detailed short-term roadmap
>     (https://scikit-learn.org/dev/roadmap.html).
>     Given that we now have about 6(!) full time people (Oliver, Jeremy,
>     Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I
>     think
>     it's realistic
>     to achieve most of these within a year or two. We have actually made
>     some significant progress already.
>
>     I think now would be a good time to start thinking about a
>     longer-term
>     roadmap, say 3-5 years out.
>     What do we want to achieve? What are realistic goals, and what are
>     moonshot goals?
>     Having a common vision and shared goals might help us with
>     funding, but
>     might also help us with prioritization and motivation.
>
>     What do you think? Do you think this is important and worth-while?
>     And what should our goals be?
>
>     Best,
>     Andy
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> -- 
> Piotr Szyma?ski
> niedakh at gmail.com <mailto:niedakh at gmail.com>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190723/12accf04/attachment.html>

From gabor.toth at maximilianeum.de  Tue Jul 23 12:25:55 2019
From: gabor.toth at maximilianeum.de (Gabor Toth)
Date: Tue, 23 Jul 2019 09:25:55 -0700
Subject: [scikit-learn] Random Forest without target to measure feature
 importance
Message-ID: <CANKXKxFHsHqDWNccZcVBYOv6bOWNr1rpUurMhyDxUSyKYf6NjA@mail.gmail.com>

Hello,

I would like to use Random Forest classifier to assess the importance of
features (bag-of-words) but I don't have any predefined class labels or any
test data. I have earlier used ExtraTreesClassifier() with fit_transform,
which is not available anymore (see below). I am wondering how I could use
Random Forest now.

clf = ExtraTreesClassifier()

clf.fit_transform(doc_term_matrix,np.empty(doc_term_matrix.shape))
features_importance=np.array(clf.feature_importances_)

Thanks,

Gabor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190723/3e97849c/attachment.html>

From glennmschultz at me.com  Wed Jul 24 14:24:36 2019
From: glennmschultz at me.com (Glenn Schultz)
Date: Wed, 24 Jul 2019 14:24:36 -0400
Subject: [scikit-learn] question using GridSearchCV
Message-ID: <E3D9A6D4-E016-4863-B8E1-C9E735D96FA8@me.com>

I am using GBClassifier, the below works if I use the default accuracy but it fails using roc_auc or roc_auc_score.  I have found many examples to work with but for the life of me I can?t get it two work with roc_auc.  What am I doing wrong.

from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import auction, roc_auc_score 

y_train = LoansTrainData[?event?]
x_train LoanTrainData[LoansTrainData.columns.drop(?event?)]

parameters = {?loss? :[?deviance?],
			?scoring? :[?roc_auc?},
			?learning_rate? :[.1, .05]

selfCLF =GridSearchCV(GradientBoostingClassifier(), parameters, versose = 3m n_jobs = 4)
searchCLF(x_train, y_train)

From t3kcit at gmail.com  Wed Jul 24 14:57:01 2019
From: t3kcit at gmail.com (Andreas Mueller)
Date: Wed, 24 Jul 2019 14:57:01 -0400
Subject: [scikit-learn] question using GridSearchCV
In-Reply-To: <E3D9A6D4-E016-4863-B8E1-C9E735D96FA8@me.com>
References: <E3D9A6D4-E016-4863-B8E1-C9E735D96FA8@me.com>
Message-ID: <f917950e-0be9-1a3c-50bf-f4dedf33e07a@gmail.com>

scoring is not a parameter.
It needs to be passed to GridSearchCV

selfCLF =GridSearchCV(GradientBoostingClassifier(), parameters, versose = 3m n_jobs = 4), scoring='roc_auc')


On 7/24/19 1:24 PM, Glenn Schultz via scikit-learn wrote:
> I am using GBClassifier, the below works if I use the default accuracy but it fails using roc_auc or roc_auc_score.  I have found many examples to work with but for the life of me I can?t get it two work with roc_auc.  What am I doing wrong.
>
> from sklearn.ensemble import GradientBoostingClassifier
> from sklearn.model_selection import GridSearchCV
> from sklearn.metrics import auction, roc_auc_score
>
> y_train = LoansTrainData[?event?]
> x_train LoanTrainData[LoansTrainData.columns.drop(?event?)]
>
> parameters = {?loss? :[?deviance?],
> 			?scoring? :[?roc_auc?},
> 			?learning_rate? :[.1, .05]
>
> selfCLF =GridSearchCV(GradientBoostingClassifier(), parameters, versose = 3m n_jobs = 4)
> searchCLF(x_train, y_train)
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


From glennmschultz at me.com  Wed Jul 24 15:11:15 2019
From: glennmschultz at me.com (Glenn Schultz)
Date: Wed, 24 Jul 2019 15:11:15 -0400
Subject: [scikit-learn] question using GridSearchCV
In-Reply-To: <f917950e-0be9-1a3c-50bf-f4dedf33e07a@gmail.com>
References: <E3D9A6D4-E016-4863-B8E1-C9E735D96FA8@me.com>
 <f917950e-0be9-1a3c-50bf-f4dedf33e07a@gmail.com>
Message-ID: <B9EDFF94-B339-4B88-9A78-9A2F73638156@me.com>

Thank you for answering ... makes sense now that you point it out.

Sent from my iPhone


> On Jul 24, 2019, at 2:57 PM, Andreas Mueller <t3kcit at gmail.com> wrote:
> 
> scoring is not a parameter.
> It needs to be passed to GridSearchCV
> 
> selfCLF =GridSearchCV(GradientBoostingClassifier(), parameters, versose = 3m n_jobs = 4), scoring='roc_auc')
> 
> 
> 
>> On 7/24/19 1:24 PM, Glenn Schultz via scikit-learn wrote:
>> I am using GBClassifier, the below works if I use the default accuracy but it fails using roc_auc or roc_auc_score.  I have found many examples to work with but for the life of me I can?t get it two work with roc_auc.  What am I doing wrong.
>> 
>> from sklearn.ensemble import GradientBoostingClassifier
>> from sklearn.model_selection import GridSearchCV
>> from sklearn.metrics import auction, roc_auc_score
>> 
>> y_train = LoansTrainData[?event?]
>> x_train LoanTrainData[LoansTrainData.columns.drop(?event?)]
>> 
>> parameters = {?loss? :[?deviance?],
>>            ?scoring? :[?roc_auc?},
>>            ?learning_rate? :[.1, .05]
>> 
>> selfCLF =GridSearchCV(GradientBoostingClassifier(), parameters, versose = 3m n_jobs = 4)
>> searchCLF(x_train, y_train)
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn


From aiwern at gmail.com  Thu Jul 25 14:16:21 2019
From: aiwern at gmail.com (Ai Wern)
Date: Thu, 25 Jul 2019 14:16:21 -0400
Subject: [scikit-learn] FINAL CALL for Papers: MICCAI 2019 Connectomics in
 NeuroImaging Workshop and Challenge
Message-ID: <CAHUSaBZ70-jy+h3VuzgFsEAkLNvLtRnf0gYgy8VS1suJVFdcwg@mail.gmail.com>

**Apologies for cross posting**

Dear Colleagues,

This is a final call for full-length papers submissions to our *3rd
International Workshop on Connectomics in NeuroImaging (CNI 2019),
*and *Transfer-Learning
CNI Challenge 2019 *held in parallel with the 22nd International Conference
on Medical Image Computing and Computer-assisted Intervention (MICCAI 2019)
in Shenzhen, China.

*Our deadline has been extended to Weds 31st July 2019*

**** CNI Workshop Call for Papers ****

Our topics of interest cover (but are not limited to):

(1) New developments in connectome construction from different imaging
modalities;
(2) Development of data driven techniques to identify biomarkers in
connectome data;
(3) Machine learning algorithms and connectome data analysis;
(4) Brain network modeling and formal conceptual models of connectome data;
(5) Evaluation and validation of connectome models.

If you have research that fits into the scope of our workshop detailed on
our website (http://www.brainconnectivity.net/workshop*)*, we encourage you
to submit a paper.


**** CNI Call for Challengers ****

Addressing the issues of generalizability and clinical relevance for
functional connectomes, you can leverage a unique resting-state fMRI
(rsfMRI) dataset of attention deficit hyperactivity disorder (ADHD) and
neurotypical controls (NC) to design a classification framework that can
predict subject diagnosis (ADHD vs. NC) based on brain connectivity data.

In a surprise twist, we will also evaluate the classification performance
on a related clinical population with an ADHD comorbidity. This challenge
will allow us to assess (1) whether the method is extracting functional
connectivity patterns related to ADHD symptomatology, and (2) how much of
this information ?transfers? between clinical populations.

Training and validation data are now released
http://www.brainconnectivity.net/challenge


**** Why submit to the CNI Workshop and Challenge? ****

- Keynote talks by Prof Yong He (Beijing Normal University, China) and Dr.
Fan Zhang (Harvard Medical School, USA);
- Oral presentations and poster sessions to provide you with ample
opportunity for exchanges and discussions;
- Accepted papers will be published in an LNCS proceedings;
- Best Paper and Poster Awards will be presented, and sponsored prizes for
Challenge winners.

**** Important dates for CNI workshop ****

- Submission deadline: July 31st, 2019, 23:59 EST
- Notification of acceptance: August 13th, 2019
- Camera-ready deadline : August 18th, 2019, 23:59 EST
- Submission website: https://cmt3.research.microsoft.com/CNI2019

**** Important dates for CNI Challenge ****

- Submission deadline: August 15th, 2019, 23:59 EST
- Submission website: https://cmt3.research.microsoft.com/CNIChallenge2019
<https://cmt3.research.microsoft.com/CNI2019>

For more information, visit our website at http://www.brainconnectivity.net.


Best wishes,
CNI 2019 Organising Committee
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190725/fd0e85a7/attachment.html>

From niourf at gmail.com  Fri Jul 26 14:08:14 2019
From: niourf at gmail.com (Nicolas Hug)
Date: Fri, 26 Jul 2019 14:08:14 -0400
Subject: [scikit-learn] Monthly meetings between core developers +
 "Hello World"
In-Reply-To: <CAEOrW4-L08RDzKkidBb5r0UXUzRJ9rrjEhGUUPY+ogezQ1E4Wg@mail.gmail.com>
References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn>
 <CAAkaFLX60h==ZazLjZat6tv7pSnykYHnY8YiF3+XP35q0vSiBQ@mail.gmail.com>
 <CAGfF14_7VJfcpgTqd_JPu51oO2bJJTyDfV6Oy9eHVzye=d2pmw@mail.gmail.com>
 <CAEOrW48LZDTkAd9yX==cxHBuGEbZApg9QWFrLsAmVFXEpdN-xA@mail.gmail.com>
 <b57b51be-478d-2098-ee3b-fceaec21d906@gmail.com>
 <CAEOrW4-L08RDzKkidBb5r0UXUzRJ9rrjEhGUUPY+ogezQ1E4Wg@mail.gmail.com>
Message-ID: <08716118-a3a8-0131-aeca-f97a8aba3f25@gmail.com>

Thanks everyone for your feedback!

Let's try to have a meeting on Monday 5th August, and then have meetings 
on the last Monday of the month? Next meeting would be on August 26th.

For the time: 
https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=8&day=5&hour=13&min=0&sec=0&p1=240&p2=33&p3=37&p4=179. 
This one is convenient for NY and Europe, less so for Sydney / Beijing . 
We can have the next meeting accommodate for Joel / Hanmin.

We can use Andy's appear.in : https://appear.in/amueller. I'm happy to 
(try to) "lead" the discussion this first time?


For logistics: I created a new project board 
https://github.com/scikit-learn/scikit-learn/projects/15

I was thinking of having one column per meeting. A few days before the 
meeting, people can write down what they plan to discuss (one note per 
core-dev), so others can prepare. In particular, people that are not 
able to attend can leave details here (let us know in the notes!).

One advantage of these boards is that they're searchable, we have a 
clear history of meetings, and it's easy to reference PRs/issues. This 
is of course only a proposal, we can try it and see whether it works out ;)


@Chiara Welcome!! Thanks for offering to help! It didn't take long so I 
took care of creating the board (also I would have felt bad for making 
you work while you only start in Sep).


Thanks,

Nicolas

On 7/22/19 9:57 AM, Adrin wrote:
> That's kinda what I meant. I didn't mean to limit the access to the 
> project to @core-devs, I meant they can be pinged.
>
> On Mon, Jul 22, 2019 at 3:56 PM Andreas Mueller <t3kcit at gmail.com 
> <mailto:t3kcit at gmail.com>> wrote:
>
>
>     On 7/22/19 9:22 AM, Adrin wrote:
>>     Awesome, excited to have your help around :)
>>
>>     We already have the @core-devs team on github, we can use it more
>>     often/more organized.hi
>
>     Why wouldn't we just use the scikit-learn repo projects?
>
>
>>
>>     On Fri, Jul 19, 2019 at 2:48 PM Chiara Marmo
>>     <marmochiaskl at gmail.com <mailto:marmochiaskl at gmail.com>> wrote:
>>
>>         Dear list,
>>
>>         I'm Chiara, in September I will start to work full time for
>>         the Scikit-Learn Consortium at INRIA (France). My background
>>         is in Astronomy and Planetary Science: I've worked there as a
>>         Research Engineer for around 15 years, writing code, mining
>>         data and managing some project.
>>
>>         One of my task at the Consortium will be to take care of our
>>         connection with the developer community, so let me know if I
>>         can help in managing those monthly meetings in some way.
>>         In the meanwhile, may I suggest to create a github team for
>>         core developers in the scikit-learn organization? As
>>         Alexandre said, team specific projects and discussions on
>>         github could be a way to efficiently prepare meetings and
>>         prioritize issues.
>>
>>         Thanks for listening,
>>         have a nice day.
>>         Chiara
>>         _______________________________________________
>>         scikit-learn mailing list
>>         scikit-learn at python.org <mailto:scikit-learn at python.org>
>>         https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>>     _______________________________________________
>>     scikit-learn mailing list
>>     scikit-learn at python.org  <mailto:scikit-learn at python.org>
>>     https://mail.python.org/mailman/listinfo/scikit-learn
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190726/a5f26821/attachment.html>

From joel.nothman at gmail.com  Mon Jul 29 22:57:08 2019
From: joel.nothman at gmail.com (Joel Nothman)
Date: Tue, 30 Jul 2019 12:57:08 +1000
Subject: [scikit-learn] ANN Scikit-learn 0.21.3 and 0.20.4 released
Message-ID: <CAAkaFLUpNK_Dkw51cnRq9Wfx2J-79CHs=ROQm2xW7viFCN2gWQ@mail.gmail.com>

We have released patches to Scikit-learn 0.21 (python >=3.5) and 0.20
(python 2 and 3) series including several bug fixes. See their respective
change logs at
https://scikit-learn.org/dev/whats_new/v0.21.html#version-0-21-3 and
https://scikit-learn.org/dev/whats_new/v0.20.html#version-0-20-4.

Install them from PyPI or conda-forge:
* https://pypi.org/project/scikit-learn/0.21.3/
* https://pypi.org/project/scikit-learn/0.20.4/

Thanks to all who have contributed!

Also, work in progress: Scikit-learn 0.22 is in development with lots of
great new features to be released hopefully towards the end of 2019. See
change logs at https://scikit-learn.org/dev/whats_new/v0.22.html for some
of the things coming your way, and try them out by installing the nightly
build (see
https://scikit-learn.org/dev/developers/advanced_installation.html#installing-nightly-builds
<https://scikit-learn.org/stable/developers/advanced_installation.html#installing-nightly-builds>
).

Happy learning!

the scikit-learn team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190730/407b9202/attachment.html>

From charujing123 at 163.com  Tue Jul 30 21:22:10 2019
From: charujing123 at 163.com (charujing123)
Date: Wed, 31 Jul 2019 09:22:10 +0800
Subject: [scikit-learn] cross-validated MANOVA
Message-ID: <21b11453.1152856.16c459eca56.Coremail.charujing123@163.com>

Dear experts and users,
Does anyone know how to perform cross-validated  multivariate analysis of variance? This is the paper mentioned this method "Searchlight-based multi-voxel pattern analysis of fMRI by cross-validated MANOVA".
Thanks.
Rujing

2019-07-31


charujing123 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190731/d0b031de/attachment.html>