From pahome.chen at mirlab.org Tue Jul 2 00:48:11 2019 From: pahome.chen at mirlab.org (lampahome) Date: Tue, 2 Jul 2019 12:48:11 +0800 Subject: [scikit-learn] What's the principle of partial_fit? Message-ID: I work with partial_fit of Birch because the dataset is too huge to load into memory. So I cluster data batch by batch. eg: I have 50000 samples and every batch contain 1000 samples. I found clustering result is better if I cluster data which contain part of last batch better than cluster data which doesn't contain previous data. So I want to know how partail_fit works. -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Wed Jul 3 04:12:46 2019 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Wed, 3 Jul 2019 10:12:46 +0200 Subject: [scikit-learn] New core developer: jeremiedbb Message-ID: The core developers of Scikit-learn have recently voted to welcome J?r?mie Du Boisberranger to the team, in recognition of his efforts and trustworthiness as contributor. J?r?mie's works at Inria Saclay and is supported by the scikit-learn initiative at Fondation Inria and its partners. Congratulations and welcome to the team J?r?mie! -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From adrin.jalali at gmail.com Fri Jul 5 10:02:18 2019 From: adrin.jalali at gmail.com (Adrin) Date: Fri, 5 Jul 2019 16:02:18 +0200 Subject: [scikit-learn] New core developer: jeremiedbb In-Reply-To: References: Message-ID: woohoo, congrats Jeremie :) On Wed, Jul 3, 2019 at 10:14 AM Olivier Grisel wrote: > The core developers of Scikit-learn have recently voted to welcome > J?r?mie Du Boisberranger to the team, in recognition of his efforts > and trustworthiness as contributor. J?r?mie's works at Inria Saclay > and is supported by the scikit-learn initiative at Fondation Inria and > its partners. > > Congratulations and welcome to the team J?r?mie! > > -- > Olivier > http://twitter.com/ogrisel - http://github.com/ogrisel > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charujing123 at 163.com Sun Jul 7 02:03:52 2019 From: charujing123 at 163.com (charujing123) Date: Sun, 7 Jul 2019 14:03:52 +0800 Subject: [scikit-learn] how to preprocess in the cross_validate Message-ID: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com> Hi It's easy to preprocess when i used part of data to train and test. However, how to preprocess within the function of sklearn.model_selection.cross_validate? Thanks. Rujing 2019-07-07 charujing123 -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliverrausch99 at gmail.com Sun Jul 7 04:03:08 2019 From: oliverrausch99 at gmail.com (Oliver Rausch) Date: Sun, 7 Jul 2019 10:03:08 +0200 Subject: [scikit-learn] how to preprocess in the cross_validate In-Reply-To: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com> References: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com> Message-ID: Hi Rujing, The Pipeline [0] from sklearn may be of interest to you. Best regards, Oliver ? [0] https://scikit-learn.org/stable/modules/compose.html On Sun, Jul 7, 2019 at 08:50 charujing123 wrote: > Hi > It's easy to preprocess when i used part of data to train and test. > However, how to preprocess within the function of > sklearn.model_selection.cross_validate? > Thanks. > Rujing > > 2019-07-07 > ------------------------------ > charujing123 > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -- Best Regards, Oliver -------------- next part -------------- An HTML attachment was scrubbed... URL: From pahome.chen at mirlab.org Mon Jul 8 06:17:36 2019 From: pahome.chen at mirlab.org (lampahome) Date: Mon, 8 Jul 2019 18:17:36 +0800 Subject: [scikit-learn] Can I pre-calculate parameter threshold of Birch? Message-ID: The threshold is determined by the sphere and simulate the points into a sphere. When I tune parameters, I don't know how to set the range of threshold to tune. Can I pre-calculate the threshold? -------------- next part -------------- An HTML attachment was scrubbed... URL: From np.dong572 at gmail.com Mon Jul 8 09:48:20 2019 From: np.dong572 at gmail.com (Naiping Dong) Date: Mon, 8 Jul 2019 21:48:20 +0800 Subject: [scikit-learn] Variable kernel density estimation Message-ID: How sklearn perform cross validation "GridSearchCV" for bandwidth selection? It seems that the CV for kernel density estimation is different with the one used for classification. Is it used least square errors for this aim? Second, is it possible for me to use variable bandwidth for kernel density estimation, that is, use different bandwidth for different data point? Thanks. -- Elkan Department of Chemistry, HKU, HK -------------- next part -------------- An HTML attachment was scrubbed... URL: From albertthomas88 at gmail.com Mon Jul 8 10:04:48 2019 From: albertthomas88 at gmail.com (Albert Thomas) Date: Mon, 8 Jul 2019 16:04:48 +0200 Subject: [scikit-learn] Variable kernel density estimation In-Reply-To: References: Message-ID: Hi, The default score used by GridSearchCV is the one of the estimator; for KernelDensity it?s the total log likelihood. As far as I know it is not possible to have different bandwidths. Albert On Mon 8 Jul 2019 at 15:50, Naiping Dong wrote: > How sklearn perform cross validation "GridSearchCV" for bandwidth > selection? It seems that the CV for kernel density estimation is different > with the one used for classification. Is it used least square errors for > this aim? > > Second, is it possible for me to use variable bandwidth for kernel density > estimation, that is, use different bandwidth for different data point? > > Thanks. > -- > Elkan > Department of Chemistry, HKU, HK > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charujing123 at 163.com Mon Jul 8 20:48:04 2019 From: charujing123 at 163.com (charujing123) Date: Tue, 9 Jul 2019 08:48:04 +0800 Subject: [scikit-learn] how to preprocess in the cross_validate In-Reply-To: References: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com> Message-ID: <78725424.10ccf02.16bd433adef.Coremail.charujing123@163.com> Hi Oliver, Thanks for your kind reply. I read the manual, however, i did not find any options in the function of cross_validate to control the fit transformation. The fit_transform could be used to preprocessing in the pipeline, however, how to integrate this into the function of sklearn.model_selection.cross_validate? Thanks. Rujing 2019-07-09 charujing123 ????Oliver Rausch ?????2019-07-07 16:03 ???Re: [scikit-learn] how to preprocess in the cross_validate ????"Scikit-learn mailing list" ??? Hi Rujing, The Pipeline [0] from sklearn may be of interest to you. Best regards, Oliver ? [0] https://scikit-learn.org/stable/modules/compose.html On Sun, Jul 7, 2019 at 08:50 charujing123 wrote: Hi It's easy to preprocess when i used part of data to train and test. However, how to preprocess within the function of sklearn.model_selection.cross_validate? Thanks. Rujing 2019-07-07 charujing123 _______________________________________________ scikit-learn mailing list scikit-learn at python.org https://mail.python.org/mailman/listinfo/scikit-learn -- Best Regards, Oliver -------------- next part -------------- An HTML attachment was scrubbed... URL: From oliverrausch99 at gmail.com Tue Jul 9 05:00:05 2019 From: oliverrausch99 at gmail.com (Oliver Rausch) Date: Tue, 9 Jul 2019 11:00:05 +0200 Subject: [scikit-learn] how to preprocess in the cross_validate In-Reply-To: <78725424.10ccf02.16bd433adef.Coremail.charujing123@163.com> References: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com> <78725424.10ccf02.16bd433adef.Coremail.charujing123@163.com> Message-ID: Hi Rujing, You can integrate the preprocessing into the estimator by placing an estimator at the end of the pipeline. For example: make_pipeline(StandardScaler(), SVC()) This pipeline has a Support vector classifier at the end. Calling a function of the pipeline, for example fit(X, y), will first apply the StandardScaler to X, and then use the preprocessed X to fit the SVC. When you use such an estimator in the cross_validate function, the result is that the preprocessing will be applied during cross validation, like you wanted. Let me know if you have more questions. Oliver On Tue, Jul 9, 2019 at 03:04 charujing123 wrote: > Hi Oliver, > Thanks for your kind reply. I read the manual, however, i did not find any > options in the function of cross_validate to control the fit > transformation. The fit_transform could be used to preprocessing in the > pipeline, however, how to integrate this into the function of > sklearn.model_selection.cross_validate? > Thanks. > Rujing > > 2019-07-09 > ------------------------------ > charujing123 > ------------------------------ > > *????*Oliver Rausch > *?????*2019-07-07 16:03 > *???*Re: [scikit-learn] how to preprocess in the cross_validate > *????*"Scikit-learn mailing list" > *???* > > Hi Rujing, > The Pipeline [0] from sklearn may be of interest to you. > > Best regards, > Oliver > ? > [0] https://scikit-learn.org/stable/modules/compose.html > > On Sun, Jul 7, 2019 at 08:50 charujing123 wrote: > >> Hi >> It's easy to preprocess when i used part of data to train and test. >> However, how to preprocess within the function of >> sklearn.model_selection.cross_validate? >> Thanks. >> Rujing >> >> 2019-07-07 >> ------------------------------ >> charujing123 >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > -- > Best Regards, > Oliver > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -- Best Regards, Oliver -------------- next part -------------- An HTML attachment was scrubbed... URL: From charujing123 at 163.com Wed Jul 10 03:53:58 2019 From: charujing123 at 163.com (charujing123) Date: Wed, 10 Jul 2019 15:53:58 +0800 Subject: [scikit-learn] how to preprocess in the cross_validate In-Reply-To: References: <43110a45.10d5c30.16bcb0813c2.Coremail.charujing123@163.com> <78725424.10ccf02.16bd433adef.Coremail.charujing123@163.com> Message-ID: <4a10b04.1bae.16bdadff096.Coremail.charujing123@163.com> Hi Oliver For example 5-cross validation. In the cross_validate function, the StandardScaler would be fit in the trainning data, generating a model transformation? Then test data would also be transformed based on this model transformation. This two steps would be done 5 times. If I used make_pipeline(StandardScaler(), SVC()), is this right? Thanks. Rujing 2019-07-10 charujing123 ????Oliver Rausch ?????2019-07-09 17:00 ???Re: [scikit-learn] how to preprocess in the cross_validate ????"Scikit-learn mailing list" ??? Hi Rujing, You can integrate the preprocessing into the estimator by placing an estimator at the end of the pipeline. For example: make_pipeline(StandardScaler(), SVC()) This pipeline has a Support vector classifier at the end. Calling a function of the pipeline, for example fit(X, y), will first apply the StandardScaler to X, and then use the preprocessed X to fit the SVC. When you use such an estimator in the cross_validate function, the result is that the preprocessing will be applied during cross validation, like you wanted. Let me know if you have more questions. Oliver On Tue, Jul 9, 2019 at 03:04 charujing123 wrote: Hi Oliver, Thanks for your kind reply. I read the manual, however, i did not find any options in the function of cross_validate to control the fit transformation. The fit_transform could be used to preprocessing in the pipeline, however, how to integrate this into the function of sklearn.model_selection.cross_validate? Thanks. Rujing 2019-07-09 charujing123 ????Oliver Rausch ?????2019-07-07 16:03 ???Re: [scikit-learn] how to preprocess in the cross_validate ????"Scikit-learn mailing list" ??? Hi Rujing, The Pipeline [0] from sklearn may be of interest to you. Best regards, Oliver ? [0] https://scikit-learn.org/stable/modules/compose.html On Sun, Jul 7, 2019 at 08:50 charujing123 wrote: Hi It's easy to preprocess when i used part of data to train and test. However, how to preprocess within the function of sklearn.model_selection.cross_validate? Thanks. Rujing 2019-07-07 charujing123 _______________________________________________ scikit-learn mailing list scikit-learn at python.org https://mail.python.org/mailman/listinfo/scikit-learn -- Best Regards, Oliver _______________________________________________ scikit-learn mailing list scikit-learn at python.org https://mail.python.org/mailman/listinfo/scikit-learn -- Best Regards, Oliver -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Sun Jul 14 14:43:39 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Sun, 14 Jul 2019 13:43:39 -0500 Subject: [scikit-learn] Long term roadmap and moonshot goals Message-ID: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com> Hi all. At SciPy, Brian Granger raised a good point about their planning for the Jupyter Project, which is the importance of long-term goals. I think it's great that we now have a detailed short-term roadmap (https://scikit-learn.org/dev/roadmap.html). Given that we now have about 6(!) full time people (Oliver, Jeremy, Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I think it's realistic to achieve most of these within a year or two. We have actually made some significant progress already. I think now would be a good time to start thinking about a longer-term roadmap, say 3-5 years out. What do we want to achieve? What are realistic goals, and what are moonshot goals? Having a common vision and shared goals might help us with funding, but might also help us with prioritization and motivation. What do you think? Do you think this is important and worth-while? And what should our goals be? Best, Andy From nshervt at gmail.com Tue Jul 16 12:01:40 2019 From: nshervt at gmail.com (Navid Shervani-Tabar) Date: Tue, 16 Jul 2019 12:01:40 -0400 Subject: [scikit-learn] Multi-output regressor and sklearn's RFE module Message-ID: Hello, I have a question regarding sklearn's RFE module. I have a multi-output regressor and I would like to reduce dimensionality of input using RFE. However, it seems that it is not possible to have multi-dimensional output when using RFE. I was wondering if there is a workaround for this or there is something conceptually wrong about it. You can find a minimal working code at the following link. https://stackoverflow.com/questions/57060003/multi-output-regressor-and-sklearns-rfe-module Thanks! Navid -------------- next part -------------- An HTML attachment was scrubbed... URL: From niourf at gmail.com Wed Jul 17 14:49:00 2019 From: niourf at gmail.com (Nicolas Hug) Date: Wed, 17 Jul 2019 14:49:00 -0400 Subject: [scikit-learn] Monthly meetings between core developers Message-ID: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> Hi Everyone, The scikit-learn team have been expanding significantly lately: we have now 3 FTEs in NY, 1 in Berlin, and 3 (soon 4) in Paris. To scale efficiently, I think we should try to communicate more. *I'd like to propose monthly meetings* *between the core-developers*. This would be the occasion to: - communicate what everyone is currently working on - ask for feedback/reviews on some specific PRs - keep everybody apprised of the latest news/decisions regarding the project. Some discussions often take place in channels that some of us may miss. To keep it efficient, maybe we could have a hard time limit, e.g. 30 mins. One person would be in charge of conducting the meeting, and another one would take notes. We would take rounds every month. While meetings and notes would be public, only core-developers or strongly-involved members would be invited to join the discussion. I understand the time-zone differences and personal schedules will make it hard to arrange for all of us to be present. I'm not sure how to handle this equitably. We could have a document that sets the agenda and a place to take notes. For members that are not able to join, they can add items onto the agenda with their thoughts before and after the meeting. Core-devs, WDYT? Thanks! Nicolas -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Wed Jul 17 15:02:27 2019 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 17 Jul 2019 21:02:27 +0200 Subject: [scikit-learn] Monthly meetings between core developers In-Reply-To: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> Message-ID: <20190717190227.cke2mrh6wddp4lsl@phare.normalesup.org> On Wed, Jul 17, 2019 at 02:49:00PM -0400, Nicolas Hug wrote: > Core-devs, WDYT? +1. The real challenge will be to find a time slot! :) G From adrin.jalali at gmail.com Wed Jul 17 15:05:01 2019 From: adrin.jalali at gmail.com (Adrin) Date: Wed, 17 Jul 2019 21:05:01 +0200 Subject: [scikit-learn] Monthly meetings between core developers In-Reply-To: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> Message-ID: I'm very strongly all in favor, thanks for bringing it up. About the time-zone issue, I know some other teams with the same situation, conduct their meetings at different times, so that each time it's inconvenient for different people. Also, I know some of us are full-time on the project, but having regular meetings to me is independent of that, and I'd really like to hear in these meetings from people like Joel and Hanmin too. Cheers, Adrin. On Wed, Jul 17, 2019 at 8:52 PM Nicolas Hug wrote: > Hi Everyone, > > The scikit-learn team have been expanding significantly lately: we have > now 3 FTEs in NY, 1 in Berlin, and 3 (soon 4) in Paris. > > To scale efficiently, I think we should try to communicate more. > > *I'd like to propose monthly meetings* *between the core-developers*. > This would be the occasion to: > > - communicate what everyone is currently working on > > - ask for feedback/reviews on some specific PRs > > - keep everybody apprised of the latest news/decisions regarding the > project. Some discussions often take place in channels that some of us may > miss. > > To keep it efficient, maybe we could have a hard time limit, e.g. 30 mins. > One person would be in charge of conducting the meeting, and another one > would take notes. We would take rounds every month. While meetings and > notes would be public, only core-developers or strongly-involved members > would be invited to join the discussion. > > I understand the time-zone differences and personal schedules will make it > hard to arrange for all of us to be present. I'm not sure how to handle > this equitably. We could have a document that sets the agenda and a place > to take notes. For members that are not able to join, they can add items > onto the agenda with their thoughts before and after the meeting. > > Core-devs, WDYT? > > Thanks! > Nicolas > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.lemaitre58 at gmail.com Wed Jul 17 15:17:15 2019 From: g.lemaitre58 at gmail.com (=?UTF-8?Q?Guillaume_Lema=C3=AEtre?=) Date: Wed, 17 Jul 2019 21:17:15 +0200 Subject: [scikit-learn] Monthly meetings between core developers In-Reply-To: References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> Message-ID: I am +1. This is a great initiative. IMO, we could make it really regular (i.e., a specific week-day of a specific week in a month), with a rolling time (for the time-zone issue). In this matter, we could maybe clear more in advance our agenda instead of trying to find a date which accommodates everyone. Just a thought. Cheers, On Wed, 17 Jul 2019 at 21:07, Adrin wrote: > I'm very strongly all in favor, thanks for bringing it up. > > About the time-zone issue, I know some other teams with the same > situation, conduct their meetings > at different times, so that each time it's inconvenient for different > people. > > Also, I know some of us are full-time on the project, but having regular > meetings to me is independent > of that, and I'd really like to hear in these meetings from people like > Joel and Hanmin too. > > Cheers, > Adrin. > > On Wed, Jul 17, 2019 at 8:52 PM Nicolas Hug wrote: > >> Hi Everyone, >> >> The scikit-learn team have been expanding significantly lately: we have >> now 3 FTEs in NY, 1 in Berlin, and 3 (soon 4) in Paris. >> >> To scale efficiently, I think we should try to communicate more. >> >> *I'd like to propose monthly meetings* *between the core-developers*. >> This would be the occasion to: >> >> - communicate what everyone is currently working on >> >> - ask for feedback/reviews on some specific PRs >> >> - keep everybody apprised of the latest news/decisions regarding the >> project. Some discussions often take place in channels that some of us may >> miss. >> >> To keep it efficient, maybe we could have a hard time limit, e.g. 30 >> mins. One person would be in charge of conducting the meeting, and another >> one would take notes. We would take rounds every month. While meetings and >> notes would be public, only core-developers or strongly-involved members >> would be invited to join the discussion. >> >> I understand the time-zone differences and personal schedules will make >> it hard to arrange for all of us to be present. I'm not sure how to handle >> this equitably. We could have a document that sets the agenda and a place >> to take notes. For members that are not able to join, they can add items >> onto the agenda with their thoughts before and after the meeting. >> >> Core-devs, WDYT? >> >> Thanks! >> Nicolas >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -- Guillaume Lemaitre INRIA Saclay - Parietal team Center for Data Science Paris-Saclay https://glemaitre.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Wed Jul 17 18:12:51 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Wed, 17 Jul 2019 18:12:51 -0400 Subject: [scikit-learn] Monthly meetings between core developers In-Reply-To: References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> Message-ID: On 7/17/19 2:17 PM, Guillaume Lema?tre wrote: > I am +1. This is a great initiative. > > IMO, we could make it really regular (i.e., a specific week-day of a > specific week in a month), with a rolling time (for the time-zone issue). > In this matter, we could maybe clear more in advance our agenda > instead of trying to find a date which accommodates everyone. > I agree, we could do something like the last Monday every month and alternate between two (or three) different time zones. We have CET (UTC+1), EST (UTC-5), CT (UTC+08), AEDT (USC+11) so that seems super easy, right? (TIL CST can stand for "Central"/US, China, and Cuba! not confusing at all) I agree that we should be as inclusive as possible, but I also don't want to create the expectation that some people (not thinking of any Australian in particular) who already sacrifice a lot of their free time have to invest even more time to keep up with the rest. I think the idea of posting write-ups will help being more inclusive in that regard. From olivier.grisel at ensta.org Thu Jul 18 02:00:50 2019 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Thu, 18 Jul 2019 08:00:50 +0200 Subject: [scikit-learn] Monthly meetings between core developers In-Reply-To: References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> Message-ID: +1 for last Monday of each month. How about the duration? 1h max + breakout in smaller groups on more specific topics if needed? -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrin.jalali at gmail.com Thu Jul 18 02:26:37 2019 From: adrin.jalali at gmail.com (Adrin) Date: Thu, 18 Jul 2019 08:26:37 +0200 Subject: [scikit-learn] Monthly meetings between core developers In-Reply-To: References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> Message-ID: BTW, where was the meeting for last Monday organized? I don't think I knew it was happening. On Thu., Jul. 18, 2019, 08:02 Olivier Grisel, wrote: > +1 for last Monday of each month. How about the duration? 1h max + > breakout in smaller groups on more specific topics if needed? > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Thu Jul 18 02:38:26 2019 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Thu, 18 Jul 2019 08:38:26 +0200 Subject: [scikit-learn] Monthly meetings between core developers In-Reply-To: References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> Message-ID: Le jeu. 18 juil. 2019 ? 08:29, Adrin a ?crit : > > BTW, where was the meeting for last Monday organized? I don't think I knew it was happening. I do not understand what you are referring to. My email was about the organization of future meetings as suggested by Andreas. From adrin.jalali at gmail.com Thu Jul 18 02:41:38 2019 From: adrin.jalali at gmail.com (Adrin) Date: Thu, 18 Jul 2019 08:41:38 +0200 Subject: [scikit-learn] Monthly meetings between core developers In-Reply-To: References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> Message-ID: Ah sorry, my eyes skipped the "of the month" part of "last Monday of the month". My bad! On Thu., Jul. 18, 2019, 08:39 Olivier Grisel, wrote: > Le jeu. 18 juil. 2019 ? 08:29, Adrin a ?crit : > > > > BTW, where was the meeting for last Monday organized? I don't think I > knew it was happening. > > I do not understand what you are referring to. My email was about the > organization of future meetings as suggested by Andreas. > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Thu Jul 18 04:57:21 2019 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Thu, 18 Jul 2019 10:57:21 +0200 Subject: [scikit-learn] Monthly meetings between core developers In-Reply-To: References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> Message-ID: I just found this planner to give it a try: https://www.timeanddate.com/worldclock/meetingtime.html?day=29&month=7&year=2019&p1=240&p2=33&p3=37&p4=179&iv=0 (Berlin and Paris are on the same timezone so I did not put only Berlin). It's going to be challenging to find a timeslot for every body. The least extreme timeslot for everybody to attend at the same time would be: https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=11&min=0&sec=0&p1=240&p2=33&p3=37&p4=179 We could also arrange for a second timeslot later (that would be Tuesday morning in Australia and China): https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=21&min=0&sec=0&p1=240&p2=33&p3=37&p4=179 I wouldn't mind doing a meeting around 11pm on Monday evening from time to time but it would still be very early for Beijing. Just to let you know, I will be off from next Saturday till Monday August 19 (big summer break :) so don't count on my for the first meeting if you start the meetings in the mean time. Le jeu. 18 juil. 2019 ? 00:15, Andreas Mueller a ?crit : > > > > On 7/17/19 2:17 PM, Guillaume Lema?tre wrote: > > I am +1. This is a great initiative. > > > > IMO, we could make it really regular (i.e., a specific week-day of a > > specific week in a month), with a rolling time (for the time-zone issue). > > In this matter, we could maybe clear more in advance our agenda > > instead of trying to find a date which accommodates everyone. > > > I agree, we could do something like the last Monday every month and > alternate between two (or three) different time zones. > We have CET (UTC+1), EST (UTC-5), CT (UTC+08), AEDT (USC+11) so that > seems super easy, right? > (TIL CST can stand for "Central"/US, China, and Cuba! not confusing at all) > > I agree that we should be as inclusive as possible, but I also don't > want to create the expectation that some people (not thinking of any > Australian in particular) > who already sacrifice a lot of their free time have to invest even more > time to keep up with the rest. > > I think the idea of posting write-ups will help being more inclusive in > that regard. > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From alexandre.gramfort at inria.fr Thu Jul 18 05:13:54 2019 From: alexandre.gramfort at inria.fr (Alexandre Gramfort) Date: Thu, 18 Jul 2019 11:13:54 +0200 Subject: [scikit-learn] Monthly meetings between core developers In-Reply-To: References: <8d06eaf2-8160-556f-4bb5-4dbcdaa05865@gmail.com> Message-ID: hi, I kind of like the project boards we used for sprints: https://github.com/scikit-learn/scikit-learn/projects/11 the outcome of the core devs meeting could be to agree what should be listed on such a priority board. my 2c Alex On Thu, Jul 18, 2019 at 10:59 AM Olivier Grisel wrote: > > I just found this planner to give it a try: > > https://www.timeanddate.com/worldclock/meetingtime.html?day=29&month=7&year=2019&p1=240&p2=33&p3=37&p4=179&iv=0 > > (Berlin and Paris are on the same timezone so I did not put only Berlin). > > It's going to be challenging to find a timeslot for every body. The > least extreme timeslot for everybody to attend at the same time would > be: > > https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=11&min=0&sec=0&p1=240&p2=33&p3=37&p4=179 > > We could also arrange for a second timeslot later (that would be > Tuesday morning in Australia and China): > > https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=21&min=0&sec=0&p1=240&p2=33&p3=37&p4=179 > > I wouldn't mind doing a meeting around 11pm on Monday evening from > time to time but it would still be very early for Beijing. > > Just to let you know, I will be off from next Saturday till Monday > August 19 (big summer break :) so don't count on my for the first > meeting if you start the meetings in the mean time. > > Le jeu. 18 juil. 2019 ? 00:15, Andreas Mueller a ?crit : > > > > > > > > On 7/17/19 2:17 PM, Guillaume Lema?tre wrote: > > > I am +1. This is a great initiative. > > > > > > IMO, we could make it really regular (i.e., a specific week-day of a > > > specific week in a month), with a rolling time (for the time-zone issue). > > > In this matter, we could maybe clear more in advance our agenda > > > instead of trying to find a date which accommodates everyone. > > > > > I agree, we could do something like the last Monday every month and > > alternate between two (or three) different time zones. > > We have CET (UTC+1), EST (UTC-5), CT (UTC+08), AEDT (USC+11) so that > > seems super easy, right? > > (TIL CST can stand for "Central"/US, China, and Cuba! not confusing at all) > > > > I agree that we should be as inclusive as possible, but I also don't > > want to create the expectation that some people (not thinking of any > > Australian in particular) > > who already sacrifice a lot of their free time have to invest even more > > time to keep up with the rest. > > > > I think the idea of posting write-ups will help being more inclusive in > > that regard. > > _______________________________________________ > > scikit-learn mailing list > > scikit-learn at python.org > > https://mail.python.org/mailman/listinfo/scikit-learn > > > > -- > Olivier > http://twitter.com/ogrisel - http://github.com/ogrisel > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn From qinhanmin2005 at sina.com Thu Jul 18 06:04:51 2019 From: qinhanmin2005 at sina.com (Hanmin Qin) Date: Thu, 18 Jul 2019 18:04:51 +0800 Subject: [scikit-learn] Monthly meetings between core developers Message-ID: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn> I don't think it's worthwhile to worry too much about Beijing time if I'm the only person in Beijing. I'm happy to get up early once a month to learn from the great team! Hanmin Qin ----- Original Message ----- From: Alexandre Gramfort To: Scikit-learn mailing list Subject: Re: [scikit-learn] Monthly meetings between core developers Date: 2019-07-18 17:16 hi, I kind of like the project boards we used for sprints: https://github.com/scikit-learn/scikit-learn/projects/11 the outcome of the core devs meeting could be to agree what should be listed on such a priority board. my 2c Alex On Thu, Jul 18, 2019 at 10:59 AM Olivier Grisel wrote: > > I just found this planner to give it a try: > > https://www.timeanddate.com/worldclock/meetingtime.html?day=29&month=7&year=2019&p1=240&p2=33&p3=37&p4=179&iv=0 > > (Berlin and Paris are on the same timezone so I did not put only Berlin). > > It's going to be challenging to find a timeslot for every body. The > least extreme timeslot for everybody to attend at the same time would > be: > > https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=11&min=0&sec=0&p1=240&p2=33&p3=37&p4=179 > > We could also arrange for a second timeslot later (that would be > Tuesday morning in Australia and China): > > https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=7&day=29&hour=21&min=0&sec=0&p1=240&p2=33&p3=37&p4=179 > > I wouldn't mind doing a meeting around 11pm on Monday evening from > time to time but it would still be very early for Beijing. > > Just to let you know, I will be off from next Saturday till Monday > August 19 (big summer break :) so don't count on my for the first > meeting if you start the meetings in the mean time. > > Le jeu. 18 juil. 2019 ? 00:15, Andreas Mueller a ?crit : > > > > > > > > On 7/17/19 2:17 PM, Guillaume Lema?tre wrote: > > > I am +1. This is a great initiative. > > > > > > IMO, we could make it really regular (i.e., a specific week-day of a > > > specific week in a month), with a rolling time (for the time-zone issue). > > > In this matter, we could maybe clear more in advance our agenda > > > instead of trying to find a date which accommodates everyone. > > > > > I agree, we could do something like the last Monday every month and > > alternate between two (or three) different time zones. > > We have CET (UTC+1), EST (UTC-5), CT (UTC+08), AEDT (USC+11) so that > > seems super easy, right? > > (TIL CST can stand for "Central"/US, China, and Cuba! not confusing at all) > > > > I agree that we should be as inclusive as possible, but I also don't > > want to create the expectation that some people (not thinking of any > > Australian in particular) > > who already sacrifice a lot of their free time have to invest even more > > time to keep up with the rest. > > > > I think the idea of posting write-ups will help being more inclusive in > > that regard. > > _______________________________________________ > > scikit-learn mailing list > > scikit-learn at python.org > > https://mail.python.org/mailman/listinfo/scikit-learn > > > > -- > Olivier > http://twitter.com/ogrisel - http://github.com/ogrisel > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn at python.org https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From joel.nothman at gmail.com Thu Jul 18 19:57:20 2019 From: joel.nothman at gmail.com (Joel Nothman) Date: Fri, 19 Jul 2019 09:57:20 +1000 Subject: [scikit-learn] Monthly meetings between core developers In-Reply-To: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn> References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn> Message-ID: I'm away on a holiday at the moment (in case you hadn't identified my silence). I'd be keen to join in but might not be able to move schedules around it. I like the idea of prioritising together, though I'm not sure how to keep the meetings clipped. I'm also going to be quite lost on the issue tracker when I return to civilisation next week... (Are we still on for two patch releases?) J -------------- next part -------------- An HTML attachment was scrubbed... URL: From pahome.chen at mirlab.org Fri Jul 19 05:39:27 2019 From: pahome.chen at mirlab.org (lampahome) Date: Fri, 19 Jul 2019 17:39:27 +0800 Subject: [scikit-learn] Any machine learning used in storage company? Message-ID: Is there any application used in storage company? Can anyone briefly introduce what application in what company? thx -------------- next part -------------- An HTML attachment was scrubbed... URL: From marmochiaskl at gmail.com Fri Jul 19 08:46:37 2019 From: marmochiaskl at gmail.com (Chiara Marmo) Date: Fri, 19 Jul 2019 14:46:37 +0200 Subject: [scikit-learn] Monthly meetings between core developers + "Hello World" In-Reply-To: References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn> Message-ID: Dear list, I'm Chiara, in September I will start to work full time for the Scikit-Learn Consortium at INRIA (France). My background is in Astronomy and Planetary Science: I've worked there as a Research Engineer for around 15 years, writing code, mining data and managing some project. One of my task at the Consortium will be to take care of our connection with the developer community, so let me know if I can help in managing those monthly meetings in some way. In the meanwhile, may I suggest to create a github team for core developers in the scikit-learn organization? As Alexandre said, team specific projects and discussions on github could be a way to efficiently prepare meetings and prioritize issues. Thanks for listening, have a nice day. Chiara -------------- next part -------------- An HTML attachment was scrubbed... URL: From milton.pifanos at gmail.com Mon Jul 22 09:16:35 2019 From: milton.pifanos at gmail.com (Milton Pifano) Date: Mon, 22 Jul 2019 10:16:35 -0300 Subject: [scikit-learn] Test Sample Size Message-ID: Dear scikit-learn subscribers. I am working on a multiclass classificacition project and I have found many resources about how to deal with an imbalaced dataset for trainning, bu I have not been able to find any reference on the test dataset size. Can anyone send some references? Thanks, Milton Pifano -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrin.jalali at gmail.com Mon Jul 22 09:22:51 2019 From: adrin.jalali at gmail.com (Adrin) Date: Mon, 22 Jul 2019 15:22:51 +0200 Subject: [scikit-learn] Monthly meetings between core developers + "Hello World" In-Reply-To: References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn> Message-ID: Awesome, excited to have your help around :) We already have the @core-devs team on github, we can use it more often/more organized. On Fri, Jul 19, 2019 at 2:48 PM Chiara Marmo wrote: > Dear list, > > I'm Chiara, in September I will start to work full time for the > Scikit-Learn Consortium at INRIA (France). My background is in Astronomy > and Planetary Science: I've worked there as a Research Engineer for around > 15 years, writing code, mining data and managing some project. > > One of my task at the Consortium will be to take care of our connection > with the developer community, so let me know if I can help in managing > those monthly meetings in some way. > In the meanwhile, may I suggest to create a github team for core > developers in the scikit-learn organization? As Alexandre said, team > specific projects and discussions on github could be a way to efficiently > prepare meetings and prioritize issues. > > Thanks for listening, > have a nice day. > Chiara > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbbrown at kuhp.kyoto-u.ac.jp Mon Jul 22 09:24:26 2019 From: jbbrown at kuhp.kyoto-u.ac.jp (Brown J.B.) Date: Mon, 22 Jul 2019 22:24:26 +0900 Subject: [scikit-learn] Test Sample Size In-Reply-To: References: Message-ID: Dear Milton, It is just my opinion based on many experiences, but if you want to stress-test your estimator, make your test set at least as big as, if not bigger than, the training set. Sincerely, J.B. 2019?7?22?(?) 22:18 Milton Pifano : > Dear scikit-learn subscribers. > > I am working on a multiclass classificacition project and I have found > many resources about how to deal with an imbalaced dataset for trainning, > bu I have not been able to find any reference on the test dataset size. > Can anyone send some references? > > Thanks, > Milton Pifano > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrin.jalali at gmail.com Mon Jul 22 09:51:12 2019 From: adrin.jalali at gmail.com (Adrin) Date: Mon, 22 Jul 2019 15:51:12 +0200 Subject: [scikit-learn] Continues monitoring of benchmark performances Message-ID: Hi, There is this [page](https://pandas.pydata.org/speed/scikit-learn/) maintained by some of the pandas maintainers (@TomAugspurger in particular), and it seems like a really good idea to have an eye on the performance of different benchmarks through time just in case a PR introduces some major drawbacks. However, he doesn't have the bandwidth to maintain it much more, and not really the hardware. I think it'd be a good idea for us to have that, wanted to bring it up and see what you think! Cheers, Adrin. -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Mon Jul 22 09:54:57 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Mon, 22 Jul 2019 09:54:57 -0400 Subject: [scikit-learn] Monthly meetings between core developers + "Hello World" In-Reply-To: References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn> Message-ID: On 7/22/19 9:22 AM, Adrin wrote: > Awesome, excited to have your help around :) > > We already have the @core-devs team on github, we can use it more > often/more organized.hi Why wouldn't we just use the scikit-learn repo projects? > > On Fri, Jul 19, 2019 at 2:48 PM Chiara Marmo > wrote: > > Dear list, > > I'm Chiara, in September I will start to work full time for the > Scikit-Learn Consortium at INRIA (France). My background is in > Astronomy and Planetary Science: I've worked there as a Research > Engineer for around 15 years, writing code, mining data and > managing some project. > > One of my task at the Consortium will be to take care of our > connection with the developer community, so let me know if I can > help in managing those monthly meetings in some way. > In the meanwhile, may I suggest to create a github team for core > developers in the scikit-learn organization? As Alexandre said, > team specific projects and discussions on github could be a way to > efficiently prepare meetings and prioritize issues. > > Thanks for listening, > have a nice day. > Chiara > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrin.jalali at gmail.com Mon Jul 22 09:57:37 2019 From: adrin.jalali at gmail.com (Adrin) Date: Mon, 22 Jul 2019 15:57:37 +0200 Subject: [scikit-learn] Monthly meetings between core developers + "Hello World" In-Reply-To: References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn> Message-ID: That's kinda what I meant. I didn't mean to limit the access to the project to @core-devs, I meant they can be pinged. On Mon, Jul 22, 2019 at 3:56 PM Andreas Mueller wrote: > > On 7/22/19 9:22 AM, Adrin wrote: > > Awesome, excited to have your help around :) > > We already have the @core-devs team on github, we can use it more > often/more organized.hi > > Why wouldn't we just use the scikit-learn repo projects? > > > > On Fri, Jul 19, 2019 at 2:48 PM Chiara Marmo > wrote: > >> Dear list, >> >> I'm Chiara, in September I will start to work full time for the >> Scikit-Learn Consortium at INRIA (France). My background is in Astronomy >> and Planetary Science: I've worked there as a Research Engineer for around >> 15 years, writing code, mining data and managing some project. >> >> One of my task at the Consortium will be to take care of our connection >> with the developer community, so let me know if I can help in managing >> those monthly meetings in some way. >> In the meanwhile, may I suggest to create a github team for core >> developers in the scikit-learn organization? As Alexandre said, team >> specific projects and discussions on github could be a way to efficiently >> prepare meetings and prioritize issues. >> >> Thanks for listening, >> have a nice day. >> Chiara >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > > _______________________________________________ > scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.augspurger88 at gmail.com Mon Jul 22 09:59:51 2019 From: tom.augspurger88 at gmail.com (Tom Augspurger) Date: Mon, 22 Jul 2019 08:59:51 -0500 Subject: [scikit-learn] Continues monitoring of benchmark performances In-Reply-To: References: Message-ID: Thanks Adrin, A month or so ago I started running scikit-learn benchmarks, but I had to disable them since they were taking too long (longer than a day). I haven't had time to investigate why, but I assume it was an issue with how I set them up. Just FYI, I'm planning to include "maintain and improve the benchmark running tools" as part of the pandas' application for the CZI grant. All that is in https://github.com/asv-runner (a mix of Ansible, Airflow, and GitHub bots). If anyone is interested in (possibly) having funding to work on this, feel free to reach out to me off list and we can discuss things. Tom On Mon, Jul 22, 2019 at 8:53 AM Adrin wrote: > Hi, > > There is this [page](https://pandas.pydata.org/speed/scikit-learn/) > maintained by some of the pandas maintainers (@TomAugspurger in > particular), and it seems like a really good idea to have an eye on the > performance of different benchmarks through time just in case a PR > introduces some major drawbacks. > > However, he doesn't have the bandwidth to maintain it much more, and not > really the hardware. I think it'd be a good idea for us to have that, > wanted to bring it up and see what you think! > > Cheers, > Adrin. > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niourf at gmail.com Mon Jul 22 10:14:10 2019 From: niourf at gmail.com (Nicolas Hug) Date: Mon, 22 Jul 2019 10:14:10 -0400 Subject: [scikit-learn] Continues monitoring of benchmark performances In-Reply-To: References: Message-ID: <216082b3-39cd-c898-c6cf-236d52d3abaa@gmail.com> I agree having benchmarks for non regression would be very helpful. A seemingly simple change in Cython code can lead to drastic performance drop. I can't find it back but I think J?r?mie has submitted an issue about this? On 7/22/19 9:59 AM, Tom Augspurger wrote: > Thanks Adrin, > > A month or so ago I started running scikit-learn benchmarks, but I had > to disable them since they were taking too long (longer than a day). > I haven't had time to investigate why, but I assume it was an issue > with how I set them up. > > Just FYI, I'm planning to include "maintain and improve the benchmark > running tools" as part of the pandas' application for the CZI grant. > All that is in https://github.com/asv-runner (a mix of Ansible, > Airflow, and GitHub bots). If anyone is interested in (possibly) > having funding > to work on this, feel free to reach out to me off list and we can > discuss things. > > Tom > > On Mon, Jul 22, 2019 at 8:53 AM Adrin > wrote: > > Hi, > > There is this > [page](https://pandas.pydata.org/speed/scikit-learn/) maintained > by some of the pandas maintainers (@TomAugspurger in particular), > and it seems like a really good idea to have an eye on the > performance of different benchmarks through time just in case a PR > introduces some major drawbacks. > > However, he doesn't have the bandwidth to maintain it much more, > and not really the hardware. I think it'd be a good idea for us to > have that, wanted to bring it up and see what you think! > > Cheers, > Adrin. > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From joel.nothman at gmail.com Mon Jul 22 20:12:34 2019 From: joel.nothman at gmail.com (Joel Nothman) Date: Tue, 23 Jul 2019 10:12:34 +1000 Subject: [scikit-learn] Continues monitoring of benchmark performances In-Reply-To: <216082b3-39cd-c898-c6cf-236d52d3abaa@gmail.com> References: <216082b3-39cd-c898-c6cf-236d52d3abaa@gmail.com> Message-ID: Isn't J?r?mie's project at https://github.com/jeremiedbb/scikit-learn_benchmarks meant to be doing this? What's its status? How does it relate to Tom's work? (Can we please take http://scikit-learn.org/ml-benchmarks/ offline?) On Tue, 23 Jul 2019 at 00:17, Nicolas Hug wrote: > I agree having benchmarks for non regression would be very helpful. A > seemingly simple change in Cython code can lead to drastic performance drop. > > I can't find it back but I think J?r?mie has submitted an issue about this? > > On 7/22/19 9:59 AM, Tom Augspurger wrote: > > Thanks Adrin, > > A month or so ago I started running scikit-learn benchmarks, but I had to > disable them since they were taking too long (longer than a day). > I haven't had time to investigate why, but I assume it was an issue with > how I set them up. > > Just FYI, I'm planning to include "maintain and improve the benchmark > running tools" as part of the pandas' application for the CZI grant. > All that is in https://github.com/asv-runner (a mix of Ansible, Airflow, > and GitHub bots). If anyone is interested in (possibly) having funding > to work on this, feel free to reach out to me off list and we can discuss > things. > > Tom > > On Mon, Jul 22, 2019 at 8:53 AM Adrin wrote: > >> Hi, >> >> There is this [page](https://pandas.pydata.org/speed/scikit-learn/) >> maintained by some of the pandas maintainers (@TomAugspurger in >> particular), and it seems like a really good idea to have an eye on the >> performance of different benchmarks through time just in case a PR >> introduces some major drawbacks. >> >> However, he doesn't have the bandwidth to maintain it much more, and not >> really the hardware. I think it'd be a good idea for us to have that, >> wanted to bring it up and see what you think! >> >> Cheers, >> Adrin. >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > > _______________________________________________ > scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre.gramfort at inria.fr Tue Jul 23 02:25:16 2019 From: alexandre.gramfort at inria.fr (Alexandre Gramfort) Date: Tue, 23 Jul 2019 08:25:16 +0200 Subject: [scikit-learn] Continues monitoring of benchmark performances In-Reply-To: References: <216082b3-39cd-c898-c6cf-236d52d3abaa@gmail.com> Message-ID: it's the same projects. Alex From jeremie.du-boisberranger at inria.fr Tue Jul 23 04:37:24 2019 From: jeremie.du-boisberranger at inria.fr (Jeremie du Boisberranger) Date: Tue, 23 Jul 2019 10:37:24 +0200 Subject: [scikit-learn] Continues monitoring of benchmark performances In-Reply-To: References: <216082b3-39cd-c898-c6cf-236d52d3abaa@gmail.com> Message-ID: <27b2229e-6840-99b2-e5bf-ace5d92017ee@inria.fr> > Isn't J?r?mie's project at https://github.com/jeremiedbb/scikit-learn_benchmarks?meant to be doing this? What's its status? How does it relate to Tom's work? Yes it's the same project. Tom kindly accepted to run it alongside other projects from the pydata ecosystem. > I can't find it back but I think J?r?mie has submitted an issue about this? I didn't submit an issue but briefly mentioned it at the Paris sprint in february. > but I had to disable them since they were taking too long (longer than a day) Something wrong is definitely going on. It should take 1-2 hours (it does on my laptop). Before it was running on Tom's machine, we were considering running it on a dedicated machine at INRIA. Maybe it will be better to do that after all. > What's its status? After seeing run for a few weeks, I think it still needs some more work. More readable presentation of the results. Some benchmarks show large fluctuations. It might be the hardware or maybe my settings are not good. On 23/07/2019 08:25, Alexandre Gramfort wrote: > it's the same projects. > > Alex > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn From adrin.jalali at gmail.com Tue Jul 23 07:51:43 2019 From: adrin.jalali at gmail.com (Adrin) Date: Tue, 23 Jul 2019 13:51:43 +0200 Subject: [scikit-learn] Long term roadmap and moonshot goals In-Reply-To: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com> References: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com> Message-ID: It may be worth doing a user survey to get a feeling of what people care about, we may or may not take them into account afterwards. Here's how Dask is doing it: https://github.com/dask/dask/issues/4748 On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller wrote: > Hi all. > At SciPy, Brian Granger raised a good point about their planning for the > Jupyter Project, which is the importance of long-term goals. > > I think it's great that we now have a detailed short-term roadmap > (https://scikit-learn.org/dev/roadmap.html). > Given that we now have about 6(!) full time people (Oliver, Jeremy, > Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I think > it's realistic > to achieve most of these within a year or two. We have actually made > some significant progress already. > > I think now would be a good time to start thinking about a longer-term > roadmap, say 3-5 years out. > What do we want to achieve? What are realistic goals, and what are > moonshot goals? > Having a common vision and shared goals might help us with funding, but > might also help us with prioritization and motivation. > > What do you think? Do you think this is important and worth-while? > And what should our goals be? > > Best, > Andy > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.augspurger88 at gmail.com Tue Jul 23 11:28:30 2019 From: tom.augspurger88 at gmail.com (Tom Augspurger) Date: Tue, 23 Jul 2019 10:28:30 -0500 Subject: [scikit-learn] Long term roadmap and moonshot goals In-Reply-To: References: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com> Message-ID: Pandas will be running one soon too: https://github.com/pandas-dev/pandas/issues/27477 It may be worth coordinating on questions so that we can compare communities (or combining surveys to reduce "survey-fatigue" somehow? Haven't thought through this). Tom On Tue, Jul 23, 2019 at 6:54 AM Adrin wrote: > It may be worth doing a user survey to get a feeling of what people care > about, we may or may not take them into account afterwards. > > Here's how Dask is doing it: https://github.com/dask/dask/issues/4748 > > On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller wrote: > >> Hi all. >> At SciPy, Brian Granger raised a good point about their planning for the >> Jupyter Project, which is the importance of long-term goals. >> >> I think it's great that we now have a detailed short-term roadmap >> (https://scikit-learn.org/dev/roadmap.html). >> Given that we now have about 6(!) full time people (Oliver, Jeremy, >> Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I think >> it's realistic >> to achieve most of these within a year or two. We have actually made >> some significant progress already. >> >> I think now would be a good time to start thinking about a longer-term >> roadmap, say 3-5 years out. >> What do we want to achieve? What are realistic goals, and what are >> moonshot goals? >> Having a common vision and shared goals might help us with funding, but >> might also help us with prioritization and motivation. >> >> What do you think? Do you think this is important and worth-while? >> And what should our goals be? >> >> Best, >> Andy >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niedakh at gmail.com Tue Jul 23 11:36:24 2019 From: niedakh at gmail.com (=?UTF-8?Q?Piotr_Szyma=C5=84ski?=) Date: Tue, 23 Jul 2019 17:36:24 +0200 Subject: [scikit-learn] Long term roadmap and moonshot goals In-Reply-To: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com> References: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com> Message-ID: If I could pitch in, it would be lovely, very lovely indeed, if scikit-learn models could: - operate on sparse data, both input and output by default - implement some kind of sparse vector representation (as in https://github.com/scikit-learn/scikit-learn/issues/8908 ) - perhaps have a unifiying numpy.array / scipy.sparse_matrix interface to give people some slack on jumping betwen [] operator conventions We would benefit from that strongly in scikit-multilearn, as when a multi-output problem is transformed to a single-output problem based on unique combinations, this representation has to be dense for scikit-learn at the moment. We end up losing some speed there. I'm sure other libraries like ex. imbalanced-learn, or scikit-multiflow would also see these as a huge thing. Best, Piotr On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller wrote: > Hi all. > At SciPy, Brian Granger raised a good point about their planning for the > Jupyter Project, which is the importance of long-term goals. > > I think it's great that we now have a detailed short-term roadmap > (https://scikit-learn.org/dev/roadmap.html). > Given that we now have about 6(!) full time people (Oliver, Jeremy, > Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I think > it's realistic > to achieve most of these within a year or two. We have actually made > some significant progress already. > > I think now would be a good time to start thinking about a longer-term > roadmap, say 3-5 years out. > What do we want to achieve? What are realistic goals, and what are > moonshot goals? > Having a common vision and shared goals might help us with funding, but > might also help us with prioritization and motivation. > > What do you think? Do you think this is important and worth-while? > And what should our goals be? > > Best, > Andy > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -- Piotr Szyma?ski niedakh at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Tue Jul 23 11:45:38 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Tue, 23 Jul 2019 11:45:38 -0400 Subject: [scikit-learn] Long term roadmap and moonshot goals In-Reply-To: References: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com> Message-ID: <9b007d3f-33b6-f00b-f2e2-5f523cf36be8@gmail.com> We had one done in 2013 (wow!). I'll post the link to the internal mailing list since it could have identifying information. Obviously the answers now would be quite different, just thought it would be interesting to look at it again. On 7/23/19 10:28 AM, Tom Augspurger wrote: > Pandas will be running one soon too: > https://github.com/pandas-dev/pandas/issues/27477 > > It may be worth coordinating on questions so that we can compare > communities (or combining surveys to reduce "survey-fatigue" somehow? > Haven't thought through this). > > Tom > > On Tue, Jul 23, 2019 at 6:54 AM Adrin > wrote: > > It may be worth doing a user survey to get a feeling of what > people care about, we may or may not take them into account > afterwards. > > Here's how Dask is doing it: https://github.com/dask/dask/issues/4748 > > On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller > wrote: > > Hi all. > At SciPy, Brian Granger raised a good point about their > planning for the > Jupyter Project, which is the importance of long-term goals. > > I think it's great that we now have a detailed short-term roadmap > (https://scikit-learn.org/dev/roadmap.html). > Given that we now have about 6(!) full time people (Oliver, > Jeremy, > Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO > TEAM!!), I think > it's realistic > to achieve most of these within a year or two. We have > actually made > some significant progress already. > > I think now would be a good time to start thinking about a > longer-term > roadmap, say 3-5 years out. > What do we want to achieve? What are realistic goals, and what > are > moonshot goals? > Having a common vision and shared goals might help us with > funding, but > might also help us with prioritization and motivation. > > What do you think? Do you think this is important and worth-while? > And what should our goals be? > > Best, > Andy > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Tue Jul 23 11:52:40 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Tue, 23 Jul 2019 11:52:40 -0400 Subject: [scikit-learn] Long term roadmap and moonshot goals In-Reply-To: References: <4a04ad85-7913-3d5a-5be2-98edb5b9c824@gmail.com> Message-ID: <5e9ff930-7db6-67a6-2231-f59395f47e68@gmail.com> Can you give an example? I imagine that just supporting the data structure will not give you any speed benefit unless the algorithms are reimplemented to take advantage of the problem structure. Even if the output of logistic regression would be a sparse binary vector, you'd still need to compute every entry, which would be the slow part. On 7/23/19 10:36 AM, Piotr Szyma?ski wrote: > If I could pitch in, it would be lovely, very lovely indeed, if > scikit-learn models could: > > - operate on sparse data, both input and output by default > - implement some kind of sparse vector representation (as in > https://github.com/scikit-learn/scikit-learn/issues/8908 ) > - perhaps have a unifiying numpy.array / scipy.sparse_matrix interface > to give people some slack on jumping betwen [] operator conventions > > We would benefit from that strongly in scikit-multilearn, as when a > multi-output problem is transformed to a single-output problem based > on unique combinations, this representation has to be dense for > scikit-learn at the moment. We end up losing some speed there. I'm > sure other libraries like ex. imbalanced-learn, or scikit-multiflow > would also see these as a huge thing. > > Best, > Piotr > > > > On Sun, Jul 14, 2019 at 8:44 PM Andreas Mueller > wrote: > > Hi all. > At SciPy, Brian Granger raised a good point about their planning > for the > Jupyter Project, which is the importance of long-term goals. > > I think it's great that we now have a detailed short-term roadmap > (https://scikit-learn.org/dev/roadmap.html). > Given that we now have about 6(!) full time people (Oliver, Jeremy, > Guillaume, Nicolas, Thomas, Adrin) on scikit-learn (GO TEAM!!), I > think > it's realistic > to achieve most of these within a year or two. We have actually made > some significant progress already. > > I think now would be a good time to start thinking about a > longer-term > roadmap, say 3-5 years out. > What do we want to achieve? What are realistic goals, and what are > moonshot goals? > Having a common vision and shared goals might help us with > funding, but > might also help us with prioritization and motivation. > > What do you think? Do you think this is important and worth-while? > And what should our goals be? > > Best, > Andy > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > > > -- > Piotr Szyma?ski > niedakh at gmail.com > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From gabor.toth at maximilianeum.de Tue Jul 23 12:25:55 2019 From: gabor.toth at maximilianeum.de (Gabor Toth) Date: Tue, 23 Jul 2019 09:25:55 -0700 Subject: [scikit-learn] Random Forest without target to measure feature importance Message-ID: Hello, I would like to use Random Forest classifier to assess the importance of features (bag-of-words) but I don't have any predefined class labels or any test data. I have earlier used ExtraTreesClassifier() with fit_transform, which is not available anymore (see below). I am wondering how I could use Random Forest now. clf = ExtraTreesClassifier() clf.fit_transform(doc_term_matrix,np.empty(doc_term_matrix.shape)) features_importance=np.array(clf.feature_importances_) Thanks, Gabor -------------- next part -------------- An HTML attachment was scrubbed... URL: From glennmschultz at me.com Wed Jul 24 14:24:36 2019 From: glennmschultz at me.com (Glenn Schultz) Date: Wed, 24 Jul 2019 14:24:36 -0400 Subject: [scikit-learn] question using GridSearchCV Message-ID: I am using GBClassifier, the below works if I use the default accuracy but it fails using roc_auc or roc_auc_score. I have found many examples to work with but for the life of me I can?t get it two work with roc_auc. What am I doing wrong. from sklearn.ensemble import GradientBoostingClassifier from sklearn.model_selection import GridSearchCV from sklearn.metrics import auction, roc_auc_score y_train = LoansTrainData[?event?] x_train LoanTrainData[LoansTrainData.columns.drop(?event?)] parameters = {?loss? :[?deviance?], ?scoring? :[?roc_auc?}, ?learning_rate? :[.1, .05] selfCLF =GridSearchCV(GradientBoostingClassifier(), parameters, versose = 3m n_jobs = 4) searchCLF(x_train, y_train) From t3kcit at gmail.com Wed Jul 24 14:57:01 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Wed, 24 Jul 2019 14:57:01 -0400 Subject: [scikit-learn] question using GridSearchCV In-Reply-To: References: Message-ID: scoring is not a parameter. It needs to be passed to GridSearchCV selfCLF =GridSearchCV(GradientBoostingClassifier(), parameters, versose = 3m n_jobs = 4), scoring='roc_auc') On 7/24/19 1:24 PM, Glenn Schultz via scikit-learn wrote: > I am using GBClassifier, the below works if I use the default accuracy but it fails using roc_auc or roc_auc_score. I have found many examples to work with but for the life of me I can?t get it two work with roc_auc. What am I doing wrong. > > from sklearn.ensemble import GradientBoostingClassifier > from sklearn.model_selection import GridSearchCV > from sklearn.metrics import auction, roc_auc_score > > y_train = LoansTrainData[?event?] > x_train LoanTrainData[LoansTrainData.columns.drop(?event?)] > > parameters = {?loss? :[?deviance?], > ?scoring? :[?roc_auc?}, > ?learning_rate? :[.1, .05] > > selfCLF =GridSearchCV(GradientBoostingClassifier(), parameters, versose = 3m n_jobs = 4) > searchCLF(x_train, y_train) > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn From glennmschultz at me.com Wed Jul 24 15:11:15 2019 From: glennmschultz at me.com (Glenn Schultz) Date: Wed, 24 Jul 2019 15:11:15 -0400 Subject: [scikit-learn] question using GridSearchCV In-Reply-To: References: Message-ID: Thank you for answering ... makes sense now that you point it out. Sent from my iPhone > On Jul 24, 2019, at 2:57 PM, Andreas Mueller wrote: > > scoring is not a parameter. > It needs to be passed to GridSearchCV > > selfCLF =GridSearchCV(GradientBoostingClassifier(), parameters, versose = 3m n_jobs = 4), scoring='roc_auc') > > > >> On 7/24/19 1:24 PM, Glenn Schultz via scikit-learn wrote: >> I am using GBClassifier, the below works if I use the default accuracy but it fails using roc_auc or roc_auc_score. I have found many examples to work with but for the life of me I can?t get it two work with roc_auc. What am I doing wrong. >> >> from sklearn.ensemble import GradientBoostingClassifier >> from sklearn.model_selection import GridSearchCV >> from sklearn.metrics import auction, roc_auc_score >> >> y_train = LoansTrainData[?event?] >> x_train LoanTrainData[LoansTrainData.columns.drop(?event?)] >> >> parameters = {?loss? :[?deviance?], >> ?scoring? :[?roc_auc?}, >> ?learning_rate? :[.1, .05] >> >> selfCLF =GridSearchCV(GradientBoostingClassifier(), parameters, versose = 3m n_jobs = 4) >> searchCLF(x_train, y_train) >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn From aiwern at gmail.com Thu Jul 25 14:16:21 2019 From: aiwern at gmail.com (Ai Wern) Date: Thu, 25 Jul 2019 14:16:21 -0400 Subject: [scikit-learn] FINAL CALL for Papers: MICCAI 2019 Connectomics in NeuroImaging Workshop and Challenge Message-ID: **Apologies for cross posting** Dear Colleagues, This is a final call for full-length papers submissions to our *3rd International Workshop on Connectomics in NeuroImaging (CNI 2019), *and *Transfer-Learning CNI Challenge 2019 *held in parallel with the 22nd International Conference on Medical Image Computing and Computer-assisted Intervention (MICCAI 2019) in Shenzhen, China. *Our deadline has been extended to Weds 31st July 2019* **** CNI Workshop Call for Papers **** Our topics of interest cover (but are not limited to): (1) New developments in connectome construction from different imaging modalities; (2) Development of data driven techniques to identify biomarkers in connectome data; (3) Machine learning algorithms and connectome data analysis; (4) Brain network modeling and formal conceptual models of connectome data; (5) Evaluation and validation of connectome models. If you have research that fits into the scope of our workshop detailed on our website (http://www.brainconnectivity.net/workshop*)*, we encourage you to submit a paper. **** CNI Call for Challengers **** Addressing the issues of generalizability and clinical relevance for functional connectomes, you can leverage a unique resting-state fMRI (rsfMRI) dataset of attention deficit hyperactivity disorder (ADHD) and neurotypical controls (NC) to design a classification framework that can predict subject diagnosis (ADHD vs. NC) based on brain connectivity data. In a surprise twist, we will also evaluate the classification performance on a related clinical population with an ADHD comorbidity. This challenge will allow us to assess (1) whether the method is extracting functional connectivity patterns related to ADHD symptomatology, and (2) how much of this information ?transfers? between clinical populations. Training and validation data are now released http://www.brainconnectivity.net/challenge **** Why submit to the CNI Workshop and Challenge? **** - Keynote talks by Prof Yong He (Beijing Normal University, China) and Dr. Fan Zhang (Harvard Medical School, USA); - Oral presentations and poster sessions to provide you with ample opportunity for exchanges and discussions; - Accepted papers will be published in an LNCS proceedings; - Best Paper and Poster Awards will be presented, and sponsored prizes for Challenge winners. **** Important dates for CNI workshop **** - Submission deadline: July 31st, 2019, 23:59 EST - Notification of acceptance: August 13th, 2019 - Camera-ready deadline : August 18th, 2019, 23:59 EST - Submission website: https://cmt3.research.microsoft.com/CNI2019 **** Important dates for CNI Challenge **** - Submission deadline: August 15th, 2019, 23:59 EST - Submission website: https://cmt3.research.microsoft.com/CNIChallenge2019 For more information, visit our website at http://www.brainconnectivity.net. Best wishes, CNI 2019 Organising Committee -------------- next part -------------- An HTML attachment was scrubbed... URL: From niourf at gmail.com Fri Jul 26 14:08:14 2019 From: niourf at gmail.com (Nicolas Hug) Date: Fri, 26 Jul 2019 14:08:14 -0400 Subject: [scikit-learn] Monthly meetings between core developers + "Hello World" In-Reply-To: References: <20190718100451.B608918C0090@webmail.sinamail.sina.com.cn> Message-ID: <08716118-a3a8-0131-aeca-f97a8aba3f25@gmail.com> Thanks everyone for your feedback! Let's try to have a meeting on Monday 5th August, and then have meetings on the last Monday of the month? Next meeting would be on August 26th. For the time: https://www.timeanddate.com/worldclock/meetingdetails.html?year=2019&month=8&day=5&hour=13&min=0&sec=0&p1=240&p2=33&p3=37&p4=179. This one is convenient for NY and Europe, less so for Sydney / Beijing . We can have the next meeting accommodate for Joel / Hanmin. We can use Andy's appear.in : https://appear.in/amueller. I'm happy to (try to) "lead" the discussion this first time? For logistics: I created a new project board https://github.com/scikit-learn/scikit-learn/projects/15 I was thinking of having one column per meeting. A few days before the meeting, people can write down what they plan to discuss (one note per core-dev), so others can prepare. In particular, people that are not able to attend can leave details here (let us know in the notes!). One advantage of these boards is that they're searchable, we have a clear history of meetings, and it's easy to reference PRs/issues. This is of course only a proposal, we can try it and see whether it works out ;) @Chiara Welcome!! Thanks for offering to help! It didn't take long so I took care of creating the board (also I would have felt bad for making you work while you only start in Sep). Thanks, Nicolas On 7/22/19 9:57 AM, Adrin wrote: > That's kinda what I meant. I didn't mean to limit the access to the > project to @core-devs, I meant they can be pinged. > > On Mon, Jul 22, 2019 at 3:56 PM Andreas Mueller > wrote: > > > On 7/22/19 9:22 AM, Adrin wrote: >> Awesome, excited to have your help around :) >> >> We already have the @core-devs team on github, we can use it more >> often/more organized.hi > > Why wouldn't we just use the scikit-learn repo projects? > > >> >> On Fri, Jul 19, 2019 at 2:48 PM Chiara Marmo >> > wrote: >> >> Dear list, >> >> I'm Chiara, in September I will start to work full time for >> the Scikit-Learn Consortium at INRIA (France). My background >> is in Astronomy and Planetary Science: I've worked there as a >> Research Engineer for around 15 years, writing code, mining >> data and managing some project. >> >> One of my task at the Consortium will be to take care of our >> connection with the developer community, so let me know if I >> can help in managing those monthly meetings in some way. >> In the meanwhile, may I suggest to create a github team for >> core developers in the scikit-learn organization? As >> Alexandre said, team specific projects and discussions on >> github could be a way to efficiently prepare meetings and >> prioritize issues. >> >> Thanks for listening, >> have a nice day. >> Chiara >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From joel.nothman at gmail.com Mon Jul 29 22:57:08 2019 From: joel.nothman at gmail.com (Joel Nothman) Date: Tue, 30 Jul 2019 12:57:08 +1000 Subject: [scikit-learn] ANN Scikit-learn 0.21.3 and 0.20.4 released Message-ID: We have released patches to Scikit-learn 0.21 (python >=3.5) and 0.20 (python 2 and 3) series including several bug fixes. See their respective change logs at https://scikit-learn.org/dev/whats_new/v0.21.html#version-0-21-3 and https://scikit-learn.org/dev/whats_new/v0.20.html#version-0-20-4. Install them from PyPI or conda-forge: * https://pypi.org/project/scikit-learn/0.21.3/ * https://pypi.org/project/scikit-learn/0.20.4/ Thanks to all who have contributed! Also, work in progress: Scikit-learn 0.22 is in development with lots of great new features to be released hopefully towards the end of 2019. See change logs at https://scikit-learn.org/dev/whats_new/v0.22.html for some of the things coming your way, and try them out by installing the nightly build (see https://scikit-learn.org/dev/developers/advanced_installation.html#installing-nightly-builds ). Happy learning! the scikit-learn team -------------- next part -------------- An HTML attachment was scrubbed... URL: From charujing123 at 163.com Tue Jul 30 21:22:10 2019 From: charujing123 at 163.com (charujing123) Date: Wed, 31 Jul 2019 09:22:10 +0800 Subject: [scikit-learn] cross-validated MANOVA Message-ID: <21b11453.1152856.16c459eca56.Coremail.charujing123@163.com> Dear experts and users, Does anyone know how to perform cross-validated multivariate analysis of variance? This is the paper mentioned this method "Searchlight-based multi-voxel pattern analysis of fMRI by cross-validated MANOVA". Thanks. Rujing 2019-07-31 charujing123 -------------- next part -------------- An HTML attachment was scrubbed... URL: