From olivier.grisel at ensta.org Mon Dec 2 07:57:06 2019 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 2 Dec 2019 13:57:06 +0100 Subject: [scikit-learn] scikit-learn twitter account In-Reply-To: References: <1843A3F8-A6B1-4D6D-A5A4-6A2342A0D770@gmail.com> <20191122162239.k5rlri4qrpjvfma3@phare.normalesup.org> Message-ID: Alright, it seems that I can create twitter apps (and generates API tokens) for the @sklearn_commits account however https://github.com/filearts/tweethook does not work as it relies on a third party webtask,io service that does not accept any new subscription... I am looking for an alternative way to do this but I am not sure how to do so... From niourf at gmail.com Mon Dec 2 09:43:21 2019 From: niourf at gmail.com (Nicolas Hug) Date: Mon, 2 Dec 2019 09:43:21 -0500 Subject: [scikit-learn] scikit-learn twitter account In-Reply-To: <20191201031948.ziplxkoof2fxwo7w@phare.normalesup.org> References: <1843A3F8-A6B1-4D6D-A5A4-6A2342A0D770@gmail.com> <20191122162239.k5rlri4qrpjvfma3@phare.normalesup.org> <45705fd4-994c-5b0b-63dd-ebed45c4549a@gmail.com> <20191201031948.ziplxkoof2fxwo7w@phare.normalesup.org> Message-ID: Agreed. Regarding the release tweet, I think we could link to the highlights rather than to the what'snew On 11/30/19 10:19 PM, Gael Varoquaux wrote: > Sounds good! > > As a side note, I hope that the scikit-learn twitter account can be > something where we "ask for forgiveness rather than permission": the > consequences of getting something wrong are lighter than when > incorporating code in the library. Hopefully, this should enables us to > keep the twitter account active while minimizing the amount of time spent > on it. > > My 2 cents, > > Ga?l > > On Sat, Nov 30, 2019 at 05:33:18PM -0500, Nicolas Hug wrote: >> Adrin also proposed >> Hi there. We've repurposed this account and it will be used for >> scikit-learn related announcements. To follow day to day progress on the >> repo, please follow @sklearn_commits. >> Both are fine with me. > >> For maximum reach, maybe we could: > >> 1. tweet the release announcement from @scikit-learn >> 2. directly answer with the tweet indicating that we are re-purposing the >> account >> 3. have everyone retweet the first tweet >> Nicolas > >> On 11/25/19 12:23 PM, Olivier Grisel wrote: >> I have created the https://twitter.com/sklearn_commits twitter account. >> I have applied to make this account a "Twitter Developer" account to >> be able to use https://github.com/filearts/tweethook to register it as >> a webhook for the main scikit-learn github repo. >> Once ready, I will remove the old webhook currently registered on >> @scikit_learn account and would like to tweet about the transfer as >> drafted here: >> https://hackmd.io/@4rHCRgfySZSdd5eMtfUJiA/H1CSpuF2S/edit >> Please feel free to let me know if you have any comment / suggestion >> about this plan. > > >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn > From olivier.grisel at ensta.org Mon Dec 2 10:17:11 2019 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 2 Dec 2019 16:17:11 +0100 Subject: [scikit-learn] scikit-learn twitter account In-Reply-To: References: <1843A3F8-A6B1-4D6D-A5A4-6A2342A0D770@gmail.com> <20191122162239.k5rlri4qrpjvfma3@phare.normalesup.org> Message-ID: It might actually be possible to use github actions with https://github.com/xorilog/twitter-action for instance. I will try to give it a try with a test repo. -- Olivier From olivier.grisel at ensta.org Mon Dec 2 13:00:23 2019 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 2 Dec 2019 19:00:23 +0100 Subject: [scikit-learn] scikit-learn twitter account In-Reply-To: References: <1843A3F8-A6B1-4D6D-A5A4-6A2342A0D770@gmail.com> <20191122162239.k5rlri4qrpjvfma3@phare.normalesup.org> Message-ID: Alright, I have configured the new github action for the tweets on @sklearn_commits: https://github.com/scikit-learn/scikit-learn/pull/15758 I tested it from my repo and it worked fine (I deleted the test tweet though). We can do the switch as soon as this PR is merged. -- Olivier From t3kcit at gmail.com Mon Dec 2 15:34:25 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Mon, 2 Dec 2019 15:34:25 -0500 Subject: [scikit-learn] SVM-RFE In-Reply-To: References: Message-ID: <8eeb4fa1-3b57-736b-4cc8-488133f711b8@gmail.com> It does provide the ranking of features in the ranking_ attribute and it provides the cross-validation accuracies for all subsets in grid_scores_. It doesn't provide the feature weights for all subsets, but that's something that would be easy to add if it's desired. On 11/25/19 10:50 AM, Malik Yousef wrote: > It does not provide access for tracing the step by step feature > weights and predictive ability- The user provides the n_feature. > > Malik > > --------------------------------------------------------------------------------------- > *Prof. Malik Yousef (Associate Professor) * > *The Head of the**?Galilee Digital Health Research Center (GDH)*** > *Zefat Academic College ,?Department of Information System * > Home Page: > https://malikyousef.com/ > Google Scholar Profile : > https://scholar.google.com/citations?user=9UCZ_q4AAAAJ&hl=en&oi=ao > ---------------------------------------------------------------------------------------------------- > > > > On Mon, Nov 25, 2019 at 1:36 PM Brown J.B. via scikit-learn > > wrote: > > > 2019?11?23?(?) 2:12 Andreas Mueller >: > > I think you can also use RFECV directly without doing any > wrapping. > >> Your request to do performance checking of the steps of >> SVM-RFE is a pretty common task. > > > Yes, RFECV works well (and I should know as an appreciative > long-time user ;-)? ), but does it actually provide a mechanism > (accessors) for tracing the step by step feature weights and > predictive ability as the features are continually reduced? > (Or perhaps it's because I'm looking at 0.20.1 and 0.21.2 > documentation...?) > > J.B. > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Tue Dec 3 04:25:29 2019 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Tue, 3 Dec 2019 10:25:29 +0100 Subject: [scikit-learn] scikit-learn twitter account In-Reply-To: References: <1843A3F8-A6B1-4D6D-A5A4-6A2342A0D770@gmail.com> <20191122162239.k5rlri4qrpjvfma3@phare.normalesup.org> Message-ID: Ok the twitter accounts are now switched: https://twitter.com/scikit_learn/status/1201794032650932224 The notifications for commits pushed to master are live: https://twitter.com/sklearn_commits Ready for the release :) -- Olivier From adrin.jalali at gmail.com Tue Dec 3 07:50:46 2019 From: adrin.jalali at gmail.com (Adrin) Date: Tue, 3 Dec 2019 13:50:46 +0100 Subject: [scikit-learn] ANN: scikit-learn 0.22 final release Message-ID: We're happy to announce the 0.22 release. You can read the release highlights under https://scikit-learn.org/stable/auto_examples/release_highlights/plot_release_highlights_0_22_0.html and the long version of the change log under https://scikit-learn.org/stable/whats_new/v0.22.html#changes-0-22. This version supports Python versions 3.5 to 3.8. You can give it a go using `pip install -U scikit-learn` while conda and conda forge binaries are coming. Regards, Adrin, on behalf of the scikit-learn maintainer team. -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Tue Dec 3 15:48:48 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Tue, 3 Dec 2019 15:48:48 -0500 Subject: [scikit-learn] ANN: scikit-learn 0.22 final release In-Reply-To: References: Message-ID: Awesome! Thank you for all the work on the release! This is a big one! Are we tweeting with the repurposed twitter account? Andy On 12/3/19 7:50 AM, Adrin wrote: > We're happy to announce the 0.22 release. You can read > the release highlights under > https://scikit-learn.org/stable/auto_examples/release_highlights/plot_release_highlights_0_22_0.html > and the long version of the change log under > https://scikit-learn.org/stable/whats_new/v0.22.html#changes-0-22. > > This version supports Python versions 3.5 to 3.8. You can > give it a go using `pip install -U scikit-learn` while > conda and conda forge binaries are coming. > > Regards, > Adrin, on behalf of the scikit-learn maintainer team. > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From niourf at gmail.com Tue Dec 3 17:09:12 2019 From: niourf at gmail.com (Nicolas Hug) Date: Tue, 3 Dec 2019 17:09:12 -0500 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute Message-ID: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> As per our Governance document, changes to API principles are to be established through an Enhancement Proposal (SLEP) from which any core developer can call for a vote on its acceptance. * * *SLEP010: n_features_in attribute *is up for a vote. Please see https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html* * * * *This SLEP proposes the introduction of a public n_features_in_ attribute for most estimators* Core developers are invited to vote on this change until 4 January 2020 by replying to this email thread. All members of the community are welcome to comment on the proposal on this mailing list, or to propose minor changes through Issues and Pull Requests at https://github.com/scikit-learn/enhancement_proposals/. -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Tue Dec 3 17:27:35 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Tue, 3 Dec 2019 17:27:35 -0500 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: <9993b65c-56f2-ae35-d2a8-8a71449f50e4@gmail.com> +1 On 12/3/19 5:09 PM, Nicolas Hug wrote: > > As per our Governance > document, changes to API principles are to be established through an > Enhancement Proposal (SLEP) from which any core developer can call for > a vote on its acceptance. > > * > * > *SLEP010: n_features_in attribute *is up for a vote. Please see > https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html* > * > * > * > *This SLEP proposes the introduction of a public n_features_in_ > attribute for most estimators* > > Core developers are invited to vote on this change until 4 January > 2020 by replying to this email thread. > > All members of the community are welcome to comment on the proposal on > this mailing list, or to propose minor changes through Issues and Pull > Requests at https://github.com/scikit-learn/enhancement_proposals/. > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrin.jalali at gmail.com Tue Dec 3 17:40:14 2019 From: adrin.jalali at gmail.com (Adrin) Date: Tue, 3 Dec 2019 23:40:14 +0100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: <9993b65c-56f2-ae35-d2a8-8a71449f50e4@gmail.com> References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> <9993b65c-56f2-ae35-d2a8-8a71449f50e4@gmail.com> Message-ID: +1 On Tue., Dec. 3, 2019, 23:28 Andreas Mueller, wrote: > +1 > > On 12/3/19 5:09 PM, Nicolas Hug wrote: > > As per our Governance > document, changes to API principles are to be established through an > Enhancement Proposal (SLEP) from which any core developer can call for a > vote on its acceptance. > > *SLEP010: n_features_in attribute *is up for a vote. Please see > https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html > > *This SLEP proposes the introduction of a public n_features_in_ attribute > for most estimators* > > Core developers are invited to vote on this change until 4 January 2020 by > replying to this email thread. > > All members of the community are welcome to comment on the proposal on > this mailing list, or to propose minor changes through Issues and Pull > Requests at https://github.com/scikit-learn/enhancement_proposals/. > > _______________________________________________ > scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From godefroi.catherine at gmail.com Tue Dec 3 17:43:05 2019 From: godefroi.catherine at gmail.com (Compte.validation) Date: Tue, 3 Dec 2019 23:43:05 +0100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> <9993b65c-56f2-ae35-d2a8-8a71449f50e4@gmail.com> Message-ID: STOP. I DON'T WANT TO BE MAILED AGAIN Le mar. 3 d?c. 2019 ? 23:41, Adrin a ?crit : > +1 > > On Tue., Dec. 3, 2019, 23:28 Andreas Mueller, wrote: > >> +1 >> >> On 12/3/19 5:09 PM, Nicolas Hug wrote: >> >> As per our Governance >> document, changes to API principles are to be established through an >> Enhancement Proposal (SLEP) from which any core developer can call for a >> vote on its acceptance. >> >> *SLEP010: n_features_in attribute *is up for a vote. Please see >> https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html >> >> *This SLEP proposes the introduction of a public n_features_in_ attribute >> for most estimators* >> >> Core developers are invited to vote on this change until 4 January 2020 >> by replying to this email thread. >> >> All members of the community are welcome to comment on the proposal on >> this mailing list, or to propose minor changes through Issues and Pull >> Requests at https://github.com/scikit-learn/enhancement_proposals/. >> >> _______________________________________________ >> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From niourf at gmail.com Tue Dec 3 17:58:44 2019 From: niourf at gmail.com (Nicolas Hug) Date: Tue, 3 Dec 2019 17:58:44 -0500 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> <9993b65c-56f2-ae35-d2a8-8a71449f50e4@gmail.com> Message-ID: +1 On 12/3/19 5:40 PM, Adrin wrote: > +1 > > On Tue., Dec. 3, 2019, 23:28 Andreas Mueller, > wrote: > > +1 > > On 12/3/19 5:09 PM, Nicolas Hug wrote: >> >> As per our Governance >> document, >> changes to API principles are to be established through an >> Enhancement Proposal (SLEP) from which any core developer can >> call for a vote on its acceptance. >> >> * >> * >> *SLEP010: n_features_in attribute *is up for a vote. Please see >> https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html* >> * >> * >> * >> *This SLEP proposes the introduction of a public n_features_in_ >> attribute for most estimators* >> >> Core developers are invited to vote on this change until 4 >> January 2020 by replying to this email thread. >> >> All members of the community are welcome to comment on the >> proposal on this mailing list, or to propose minor changes >> through Issues and Pull Requests at >> https://github.com/scikit-learn/enhancement_proposals/. >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbbrown at kuhp.kyoto-u.ac.jp Tue Dec 3 23:02:28 2019 From: jbbrown at kuhp.kyoto-u.ac.jp (Brown J.B.) Date: Wed, 4 Dec 2019 13:02:28 +0900 Subject: [scikit-learn] SVM-RFE In-Reply-To: <8eeb4fa1-3b57-736b-4cc8-488133f711b8@gmail.com> References: <8eeb4fa1-3b57-736b-4cc8-488133f711b8@gmail.com> Message-ID: 2019?12?3?(?) 5:36 Andreas Mueller : > It does provide the ranking of features in the ranking_ attribute and it > provides the cross-validation accuracies for all subsets in grid_scores_. > It doesn't provide the feature weights for all subsets, but that's > something that would be easy to add if it's desired. > I would guess that there is some population of the user base that would like to track the per-iteration feature weights. It would appear to me that a straightforward (un-optimized) implementation would be place a NaN value for a feature once it is eliminated, so that a numpy.ndarray can be returned and immediately dumped to matplotlib.pcolormesh or other visualization routines in various libraries. Just an idea. J.B. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Tue Dec 3 23:32:26 2019 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 3 Dec 2019 23:32:26 -0500 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> <9993b65c-56f2-ae35-d2a8-8a71449f50e4@gmail.com> Message-ID: <20191204043226.n5lwnru7sf7wpibd@phare.normalesup.org> +1. Great job! Ga?l On Tue, Dec 03, 2019 at 05:58:44PM -0500, Nicolas Hug wrote: > +1 > On 12/3/19 5:40 PM, Adrin wrote: > +1 > On Tue., Dec. 3, 2019, 23:28 Andreas Mueller, wrote: > +1 > On 12/3/19 5:09 PM, Nicolas Hug wrote: > As per our Governance document, changes to API principles are to be > established through an Enhancement Proposal (SLEP) from which any > core developer can call for a vote on its acceptance. > SLEP010: n_features_in attribute is up for a vote. Please see > https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest > /slep010/proposal.html > This SLEP proposes the introduction of a public n_features_in_ > attribute for most estimators > Core developers are invited to vote on this change until 4 January > 2020 by replying to this email thread. > All members of the community are welcome to comment on the proposal > on this mailing list, or to propose minor changes through Issues > and Pull Requests at?https://github.com/scikit-learn/ > enhancement_proposals/. > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -- Gael Varoquaux Research Director, INRIA Visiting professor, McGill http://gael-varoquaux.info http://twitter.com/GaelVaroquaux From qinhanmin2005 at sina.com Wed Dec 4 02:45:59 2019 From: qinhanmin2005 at sina.com (Hanmin Qin) Date: Wed, 04 Dec 2019 15:45:59 +0800 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute Message-ID: <20191204074600.1DEEE414009C@webmail.sinamail.sina.com.cn> +1 (seems that users are unhappy because we vote in the mailing list?) Hanmin Qin ----- Original Message ----- From: Gael Varoquaux To: Scikit-learn mailing list Subject: Re: [scikit-learn] Vote on SLEP010: n_features_in_ attribute Date: 2019-12-04 12:34 +1. Great job! Ga?l On Tue, Dec 03, 2019 at 05:58:44PM -0500, Nicolas Hug wrote: > +1 > On 12/3/19 5:40 PM, Adrin wrote: > +1 > On Tue., Dec. 3, 2019, 23:28 Andreas Mueller, wrote: > +1 > On 12/3/19 5:09 PM, Nicolas Hug wrote: > As per our Governance document, changes to API principles are to be > established through an Enhancement Proposal (SLEP) from which any > core developer can call for a vote on its acceptance. > SLEP010: n_features_in attribute is up for a vote. Please see > https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest > /slep010/proposal.html > This SLEP proposes the introduction of a public n_features_in_ > attribute for most estimators > Core developers are invited to vote on this change until 4 January > 2020 by replying to this email thread. > All members of the community are welcome to comment on the proposal > on this mailing list, or to propose minor changes through Issues > and Pull Requests at https://github.com/scikit-learn/ > enhancement_proposals/. > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -- Gael Varoquaux Research Director, INRIA Visiting professor, McGill http://gael-varoquaux.info http://twitter.com/GaelVaroquaux _______________________________________________ scikit-learn mailing list scikit-learn at python.org https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From ahowe42 at gmail.com Wed Dec 4 04:19:55 2019 From: ahowe42 at gmail.com (Andrew Howe) Date: Wed, 4 Dec 2019 09:19:55 +0000 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: Excellent idea. <~~~~~~~~~~~~~~~~~~~~~~~~~~~> J. Andrew Howe, PhD LinkedIn Profile ResearchGate Profile Open Researcher and Contributor ID (ORCID) Github Profile Personal Website I live to learn, so I can learn to live. - me <~~~~~~~~~~~~~~~~~~~~~~~~~~~> On Tue, Dec 3, 2019 at 10:14 PM Nicolas Hug wrote: > As per our Governance > document, changes to API principles are to be established through an > Enhancement Proposal (SLEP) from which any core developer can call for a > vote on its acceptance. > > *SLEP010: n_features_in attribute *is up for a vote. Please see > https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html > > *This SLEP proposes the introduction of a public n_features_in_ attribute > for most estimators* > > Core developers are invited to vote on this change until 4 January 2020 by > replying to this email thread. > > All members of the community are welcome to comment on the proposal on > this mailing list, or to propose minor changes through Issues and Pull > Requests at https://github.com/scikit-learn/enhancement_proposals/. > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From trev.stephens at gmail.com Wed Dec 4 04:41:36 2019 From: trev.stephens at gmail.com (Trevor Stephens) Date: Wed, 4 Dec 2019 20:41:36 +1100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: Not a core contributor, but what's with the "_in" bit? Just go with a public n_features which I think a bunch of estimators already have? Feels clumsy On Wed, Dec 4, 2019 at 8:22 PM Andrew Howe wrote: > Excellent idea. > > > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > J. Andrew Howe, PhD > LinkedIn Profile > ResearchGate Profile > Open Researcher and Contributor ID (ORCID) > > Github Profile > Personal Website > I live to learn, so I can learn to live. - me > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > > > On Tue, Dec 3, 2019 at 10:14 PM Nicolas Hug wrote: > >> As per our Governance >> document, changes to API principles are to be established through an >> Enhancement Proposal (SLEP) from which any core developer can call for a >> vote on its acceptance. >> >> *SLEP010: n_features_in attribute *is up for a vote. Please see >> https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html >> >> *This SLEP proposes the introduction of a public n_features_in_ attribute >> for most estimators* >> >> Core developers are invited to vote on this change until 4 January 2020 >> by replying to this email thread. >> >> All members of the community are welcome to comment on the proposal on >> this mailing list, or to propose minor changes through Issues and Pull >> Requests at https://github.com/scikit-learn/enhancement_proposals/. >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ahowe42 at gmail.com Wed Dec 4 04:43:42 2019 From: ahowe42 at gmail.com (Andrew Howe) Date: Wed, 4 Dec 2019 09:43:42 +0000 Subject: [scikit-learn] ANN: scikit-learn 0.22 final release In-Reply-To: References: Message-ID: This is an excellent release with some very cool new features! I'm quite chuffed about the stacked estimators especially. Great job team! Scikit-learn is incredibly well-supported and tremendously full-featured. I have to ask; why is it still in beta? Andrew <~~~~~~~~~~~~~~~~~~~~~~~~~~~> J. Andrew Howe, PhD LinkedIn Profile ResearchGate Profile Open Researcher and Contributor ID (ORCID) Github Profile Personal Website I live to learn, so I can learn to live. - me <~~~~~~~~~~~~~~~~~~~~~~~~~~~> On Tue, Dec 3, 2019 at 12:53 PM Adrin wrote: > We're happy to announce the 0.22 release. You can read > the release highlights under > https://scikit-learn.org/stable/auto_examples/release_highlights/plot_release_highlights_0_22_0.html > and the long version of the change log under > https://scikit-learn.org/stable/whats_new/v0.22.html#changes-0-22. > > This version supports Python versions 3.5 to 3.8. You can > give it a go using `pip install -U scikit-learn` while > conda and conda forge binaries are coming. > > Regards, > Adrin, on behalf of the scikit-learn maintainer team. > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ahowe42 at gmail.com Wed Dec 4 04:45:38 2019 From: ahowe42 at gmail.com (Andrew Howe) Date: Wed, 4 Dec 2019 09:45:38 +0000 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: Perhaps because some skl objects have R *n_features_in*, but then S *n_features_out*, where S!=R. Andrew <~~~~~~~~~~~~~~~~~~~~~~~~~~~> J. Andrew Howe, PhD LinkedIn Profile ResearchGate Profile Open Researcher and Contributor ID (ORCID) Github Profile Personal Website I live to learn, so I can learn to live. - me <~~~~~~~~~~~~~~~~~~~~~~~~~~~> On Wed, Dec 4, 2019 at 9:43 AM Trevor Stephens wrote: > Not a core contributor, but what's with the "_in" bit? Just go with a > public n_features which I think a bunch of estimators already have? Feels > clumsy > > On Wed, Dec 4, 2019 at 8:22 PM Andrew Howe wrote: > >> Excellent idea. >> >> >> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >> J. Andrew Howe, PhD >> LinkedIn Profile >> ResearchGate Profile >> Open Researcher and Contributor ID (ORCID) >> >> Github Profile >> Personal Website >> I live to learn, so I can learn to live. - me >> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >> >> >> On Tue, Dec 3, 2019 at 10:14 PM Nicolas Hug wrote: >> >>> As per our Governance >>> document, changes to API principles are to be established through an >>> Enhancement Proposal (SLEP) from which any core developer can call for a >>> vote on its acceptance. >>> >>> *SLEP010: n_features_in attribute *is up for a vote. Please see >>> https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html >>> >>> *This SLEP proposes the introduction of a public n_features_in_ >>> attribute for most estimators* >>> >>> Core developers are invited to vote on this change until 4 January 2020 >>> by replying to this email thread. >>> >>> All members of the community are welcome to comment on the proposal on >>> this mailing list, or to propose minor changes through Issues and Pull >>> Requests at https://github.com/scikit-learn/enhancement_proposals/. >>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn at python.org >>> https://mail.python.org/mailman/listinfo/scikit-learn >>> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joel.nothman at gmail.com Wed Dec 4 04:51:11 2019 From: joel.nothman at gmail.com (Joel Nothman) Date: Wed, 4 Dec 2019 20:51:11 +1100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: We are looking to have n_features_out_ for transformers. This naming makes the difference explicit. I would like to see some guidance on how an estimator implementation (e.g. in scikit-learn-contrib) is advised to maintain compatibility with Scikit-learn pre- and post- SLEP010. That is, we want to encourage developers to take advantage of super()._validate_data(X, y), but we also don't want to force them to set a minimal Scikit-learn >= 0.23 dependency (or do we?). What's the recommended way to do implement fit and predict in such an implementation? Is it to (a) not use _validate_data until the minimal dependency is reached? (b) implement a patched BaseEstimator in the library which inherits from Scikit-learn's BaseEstimator and adds _validate_data? (c) something else? Joel -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrin.jalali at gmail.com Wed Dec 4 04:59:07 2019 From: adrin.jalali at gmail.com (Adrin) Date: Wed, 4 Dec 2019 10:59:07 +0100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: As far as I remember, one idea is to have a deprecation cycle with a FutureWarning on check estimator to give third party developers implement the new API. On Wed, Dec 4, 2019 at 10:52 AM Joel Nothman wrote: > We are looking to have n_features_out_ for transformers. This naming makes > the difference explicit. > > I would like to see some guidance on how an estimator implementation (e.g. > in scikit-learn-contrib) is advised to maintain compatibility with > Scikit-learn pre- and post- SLEP010. > > That is, we want to encourage developers to take advantage of > super()._validate_data(X, y), but we also don't want to force them to set a > minimal Scikit-learn >= 0.23 dependency (or do we?). What's the recommended > way to do implement fit and predict in such an implementation? > > Is it to > (a) not use _validate_data until the minimal dependency is reached? > (b) implement a patched BaseEstimator in the library which inherits from > Scikit-learn's BaseEstimator and adds _validate_data? > (c) something else? > > > Joel > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joel.nothman at gmail.com Wed Dec 4 05:01:57 2019 From: joel.nothman at gmail.com (Joel Nothman) Date: Wed, 4 Dec 2019 21:01:57 +1100 Subject: [scikit-learn] ANN: scikit-learn 0.22 final release In-Reply-To: References: Message-ID: The stacked estimators was certainly a team effort! I am excited that we've finally got a consistent solution to using approximate nearest neighbors with our neighbors-based learners. Why is it still version <1? Perhaps it shouldn't be. But it can be hard to set aside perfectionism! And there's so much on the roadmap ( https://scikit-learn.org/stable/roadmap.html). But perhaps you've got a point. On Wed, 4 Dec 2019 at 20:45, Andrew Howe wrote: > This is an excellent release with some very cool new features! I'm quite > chuffed about the stacked estimators especially. Great job team! > > Scikit-learn is incredibly well-supported and tremendously full-featured. > I have to ask; why is it still in beta? > > Andrew > > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > J. Andrew Howe, PhD > LinkedIn Profile > ResearchGate Profile > Open Researcher and Contributor ID (ORCID) > > Github Profile > Personal Website > I live to learn, so I can learn to live. - me > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > > > On Tue, Dec 3, 2019 at 12:53 PM Adrin wrote: > >> We're happy to announce the 0.22 release. You can read >> the release highlights under >> https://scikit-learn.org/stable/auto_examples/release_highlights/plot_release_highlights_0_22_0.html >> and the long version of the change log under >> https://scikit-learn.org/stable/whats_new/v0.22.html#changes-0-22. >> >> This version supports Python versions 3.5 to 3.8. You can >> give it a go using `pip install -U scikit-learn` while >> conda and conda forge binaries are coming. >> >> Regards, >> Adrin, on behalf of the scikit-learn maintainer team. >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joel.nothman at gmail.com Wed Dec 4 05:03:56 2019 From: joel.nothman at gmail.com (Joel Nothman) Date: Wed, 4 Dec 2019 21:03:56 +1100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: Oh... I remember what we landed up on, actually... we've made _validate_data private so downstream estimators can't technically expect to use it reliably across any versions... -------------- next part -------------- An HTML attachment was scrubbed... URL: From trev.stephens at gmail.com Wed Dec 4 05:05:17 2019 From: trev.stephens at gmail.com (Trevor Stephens) Date: Wed, 4 Dec 2019 21:05:17 +1100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: Makes sense Joel, wasn't mentioned in the docs, so was a bit strange. Still feels a bit weird but I'm sure I'll adapt_in and thrive_out. Downstream projectwise, I'm happy to bounce my dependencies up whenever necessary. Always nice to support old versions of sklearn, but not at the expense of spaghetti code from my persepctive, whatever that's worth. Might be a bit more prickly for projects still trying to support Py2.x though? On Wed, Dec 4, 2019 at 8:53 PM Joel Nothman wrote: > We are looking to have n_features_out_ for transformers. This naming makes > the difference explicit. > > I would like to see some guidance on how an estimator implementation (e.g. > in scikit-learn-contrib) is advised to maintain compatibility with > Scikit-learn pre- and post- SLEP010. > > That is, we want to encourage developers to take advantage of > super()._validate_data(X, y), but we also don't want to force them to set a > minimal Scikit-learn >= 0.23 dependency (or do we?). What's the recommended > way to do implement fit and predict in such an implementation? > > Is it to > (a) not use _validate_data until the minimal dependency is reached? > (b) implement a patched BaseEstimator in the library which inherits from > Scikit-learn's BaseEstimator and adds _validate_data? > (c) something else? > > > Joel > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ahowe42 at gmail.com Wed Dec 4 05:19:30 2019 From: ahowe42 at gmail.com (Andrew Howe) Date: Wed, 4 Dec 2019 10:19:30 +0000 Subject: [scikit-learn] Version 0.21! and plot_tree! In-Reply-To: <428bcef1-fb27-fd98-7d7a-13d33e7fb8c6@gmail.com> References: <428bcef1-fb27-fd98-7d7a-13d33e7fb8c6@gmail.com> Message-ID: Hi Andy I've been playing around with plot_tree for a while (clearly), and have some feedback finally. I'm not very concerned with the compactness of the tree. However, for large trees, it's not very easy to inspect or traverse. I think it could be very useful to add the following ways to *slice-and-dice* the tree: - plot_tree_subtree_node - plots only the portion of the tree that could be accessed by traversing downwards from the specified node - plot_tree_subtree_class - plots the entire tree, highlighting all the traversals that lead to a specific class, other class leaf / branch nodes could be shrunk to save space Also, I have noted on ver 0.21.3 that the rotate argument does not seem to be working in either jupyter lab or ipython, though this seems like a known issue. Andrew <~~~~~~~~~~~~~~~~~~~~~~~~~~~> J. Andrew Howe, PhD LinkedIn Profile ResearchGate Profile Open Researcher and Contributor ID (ORCID) Github Profile Personal Website I live to learn, so I can learn to live. - me <~~~~~~~~~~~~~~~~~~~~~~~~~~~> On Thu, May 23, 2019 at 4:24 PM Andreas Mueller wrote: > Hey Andrew. > Thanks for saying thanks! > I share your frustration with export_graphviz, in particular for teaching. > I feel like plot_tree is not ideal yet, though. In particular the layout > is not as compact as the graphviz one. > If you have any feedback or suggestions, I'd be very happy to hear them! > > Cheers, > Andy > > > On 5/23/19 10:39 AM, Andrew Howe wrote: > > I want to say thank you to all the sklearn developers. The breadth and > quality of this software is truly breathtaking. > > Specifically, I want to say thank you very very much for the plot_tree > function! I have wasted a lot of effort in the past, on multiple OSes, > getting everything to work so I could view the tree.export_graphviz > results. Having this new function to plot the trees natively in matplotlib > is extremely useful. > > Thanks again! > Andrew > > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > J. Andrew Howe, PhD > LinkedIn Profile > ResearchGate Profile > Open Researcher and Contributor ID (ORCID) > > Github Profile > Personal Website > I live to learn, so I can learn to live. - me > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > > _______________________________________________ > scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrin.jalali at gmail.com Wed Dec 4 05:38:37 2019 From: adrin.jalali at gmail.com (Adrin) Date: Wed, 4 Dec 2019 11:38:37 +0100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: I really don't think Py2.x should be our concern anymore. The kind of spaghetti code you mention is somewhat the nature of supporting multiple versions of a dependency library. We do have similar code in our code base which deals with different versions of our own dependencies. On Wed, Dec 4, 2019 at 11:06 AM Trevor Stephens wrote: > Makes sense Joel, wasn't mentioned in the docs, so was a bit strange. > Still feels a bit weird but I'm sure I'll adapt_in and thrive_out. > > Downstream projectwise, I'm happy to bounce my dependencies up whenever > necessary. Always nice to support old versions of sklearn, but not at the > expense of spaghetti code from my persepctive, whatever that's worth. > > Might be a bit more prickly for projects still trying to support Py2.x > though? > > On Wed, Dec 4, 2019 at 8:53 PM Joel Nothman > wrote: > >> We are looking to have n_features_out_ for transformers. This naming >> makes the difference explicit. >> >> I would like to see some guidance on how an estimator implementation >> (e.g. in scikit-learn-contrib) is advised to maintain compatibility with >> Scikit-learn pre- and post- SLEP010. >> >> That is, we want to encourage developers to take advantage of >> super()._validate_data(X, y), but we also don't want to force them to set a >> minimal Scikit-learn >= 0.23 dependency (or do we?). What's the recommended >> way to do implement fit and predict in such an implementation? >> >> Is it to >> (a) not use _validate_data until the minimal dependency is reached? >> (b) implement a patched BaseEstimator in the library which inherits from >> Scikit-learn's BaseEstimator and adds _validate_data? >> (c) something else? >> >> >> Joel >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ahowe42 at gmail.com Wed Dec 4 06:19:56 2019 From: ahowe42 at gmail.com (Andrew Howe) Date: Wed, 4 Dec 2019 11:19:56 +0000 Subject: [scikit-learn] ANN: scikit-learn 0.22 final release In-Reply-To: References: Message-ID: That is an impressive roadmap, and I certainly applaud the desire for perfection. That said, I feel that it is past time to bring sklearn out of beta. Most of what's on the roadmap looks like it would fit quite well into continuing development of a "stable" package, with no (or at least few) backwards-compatibility issues. just my 2 cents. Andrew <~~~~~~~~~~~~~~~~~~~~~~~~~~~> J. Andrew Howe, PhD LinkedIn Profile ResearchGate Profile Open Researcher and Contributor ID (ORCID) Github Profile Personal Website I live to learn, so I can learn to live. - me <~~~~~~~~~~~~~~~~~~~~~~~~~~~> On Wed, Dec 4, 2019 at 10:04 AM Joel Nothman wrote: > The stacked estimators was certainly a team effort! > > I am excited that we've finally got a consistent solution to using > approximate nearest neighbors with our neighbors-based learners. > > Why is it still version <1? Perhaps it shouldn't be. But it can be hard to > set aside perfectionism! > > And there's so much on the roadmap ( > https://scikit-learn.org/stable/roadmap.html). But perhaps you've got a > point. > > On Wed, 4 Dec 2019 at 20:45, Andrew Howe wrote: > >> This is an excellent release with some very cool new features! I'm quite >> chuffed about the stacked estimators especially. Great job team! >> >> Scikit-learn is incredibly well-supported and tremendously full-featured. >> I have to ask; why is it still in beta? >> >> Andrew >> >> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >> J. Andrew Howe, PhD >> LinkedIn Profile >> ResearchGate Profile >> Open Researcher and Contributor ID (ORCID) >> >> Github Profile >> Personal Website >> I live to learn, so I can learn to live. - me >> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >> >> >> On Tue, Dec 3, 2019 at 12:53 PM Adrin wrote: >> >>> We're happy to announce the 0.22 release. You can read >>> the release highlights under >>> https://scikit-learn.org/stable/auto_examples/release_highlights/plot_release_highlights_0_22_0.html >>> and the long version of the change log under >>> https://scikit-learn.org/stable/whats_new/v0.22.html#changes-0-22. >>> >>> This version supports Python versions 3.5 to 3.8. You can >>> give it a go using `pip install -U scikit-learn` while >>> conda and conda forge binaries are coming. >>> >>> Regards, >>> Adrin, on behalf of the scikit-learn maintainer team. >>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn at python.org >>> https://mail.python.org/mailman/listinfo/scikit-learn >>> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Wed Dec 4 10:54:53 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Wed, 4 Dec 2019 10:54:53 -0500 Subject: [scikit-learn] ANN: scikit-learn 0.22 final release In-Reply-To: References: Message-ID: <71fcf056-1212-b1b8-cc02-53c6171d1656@gmail.com> Maybe we can discuss this in https://github.com/scikit-learn/scikit-learn/issues/14386 ? I think I have come to agree that we should just do 1.0 and if we want to make any big changes that should be 2.0. On 12/4/19 6:19 AM, Andrew Howe wrote: > That is an impressive roadmap, and I certainly applaud the desire for > perfection. That said, I feel that it is past time to bring sklearn > out of beta. Most of what's on the roadmap looks like it would fit > quite well into continuing development of a "stable" package, with no > (or at least few) backwards-compatibility issues. > > just my 2 cents. > > Andrew > > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > J. Andrew Howe, PhD > LinkedIn Profile > ResearchGate Profile > Open Researcher and Contributor ID (ORCID) > > Github Profile > Personal Website > I live to learn, so I can learn to live. - me > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > > > On Wed, Dec 4, 2019 at 10:04 AM Joel Nothman > wrote: > > The stacked estimators was certainly a team effort! > > I am excited that we've finally got a consistent solution to using > approximate nearest neighbors with our neighbors-based learners. > > Why is it still version <1? Perhaps it shouldn't be. But it can be > hard to set aside perfectionism! > > And there's so much on the roadmap > (https://scikit-learn.org/stable/roadmap.html). But perhaps you've > got a point. > > On Wed, 4 Dec 2019 at 20:45, Andrew Howe > wrote: > > This is an excellent release with some very cool new features! > I'm quite chuffed about the stacked estimators especially. > Great job team! > > Scikit-learn is incredibly well-supported and tremendously > full-featured. I have to ask; why is it still in beta? > > Andrew > > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > J. Andrew Howe, PhD > LinkedIn Profile > ResearchGate Profile > > Open Researcher and Contributor ID (ORCID) > > Github Profile > Personal Website > I live to learn, so I can learn to live. - me > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > > > On Tue, Dec 3, 2019 at 12:53 PM Adrin > wrote: > > We're happy to announce the 0.22 release. You can read > the release highlights under > https://scikit-learn.org/stable/auto_examples/release_highlights/plot_release_highlights_0_22_0.html > and the long version of the change log under > https://scikit-learn.org/stable/whats_new/v0.22.html#changes-0-22. > > This version supports Python versions 3.5 to 3.8. You can > give it a go using `pip install -U scikit-learn` while > conda and conda forge binaries are coming. > > Regards, > Adrin, on behalf of the scikit-learn maintainer team. > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Wed Dec 4 10:59:33 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Wed, 4 Dec 2019 10:59:33 -0500 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: <70286661-16d2-4863-9ff0-0681a5d09394@gmail.com> I agree / recall that that was what we settled on. So a) but even more conservative ;) On 12/4/19 5:03 AM, Joel Nothman wrote: > Oh... I remember what we landed up on, actually... we've made > _validate_data private so downstream estimators can't technically > expect to use it reliably across any versions... > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Wed Dec 4 11:05:57 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Wed, 4 Dec 2019 11:05:57 -0500 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: <72bb293c-adb1-00e7-8b41-b063cdd704d3@gmail.com> On 12/4/19 5:05 AM, Trevor Stephens wrote: > Makes sense Joel, wasn't mentioned in the docs, so was a bit strange. > Still feels a bit weird but I'm sure I'll adapt_in and thrive_out. > Indeed, and as Joel said, we'll have n_features_out_ added soon. Having both is quite helpful in many situations. The naming is also meant to be analogous to future "feature_names_in_" and "feature_names_out_ attribute. Right now we have "get_feature_names()", which actually refers to the output features. That's a whole lot of new attributes, but after quite a lot of deliberation that's the solution we came up with, as there were major flaws in all other proposals. The SLEP for that is being rewritten right now. There's some conversation in https://github.com/scikit-learn/enhancement_proposals/pull/18 but the document doesn't reflect the current consensus. Andy > Downstream projectwise, I'm happy to bounce my dependencies up > whenever necessary. Always nice to support old versions of sklearn, > but not at the expense of spaghetti code from my persepctive, whatever > that's worth. > > Might be a bit more prickly for projects still trying to support Py2.x > though? > > On Wed, Dec 4, 2019 at 8:53 PM Joel Nothman > wrote: > > We are looking to have n_features_out_ for transformers. This > naming makes the difference explicit. > > I would like to see some guidance on how an estimator > implementation (e.g. in scikit-learn-contrib) is advised to > maintain compatibility with Scikit-learn pre- and post- SLEP010. > > That is, we want to encourage?developers to take advantage of > super()._validate_data(X, y), but we also?don't want to force them > to set a minimal Scikit-learn >= 0.23 dependency (or do we?). > What's the recommended way to do implement fit and predict in such > an implementation? > > Is it to > (a) not use _validate_data until the minimal dependency is reached? > (b) implement a patched BaseEstimator in the library which > inherits from Scikit-learn's BaseEstimator and adds _validate_data? > (c) something else? > > > Joel > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Wed Dec 4 11:12:20 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Wed, 4 Dec 2019 11:12:20 -0500 Subject: [scikit-learn] Version 0.21! and plot_tree! In-Reply-To: References: <428bcef1-fb27-fd98-7d7a-13d33e7fb8c6@gmail.com> Message-ID: Hey Andrew. Thanks for your feedback! On 12/4/19 5:19 AM, Andrew Howe wrote: > Hi Andy > > I've been playing around with plot_tree for a while (clearly), and > have some feedback finally. I'm not very concerned with the > compactness of the tree. However, for large trees, it's not very easy > to inspect or traverse. I think it could be very useful to add the > following ways to /slice-and-dice/?the tree: > > * plot_tree_subtree_node - plots only the portion of the tree that > could be accessed by traversing downwards from the specified node > * plot_tree_subtree_class - plots the entire tree, highlighting all > the traversals that lead to a specific class, other class leaf / > branch nodes could be shrunk to save space > Somehow I feel like an interactive visualization would be more useful for that. Don't you think? There are tree exploration tools that we could run in jupyter. If we can do it with just CSS (which is actually reasonably plausible), then we could ship this with scikit-learn. That would be completely orthogonal to the matplotlib based code, though. I had thought about whether it might make sense to do a html based tree visualization recently and thought it might actually be nicer than the matplotlib one. Thoughts? > Also, I have noted on ver 0.21.3 that the rotate argument does not > seem to be working in either jupyter lab or ipython, though this seems > like a known issue. There seems to be a recent issue: https://github.com/scikit-learn/scikit-learn/issues/15694 Is that a feature you really want? That's just me copying something I didn't mean to copy, and it's not implemented at all. We could implement it, but I was leaning towards just deleting it. Cheers, Andy > > Andrew > > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > J. Andrew Howe, PhD > LinkedIn Profile > ResearchGate Profile > Open Researcher and Contributor ID (ORCID) > > Github Profile > Personal Website > I live to learn, so I can learn to live. - me > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > > > On Thu, May 23, 2019 at 4:24 PM Andreas Mueller > wrote: > > Hey Andrew. > Thanks for saying thanks! > I share your frustration with export_graphviz, in particular for > teaching. > I feel like plot_tree is not ideal yet, though. In particular the > layout is not as compact as the graphviz one. > If you have any feedback or suggestions, I'd be very happy to hear > them! > > Cheers, > Andy > > > On 5/23/19 10:39 AM, Andrew Howe wrote: >> I want to say thank you to all the sklearn developers. The >> breadth and quality of this software is truly breathtaking. >> >> Specifically, I want to say thank you very very much for the >> plot_tree function! I have wasted a lot of effort in the past, on >> multiple OSes, getting everything to work so I could view the >> tree.export_graphviz results. Having this new function to plot >> the trees natively in matplotlib is extremely useful. >> >> Thanks again! >> Andrew >> >> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >> J. Andrew Howe, PhD >> LinkedIn Profile >> ResearchGate Profile >> >> Open Researcher and Contributor ID (ORCID) >> >> Github Profile >> Personal Website >> I live to learn, so I can learn to live. - me >> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From t3kcit at gmail.com Wed Dec 4 11:12:58 2019 From: t3kcit at gmail.com (Andreas Mueller) Date: Wed, 4 Dec 2019 11:12:58 -0500 Subject: [scikit-learn] SVM-RFE In-Reply-To: References: <8eeb4fa1-3b57-736b-4cc8-488133f711b8@gmail.com> Message-ID: <8d8d7966-c8d0-3abf-4f10-7ac708f001dc@gmail.com> PR welcome ;) On 12/3/19 11:02 PM, Brown J.B. via scikit-learn wrote: > 2019?12?3?(?) 5:36 Andreas Mueller >: > > It does provide the ranking of features in the ranking_ attribute > and it provides the cross-validation accuracies for all subsets in > grid_scores_. > It doesn't provide the feature weights for all subsets, but that's > something that would be easy to add if it's desired. > > > I would guess that there is some population of the user base that > would like to track the per-iteration feature weights. > It would appear to me that a straightforward (un-optimized) > implementation would be place a NaN value for a feature once it is > eliminated, so that a numpy.ndarray can be returned and immediately > dumped to matplotlib.pcolormesh or other visualization routines in > various libraries. > > Just an idea. > > J.B. > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From joel.nothman at gmail.com Wed Dec 4 14:44:36 2019 From: joel.nothman at gmail.com (Joel Nothman) Date: Thu, 5 Dec 2019 06:44:36 +1100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: I am +1 for this, but I think we should look at how to make these new validation methods usable by external developers ideally supporting multiple Scikit-learn versions (i.e. we need something in stable public or protected API). A simple solution is to make default implementations of validate_data available as public helpers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbbrown at kuhp.kyoto-u.ac.jp Thu Dec 5 00:14:35 2019 From: jbbrown at kuhp.kyoto-u.ac.jp (Brown J.B.) Date: Thu, 5 Dec 2019 14:14:35 +0900 Subject: [scikit-learn] SVM-RFE In-Reply-To: <8d8d7966-c8d0-3abf-4f10-7ac708f001dc@gmail.com> References: <8eeb4fa1-3b57-736b-4cc8-488133f711b8@gmail.com> <8d8d7966-c8d0-3abf-4f10-7ac708f001dc@gmail.com> Message-ID: I certainly am guilty of only commenting in the mailing list and not engaging more via GitHub! :) (Much like many of you PIs on this list, the typical ActualWork-GrantWriting-ReportWriting-InvitedLectures-RealLifeParenting cycle eats the day away.) While I've failed previously to get involved after showing interest, let's see if I can't actually succeed for once. 2019?12?5?(?) 1:14 Andreas Mueller : > PR welcome ;) > > > On 12/3/19 11:02 PM, Brown J.B. via scikit-learn wrote: > > 2019?12?3?(?) 5:36 Andreas Mueller : > >> It does provide the ranking of features in the ranking_ attribute and it >> provides the cross-validation accuracies for all subsets in grid_scores_. >> It doesn't provide the feature weights for all subsets, but that's >> something that would be easy to add if it's desired. >> > > I would guess that there is some population of the user base that would > like to track the per-iteration feature weights. > It would appear to me that a straightforward (un-optimized) implementation > would be place a NaN value for a feature once it is eliminated, so that a > numpy.ndarray can be returned and immediately dumped to > matplotlib.pcolormesh or other visualization routines in various libraries. > > Just an idea. > > J.B. > > _______________________________________________ > scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marmochiaskl at gmail.com Thu Dec 5 04:56:54 2019 From: marmochiaskl at gmail.com (Chiara Marmo) Date: Thu, 5 Dec 2019 10:56:54 +0100 Subject: [scikit-learn] Update of 'Upcoming events' on the scikit-learn wiki Message-ID: Dear core-devs, I would like to advertise about our Paris sprint (end of January) on the scikit-learn wiki. If there are no objections, my goal is to make a list of events in the 'Upcoming events' page [1] and link the event pages from there. This will allow to link also events organised by other entities (like WiMLDS) even if pages are not hosted there. Please, let me know if you all are ok with that. Thanks for listening, best Chiara [1] https://github.com/scikit-learn/scikit-learn/wiki/Upcoming-events -------------- next part -------------- An HTML attachment was scrubbed... URL: From ahowe42 at gmail.com Thu Dec 5 06:39:04 2019 From: ahowe42 at gmail.com (Andrew Howe) Date: Thu, 5 Dec 2019 11:39:04 +0000 Subject: [scikit-learn] Version 0.21! and plot_tree! In-Reply-To: References: <428bcef1-fb27-fd98-7d7a-13d33e7fb8c6@gmail.com> Message-ID: Andy To be honest, I've kind of outgrown matplotlib, and do all my viz with plotly. It's much less mature of a package, and the documentation seems to leave much to be desired. However, the interactivity and html/javascript basis is worth that trade-off for me. So yes, ultimately, I'd like to see tree_plot generate tree visualizations with which the user can interact. Same for all sklearn plots, really. The tree rotation is not something I want - I'd never found it useful in other applications. Just figured I would mention it as feedback. I'd go with you on deleting the arg. Andrew <~~~~~~~~~~~~~~~~~~~~~~~~~~~> J. Andrew Howe, PhD LinkedIn Profile ResearchGate Profile Open Researcher and Contributor ID (ORCID) Github Profile Personal Website I live to learn, so I can learn to live. - me <~~~~~~~~~~~~~~~~~~~~~~~~~~~> On Wed, Dec 4, 2019 at 4:14 PM Andreas Mueller wrote: > Hey Andrew. > Thanks for your feedback! > > On 12/4/19 5:19 AM, Andrew Howe wrote: > > Hi Andy > > I've been playing around with plot_tree for a while (clearly), and have > some feedback finally. I'm not very concerned with the compactness of the > tree. However, for large trees, it's not very easy to inspect or traverse. > I think it could be very useful to add the following ways to > *slice-and-dice* the tree: > > - plot_tree_subtree_node - plots only the portion of the tree that > could be accessed by traversing downwards from the specified node > - plot_tree_subtree_class - plots the entire tree, highlighting all > the traversals that lead to a specific class, other class leaf / branch > nodes could be shrunk to save space > > Somehow I feel like an interactive visualization would be more useful for > that. Don't you think? > There are tree exploration tools that we could run in jupyter. > If we can do it with just CSS (which is actually reasonably plausible), > then we could ship this with scikit-learn. > That would be completely orthogonal to the matplotlib based code, though. > I had thought about whether it might make sense to do a html based tree > visualization recently and thought it might actually be nicer than the > matplotlib one. > Thoughts? > > Also, I have noted on ver 0.21.3 that the rotate argument does not seem to > be working in either jupyter lab or ipython, though this seems like a known > issue. > > There seems to be a recent issue: > https://github.com/scikit-learn/scikit-learn/issues/15694 > > Is that a feature you really want? That's just me copying something I > didn't mean to copy, and it's not implemented at all. > We could implement it, but I was leaning towards just deleting it. > > Cheers, > Andy > > > Andrew > > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > J. Andrew Howe, PhD > LinkedIn Profile > ResearchGate Profile > Open Researcher and Contributor ID (ORCID) > > Github Profile > Personal Website > I live to learn, so I can learn to live. - me > <~~~~~~~~~~~~~~~~~~~~~~~~~~~> > > > On Thu, May 23, 2019 at 4:24 PM Andreas Mueller wrote: > >> Hey Andrew. >> Thanks for saying thanks! >> I share your frustration with export_graphviz, in particular for teaching. >> I feel like plot_tree is not ideal yet, though. In particular the layout >> is not as compact as the graphviz one. >> If you have any feedback or suggestions, I'd be very happy to hear them! >> >> Cheers, >> Andy >> >> >> On 5/23/19 10:39 AM, Andrew Howe wrote: >> >> I want to say thank you to all the sklearn developers. The breadth and >> quality of this software is truly breathtaking. >> >> Specifically, I want to say thank you very very much for the plot_tree >> function! I have wasted a lot of effort in the past, on multiple OSes, >> getting everything to work so I could view the tree.export_graphviz >> results. Having this new function to plot the trees natively in matplotlib >> is extremely useful. >> >> Thanks again! >> Andrew >> >> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >> J. Andrew Howe, PhD >> LinkedIn Profile >> ResearchGate Profile >> Open Researcher and Contributor ID (ORCID) >> >> Github Profile >> Personal Website >> I live to learn, so I can learn to live. - me >> <~~~~~~~~~~~~~~~~~~~~~~~~~~~> >> >> _______________________________________________ >> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > > _______________________________________________ > scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rth.yurchak at gmail.com Fri Dec 6 07:31:46 2019 From: rth.yurchak at gmail.com (Roman Yurchak) Date: Fri, 6 Dec 2019 13:31:46 +0100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: <75c39c63-3093-891b-fbbc-e1f371cf0e03@gmail.com> On 04/12/2019 20:44, Joel Nothman wrote: > I am +1 for this, but I think we should look at how to make these new > validation methods usable by external developers +1 for the SLEP and for finding a way to make this method usable by external developers maybe as part of the developer API. From rth.yurchak at gmail.com Fri Dec 6 08:15:15 2019 From: rth.yurchak at gmail.com (Roman Yurchak) Date: Fri, 6 Dec 2019 14:15:15 +0100 Subject: [scikit-learn] Update of 'Upcoming events' on the scikit-learn wiki In-Reply-To: References: Message-ID: Thank you, Chiara! I think announcing some of the main planned sprints on the mailing list and twitter would be helpful. Last sprint (in London) contributors were interested in knowing how they could follow when next sprints would happen, and we didn't have a clear answer then (short of following all discussions on the mailing list). +1 also to link on wiki to scikit-learn sprints organized by other organizations. -- Roman On 05/12/2019 10:56, Chiara Marmo wrote: > Dear core-devs, > > I would like to advertise about our Paris sprint (end of January) on the > scikit-learn wiki. > If there are no objections, my goal is to make a list of events in the > 'Upcoming events' page [1] and link the event pages from there. > This will allow to link also events organised by other entities (like > WiMLDS) even if pages are not hosted there. > > Please, let me know if you all are ok with that. > > Thanks for listening, > best > > Chiara > > [1] https://github.com/scikit-learn/scikit-learn/wiki/Upcoming-events > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > From marmochiaskl at gmail.com Mon Dec 9 05:42:11 2019 From: marmochiaskl at gmail.com (Chiara Marmo) Date: Mon, 9 Dec 2019 11:42:11 +0100 Subject: [scikit-learn] Paris Sprint in January and wiki update Message-ID: Dear all, as promised I have published a page about the January Sprint in Paris on the scikit-learn wiki [1] More details will be added but the link to the registration form and a preliminary programme are already there. Is that worthing a Tweet? I mean a scikit_learn tweet... At the same time an "Upcoming events" page is now available [2] : I have linked there the WiLMDS sprint in Berlin... this is the only future event of which I'm aware of... Is this page worthing a tweet too? :) Thanks for listening, Best, Chiara [1] https://github.com/scikit-learn/scikit-learn/wiki/Paris-scikit-learn-Sprint-of-the-Decade [2] https://github.com/scikit-learn/scikit-learn/wiki/Upcoming-events -------------- next part -------------- An HTML attachment was scrubbed... URL: From char at upatras.gr Mon Dec 9 07:01:09 2019 From: char at upatras.gr (Christos Aridas) Date: Mon, 9 Dec 2019 14:01:09 +0200 Subject: [scikit-learn] ANN: imbalanced-learn 0.6 released Message-ID: Hi all, We're happy to announce the 0.6 (and 0.6.1) release. imbalanced-learn is a toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced data sets frequently encountered in machine learning and pattern recognition. In this release several enhancements, bug fixes and maintenance tasks have been introduced. Notable enhancements: 1) Pandas DataFrame in, Pandas DataFrame out. 2) Speed-up of the resampling procedure in some samplers due to internal vectorization. The full changelog can be found here: http://imbalanced-learn.org/stable/whats_new.html The new release of imbalanced-learn is already available via pip and conda! For more information, examples, and documentation, please visit our website: http://imbalanced-learn.org Cheers, Chris, on behalf of the imbalanced-learn team. -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrin.jalali at gmail.com Mon Dec 9 13:51:44 2019 From: adrin.jalali at gmail.com (Adrin) Date: Mon, 9 Dec 2019 19:51:44 +0100 Subject: [scikit-learn] Paris Sprint in January and wiki update In-Reply-To: References: Message-ID: Awesome, thanks. Tweeting about them sounds good to me. But I can't find the threading option from the tweetdeck. I see it only from the app or the main website, and I can't tweet on @scikit-learn's behalf from there. On Mon, Dec 9, 2019 at 11:43 AM Chiara Marmo wrote: > Dear all, > > as promised I have published a page about the January Sprint in Paris on > the scikit-learn wiki [1] > More details will be added but the link to the registration form and a > preliminary programme are already there. > > Is that worthing a Tweet? I mean a scikit_learn tweet... > > At the same time an "Upcoming events" page is now available [2] : I have > linked there the WiLMDS sprint in Berlin... this is the only future event > of which I'm aware of... > > Is this page worthing a tweet too? :) > > Thanks for listening, > > Best, > Chiara > > [1] > https://github.com/scikit-learn/scikit-learn/wiki/Paris-scikit-learn-Sprint-of-the-Decade > [2] https://github.com/scikit-learn/scikit-learn/wiki/Upcoming-events > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bertrand.thirion at inria.fr Sun Dec 15 17:37:12 2019 From: bertrand.thirion at inria.fr (bthirion) Date: Sun, 15 Dec 2019 23:37:12 +0100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: +1 for me. Best, Bertrand On 03/12/2019 23:09, Nicolas Hug wrote: > > As per our Governance > document, changes to API principles are to be established through an > Enhancement Proposal (SLEP) from which any core developer can call for > a vote on its acceptance. > > * > * > *SLEP010: n_features_in attribute *is up for a vote. Please see > https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html* > * > * > * > *This SLEP proposes the introduction of a public n_features_in_ > attribute for most estimators* > > Core developers are invited to vote on this change until 4 January > 2020 by replying to this email thread. > > All members of the community are welcome to comment on the proposal on > this mailing list, or to propose minor changes through Issues and Pull > Requests at https://github.com/scikit-learn/enhancement_proposals/. > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.lemaitre58 at gmail.com Mon Dec 16 05:00:32 2019 From: g.lemaitre58 at gmail.com (=?UTF-8?Q?Guillaume_Lema=C3=AEtre?=) Date: Mon, 16 Dec 2019 11:00:32 +0100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: I am +1 as well. I think that what proposed by @Joel Nothman should be considered. It seems that we have cases that we know that it is not meant to have the parameters (e.g., Vectorizer). I think that it would make sense to have an estimator tag. Thus, the FutureWarning for a third-party library might mention adding n_feature_in_ or adding an estimator tag if it does not apply. On Sun, 15 Dec 2019 at 23:39, bthirion wrote: > +1 for me. > Best, > Bertrand > > On 03/12/2019 23:09, Nicolas Hug wrote: > > As per our Governance > document, changes to API principles are to be established through an > Enhancement Proposal (SLEP) from which any core developer can call for a > vote on its acceptance. > > *SLEP010: n_features_in attribute *is up for a vote. Please see > https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep010/proposal.html > > *This SLEP proposes the introduction of a public n_features_in_ attribute > for most estimators* > > Core developers are invited to vote on this change until 4 January 2020 by > replying to this email thread. > > All members of the community are welcome to comment on the proposal on > this mailing list, or to propose minor changes through Issues and Pull > Requests at https://github.com/scikit-learn/enhancement_proposals/. > > _______________________________________________ > scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -- Guillaume Lemaitre Scikit-learn @ Inria Foundation https://glemaitre.github.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexandre.gramfort at inria.fr Mon Dec 16 16:17:41 2019 From: alexandre.gramfort at inria.fr (Alexandre Gramfort) Date: Mon, 16 Dec 2019 22:17:41 +0100 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: +1 on SLEP + adding an estimator tag if it does not apply eg Text vectorizers etc. Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From dstromberg at grokstream.com Mon Dec 16 20:02:09 2019 From: dstromberg at grokstream.com (Dan Stromberg) Date: Mon, 16 Dec 2019 17:02:09 -0800 Subject: [scikit-learn] Heisenbug? Message-ID: Hi folks. I'm new to Scikit-learn. I have a very large Python project that seems to have a heisenbug which is manifesting in scikit-learn code. Short of constructing an SSCCE, are there any magical techniques I should try for pinning down the precise cause? Like valgrind or something? An SSCCE will most likely be pretty painful: the project has copious shared, mutable state, and I've already tried a largish test program that calls into the same code path with the error manifesting 0 times in 100. It's quite possible the root cause will turn out to be some other part of the software stack. The traceback from pytest looks like: sequential/test_training.py:101: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../rt/classifier/coach.py:146: in train **self.classifier_section ../domain/classifier/factories/classifier_academy.py:115: in create_classifier **kwargs) ../domain/classifier/factories/imp/xgb_factory.py:164: in create clf_random.fit(X_train, y_train) ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:722: in fit self._run_search(evaluate_candidates) ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:1515: in _run_search random_state=self.random_state)) ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:711: in evaluate_candidates cv.split(X, y, groups))) ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:996: in __call__ self.retrieve() ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:899: in retrieve self._output.extend(job.get(timeout=self.timeout)) ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py:517: in wrap_future_result return future.result(timeout=timeout) /usr/lib/python3.6/concurrent/futures/_base.py:425: in result return self.__get_result() _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = def __get_result(self): if self._exception: > raise self._exception E ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). /usr/lib/python3.6/concurrent/futures/_base.py:384: ValueError The above exception is raised about 12 to 14 times in 100 in full-blown automated testing. Thanks for the cool software. -------------- next part -------------- An HTML attachment was scrubbed... URL: From joel.nothman at gmail.com Tue Dec 17 02:19:44 2019 From: joel.nothman at gmail.com (Joel Nothman) Date: Tue, 17 Dec 2019 18:19:44 +1100 Subject: [scikit-learn] Heisenbug? In-Reply-To: References: Message-ID: Hi Dan, this kind of error can come from overflow. Are all of your test systems the same architecture? On Tue., 17 Dec. 2019, 12:03 pm Dan Stromberg, wrote: > Hi folks. > > I'm new to Scikit-learn. > > I have a very large Python project that seems to have a heisenbug which is > manifesting in scikit-learn code. > > Short of constructing an SSCCE, are there any magical techniques I should > try for pinning down the precise cause? Like valgrind or something? > > An SSCCE will most likely be pretty painful: the project has copious > shared, mutable state, and I've already tried a largish test program that > calls into the same code path with the error manifesting 0 times in 100. > > It's quite possible the root cause will turn out to be some other part of > the software stack. > > The traceback from pytest looks like: > sequential/test_training.py:101: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ > ../rt/classifier/coach.py:146: in train > **self.classifier_section > ../domain/classifier/factories/classifier_academy.py:115: in > create_classifier > **kwargs) > ../domain/classifier/factories/imp/xgb_factory.py:164: in create > clf_random.fit(X_train, y_train) > ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:722: > in fit > self._run_search(evaluate_candidates) > ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:1515: > in _run_search > random_state=self.random_state)) > ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:711: > in evaluate_candidates > cv.split(X, y, groups))) > ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:996: > in __call__ > self.retrieve() > ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:899: > in retrieve > self._output.extend(job.get(timeout=self.timeout)) > ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py:517: > in wrap_future_result > return future.result(timeout=timeout) > /usr/lib/python3.6/concurrent/futures/_base.py:425: in result > return self.__get_result() > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ _ _ _ _ _ _ _ _ _ _ _ > > self = > > def __get_result(self): > if self._exception: > > raise self._exception > E ValueError: Input contains NaN, infinity or a value too large > for dtype('float32'). > > /usr/lib/python3.6/concurrent/futures/_base.py:384: ValueError > > > The above exception is raised about 12 to 14 times in 100 in full-blown > automated testing. > > Thanks for the cool software. > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Tue Dec 17 10:14:38 2019 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Tue, 17 Dec 2019 16:14:38 +0100 Subject: [scikit-learn] Paris Sprint in January and wiki update In-Reply-To: References: Message-ID: Indeed I do not see the "circle add" button in the tweetdeck UI anymore. But it's ok not to prepare the threads before tweeting the first tweet. We can build the thread progressively by publishing the first tweet and then replying one tweet after the other by hitting the reply button of the last published tweet in the thread. From dstromberg at grokstream.com Tue Dec 17 10:50:28 2019 From: dstromberg at grokstream.com (Dan Stromberg) Date: Tue, 17 Dec 2019 07:50:28 -0800 Subject: [scikit-learn] Heisenbug? In-Reply-To: References: Message-ID: Hi. Overflow does sound kind of possible. We're sending semi-random values to the test. I believe our systems are all x86_64, Linux. Some are Ubuntu 16.04, some are Mint 19.2. I realized on the way to work this morning, that I left out some important information; I suspect a heisenbug for 3 reasons: 1) If I try to look at it with print functions, I get a traceback after the print's, but no print output. This happens with both writing to a disk-based file, and with printing to stdout. 2) If I try to look at it with pudb (a debugger) via pudb.set_trace(), I get a failure to start pudb. 3) If I create a small test program that sends the same inputs to the function in question, the function works fine. Thanks. On Mon, Dec 16, 2019 at 11:20 PM Joel Nothman wrote: > Hi Dan, this kind of error can come from overflow. Are all of your test > systems the same architecture? > > On Tue., 17 Dec. 2019, 12:03 pm Dan Stromberg, > wrote: > >> Hi folks. >> >> I'm new to Scikit-learn. >> >> I have a very large Python project that seems to have a heisenbug which >> is manifesting in scikit-learn code. >> >> Short of constructing an SSCCE, are there any magical techniques I should >> try for pinning down the precise cause? Like valgrind or something? >> >> An SSCCE will most likely be pretty painful: the project has copious >> shared, mutable state, and I've already tried a largish test program that >> calls into the same code path with the error manifesting 0 times in 100. >> >> It's quite possible the root cause will turn out to be some other part of >> the software stack. >> >> The traceback from pytest looks like: >> sequential/test_training.py:101: >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >> _ _ _ _ _ _ _ _ _ _ _ _ >> ../rt/classifier/coach.py:146: in train >> **self.classifier_section >> ../domain/classifier/factories/classifier_academy.py:115: in >> create_classifier >> **kwargs) >> ../domain/classifier/factories/imp/xgb_factory.py:164: in create >> clf_random.fit(X_train, y_train) >> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:722: >> in fit >> self._run_search(evaluate_candidates) >> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:1515: >> in _run_search >> random_state=self.random_state)) >> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:711: >> in evaluate_candidates >> cv.split(X, y, groups))) >> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:996: >> in __call__ >> self.retrieve() >> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:899: >> in retrieve >> self._output.extend(job.get(timeout=self.timeout)) >> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py:517: >> in wrap_future_result >> return future.result(timeout=timeout) >> /usr/lib/python3.6/concurrent/futures/_base.py:425: in result >> return self.__get_result() >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >> _ _ _ _ _ _ _ _ _ _ _ _ >> >> self = >> >> def __get_result(self): >> if self._exception: >> > raise self._exception >> E ValueError: Input contains NaN, infinity or a value too large >> for dtype('float32'). >> >> /usr/lib/python3.6/concurrent/futures/_base.py:384: ValueError >> >> >> The above exception is raised about 12 to 14 times in 100 in full-blown >> automated testing. >> >> Thanks for the cool software. >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dstromberg at grokstream.com Tue Dec 17 11:08:11 2019 From: dstromberg at grokstream.com (Dan Stromberg) Date: Tue, 17 Dec 2019 08:08:11 -0800 Subject: [scikit-learn] Heisenbug? In-Reply-To: References: Message-ID: Here are the inputs to _assert_all_finite() on one specific failed run. They look finite to me: X: array([0.6150936 , 0.24652782, 0.8880004 , 0.2016928 , 0.80948585, 0.10764928, 0.81631166, 0.25909033, 0.9299345 , 0.10186833, 0.81581795, 0.21659133, 0.8279047 , 0.11432098, 0.7335735 , 0.20154186, 0.85112196, 0.17447269, 0.5934462 , 0.3967309 , 0.83702815, 0.35380727, 0.75063705, 0.32200715, 0.85112196, 0.11191818, 0.6814021 , 0.11622761, 0.851942 , 0.1892652 , 0.8554932 , 0.17869748], dtype=float32) allow_nan: False On Tue, Dec 17, 2019 at 7:50 AM Dan Stromberg wrote: > > Hi. > > Overflow does sound kind of possible. We're sending semi-random values to > the test. > > I believe our systems are all x86_64, Linux. Some are Ubuntu 16.04, some > are Mint 19.2. > > I realized on the way to work this morning, that I left out some important > information; I suspect a heisenbug for 3 reasons: > > 1) If I try to look at it with print functions, I get a traceback after > the print's, but no print output. This happens with both writing to a > disk-based file, and with printing to stdout. > > 2) If I try to look at it with pudb (a debugger) via pudb.set_trace(), I > get a failure to start pudb. > > 3) If I create a small test program that sends the same inputs to the > function in question, the function works fine. > > Thanks. > > On Mon, Dec 16, 2019 at 11:20 PM Joel Nothman > wrote: > >> Hi Dan, this kind of error can come from overflow. Are all of your test >> systems the same architecture? >> >> On Tue., 17 Dec. 2019, 12:03 pm Dan Stromberg, >> wrote: >> >>> Hi folks. >>> >>> I'm new to Scikit-learn. >>> >>> I have a very large Python project that seems to have a heisenbug which >>> is manifesting in scikit-learn code. >>> >>> Short of constructing an SSCCE, are there any magical techniques I >>> should try for pinning down the precise cause? Like valgrind or something? >>> >>> An SSCCE will most likely be pretty painful: the project has copious >>> shared, mutable state, and I've already tried a largish test program that >>> calls into the same code path with the error manifesting 0 times in 100. >>> >>> It's quite possible the root cause will turn out to be some other part >>> of the software stack. >>> >>> The traceback from pytest looks like: >>> sequential/test_training.py:101: >>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >>> _ _ _ _ _ _ _ _ _ _ _ _ _ >>> ../rt/classifier/coach.py:146: in train >>> **self.classifier_section >>> ../domain/classifier/factories/classifier_academy.py:115: in >>> create_classifier >>> **kwargs) >>> ../domain/classifier/factories/imp/xgb_factory.py:164: in create >>> clf_random.fit(X_train, y_train) >>> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:722: >>> in fit >>> self._run_search(evaluate_candidates) >>> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:1515: >>> in _run_search >>> random_state=self.random_state)) >>> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:711: >>> in evaluate_candidates >>> cv.split(X, y, groups))) >>> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:996: >>> in __call__ >>> self.retrieve() >>> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:899: >>> in retrieve >>> self._output.extend(job.get(timeout=self.timeout)) >>> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py:517: >>> in wrap_future_result >>> return future.result(timeout=timeout) >>> /usr/lib/python3.6/concurrent/futures/_base.py:425: in result >>> return self.__get_result() >>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >>> _ _ _ _ _ _ _ _ _ _ _ _ _ >>> >>> self = >>> >>> def __get_result(self): >>> if self._exception: >>> > raise self._exception >>> E ValueError: Input contains NaN, infinity or a value too >>> large for dtype('float32'). >>> >>> /usr/lib/python3.6/concurrent/futures/_base.py:384: ValueError >>> >>> >>> The above exception is raised about 12 to 14 times in 100 in full-blown >>> automated testing. >>> >>> Thanks for the cool software. >>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn at python.org >>> https://mail.python.org/mailman/listinfo/scikit-learn >>> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pavelgus at protonmail.com Wed Dec 18 11:07:34 2019 From: pavelgus at protonmail.com (Pavel G.) Date: Wed, 18 Dec 2019 16:07:34 +0000 Subject: [scikit-learn] Calculate cohen_kappa_score class wise? Message-ID: Hi sklearn developers ans users, The cohen_kappa_score calculates overall score for all classes. y_true = [0, 1, 2, 2, 2] y_pred = [0, 0, 2, 2, 1] target_names = ['class 0', 'class 1', 'class 2'] from sklearn.metrics import cohen_kappa_score print (cohen_kappa_score(y_true, y_pred)) 0.375 How can I calculate cohen_kappa_score class wise, like the classification_report does. I hope someone can help me. Thank you. Pavel G. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dstromberg at grokstream.com Wed Dec 18 19:09:21 2019 From: dstromberg at grokstream.com (Dan Stromberg) Date: Wed, 18 Dec 2019 16:09:21 -0800 Subject: [scikit-learn] Heisenbug? In-Reply-To: References: Message-ID: Any (further) suggestions folks? BTW, when I say pudb fails to start, I mean it's tracebacking trying to get None.fileno() In other pieces of (C)Python code I've tried it in, pudb.set_trace() worked nicely. On Tue, Dec 17, 2019 at 7:50 AM Dan Stromberg wrote: > > Hi. > > Overflow does sound kind of possible. We're sending semi-random values to > the test. > > I believe our systems are all x86_64, Linux. Some are Ubuntu 16.04, some > are Mint 19.2. > > I realized on the way to work this morning, that I left out some important > information; I suspect a heisenbug for 3 reasons: > > 1) If I try to look at it with print functions, I get a traceback after > the print's, but no print output. This happens with both writing to a > disk-based file, and with printing to stdout. > > 2) If I try to look at it with pudb (a debugger) via pudb.set_trace(), I > get a failure to start pudb. > > 3) If I create a small test program that sends the same inputs to the > function in question, the function works fine. > > Thanks. > > On Mon, Dec 16, 2019 at 11:20 PM Joel Nothman > wrote: > >> Hi Dan, this kind of error can come from overflow. Are all of your test >> systems the same architecture? >> >> On Tue., 17 Dec. 2019, 12:03 pm Dan Stromberg, >> wrote: >> >>> Hi folks. >>> >>> I'm new to Scikit-learn. >>> >>> I have a very large Python project that seems to have a heisenbug which >>> is manifesting in scikit-learn code. >>> >>> Short of constructing an SSCCE, are there any magical techniques I >>> should try for pinning down the precise cause? Like valgrind or something? >>> >>> An SSCCE will most likely be pretty painful: the project has copious >>> shared, mutable state, and I've already tried a largish test program that >>> calls into the same code path with the error manifesting 0 times in 100. >>> >>> It's quite possible the root cause will turn out to be some other part >>> of the software stack. >>> >>> The traceback from pytest looks like: >>> sequential/test_training.py:101: >>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >>> _ _ _ _ _ _ _ _ _ _ _ _ _ >>> ../rt/classifier/coach.py:146: in train >>> **self.classifier_section >>> ../domain/classifier/factories/classifier_academy.py:115: in >>> create_classifier >>> **kwargs) >>> ../domain/classifier/factories/imp/xgb_factory.py:164: in create >>> clf_random.fit(X_train, y_train) >>> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:722: >>> in fit >>> self._run_search(evaluate_candidates) >>> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:1515: >>> in _run_search >>> random_state=self.random_state)) >>> ../../../../.local/lib/python3.6/site-packages/sklearn/model_selection/_search.py:711: >>> in evaluate_candidates >>> cv.split(X, y, groups))) >>> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:996: >>> in __call__ >>> self.retrieve() >>> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py:899: >>> in retrieve >>> self._output.extend(job.get(timeout=self.timeout)) >>> ../../../../.local/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py:517: >>> in wrap_future_result >>> return future.result(timeout=timeout) >>> /usr/lib/python3.6/concurrent/futures/_base.py:425: in result >>> return self.__get_result() >>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >>> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ >>> _ _ _ _ _ _ _ _ _ _ _ _ _ >>> >>> self = >>> >>> def __get_result(self): >>> if self._exception: >>> > raise self._exception >>> E ValueError: Input contains NaN, infinity or a value too >>> large for dtype('float32'). >>> >>> /usr/lib/python3.6/concurrent/futures/_base.py:384: ValueError >>> >>> >>> The above exception is raised about 12 to 14 times in 100 in full-blown >>> automated testing. >>> >>> Thanks for the cool software. >>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn at python.org >>> https://mail.python.org/mailman/listinfo/scikit-learn >>> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomasjpfan at gmail.com Sat Dec 21 12:35:28 2019 From: thomasjpfan at gmail.com (Thomas J Fan) Date: Sat, 21 Dec 2019 12:35:28 -0500 Subject: [scikit-learn] Vote on SLEP010: n_features_in_ attribute In-Reply-To: References: <4600b19a-c06a-5ed5-0f14-dbf5a0a7cd5b@gmail.com> Message-ID: <18c5d963-0b7a-45ad-bd6d-0c9146be58b3@Canary> I am +1. I aggree with Joel that we should look into making these methods (or maybe functions) usable by external developers. Thomas > On Monday, Dec 16, 2019 at 4:20 PM, Alexandre Gramfort wrote: > +1 on SLEP + adding an estimator tag if it does not apply eg Text vectorizers etc. > > Alex > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: