From joel.nothman at gmail.com Mon Jun 1 10:04:28 2020 From: joel.nothman at gmail.com (Joel Nothman) Date: Tue, 2 Jun 2020 00:04:28 +1000 Subject: [scikit-learn] major league hacking summer internship program In-Reply-To: References: <066301d62dfc$af92cb90$0eb862b0$@gmail.com> <050001d63537$b4fdbe90$1ef93bb0$@gmail.com> Message-ID: I put together than inappropriate-for-purpose list of things with distance metrics when you asked re gblomier! But maybe still not fit for this purpose. On Sat, 30 May 2020 at 00:23, Andreas Mueller wrote: > Thanks folks! That gives us a good start I think! > > Re documentation: honestly I'm not entirely sure if those are good issues > because I'm not sure if we have consensus what we want to recommend. We can > certainly include these but they require some decisions and a lot of > expertise. Maybe we can discuss further issues either here or on gitter? > > Andy > > > On Fri, May 29, 2020, 09:45 Thomas J Fan wrote: > >> I can commit to reviewing. Diving into their program, it looks like they >> are hiring supervisers through: https://raise.dev/Apply/?ref=mlh which >> is titled "Software Developer Coach". By looking at their >> https://fellowship.mlh.io/students they have about 9 weeks of actual >> contributing. >> >> Given they have an engineer to help, maybe they can work on the >> documenting the production aspects: >> >> 1. Roadmap item 19: Documentation and tooling for model lifecycle >> management >> 2. Roadmap item 21: Document good practices to detect temporal >> distrubiton drift >> >> Regards, >> Thomas >> >> On Thursday, May 28, 2020 at 5:36 PM, Andreas C. Mueller < >> andreasmuellerml at gmail.com> wrote: >> >> Hi Folks. >> >> So this program sounds pretty cool. They preselected some people for an >> ML work group, who will be doing daily standups together and pair >> programming, >> and who might move around between some related projects over the 12 weeks >> of the program. >> >> They made sure to get a diverse set of students and they have an engineer >> that will supervised them. >> >> They would probably have 2-3 students working on sklearn. >> >> They don?t expect having one big feature but they do expect some guidance >> on what issues to work on. >> >> Also, the program starts on Monday, and they start contributing to OSS >> projects about a week after that. >> >> Ideally we?d tell them if we?re in or not before Monday, and have a >> tentative list of issues / projects. >> >> >> >> What do you all think? Also, if we want to do it, who would have cycles >> for some reviewing? >> >> This seems to be well organized and they seem to have put quite some >> thought into it, but we do need to do a little bit of work on our end. >> >> I can try picking some issues but I probably can?t commit a lot of >> reviewing time. >> >> >> >> Cheers, >> >> Andy >> >> >> >> >> >> *From:* scikit-learn *On >> Behalf Of *Adrin >> *Sent:* Tuesday, May 19, 2020 3:42 PM >> *To:* Scikit-learn mailing list >> *Subject:* Re: [scikit-learn] major league hacking summer internship >> program >> >> >> >> Sounds pretty cool to me. >> >> >> >> On Tue, May 19, 2020 at 6:45 PM wrote: >> >> Hey Folks. >> >> This program reached out to me: >> >> >> https://news.mlh.io/mlh-fellowship-the-future-of-tech-internships-05-04-2020 >> >> >> >> What do you think? >> >> Sounds like GSOC but with extra mentorship, so it might be a good fit for >> us? >> >> I would say it depends on what level of involvement they require from us. >> >> >> >> Best, >> >> Andy >> >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn at python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn > -------------- next part -------------- An HTML attachment was scrubbed... URL: From apusarkar at gmail.com Tue Jun 2 04:02:39 2020 From: apusarkar at gmail.com (Apu Sarkar) Date: Tue, 2 Jun 2020 13:32:39 +0530 Subject: [scikit-learn] Gaussian Process Regression with multiple features Message-ID: Hi, I am trying to predict a target (DeltaNDT) which is dependent on five features (Cu, Ni, P, T, Fluence). Please find attached the ipython notebook and the csv input data. gpr.predict is generating predictions for the train data. However, with test_preds = gpr.predict(X_test) all values test_preds are zeros. Please help me to find the mistake in my approach. Thanks. Apu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: GP_Regression_multivariate.ipynb Type: application/octet-stream Size: 107652 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: japan_baseweld.csv Type: application/octet-stream Size: 3054 bytes Desc: not available URL: From apusarkar at gmail.com Tue Jun 2 06:37:29 2020 From: apusarkar at gmail.com (Apu Sarkar) Date: Tue, 2 Jun 2020 16:07:29 +0530 Subject: [scikit-learn] Gaussian Process Regression with multiple features Message-ID: Apu Sarkar [image: image.gif] 1:32 PM (2 hours ago) [image: image.gif] [image: image.gif] to scikit-learn [image: image.gif] Hi, I am trying to predict a target (DeltaNDT) which is dependent on five features (Cu, Ni, P, T, Fluence). Please check the ipython notebook and the csv input data: https://github.com/apusarkar/GP-Regression gpr.predict is generating predictions for the train data. However, with test_preds = gpr.predict(X_test) all values test_preds are zeros. Please help me to find the mistake in my approach. Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.gif Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.gif Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.gif Type: image/gif Size: 43 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.gif Type: image/gif Size: 43 bytes Desc: not available URL: From keldendraduldorji at gmail.com Thu Jun 11 09:12:04 2020 From: keldendraduldorji at gmail.com (Kelden Dorji) Date: Thu, 11 Jun 2020 18:42:04 +0530 Subject: [scikit-learn] Fwd: Question regarding regression models In-Reply-To: References: Message-ID: Hi scikit-learn, I have a question related to regression models. Please find my question in the link below. I am still new to this and would appreciate any help. Thank you and have a nice day! https://stackoverflow.com/questions/62325079/issues-with-regression-model-giving-inverse-relationship Kelden Dradul Dorji -------------- next part -------------- An HTML attachment was scrubbed... URL: From seralouk at hotmail.com Thu Jun 11 09:53:08 2020 From: seralouk at hotmail.com (serafim loukas) Date: Thu, 11 Jun 2020 13:53:08 +0000 Subject: [scikit-learn] Question regarding regression models In-Reply-To: References: Message-ID: <54C62B0F-0A9C-4743-B9C7-4F416C589B86@hotmail.com> Hi Kelden, I answered your SO question but for the record this is what happens: date_index is a scalar and you type date_index.columns which raises the error. So you just need this: def predict_price(dates,price): date_index = np.where(date_format.columns == dates)[0][0] x = np.zeros(len(date_format.columns)) if date_index >= 0: x[date_index] = 1 return prediction.predict([x])[0] predict_price('Feb 20, 2018', 1000) Bests, Makis On 11 Jun 2020, at 15:12, Kelden Dorji > wrote: Hi scikit-learn, I have a question related to regression models. Please find my question in the link below. I am still new to this and would appreciate any help. Thank you and have a nice day! https://stackoverflow.com/questions/62325079/issues-with-regression-model-giving-inverse-relationship Kelden Dradul Dorji _______________________________________________ scikit-learn mailing list scikit-learn at python.org https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From char at upatras.gr Mon Jun 15 22:07:01 2020 From: char at upatras.gr (Christos Aridas) Date: Tue, 16 Jun 2020 05:07:01 +0300 Subject: [scikit-learn] ANN: imbalanced-learn 0.7 released Message-ID: Hi all, We're happy to announce the 0.7 version of imbalanced-learn. imbalanced-learn is a toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced data sets frequently encountered in machine learning and data mining. This release should be fully compatible with scikit-learn 0.23 The full changelog can be found here: http://imbalanced-learn.org/stable/whats_new.html The new release of imbalanced-learn is already available via pip and conda! For more information, examples, and documentation, please visit our website: http://imbalanced-learn.org Cheers, Chris, on behalf of the imbalanced-learn team. -------------- next part -------------- An HTML attachment was scrubbed... URL: From DELSO at clinic.cat Fri Jun 19 11:39:21 2020 From: DELSO at clinic.cat (DELSO, GASPAR (ICCV)) Date: Fri, 19 Jun 2020 15:39:21 +0000 Subject: [scikit-learn] Permission to publish Message-ID: Hi, We were wondering whether it would be possible to reproduce one figure from your Users' Guide in a publication. Who would be the right contact person for this? Best regards Gaspar Delso From niourf at gmail.com Fri Jun 19 12:40:25 2020 From: niourf at gmail.com (Nicolas Hug) Date: Fri, 19 Jun 2020 12:40:25 -0400 Subject: [scikit-learn] Permission to publish In-Reply-To: References: Message-ID: <48217630-d42e-e134-eb8b-565ae44646ca@gmail.com> Hi Gaspar, The package and the docs are BSD licensed so you're free to use the content in a publication. If you use scikit-learn, please make sure to cite the package https://scikit-learn.org/stable/about.html#citing-scikit-learn Nicolas On 6/19/20 11:39 AM, DELSO, GASPAR (ICCV) wrote: > Hi, > > We were wondering whether it would be possible to reproduce one figure from your Users' Guide in a publication. > Who would be the right contact person for this? > > Best regards > Gaspar Delso > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn From roy.pamphile at gmail.com Sat Jun 20 17:07:58 2020 From: roy.pamphile at gmail.com (Pamphile Roy) Date: Sat, 20 Jun 2020 11:07:58 -1000 Subject: [scikit-learn] Sensitivity analysis PR proposal Message-ID: <9BCB26E3-3F3B-4FD8-978B-4C9430DD961B@gmail.com> Hi, Is there a wish to have sensitivity analysis? Currently we can measure the quality of a regressor?s output with a metric, but there is nothing to on the parameter side. It would be handy to have sensitivity measures of the input parameters on the output. Namely: Sobol? indices (most used). I have some starting code already https://github.com/tupui/otsensitivity and https://github.com/tupui/batman/blob/master/batman/visualization/density.py . Sobol' (first order and total). COSI indices. CUSUNORO. Moment independent measures. Some visualization functions. Let me know if there is interest for a PR and if yes, what shall I include :) Pamphile @tupui -------------- next part -------------- An HTML attachment was scrubbed... URL: From marmochiaskl at gmail.com Mon Jun 22 03:11:35 2020 From: marmochiaskl at gmail.com (Chiara Marmo) Date: Mon, 22 Jun 2020 09:11:35 +0200 Subject: [scikit-learn] scikit-learn monthly meeting June 29th Message-ID: Hi all, The next scikit-learn monthly meeting will take place on Monday June 29th at 12PM UTC: https://www.timeanddate.com/worldclock/meetingdetails.html?year=2020&month=6&day=29&hour=12&min=0&sec=0&p1=240&p2=33&p3=37&p4=179&p5=195 While these meetings are mainly for core-devs to discuss the current topics, we are also happy to welcome non-core devs and other project maintainers. Feel free to join, using the following link: https://meet.google.com/xhq-yoga-rtf If you plan to attend and you would like to discuss something specific about your contribution please add your name (or github pseudo) in the " Contributors " section, of the public pad: https://hackmd.io/iJjuPx3QTQ6AqDLiPVmHFg *@core devs, please make sure to update your notes before the week-end.* Best, Chiara -------------- next part -------------- An HTML attachment was scrubbed... URL: From marmochiaskl at gmail.com Fri Jun 26 12:44:05 2020 From: marmochiaskl at gmail.com (Chiara Marmo) Date: Fri, 26 Jun 2020 18:44:05 +0200 Subject: [scikit-learn] scikit-learn monthly meeting June 29th : reminder Message-ID: Hi all, A reminder... :) The next scikit-learn monthly meeting will take place on Monday June 29th at 12PM UTC: https://www.timeanddate.com/worldclock/meetingdetails.html?year=2020&month=6&day=29&hour=12&min=0&sec=0&p1=240&p2=33&p3=37&p4=179&p5=195 While these meetings are mainly for core-devs to discuss the current topics, we are also happy to welcome non-core devs and other project maintainers. Feel free to join, using the following link: https://meet.google.com/xhq-yoga-rtf If you plan to attend and you would like to discuss something specific about your contribution please add your name (or github pseudo) in the " Contributors " section, of the public pad: https://hackmd.io/iJjuPx3QTQ6AqDLiPVmHFg *@core devs, please make sure to update your notes.* Best, Chiara -------------- next part -------------- An HTML attachment was scrubbed... URL: From prashantasaha at montana.edu Sun Jun 28 23:11:29 2020 From: prashantasaha at montana.edu (Saha, Prashanta) Date: Mon, 29 Jun 2020 03:11:29 +0000 Subject: [scikit-learn] Need some information regarding a GaussianNB package Message-ID: Hi , I have been using your GaussianNB package for machine learning modeling by naive bayes algorithm. I have generated train and test set using scikit learn (set1). Then I modified that train and test set (set2) in a controlled way for my experiment. The classification should perform similarly (should return same class label for set1 and set2) according to the theory but it is behaving otherwise. Can you give me any insights why is this happening? Thanks, prashanta saha, PhD student, Montana State University,Bozeman,MT. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ross at cgl.ucsf.edu Sun Jun 28 23:49:09 2020 From: ross at cgl.ucsf.edu (Bill Ross) Date: Sun, 28 Jun 2020 20:49:09 -0700 Subject: [scikit-learn] Need some information regarding a GaussianNB package In-Reply-To: References: Message-ID: <721f6530-b229-b06d-7fc3-304698bf9701@cgl.ucsf.edu> I suspect folk would need to know what the 'controlled way' was by which you derived set2 from set1. Bill On 6/28/20 8:11 PM, Saha, Prashanta wrote: > Hi , > I have been using your?GaussianNB package for machine learning > modeling by naive bayes algorithm. I have generated train and test set > using scikit learn (set1). Then I modified that train and test set > (set2) in a controlled way for my experiment. The classification > should perform similarly (should return same class label for set1 and > set2) according to the theory but it is behaving otherwise. Can you > give me any insights why is this happening? > > Thanks, > prashanta saha, > PhD student, > Montana State University,Bozeman,MT. > > > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -------------- next part -------------- An HTML attachment was scrubbed... URL: From solegalli at protonmail.com Mon Jun 29 04:06:58 2020 From: solegalli at protonmail.com (Sole Galli) Date: Mon, 29 Jun 2020 08:06:58 +0000 Subject: [scikit-learn] climate friendly software licence Message-ID: Hello Scikit-learn team, I've come across this: https://twitter.com/tristanharris/status/1277136696568508418?s=12 Basically, it is an initiative to include in software license a prohibition of use by fossil fuel extractivist companies. I would like to know your views on this? Is this something that you would pick up from Scikit-learn? Are there some legal concerns to be aware of? or anything else that should be considered? Because it sounds quite powerful and straightforward to me. I would be really keen to hear from you. Thanks a lot Sole -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Mon Jun 29 09:40:24 2020 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 29 Jun 2020 15:40:24 +0200 Subject: [scikit-learn] climate friendly software licence In-Reply-To: References: Message-ID: <20200629134024.zgvu7fgrogqsv3pw@phare.normalesup.org> Hi Sole, I personally believe that global warming is the most important threat to our well-being, together with the rise of fascism. However, legal matters are seldom easy (IANAL). It is unclear that those licenses are enforceable. See for instance the discussion from Bruce Perens, who has a huge amount of experience in open source licensing: https://perens.com/2019/10/12/invasion-of-the-ethical-licenses/ (credit to Andy Mueller for digging up this reference). The more common a software license is, the more likely a team is to hold in court, and the less likely a team is to have legal fees to cover (which would kill us, as a project). Best, Ga?l On Mon, Jun 29, 2020 at 08:06:58AM +0000, Sole Galli via scikit-learn wrote: > Hello Scikit-learn team, > I've come across this: > https://twitter.com/tristanharris/status/1277136696568508418?s=12 > Basically, it is an initiative to include in software license a prohibition of > use by fossil fuel extractivist companies. > I would like to know your views on this? Is this something that you would pick > up from Scikit-learn? > Are there some legal concerns to be aware of? or anything else that should be > considered? > Because it sounds quite powerful and straightforward to me. > I would be really keen to hear from you. > Thanks a lot > Sole > _______________________________________________ > scikit-learn mailing list > scikit-learn at python.org > https://mail.python.org/mailman/listinfo/scikit-learn -- Gael Varoquaux Research Director, INRIA Visiting professor, McGill http://gael-varoquaux.info http://twitter.com/GaelVaroquaux From olivier.grisel at ensta.org Mon Jun 29 09:50:59 2020 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 29 Jun 2020 15:50:59 +0200 Subject: [scikit-learn] climate friendly software licence In-Reply-To: References: Message-ID: Hi Sole, I personally support climate change actions very much and I am convinced climate change is the number 1 challenge of our time. In an attempt to act in a consistent way with that belief, I declined several times to keynote at conferences either organized by the fossil fuel industry or to conferences that would have required me to fly a long distance to give a presentation. However, I don't think software licensing is a right tool to advance this cause. How would we enforce it? What would happen if we don't enforce it? Who is "we", especially when our library is embedded in 3-rd party software product and the end-users are not necessarily aware of all the upstream dependencies? What about gray-cases, e.g. a company that does not fossil directly extraction per-se but works as a consultancy with a majority of customers in the fossil fuel extraction industry? What if a significant part of their consultancy is to help them detect methane leaks in satellite data? How would we audit this? With which resources? How would we get a consensual decision on those gray cases? What about the hypocrisy of using or contributing to software under that license while regularly using fossil fuel powered transportation or in a working or leaving building heated with fossil fuels? Or buying goods transported this way over long distances? Instead, I would rather encourage everyone to vote for legislators and governments that progressively set bans on the development and commercialization of fossil fuel based technologies and to voice your support for such legislations in public debates. I encourage everybody to look twice before accepting to work for a company involved in fossil fuel extraction one way or another or involved in fossil-fuel intensive activities. -- Olivier From solegalli at protonmail.com Tue Jun 30 04:26:04 2020 From: solegalli at protonmail.com (Sole Galli) Date: Tue, 30 Jun 2020 08:26:04 +0000 Subject: [scikit-learn] climate friendly software licence In-Reply-To: References: Message-ID: Hi Olivier, Gabriel, and further team, Thank you so much for your views. I understand enforcement is an issue. And I don't have yet an answer on if and how the license could be enforced. I also think that this is a second step. First would be making the use of the software illegal. This would de-legitimise these companies from using these packages, which would then hopefully prevent these companies from presenting their destructive work in open source meetings like pydata, or openly hosting tech hub communities where they share the use of this software in an attempt to recruit talent, because now the use of the software is illegal. It would also make organisations like NumFocus stop accepting fossil fuel companies as sponsors, as they did in London 2019 and giving them a space to promote their work. Technical people may also ask twice before joining these companies, if now the use of software is not allowed, even at face value. So I think, even if the license can't be enforced, it does have some power. But, as I said, at the moment I know very little of enforcement and whether package developers could get sued for adding this restriction. Yes, there is a lot we can do as individuals to decrease our carbon footprint, some of us do, and certainly we should put the right people in power, but individual effort is not enough and electing politicians happens only every so many years. We need to do more than that, because the climate situation is very precarious and very urgent unfortunately. Art organisations, newspapers, some banks and many pensions are cutting ties with fossil fuel companies. I think tech should take the plunge as well. If this is not the right way, would you have any suggestions? Cheers Sole ??????? Original Message ??????? On Monday, June 29, 2020 3:50 PM, Olivier Grisel wrote: > Hi Sole, > > I personally support climate change actions very much and I am > convinced climate change is the number 1 challenge of our time. In an > attempt to act in a consistent way with that belief, I declined > several times to keynote at conferences either organized by the fossil > fuel industry or to conferences that would have required me to fly a > long distance to give a presentation. > > However, I don't think software licensing is a right tool to advance this cause. > > How would we enforce it? What would happen if we don't enforce it? Who > is "we", especially when our library is embedded in 3-rd party > software product and the end-users are not necessarily aware of all > the upstream dependencies? > > What about gray-cases, e.g. a company that does not fossil directly > extraction per-se but works as a consultancy with a majority of > customers in the fossil fuel extraction industry? What if a > significant part of their consultancy is to help them detect methane > leaks in satellite data? How would we audit this? With which > resources? How would we get a consensual decision on those gray cases? > > What about the hypocrisy of using or contributing to software under > that license while regularly using fossil fuel powered transportation > or in a working or leaving building heated with fossil fuels? Or > buying goods transported this way over long distances? > > Instead, I would rather encourage everyone to vote for legislators and > governments that progressively set bans on the development and > commercialization of fossil fuel based technologies and to voice your > support for such legislations in public debates. I encourage everybody > to look twice before accepting to work for a company involved in > fossil fuel extraction one way or another or involved in fossil-fuel > intensive activities. > > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Olivier