From jk2k.net at gmail.com  Sat Oct  2 11:25:25 2021
From: jk2k.net at gmail.com (J K)
Date: Sat, 2 Oct 2021 11:25:25 -0400
Subject: [scikit-learn] ROC convex hulls design question
Message-ID: <3CC1A986-E296-430E-86B9-EF9B36E0A3DC@gmail.com>


Dear sklearn mailing list,

I love all the wonderful ways scikit-learn has made good practices in ML more accessible to so many! Thanks for all of that!

I?m wondering if there is there a design reason the default behavior for ROC generation (https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html) doesn?t return the convex hull of the ROC?

In the default ROC computation, the resulting ROCs aren?t on their convex hulls (https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.ConvexHull.html) even though points on the convex hulls are achievable performance. So the default ROCs returned are suboptimal. That?s a point made in Tom Fawcett?s ROC 101 paper (https://www.math.ucdavis.edu/~saito/data/roc/fawcett-roc.pdf) that was cited in the sklearn docs.

He writes: ?More generally, a classifier is potentially optimal if and only if it lies on the convex hull of the set of points in ROC space. The convex hull of the set of points in ROC space is called the ROC convex hull (ROCCH) of the corresponding set of classifiers.?

Apologies if this is already answered somewhere else? I searched and could only find this apparently abandoned repo: https://github.com/tfawcett/pycost

I?ve implemented an ROC convex hull myself and have found significant performance estimate improvements just from using the convex hull and am wondering if there was some reason this wasn?t implemented as the default.

Thanks,
-johnk-

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20211002/3bc359b4/attachment.html>

From gael.varoquaux at normalesup.org  Tue Oct  5 10:09:48 2021
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Tue, 5 Oct 2021 16:09:48 +0200
Subject: [scikit-learn] [TC Vote] Technical Committee vote: line length
In-Reply-To: <482f3b2c-fcff-719b-aa44-6f3c2d4afc0b@gmail.com>
References: <20210726212619.54iy56wbl4sdbe3z@phare.normalesup.org>
 <CAEOrW4-+LpeVeH3NYyLL2NAGWdREomAckXzM=H1h2+yBE1b3kQ@mail.gmail.com>
 <CAFvE7K6FYa3v5=Lxk_uzD6m5FCtdUtJhaMSUNs-_92Y6gXhBPg@mail.gmail.com>
 <482f3b2c-fcff-719b-aa44-6f3c2d4afc0b@gmail.com>
Message-ID: <20211005140948.kjyj35omefzeppcu@phare.normalesup.org>

Hi everyone,

I left for vacations and forgot this (and did not express my vote).

The TC has had plenty of time to vote, my own vote is in favor of the
consensus in very active developers.

My count of the expressed vote is the following:

- Keep current 88 characters:

    Olivier Grisel
    Joel Nothman
    Ga?l Varoquaux

- Revert to 79 characters:

   Alex Gramfort
   Adrin Jalali

- Answer with no preference expressed:

   Roman Yurchak

So the decision is to use 88 chars, which means no action is needed.

Thank you everyone!

Ga?l

On Mon, Aug 02, 2021 at 11:15:48AM +0200, Roman Yurchak wrote:
> I also don't have a strong opinion on this, and generally I'm just happy
> that black migration happened.

> Still with a slight preference for 88 characters as the default.

> On 28/07/2021 18:34, Olivier Grisel wrote:
> > Many very active core devs not represented in the TC voted for 88 and
> > my previous vote for 79 was not that strong. So I feel that I should
> > now vote for 88:

> > Keep current 88 characters:

> > Olivier

> > Revert to 79 characters:

> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-- 
    Gael Varoquaux
    Research Director, INRIA
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux

From olivier.grisel at ensta.org  Wed Oct  6 10:42:14 2021
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Wed, 6 Oct 2021 16:42:14 +0200
Subject: [scikit-learn] scikit-learn office hours on Friday Oct. 8 2021
Message-ID: <CAFvE7K4L_NDog4OWF7V53st4acbhj6JPU_Q856b7Sn9kAh9B7g@mail.gmail.com>

Hi all,

Some of us will be online on the scikit-learn discord this Friday at
15:00 UTC and 20:00 UTC.

First time and occasional contributors are welcome to join us to
discord using this invitation link:

https://discord.gg/YBdN45kD

The focus of these office hour sessions is to answer questions about
contributing to scikit-learn. We can also split into break out
audio/text channels and do pair programming or live reviewing of
forgotten pull requests with screen sharing.

We can also try to assist you into crafting minimal reproduction cases
for bug reports to get a higher likelihood of resolution (e.g.
https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).

If this experiment is successful, we will probably hold this kind of
office hours on a regular basis.

See you soon on discord!

-- 
Olivier

From g.lemaitre58 at gmail.com  Fri Oct  8 10:21:55 2021
From: g.lemaitre58 at gmail.com (=?utf-8?Q?Guillaume_Lema=C3=AEtre?=)
Date: Fri, 8 Oct 2021 16:21:55 +0200
Subject: [scikit-learn] scikit-learn office hours on Friday Oct. 8 2021
In-Reply-To: <CAFvE7K4L_NDog4OWF7V53st4acbhj6JPU_Q856b7Sn9kAh9B7g@mail.gmail.com>
References: <CAFvE7K4L_NDog4OWF7V53st4acbhj6JPU_Q856b7Sn9kAh9B7g@mail.gmail.com>
Message-ID: <E1F929D5-CA91-4B78-8523-52292B1AEEC0@gmail.com>

I see that Olivier did a small mistake. I will be have the office hours from 18:00 to 19:00 UTC.
So there is no office hour from 19:00 to 20:00 UTC.
Cheers,
--
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/

> On 6 Oct 2021, at 16:42, Olivier Grisel <olivier.grisel at ensta.org> wrote:
> 
> Hi all,
> 
> Some of us will be online on the scikit-learn discord this Friday at
> 15:00 UTC and 20:00 UTC.
> 
> First time and occasional contributors are welcome to join us to
> discord using this invitation link:
> 
> https://discord.gg/YBdN45kD
> 
> The focus of these office hour sessions is to answer questions about
> contributing to scikit-learn. We can also split into break out
> audio/text channels and do pair programming or live reviewing of
> forgotten pull requests with screen sharing.
> 
> We can also try to assist you into crafting minimal reproduction cases
> for bug reports to get a higher likelihood of resolution (e.g.
> https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).
> 
> If this experiment is successful, we will probably hold this kind of
> office hours on a regular basis.
> 
> See you soon on discord!
> 
> -- 
> Olivier
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20211008/c876420f/attachment.html>

From olivier.grisel at ensta.org  Fri Oct  8 10:52:35 2021
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Fri, 8 Oct 2021 16:52:35 +0200
Subject: [scikit-learn] scikit-learn office hours on Friday Oct. 8 2021
In-Reply-To: <E1F929D5-CA91-4B78-8523-52292B1AEEC0@gmail.com>
References: <CAFvE7K4L_NDog4OWF7V53st4acbhj6JPU_Q856b7Sn9kAh9B7g@mail.gmail.com>
 <E1F929D5-CA91-4B78-8523-52292B1AEEC0@gmail.com>
Message-ID: <CAFvE7K6aXJdfSTrU8L7LU-3M47CC8GqRxM_hhGcVjXY4uhSsGg@mail.gmail.com>

To summarize, the office hours for today are:

- 15:00-16:00 UTC / 17:00-18:00 CEST (this one starts in less than 10min)
- 18:00-19:00 UTC / 20:00-21:00 CEST (with Guillaume)

Sorry for the confusion and see you soon.

-- 
Olivier

From reshama.stat at gmail.com  Mon Oct 11 08:00:00 2021
From: reshama.stat at gmail.com (Reshama Shaikh)
Date: Mon, 11 Oct 2021 08:00:00 -0400
Subject: [scikit-learn] [Data Umbrella] AFME (Africa & Middle East)
 scikit-learn open source sprint (scikit-learn)
In-Reply-To: <CAKPCsugqH-JBYrUH0wn+A0KXCNPur88zG0Fei-+mxoR7T+OC1w@mail.gmail.com>
References: <CAKPCsugqH-JBYrUH0wn+A0KXCNPur88zG0Fei-+mxoR7T+OC1w@mail.gmail.com>
Message-ID: <CAKPCsuga6oTFUQtjKm_iJ49uP61HYjH1CK_YVYQ+-BsLbRBKjw@mail.gmail.com>

Hello,
At this time, we have a few spots open for the upcoming October 23
online scikit-learn sprint organized by Data Umbrella.

If you reside outside of the Africa and Middle East region, you are now
able to apply.
https://afme2021rc.dataumbrella.org/home

Note 1:  we offer a stipend of $10 USD to cover the cost of internet
access, and you can indicate such on your application.

Note 2:  if you need a translator, please indicate so on your application.

Key Notes:
a)  There is a pre-sprint event on Saturday October 16 from 5-6pm EAT.
This pre-sprint event is *optional* and an opportunity to answer any
questions in general and aid in setting up your virtual environment.

b)  Sprint is on *Saturday, October 23 at 5pm - 9pm EAT (East Africa Time) *on
our Discord server.

c)  There is a post-sprint event on Saturday November 23 from 5-6pm EAT.
This post-sprint event is *optional* and an opportunity to ask the core
devs questions on open pull requests.

d)  There is 3-4 hours of pre-work for the sprint.    Here is the
checklist:  https://afme2021rc.dataumbrella.org/about/prep-work

Please feel free to send any questions to me off the mailing list.

Best,
Reshama
Reshama Shaikh
she/her
Blog <https://reshamas.github.io> | Twitter <https://twitter.com/reshamas>
| LinkedIn <https://www.linkedin.com/in/reshamas/> | GitHub
<https://github.com/reshamas>

Data Umbrella <https://www.dataumbrella.org>
NYC PyLadies
<https://meet.meetup.com/wf/click?upn=pEEcc35imY7Cq0tG1vyTt6zEs68RbcMfjPcajNHTKtn9NmwqQbJhe15mAZ1gz2La_s50GiGgQPBz9c9AKCDbbu2LRERFOLQHDZ3rAVGAkUEIFdmeKWgLQ1JD-2FBfVxXpI86J1oyur7RYRzToaqco1fWUx-2FWPOn-2FLCyCICxwu5bjlHJvtSvVekt71L43UiQL8dMjr0HfGP-2FSeiGQFG0QQxzS-2FX5o4Q8Ch-2BHrlA5hsa9VyPXC5FvBn1cNbkmil3SgwH7HWFmXsKFJ7RYrzZR0EwWFIMarRA8-2BTgd8yXJYlfxogk-3D>


On Sat, Sep 25, 2021 at 5:05 PM Reshama Shaikh <reshama.stat at gmail.com>
wrote:

> Hello,
>
> Data Umbrella is organizing a scikit-learn sprint for this October 23,
> with a focus on **Africa and the Middle East**.  This event is free.
>
> A sprint is a 4-hour hands-on hackathon where we work on beginner issues
> in the scikit-learn GitHub repository.  Participants will be paired with
> another person.  There will be core contributors available to answer any
> questions.
>
> Event website is:  https://afme2021rc.dataumbrella.org
> We encourage folks to read the website and then complete the application.
>
> The event can be shared in these ways:
> - Retweet:  https://twitter.com/DataUmbrella/status/1435972074842034184
> - Share post on LinkedIn:
> https://www.linkedin.com/feed/update/urn:li:activity:6841738994305294336/
>
> Please feel free to contact me if you have any questions.
>
> Cheers,
> Reshama Shaikh
> she/her
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20211011/8001fd45/attachment.html>

From reshama.stat at gmail.com  Mon Oct 11 08:00:00 2021
From: reshama.stat at gmail.com (Reshama Shaikh)
Date: Mon, 11 Oct 2021 08:00:00 -0400
Subject: [scikit-learn] Open Source: sustainability and etiquette
In-Reply-To: <CAKPCsujfCDKrPVBXdzm_T4yWv5Eaa7b5OAnLjyqd6DzSPXanZw@mail.gmail.com>
References: <CAKPCsujRtuY4XN3ck87C7-RKJQHvQuWTy8Fwj_bT+3wMNL9Z4A@mail.gmail.com>
 <CAEOrW4-fk1Vz5EN7MgCyU8kH1-tGV18STfcb6p7D28sRyyqxPg@mail.gmail.com>
 <CAEjJ3fiyzScwSbw4hBc8aOExC5A3ZW7_yr2fo_=x6dJnCsgAgg@mail.gmail.com>
 <CAKPCsujfCDKrPVBXdzm_T4yWv5Eaa7b5OAnLjyqd6DzSPXanZw@mail.gmail.com>
Message-ID: <CAKPCsujoP7StDh79Wa+YvJunQXY=yuHGW08FFm_yXPyFCXm-pQ@mail.gmail.com>

Hello,
Adding another resource to this page entitled "Open Source Sustainability".
[a]

Keynote presentation [b] (video is 30 minutes) by a professor at Carnegie
Mellon University.
Her talk title:  *Laura Dabbish- Diversity and inclusion in open source
digital infrastructure projects*

She discusses her research into open source and key takeaways.

[a] https://www.dataumbrella.org/open-source/open-source-sustainability
[b] https://youtu.be/h6OkCbEd1AE

---
Reshama Shaikh
she/her


On Thu, Aug 5, 2021 at 10:38 AM Reshama Shaikh <reshama.stat at gmail.com>
wrote:

> Hello,
> I found the video, it's from 2017. It's by Heather Miller, a professor at
> CMU.  The 40-minute talk is entitled:  The Dramatic Consequences of the
> Open Source Revolution [a]
>
> Brigitta,
> Heather references Nadia Eghbal's book in her talk, which I also added to
> my list.  [b]
>
> Adrin,
> I added CHAOSS to the list as well.  They have a mailing list which I have
> subscribed to.
>
> [a]  https://youtu.be/K4mVuxcimWk
> [b]  https://www.dataumbrella.org/open-source/open-source-sustainability
>
>
> Reshama Shaikh
> she/her
> Blog <https://reshamas.github.io> | Twitter <https://twitter.com/reshamas>
> | LinkedIn <https://www.linkedin.com/in/reshamas/> | GitHub
> <https://github.com/reshamas>
>
> Data Umbrella <https://www.dataumbrella.org>
> NYC PyLadies
> <https://meet.meetup.com/wf/click?upn=pEEcc35imY7Cq0tG1vyTt6zEs68RbcMfjPcajNHTKtn9NmwqQbJhe15mAZ1gz2La_s50GiGgQPBz9c9AKCDbbu2LRERFOLQHDZ3rAVGAkUEIFdmeKWgLQ1JD-2FBfVxXpI86J1oyur7RYRzToaqco1fWUx-2FWPOn-2FLCyCICxwu5bjlHJvtSvVekt71L43UiQL8dMjr0HfGP-2FSeiGQFG0QQxzS-2FX5o4Q8Ch-2BHrlA5hsa9VyPXC5FvBn1cNbkmil3SgwH7HWFmXsKFJ7RYrzZR0EwWFIMarRA8-2BTgd8yXJYlfxogk-3D>
>
>
> On Mon, Apr 19, 2021 at 6:51 PM Brigitta Sipocz <bsipocz at gmail.com> wrote:
>
>> Hi,
>>
>> I've also very much liked Nadia Eghbal's book: Working in public; The
>> making and maintenance of open source software. I haven't yet attended a
>> conference where she was a speaker, but I'm certain there are some relevant
>> recordings on youtube.
>>
>> Cheers,
>>  Brigitta
>>
>>
>> On Mon, 19 Apr 2021 at 06:27, Adrin <adrin.jalali at gmail.com> wrote:
>>
>>> This is a really good initiative Reshama, thanks for sharing.
>>>
>>> Have you seen CHAOSScon talks and activities? They're really good, and
>>> touch on a lot of really good stuff when it comes to open source
>>> communities and sustainability.
>>> Eg.:  https://chaoss.community/chaosscon-2020-eu/
>>>
>>> Cheers,
>>> Adrin
>>>
>>> On Fri, Apr 16, 2021 at 4:26 PM Reshama Shaikh <reshama.stat at gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>> I've seen some excellent resources that have explained open source, its
>>>> sustainability, challenges and *indirectly, the etiquette*.
>>>>
>>>> I am starting to compile the list here [a].
>>>>
>>>> This keynote by Stuart Geiger is a must-watch:  The Invisible Work of
>>>> Maintaining & Sustaining Open Source Software   [b]
>>>>
>>>> There is one more video by Emily someone who was at Microsoft, but is
>>>> now a professor somewhere, and I am trying to track that video down.  I
>>>> think it's from 2017.  I'll add it to the list once I find it.  If anyone
>>>> knows the full name of the speaker, please share.
>>>>
>>>> [a]
>>>> https://www.dataumbrella.org/open-source/open-source-sustainability
>>>>
>>>> [b]
>>>> https://www.youtube.com/watch?v=PM3iltcaIL8
>>>>
>>>> Best,
>>>> Reshama
>>>> ---
>>>> Reshama Shaikh
>>>> she/her
>>>> Blog <https://reshamas.github.io> | Twitter
>>>> <https://twitter.com/reshamas> | LinkedIn
>>>> <https://www.linkedin.com/in/reshamas/> | GitHub
>>>> <https://github.com/reshamas>
>>>>
>>>> Data Umbrella <https://www.dataumbrella.org>
>>>> NYC PyLadies
>>>> <https://meet.meetup.com/wf/click?upn=pEEcc35imY7Cq0tG1vyTt6zEs68RbcMfjPcajNHTKtn9NmwqQbJhe15mAZ1gz2La_s50GiGgQPBz9c9AKCDbbu2LRERFOLQHDZ3rAVGAkUEIFdmeKWgLQ1JD-2FBfVxXpI86J1oyur7RYRzToaqco1fWUx-2FWPOn-2FLCyCICxwu5bjlHJvtSvVekt71L43UiQL8dMjr0HfGP-2FSeiGQFG0QQxzS-2FX5o4Q8Ch-2BHrlA5hsa9VyPXC5FvBn1cNbkmil3SgwH7HWFmXsKFJ7RYrzZR0EwWFIMarRA8-2BTgd8yXJYlfxogk-3D>
>>>> _______________________________________________
>>>> scikit-learn mailing list
>>>> scikit-learn at python.org
>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20211011/a15e123e/attachment-0001.html>

From gael.varoquaux at normalesup.org  Wed Oct 13 10:40:50 2021
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Wed, 13 Oct 2021 16:40:50 +0200
Subject: [scikit-learn] DirtyData and the SuperVectorizer,
 for non-normalized dataframes
Message-ID: <20211013144050.ssikmlt6gz6u4ijy@phare.normalesup.org>

Dear scikit-learn community,

I would like to announce a new release of dirty-cat, which strives to
facilitates machine-learning on non-curated categories: robust to
morphological variants, such as typos.

The new big feature, which I think is of interest to many, is the
"SuperVectorizer", that strives to readily vectorize a pandas dataframe:
https://dirty-cat.github.io/stable/auto_examples/01_dirty_categories.html#example-super-vectorizer

Of course, such an object is full of heuristics. We have tuned them
empirically, but we expect more progress in the long term, as we build a
bigger databases of dataframes that are difficult to vectorize. We'd love
people to join the adventure, it's been fun so far.

Cheers,

Ga?l

-- 
    Gael Varoquaux
    Research Director, INRIA
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux

From acojugo at gmail.com  Thu Oct 14 08:29:23 2021
From: acojugo at gmail.com (Aco Jugo)
Date: Thu, 14 Oct 2021 14:29:23 +0200
Subject: [scikit-learn] (no subject)
Message-ID: <CAP2Ju5MxQqu1Eah81fisGPkOst3ZVCAr3uZHZENjtBru5S4x_w@mail.gmail.com>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20211014/c23d8bb5/attachment.html>

From thomasjpfan at gmail.com  Mon Oct 18 12:09:09 2021
From: thomasjpfan at gmail.com (Thomas J. Fan)
Date: Mon, 18 Oct 2021 12:09:09 -0400
Subject: [scikit-learn] scikit-learn monthly developer meeting: Monday
 October 25th 2021
Message-ID: <CAK3g5AYk8JNoQp8BEeTFtRS3Ycohpry8OZ+F=vPqjZvPX=iD+A@mail.gmail.com>

Dear all,


The scikit-learn developer monthly meeting will take place on Monday

October 25th at 1PM UTC.


- Video call link: https://meet.google.com/ews-uszu-djs

- Meeting notes / agenda: https://hackmd.io/0yokz72CTZSny8y3Re648Q

- Local times:
https://www.timeanddate.com/worldclock/meetingdetails.html?year=2021&month=10&day=25&hour=13&min=0&sec=0&p1=1440&p2=240&p3=248&p4=195&p5=179&p6=224


The goal of this meeting is to discuss ongoing development topics for

the project. Everybody is welcome.


As usual, please follow the code of conduct of the project:

https://github.com/scikit-learn/scikit-learn/blob/main/CODE_OF_CONDUCT.md


Regards,

Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20211018/8265e546/attachment.html>

From g.lemaitre58 at gmail.com  Mon Oct 25 07:25:33 2021
From: g.lemaitre58 at gmail.com (=?UTF-8?Q?Guillaume_Lema=C3=AEtre?=)
Date: Mon, 25 Oct 2021 13:25:33 +0200
Subject: [scikit-learn] [ANN] scikit-learn 1.0.1 is online!
Message-ID: <CACDxx9iJJ_PW=0Mn_er53AmzX3=QG2fFPR6nNDrnpHuWn28r3g@mail.gmail.com>

scikit-learn 1.0.1 is out on pypi.org and conda-forge!

This is a small maintenance release that fixes a couple of regressions:

https://scikit-learn.org/dev/whats_new/v1.0.html#version-1-0-1
<https://scikit-learn.org/stable/whats_new/v0.24.html#version-0-24-2>

You can upgrade with pip as usual:

pip install -U scikit-learn

The conda-forge builds will be available shortly, which you can then
install using:

conda install -c conda-forge scikit-learn

Thanks again to all the contributors.
On behalf of the scikit-learn maintainer team.
-- 
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20211025/3410e998/attachment.html>

From thomasjpfan at gmail.com  Thu Oct 28 18:06:10 2021
From: thomasjpfan at gmail.com (Thomas J. Fan)
Date: Thu, 28 Oct 2021 18:06:10 -0400
Subject: [scikit-learn] scikit-learn office hours on Friday Oct. 29 2021
Message-ID: <CAK3g5AY7U6d9Mns0VUrpGnKywH4iNVqDSq55+s6ATSQ+5iHifg@mail.gmail.com>

Hi all,

Some of us will be online on the scikit-learn discord this Friday at
11am ET / 15:00 UTC / 17:00 CEST.

First time and occasional contributors are welcome to join us to
discord using this invitation link:
https://discord.gg/YBdN45kD

The focus of these office hour sessions is to answer questions about
contributing to scikit-learn. We can also split into break out
audio/text channels and do pair programming or live reviewing of
forgotten pull requests with screen sharing.

We can also try to assist you into crafting minimal reproduction cases
for bug reports to get a higher likelihood of resolution (e.g.
https://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports).

Please note, our Code of Conduct applies:
https://github.com/scikit-learn/scikit-learn/blob/main/CODE_OF_CONDUCT.md

If this experiment is successful, we will probably hold this kind of
office hours on a regular basis.

See you soon on discord!

--
Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20211028/b16dc400/attachment.html>

From g.lemaitre58 at gmail.com  Sat Oct 30 05:18:41 2021
From: g.lemaitre58 at gmail.com (=?utf-8?Q?Guillaume_Lema=C3=AEtre?=)
Date: Sat, 30 Oct 2021 11:18:41 +0200
Subject: [scikit-learn] New core dev: Julien Jerphanion
Message-ID: <A59EFCC7-95E7-4D1B-916E-91D7140FD94F@gmail.com>

The scikit-learn core development team has welcomed a new member, Julien Jerphanion, who has contributed code, reviews, and documentation since this March (aside from occasional contributions in the past).

Congratulation and welcome Julien!

On the behalf of the scikit-learn team
--
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20211030/18f7dae5/attachment.html>

From rth.yurchak at gmail.com  Sat Oct 30 05:56:02 2021
From: rth.yurchak at gmail.com (Roman Yurchak)
Date: Sat, 30 Oct 2021 11:56:02 +0200
Subject: [scikit-learn] New core dev: Julien Jerphanion
In-Reply-To: <A59EFCC7-95E7-4D1B-916E-91D7140FD94F@gmail.com>
References: <A59EFCC7-95E7-4D1B-916E-91D7140FD94F@gmail.com>
Message-ID: <a013ac51-598d-eda1-f7f2-9823f3f83d7f@gmail.com>

Congratulations, Julian, and thank for all your work!

Roman

On 30/10/2021 11:18, Guillaume Lema?tre wrote:
> The scikit-learn core development team has welcomed a new member, Julien 
> Jerphanion, who has contributed code, reviews, and documentation since 
> this March (aside from occasional contributions in the past).
> 
> Congratulation and welcome Julien!
> 
> On the behalf of the scikit-learn team
> --
> Guillaume Lemaitre
> Scikit-learn @ Inria Foundation
> https://glemaitre.github.io/ <https://glemaitre.github.io/>
> 
> 
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
> 

From matematica.a3k at gmail.com  Sun Oct 31 13:07:57 2021
From: matematica.a3k at gmail.com (=?UTF-8?Q?Matem=C3=A1tica_A3K?=)
Date: Sun, 31 Oct 2021 14:07:57 -0300
Subject: [scikit-learn] Help interpreting decision function plot
Message-ID: <CA+FDnhJ2Lcx_Le4Q-Y956w1vhH0Wn2bVNeBLufhR7GNB30ft3g@mail.gmail.com>

Hi!

I have been building a tool that integrates statistical engines - specially
scikit-learn - with django called django-ai
<https://github.com/math-a3k/django-ai/tree/covid-ht>.

With that tool, I have built another, covid-ht
<https://github.com/math-a3k/covid-ht>, which should showcase the power of
those together.

That tool is meant to help health professionals with classification tasks
based on measurements
<https://covid-ht.readthedocs.io/en/latest/beyond_covid19.html>.

The tool is heading to its first release as a technology preview, and in
this process I have faced a release-blocker issue for which I would like to
ask for your help: I can't find a consistent interpretation of the graphs.

The graphs are called "conditional decision functions
<https://covid-ht.readthedocs.io/en/latest/classification/graphing.html>",
where each one is the contour of the decision function of a classifier for
an observation in 2 variables while leaving the others fixed.

The graphs show classification regions as expected, but my initial
interpretation seems wrong (commented out
<https://raw.githubusercontent.com/math-a3k/covid-ht/master/docs/classification/graphing.rst>
).

If that explanation was good, I would expect that perturbing one variable
in a direction where the graph shows another class should switch the
classification, as the remaining variables are fixed and that should be the
value that the classifier uses to decide - which is plotted in that plane.

That is not happening, as you may check here
<http://covid-ht.herokuapp.com/> (the classifier being used is an
Histogram-based Gradient Boosting Classification Tree).

Any insight about the situation will be highly appreciated and thankful in
advance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20211031/9019beff/attachment.html>