[scikit-learn] Fwd: Re: Topic for thesis work on scikit learn
Gaurav Dhingra
gauravdhingra.gxyd at gmail.com
Tue Jan 23 11:09:23 EST 2018
Hi Andreas,
On Tuesday 23 January 2018 09:12 PM, Gaurav Dhingra wrote:
>
>
>
>
> -------- Forwarded Message --------
> Subject: Re: [scikit-learn] Topic for thesis work on scikit learn
> Date: Tue, 23 Jan 2018 10:16:36 -0500
> From: Andreas Mueller <t3kcit at gmail.com>
> To: Gaurav Dhingra <gauravdhingra.gxyd at gmail.com>
>
>
>
> Hi Gaurav.
>
> Is your mentor experienced in contributing to sklearn?
>
No, she isn't.
> Will they be able to review your code to the scikit-learn standards?
>
No.
> Have you worked on any other pull requests so far?
>
I've on a few. Please have a look at
https://github.com/scikit-learn/scikit-learn/pulls/gxyd, infact I expect
that 3 of the open PR's will be merged soon.
> Getting anything into scikit-learn without close collaboration with
> the community is quite tricky.
>
> Having a faster K-means implementation based on recent research in the
> area would be interesting,
> There's also interest in adding Robust PCA, probabilistic inference
> trees, and improving the latent dirichlet alloctation code.
>
I tried to look into what /scikit-learn community/////devs/ consider a
priority to have in their code-base (instead of me looking explicitly
for topics I like). When I looked, I thought of
https://github.com/scikit-learn/scikit-learn/issues/8337, or
https://github.com/scikit-learn/scikit-learn/issues/6557 as the possible
topics. But since I'm aware that unavailability of yours (busy in
teaching purpose can be an issue), so I simultaneously looked for other
options. I'd a conversation with Joel (he was kind enough to PM me),
this is what he said (only the important part of conversation):
| Tricky thinngs we’ve been trying to do for years:
| * estimator tags
| * sample props
| tools for optimising cluster parameters (e.g. #6948)
| sample props == #4497 and associated
| related to clusterer parameters, #6160
| estimator tags relates to #6715
| #6777 looks tricky from an ML perspective.
I'm thinking of choosing
https://github.com/scikit-learn/scikit-learn/pull/6948 (ENH optimal
n_clusters value), i.e completing that PR. If you will be having
availability to review my PR's (if I do open them), then I'd glad to
work with you on either /Conditional inference trees /or /adding
post-pruning for decision trees/.
I'm aware as Joel earlier put it /Andreas has escaped into the teaching
world/. Anyways, I don't expect my guide to provide me feedback in
regards to scikit-learn code, though she will have theoretical
explanation to my questions definitely. Also, since we can also have a
co-guide (apart from local guide), I would definitely consider that as
an option for someone from scikit-learn, even if it be you or may be
Joel. But even Joel is expected to get back to academic world as well.
If things don't go a little positive (neither you or Joel or may be
someone else from scikit-learn community is available), I'm gonna be
taking a little longer but I'll eventually get there probably.
> You can find issues on any of these in the issue tracker, which also
> has many more feature requests.
>
> Andy
>
>
> On 12/31/2017 05:46 AM, Gaurav Dhingra wrote:
>>
>> Hi Andreas,
>>
>> I think I'll get access to a local mentor from my college, so I think
>> I rule that issue out, though for technicalities still I would /like/
>> to be more dependent on feedback from the scikit-learn community,
>> since my aim wouldn't be to make something for my own use but rather
>> something that would be more useful for the scikit-learn community,
>> so that it eventually gets merged into master.
>>
>> I'm currently looking for topic that I can take up, I tried looking
>> into scikit-learn wiki but it doesn't mention for what I'm looking
>> for (no topic is mentioned). Do you have some topic in mind that
>> could be useful for addition to scikit-learn? Even if you could
>> direct me to appropriate links I would be happy to look into those.
>>
>>
>> On Wednesday 01 November 2017 01:43 AM, Andreas Mueller wrote:
>>> Hi Gaurav.
>>>
>>> Do you have a local mentor? I think having a mentor that can guide
>>> you during a thesis is very important.
>>> You could get some feedback from the community for a contribution,
>>> but that can be slow,
>>> and is entirely on volunteer basis, so there is no guarantee that
>>> you'll get the necessary feedback in time
>>> to finish your thesis.
>>>
>>> Mentoring a thesis - in particular without knowing you - is a
>>> serious commitment, so I'm not sure someone
>>> from inside the project will want to do this. I saw you already made
>>> a contribution in
>>> https://github.com/scikit-learn/scikit-learn/pull/10005
>>> but that's a very different scope than doing what I expect would be
>>> several month of work.
>>
>> Though in this regard I've made a few more contributions, here is the
>> link https://github.com/scikit-learn/scikit-learn/pulls/gxyd, though
>> I know none of them is a big contribution. If you think I should work
>> on a big enough PR, can you please suggest me some issue in that regard?
>>
>> Thanks.
>>
>>>
>>>
>>> Best,
>>> Andy
>>>
>>> On 10/31/2017 03:31 PM, Gaurav Dhingra wrote:
>>>> Hi everyone,
>>>>
>>>> I am a final year (5th year) undergraduate Applied Mathematics
>>>> student in India. I am thinking of doing my final year thesis by
>>>> doing some work (coding part) on scikit learn, so I was thinking if
>>>> anyone could tell me if there are available topics (not necessarily
>>>> names of those topics) that I could work on being an undergraduate
>>>> student? I would want to expand upon this in December when my exams
>>>> will be over. But in the mean time would want to take a step in
>>>> that direction by just knowing if there will be available topics
>>>> that I could work on.
>>>>
>>>> It could be the case that available topics are not so easy for an
>>>> undergraduate, still in that case I would like to do some research
>>>> on the topics first.
>>>>
>>>
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>> --
>> Gaurav Dhingra
>> (sent from Thunderbird email client)
>
--
Gaurav Dhingra
(sent from Thunderbird email client)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20180123/fd9e5073/attachment-0001.html>
More information about the scikit-learn
mailing list