[scikit-learn] Github project management tools

Nelle Varoquaux nelle.varoquaux at gmail.com
Fri Dec 2 20:04:10 EST 2016


Hello,

This seems a good moment to say that we will be starting a project at
BIDS next semester to try extract information from github and classify
PRs into different categories (stalled, updated, needs review).
Stéfan drafted a list of elements he would like to see for
scikit-image, and I have been wanting something similar for
matplotlib.
I've got my hands full right now, but we are more than open to discuss
with the wider community to see if such a tool would be useful and
what features is of interest.

Here are some examples of elements we'd like to be able to identify and sort:

- Most active pull requests “hot topics”
- The one where "I" have commented on.
- PRs that haven’t seen any discussion.
- Stalled PRs.
- New issues without any comments.
- See the old PRs that could be merged
- Recently merged PR referring to a ticket but haven’t closed that ticket.
- Duplicate PR (closing the same ticket).
- Tickets that being referred to many times.
- Unmergeable PRs (that need to be rebased).
- PRs that passed the majority of tests.
- Issues that external projects refer too.

Do you think something like this could be interesting for sklearn?
Also, if you have scripts that similar things and that you would be
willing to share, we would be very happy to see what exists already
out there.

Cheers,
N

On 2 December 2016 at 16:52, Andy <t3kcit at gmail.com> wrote:
> So did we ever decide on how to prioritize reviews?
> (I was still mentally / notification catching up after 0.18.1)
>
> There are some really important issues to tackle, often with proposed
> solutions, not no reviews!
> It's hard for everybody to keep the big picture in mind with such a full
> issue tracker.
> I think it might be helpful if Joel and me prioritize issues. Obviously that
> will only make
> sense if the other team members check up on it when deciding what to review
> / work on.
>
> Do we want to try to seriously use the project feature?
> https://github.com/scikit-learn/scikit-learn/projects/5
>
> On my monitor I can fit four columns and the "add cards" tab.
> I tried using five columns (separating in-progress and stalled PRs) but then
> I could access the right-most column when
> the "add cards" was open.
> The whole interface is a bit awkward but maybe the best we have (for example
> moving something from the bottom
> to the top is easiest by moving it to a different column, then scrolling up,
> then moving it back)
>
> wdyt?
> Andy
>
>
>
> On 09/29/2016 11:05 PM, Joel Nothman wrote:
>
> The spreadsheet seems to have some duplications and presumably some missing
> rows, with apologies. I assume some is due to the github pagination, and
> some may be my error. Not a big enough error to fix up.
>
> On 30 September 2016 at 05:15, Raphael C <drraph at gmail.com> wrote:
>>
>> My apologies I see it is in the spreadsheet. It would be great to see
>> this work finished for 0.19 if at all possible IMHO.
>>
>> Raphael
>>
>> On 29 September 2016 at 20:12, Raphael C <drraph at gmail.com> wrote:
>> > I hope this isn't out of place but I notice that
>> > https://github.com/scikit-learn/scikit-learn/pull/4899 is not in the
>> > list. It seems like a very worthwhile addition and the PR appears
>> > stalled at present.
>> >
>> > Raphael
>> >
>> > On 29 September 2016 at 15:05, Joel Nothman <joel.nothman at gmail.com>
>> > wrote:
>> >> I agree that being able to identify which PRs are stalled on the
>> >> contributor's part, which on reviewers' part, and since when, would be
>> >> great. I'm not sure we've come up with a way that'll work though.
>> >>
>> >> In terms of backlog, I've wondered if just getting things into a
>> >> spreadsheet
>> >> would help:
>> >>
>> >>
>> >> https://docs.google.com/spreadsheets/d/1LdzNxQbn7A0Ao8zlUBgnvT42929JpAe9958YxKCubjE/edit
>> >>
>> >> What other features of an Issue / PR would be useful to
>> >> sort/filter/pivottable on in a spreadsheet form like this?
>> >>
>> >> (It would be extra nice if we could modify titles and labels within the
>> >> spreadsheet and have them update via the GitHub API, but I'm not sure
>> >> I'll
>> >> get around to making that feature :P)
>> >>
>> >>
>> >> On 29 September 2016 at 23:45, Andreas Mueller <t3kcit at gmail.com>
>> >> wrote:
>> >>>
>> >>> So I made a project for 0.19:
>> >>>
>> >>> https://github.com/scikit-learn/scikit-learn/projects/5
>> >>>
>> >>> The idea would be to drag and drop issues and PRs so that the
>> >>> important
>> >>> ones are at the top.
>> >>> We could also add an "important" column, currently the scrolling is
>> >>> pretty
>> >>> annoying.
>> >>> Thoughts?
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On 09/28/2016 03:29 PM, Nelle Varoquaux wrote:
>> >>>>
>> >>>> On 28 September 2016 at 12:24, Andreas Mueller <t3kcit at gmail.com>
>> >>>> wrote:
>> >>>>>
>> >>>>>
>> >>>>> On 09/28/2016 02:21 PM, Nelle Varoquaux wrote:
>> >>>>>>
>> >>>>>>
>> >>>>>> I think the only ones worth having are the ones that can be dealt
>> >>>>>> with
>> >>>>>> automatically and the ones that will not be used frequently:
>> >>>>>>
>> >>>>>> - stalled after 30 days of inactivity [can be done automatically]
>> >>>>>> - in dispute [I don't expect it to be used often].
>> >>>>>
>> >>>>> I think "in dispute" is actually one of the most common statuses
>> >>>>> among
>> >>>>> PRs.
>> >>>>> Or maybe I have a skewed picture of things.
>> >>>>> Many PRs stalled because it is not clear whether the proposed
>> >>>>> solution
>> >>>>> is a
>> >>>>> good one.
>> >>>>
>> >>>> On the stalled one, sure, but there are a lot of PRs being merged
>> >>>> fairly quickly. So over all, I think it is quite rare. No?
>> >>>>
>> >>>>> It would be great to have some way to get through the backlog of 400
>> >>>>> PRs
>> >>>>> and
>> >>>>> I think tagging them might be useful.
>> >>>>> We rarely reject PRs, we could also revisit that policy.
>> >>>>>
>> >>>>> For the backlog, it's pretty unclear to me how many are waiting for
>> >>>>> reviews,
>> >>>>> how many are waiting for changes,
>> >>>>> and how many are disputed.
>> >>>>> Tagging these might help people who want to review to find things to
>> >>>>> review,
>> >>>>> and people who want to code to pick
>> >>>>> up stalled PRs.
>> >>>>
>> >>>> That sounds like a great use of labels, thought all of these need to
>> >>>> be tagged manually.
>> >>>>
>> >>>>> _______________________________________________
>> >>>>> scikit-learn mailing list
>> >>>>> scikit-learn at python.org
>> >>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>> >>>>
>> >>>> _______________________________________________
>> >>>> scikit-learn mailing list
>> >>>> scikit-learn at python.org
>> >>>> https://mail.python.org/mailman/listinfo/scikit-learn
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> scikit-learn mailing list
>> >>> scikit-learn at python.org
>> >>> https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> scikit-learn mailing list
>> >> scikit-learn at python.org
>> >> https://mail.python.org/mailman/listinfo/scikit-learn
>> >>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>


More information about the scikit-learn mailing list