[Pandas-dev] Mailing list for Python data analytics ecosystem developers?

Andy Ray Terrel andy.terrel at gmail.com
Thu Jan 3 11:53:06 EST 2019


On Mon, Dec 31, 2018 at 12:07 PM Andy Ray Terrel <andy.terrel at gmail.com>
wrote:

>
> On Mon, Dec 31, 2018 at 11:28 AM Wes McKinney <wesmckinn at gmail.com> wrote:
>
>> As a discussion list intended for project developers, I am not
>> anticipating so much noise that people become disengaged. If we were
>> creating a forum to collect user feedback, that would be a little bit
>> different. I'm more looking to encourage the sharing of more high
>> level project planning, roadmaps and goals, fund raising activities,
>> and other matters related to the health and growth of the major
>> community projects. It would be really useful for each project to
>> state a list of goals for some future horizon (e.g. 1 year).
>>
>> I have observed that some of these cross-project discussions often
>> only happen in person, or on an ad hoc basis on GitHub issues.
>>
>> User feedback can be helpful, but in practice most projects function
>> as "do-ocracies" where opinions are roughly valued proportional to
>> project contributions.
>>
>> It would also be useful to be able to point users to historical
>> discussions amongst project developers when there are questions or
>> concerns. My anecdotal experience is that the lack of visible /
>> centralized cross-project discussions and roadmapping / planning /
>> goal discussion has at times led to user (or developer) confusion
>> about what different groups of developers are trying to accomplish.
>>
>> The concerns raised seem to be mostly about optimizing large-scale
>> communications. Let's first see if there is communication that needs
>> to be optimized. Even if we add additional tools to facilitate
>> communications, I think we still need a mailing list.
>>
>>
> Have we decided which mailing list we desire? I forgot we could also just
> make it dev at pydata.org if we like. In general I think, we should write up
> a governance document on pydata as a whole.
>
>

I've created a list dev at pydata.org if we want to use it.

On the governance front, Leah is putting together a plan around managing
pydata conferences going. We also wanted to revamp pydata.org to reflect
more of the development community around the ecosystem so definitely send
ideas and thoughts.

-- Andy


> - Andy
>
>
>> - Wes
>>
>> On Sat, Dec 29, 2018 at 4:58 PM Matthew Rocklin <mrocklin at gmail.com>
>> wrote:
>> >
>> > > I don't see what is wrong with using e-mail.
>> >
>> > There were some issues raised before:
>> >
>> > I'm slightly concerned that a broad ranging e-mail list that
>> encompasses all of PyData would get noisy.  For example I can imagine
>> lengthy conversations on visualization or probabalistic programming that,
>> while I find important, would likely want to take a pass on.  Having a
>> service that includes tags and subscription to those tags may have value.
>> > E-mail list archives tend to collect dust.  If we're doing long-range
>> cross-project coordination then those conversations might have long term
>> value.  We might want to cross reference them, upvote them, subscribe to
>> them, and so on.
>> >
>> > And also some benefits of discourse raised by Nathaniel which might be
>> turned around to be interpreted as concerns with e-mail.
>> >
>> > My impression so far is that discourse takes a bit of adjustment
>> > compared to mailing lists, but it has a lot of valuable features like
>> > multi-quoting, markdown (code blocks, links, ...), solid moderation
>> > tools (e.g. if a discussion diverges you can retroactively split parts
>> > of it out into a new topic), polls (these were incredibly useful for
>> > taking the temperature of the community during the governance
>> > discussions), ability to reply to messages that were posted before you
>> > joined the list, configurable notifications (email me everything /
>> > email me when a new topic is created / email me a summary weekly /
>> > ...), ...
>> >
>> > > It is public, archival, and append-only. GitHub issues are
>> non-archival and comments can be edited or deleted.
>> >
>> > That's certainly true of GitHub issues.  I suspect that it's also true
>> of Discourse (though I'd have to go through the docs to make sure that it
>> wasn't possible to turn it off).  From my perspective the (in)ability to
>> edit or delete comments isn't a big deal.  I'm not particularly concerned
>> with people modifying history in a nefarious way.  Though perhaps my
>> viewpoint here is naive.  I haven't yet run into this issue in our
>> community.
>> >
>> > I think that the biggest benefit to using an e-mail list is that it's a
>> well known technology with a low barrier to adoption.
>> >
>> > I anticipate two likely failure modes for e-mail and discourse
>> respectively:
>> >
>> > EMail: conversation is too diffuse so that people sign up, get bored
>> listening to things that don't interest them, and then stop notifications.
>> The pydata mailing list ends up being used by small subsets of the
>> community, but not the community as a whole.
>> > Discourse: it's too new/unknown so that no one signs up and it doesn't
>> reach critical mass.  (this seems to be happening with Jupyter's discourse
>> today?)
>> >
>> > There are lots of other pros and cons to each, obviously, but those two
>> outcomes are, I think, the most troublesome.
>> >
>> > On Thu, Dec 27, 2018 at 8:13 AM Wes McKinney <wesmckinn at gmail.com>
>> wrote:
>> >>
>> >> Having dev.pydata.org sounds fine to me.
>> >>
>> >> I don't see what is wrong with using e-mail. It is public, archival,
>> >> and append-only. GitHub issues are non-archival and comments can be
>> >> edited or deleted.
>> >>
>> >> On Thu, Dec 27, 2018 at 6:26 AM Andy Ray Terrel <andy.terrel at gmail.com>
>> wrote:
>> >> >
>> >> > I would recommend we set up a site dev.pydata.org that tells the
>> folks where conversations are happening. While mailing lists are great we
>> might consider just having a github issue tracker set up for cross
>> ecosystem bugs or initiatives. I was planning on decommisionning the
>> numfocus discourse and zulip server as they didn't really have great use.
>> Chris Holdgraf suggested using Topic Box, but user based pricing isn't a
>> great option for open source development.
>> >> >
>> >> > Anywho, both dask and pandas are part of the NumFOCUS projects
>> ecosystem so I'm happy to set anything up for ya'll.
>> >> >
>> >> > -- Andy
>> >> >
>> >> > On Wed, Dec 26, 2018 at 10:35 PM Wes McKinney <wesmckinn at gmail.com>
>> wrote:
>> >> >>
>> >> >> @Andy
>> >> >>
>> >> >> pydata at googlegroups.com has 2734 members. Based on recent traffic
>> it
>> >> >> is really a user / Q&A mailing list, not a place for the
>> >> >> maintainers/steering committees of major projects to speak publicly
>> >> >> with one another (where discussions are public, archived,
>> searchable).
>> >> >> I have observed that there are many discussions happening between
>> the
>> >> >> developers of projects on an ad hoc basis and on ad hoc
>> communication
>> >> >> channels (both private and public). Partly there is no obvious place
>> >> >> for, e.g., the developers of pandas and dask to have a public
>> >> >> discussion, that is not necessarily "partisan" to one of those
>> >> >> projects.
>> >> >>
>> >> >> As another example issue, there is not an obvious place to raise
>> >> >> issues. Back in the day I think numpy-discussion or scipy-user used
>> to
>> >> >> partly serve this purpose, but the centers of gravity have shifted.
>> >> >>
>> >> >> - Wes
>> >> >>
>> >> >>
>> >> >> On Wed, Dec 26, 2018 at 9:41 PM Andy Ray Terrel <
>> andy.terrel at gmail.com> wrote:
>> >> >> >
>> >> >> > I'm not completely clear what is being asked for since
>> pydata at googlegroups.com already exists. Since NumFOCUS is promoting the
>> PyData conference and helping build the brand for the ecosystem, I wonder
>> if a home like pydata-dev at numfocus.org would be interesting for folks?
>> >> >> >
>> >> >> > It is also my understanding that there will be a fuller steering
>> committee set up for the conferences next year. I propose we do the same
>> for the technical structure. As is, I manage the website and github repos
>> but there is not much dictating how I manage these.
>> >> >> >
>> >> >> > -- Andy
>> >> >> >
>> >> >> >
>> >> >> > On Wed, Dec 26, 2018 at 6:00 PM Nathaniel Smith <njs at pobox.com>
>> wrote:
>> >> >> >>
>> >> >> >> Other examples of discourse used for dev discussion include:
>> >> >> >>
>> >> >> >> - https://internals.rust-lang.org/ -- main dev forum for rust
>> >> >> >> - https://discuss.python.org/ -- potential replacement for
>> >> >> >> python-{committers,dev,users}, still experimental but where a
>> ton of
>> >> >> >> the python governance discussion happened
>> >> >> >>
>> >> >> >> My impression so far is that discourse takes a bit of adjustment
>> >> >> >> compared to mailing lists, but it has a lot of valuable features
>> like
>> >> >> >> multi-quoting, markdown (code blocks, links, ...), solid
>> moderation
>> >> >> >> tools (e.g. if a discussion diverges you can retroactively split
>> parts
>> >> >> >> of it out into a new topic), polls (these were incredibly useful
>> for
>> >> >> >> taking the temperature of the community during the governance
>> >> >> >> discussions), ability to reply to messages that were posted
>> before you
>> >> >> >> joined the list, configurable notifications (email me everything
>> /
>> >> >> >> email me when a new topic is created / email me a summary weekly
>> /
>> >> >> >> ...), ...
>> >> >> >>
>> >> >> >> -n
>> >> >> >>
>> >> >> >> On Wed, Dec 26, 2018 at 3:41 PM Matthew Rocklin <
>> mrocklin at gmail.com> wrote:
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >>> Copying the mailing list
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > Whoops!  E-mail fail on my part.
>> >> >> >> >
>> >> >> >> >>> Discourse is interesting. It seems to be used (at least in
>> PyTorch's
>> >> >> >> >>> case) as more of a modern message board for users than a
>> place for
>> >> >> >> >>> long-form discussions between project developers.
>> >> >> >> >>>
>> >> >> >> >>> IMHO having a cross-project developer mailing list is
>> probably overdue
>> >> >> >> >>> -- I think we can do a better job the next couple of years
>> >> >> >> >>> coordinating (colluding?) with each other. A lot of
>> coordination does
>> >> >> >> >>> of course in private, project-level, or other ad-hoc basis.
>> It would
>> >> >> >> >>> help to be able to discuss ecosystem-level problems and
>> possible
>> >> >> >> >>> solutions.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > Entirely agreed.  And I think that an e-mail list is the
>> obvious first choice here.
>> >> >> >> >
>> >> >> >> > I'm bringing up discourse as an alternative for
>> consideration.  This is for a couple reasons:
>> >> >> >> >
>> >> >> >> > I'm slightly concerned that a broad ranging e-mail list that
>> encompasses all of PyData would get noisy.  For example I can imagine
>> lengthy conversations on visualization or probabalistic programming that,
>> while I find important, would likely want to take a pass on.  Having a
>> service that includes tags and subscription to those tags may have value.
>> >> >> >> > E-mail list archives tend to collect dust.  If we're doing
>> long-range cross-project coordination then those conversations might have
>> long term value.  We might want to cross reference them, upvote them,
>> subscribe to them, and so on.
>> >> >> >> >
>> >> >> >> > In regards to PyTorch's discuss in particular I agree that it
>> is used more as a user forum, which I agree is a different use case than
>> what Wes is proposing here.  I mostly pointed to it so that people could
>> get a sense of what an active system looks like.
>> >> >> >> >
>> >> >> >> > Regardless, I encourage this conversation to happen with a
>> broader set of people.  I believe that other groups are considering these
>> topics as well and may have thoughts beyond those that have been expressed
>> here.  I'm not sure how best to bootstrap this process, other than an
>> e-mail to maybe the NumFOCUS mailing list and perhaps a tweet?
>> >> >> >> >
>> >> >> >> > > There's both a NumFOCUS discourse and zulip, I believe, but
>> neither is particularly active. Whether those should be considered possible
>> starting points or cautionary tales I leave to y'all.
>> >> >> >> >
>> >> >> >> > Yeah, I should also amend my previous statement from "how
>> about discourse?" to "is there anything more appropriate than an e-mail
>> list?".  Discourse is the service around which I've seen the most activity
>> recently but I agree that in our community, it hasn't really taken off that
>> well.
>> >> >> >> >
>> >> >> >> > And just to reiterate, I think that an e-mail list would be
>> great.  Just wanted to throw out some other thoughts.
>> >> >> >> >
>> >> >> >> > Best,
>> >> >> >> > -matt
>> >> >> >> >>>
>> >> >> >> >>>
>> >> >> >> >>> > On Wed, Dec 26, 2018 at 8:47 AM Wes McKinney <
>> wesmckinn at gmail.com> wrote:
>> >> >> >> >>> >>
>> >> >> >> >>> >> I sent a request to postmaster @ python.o to create
>> >> >> >> >>> >> pydata-dev at python.org. We can also use google groups if
>> others prefer
>> >> >> >> >>> >> that
>> >> >> >> >>> >>
>> >> >> >> >>> >> On Tue, Dec 25, 2018 at 3:59 PM Joris Van den Bossche
>> >> >> >> >>> >> <jorisvandenbossche at gmail.com> wrote:
>> >> >> >> >>> >> >
>> >> >> >> >>> >> > Giving the growing ecosysten of data tools (in some way
>> related to pandas, but not pandas itself), I am also +1 on such a list. I
>> think that would be welcome, and not aware of anything existing.
>> >> >> >> >>> >> >
>> >> >> >> >>> >> > Joris
>> >> >> >> >>> >> >
>> >> >> >> >>> >> > Op di 25 dec. 2018 02:19 schreef Stephan Hoyer <
>> shoyer at gmail.com:
>> >> >> >> >>> >> >>
>> >> >> >> >>> >> >> +1 for pydata-dev
>> >> >> >> >>> >> >>
>> >> >> >> >>> >> >> I don't think there's a list quite like this today.
>> >> >> >> >>> >> >>
>> >> >> >> >>> >> >> On Mon, Dec 24, 2018 at 6:11 PM Wes McKinney <
>> wesmckinn at gmail.com> wrote:
>> >> >> >> >>> >> >>>
>> >> >> >> >>> >> >>> I'm talking about public archived communication
>> channels
>> >> >> >> >>> >> >>>
>> >> >> >> >>> >> >>> On Mon, Dec 24, 2018, 7:57 PM William Ayd <
>> william.ayd at icloud.com wrote:
>> >> >> >> >>> >> >>>>
>> >> >> >> >>> >> >>>> What do you think is missing from the existing
>> PyData conferences? I’ve only been to the one in LA but it seemed to be
>> somewhat in the direction of what you are asking for.
>> >> >> >> >>> >> >>>>
>> >> >> >> >>> >> >>>> > On Dec 24, 2018, at 3:02 PM, Wes McKinney <
>> wesmckinn at gmail.com> wrote:
>> >> >> >> >>> >> >>>> >
>> >> >> >> >>> >> >>>> > hi folks,
>> >> >> >> >>> >> >>>> >
>> >> >> >> >>> >> >>>> > Prompted by some recent discussions I wondered
>> what you all think
>> >> >> >> >>> >> >>>> > would be the best venue to have public discussions
>> that involve other
>> >> >> >> >>> >> >>>> > open source projects that are generally 1 degree
>> of separation away
>> >> >> >> >>> >> >>>> > from pandas. Sort of like "pydata-dev", or
>> something. Is there
>> >> >> >> >>> >> >>>> > something like this already that I just missed?
>> >> >> >> >>> >> >>>> >
>> >> >> >> >>> >> >>>> > As context, I'm trying to travel less and go to
>> fewer conferences the
>> >> >> >> >>> >> >>>> > next couple of years, and spend more time coding
>> and writing, but I
>> >> >> >> >>> >> >>>> > still want to talk with people (asynchronously)
>> about things, and
>> >> >> >> >>> >> >>>> > preferably in public.
>> >> >> >> >>> >> >>>> >
>> >> >> >> >>> >> >>>> > - Wes
>> >> >> >> >>> >> >>>> > _______________________________________________
>> >> >> >> >>> >> >>>> > Pandas-dev mailing list
>> >> >> >> >>> >> >>>> > Pandas-dev at python.org
>> >> >> >> >>> >> >>>> >
>> https://mail.python.org/mailman/listinfo/pandas-dev
>> >> >> >> >>> >> >>>>
>> >> >> >> >>> >> >>> _______________________________________________
>> >> >> >> >>> >> >>> Pandas-dev mailing list
>> >> >> >> >>> >> >>> Pandas-dev at python.org
>> >> >> >> >>> >> >>> https://mail.python.org/mailman/listinfo/pandas-dev
>> >> >> >> >>> >> >>
>> >> >> >> >>> >> >> _______________________________________________
>> >> >> >> >>> >> >> Pandas-dev mailing list
>> >> >> >> >>> >> >> Pandas-dev at python.org
>> >> >> >> >>> >> >> https://mail.python.org/mailman/listinfo/pandas-dev
>> >> >> >> >>> >> _______________________________________________
>> >> >> >> >>> >> Pandas-dev mailing list
>> >> >> >> >>> >> Pandas-dev at python.org
>> >> >> >> >>> >> https://mail.python.org/mailman/listinfo/pandas-dev
>> >> >> >> >>> _______________________________________________
>> >> >> >> >>> Pandas-dev mailing list
>> >> >> >> >>> Pandas-dev at python.org
>> >> >> >> >>> https://mail.python.org/mailman/listinfo/pandas-dev
>> >> >> >> >
>> >> >> >> > _______________________________________________
>> >> >> >> > Pandas-dev mailing list
>> >> >> >> > Pandas-dev at python.org
>> >> >> >> > https://mail.python.org/mailman/listinfo/pandas-dev
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >> Nathaniel J. Smith -- https://vorpus.org
>> >> >> >> _______________________________________________
>> >> >> >> Pandas-dev mailing list
>> >> >> >> Pandas-dev at python.org
>> >> >> >> https://mail.python.org/mailman/listinfo/pandas-dev
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > Pandas-dev mailing list
>> >> >> > Pandas-dev at python.org
>> >> >> > https://mail.python.org/mailman/listinfo/pandas-dev
>> >> _______________________________________________
>> >> Pandas-dev mailing list
>> >> Pandas-dev at python.org
>> >> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20190103/b52fb70a/attachment-0001.html>


More information about the Pandas-dev mailing list