From jorisvandenbossche at gmail.com  Thu Oct  3 16:32:44 2019
From: jorisvandenbossche at gmail.com (Joris Van den Bossche)
Date: Thu, 3 Oct 2019 22:32:44 +0200
Subject: [Pandas-dev] ROADMAP proposal: Consistent missing value handling
 with new NA scalar
Message-ID: <CALQtMBayajcL7LWKb-1NLQa8CKrz9zmgfp-3rx-vA0SDoP2Q-g@mail.gmail.com>

Hi all,

I would like to propose a revisit of missing value handling in pandas. It's
already being discussed on github (
https://github.com/pandas-dev/pandas/issues/28095), but want to mention
this on the mailing list as well for broader feedback.
A more detailed proposal can be found here:
https://hackmd.io/@jorisvandenbossche/Sk0wMeAmB, and discussion can be
found at the above github issue.

A summary of the proposal is to introduce *a new NA value (singleton) for
representing scalar missing values* (instead of np.nan) that can be used
consistently across all data types. This could be achieved under the hood
by using a mask-based approach to store the missing values on the
array/series-level, but the main discussion here is about the user-facing
API: the scalar NA value and the behaviour of NA in several operation.

Motivation for this change:

   - *Consistent user interface.*
   Currently, the value you get back for a missing scalar (eg from scalar
   access s[idx]) depends on the data type (np.nan for many, but pd.NaT for
   datetime-likes). Some types support missing values, others don't. This
   proposal would ensure you get back pd.NA regardless of the dtype.
   - *No "mis-use" of the np.nan floating point value.*
   The NaN value is a specific floating point value, and not necessarily an
   indicator for missing values (although pandas has always used it that way).
   And because we also use it for other dtypes, you get back a float value for
   non-float dtypes, giving misleading dtype information.
   - *A missing value that behaves accordingly.*
   Our current behaviour of missing values is inherited of the np.nan
   behaviour. Other languages that have a NA/NULL value that is distinguished
   from NaN (eg Julia, SQL, R) typically have different behaviour in
   comparison and logical operations. For example, comparison with NA could
   give NA instead of False, and consequently we need to have a boolean dtype
   with NA support. A new NA value opens up the possibility of having such
   behaviour.
   - An "NA" scalar *matches the terminology* that is used throughout
   pandas in functions and argument names (isna, dropna, fillna, skipna, ?).


See the proposal <https://hackmd.io/@jorisvandenbossche/Sk0wMeAmB> for more
details.

This has of course many consequences in the user API of pandas. Initially,
it could therefore be introduced optionally (eg only in the new data types
as nullable integer or string dtype).
And given those pervasive changes, many eyes on it are important. *So
feedback on this idea would be greatly appreciated!*

Joris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191003/356a9f03/attachment.html>

From sebastian at sipsolutions.net  Thu Oct  3 17:54:19 2019
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Thu, 03 Oct 2019 14:54:19 -0700
Subject: [Pandas-dev] =?utf-8?q?Accepting_NEP_29_=E2=80=94_Recommend_Pyth?=
 =?utf-8?q?on_and_Numpy_version_support_as_a_community_policy_standard?=
Message-ID: <cff777d659241d83dad1556172baf25d82b20209.camel@sipsolutions.net>

Hi all,

we propose formally accepting the NumPy enhancement proposal 29:

"Recommend Python and Numpy version support as a community policy
standard"

available at: https://numpy.org/neps/nep-0029-deprecation_policy.html

If there are no objections within a week it may be accepted. This
proposal is a recommendation to the larger ecosystem and thus should
receive attention and acceptance from a wide audience.
However, lets try to keep discussions on the NumPy mailing list.

The most important points from the Abstract and Implementation sections
are:

"This NEP recommends that all projects across the Scientific Python
ecosystem adopt a common ?time window-based? policy for support of
Python and NumPy versions. Standardizing a recommendation for project
support of minimum Python and NumPy versions will improve downstream
project planning. ?"

and:

"We suggest that all projects adopt the following language into their
development guidelines:

This project supports:
  * All minor versions of Python released 42 months prior to the
project, and at minimum the two latest minor versions.
  * All minor versions of numpy released in the 24 months prior to the
project, and at minimum the last thee minor versions."

For the full text, please refer to the link above.

Cheers,

Sebastian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191003/f9b1e8f0/attachment.sig>

From garcia.marc at gmail.com  Thu Oct  3 21:57:44 2019
From: garcia.marc at gmail.com (Marc Garcia)
Date: Thu, 3 Oct 2019 22:57:44 -0300
Subject: [Pandas-dev] 
	=?utf-8?q?Accepting_NEP_29_=E2=80=94_Recommend_Pyth?=
	=?utf-8?q?on_and_Numpy_version_support_as_a_community_policy_stand?=
	=?utf-8?q?ard?=
In-Reply-To: <cff777d659241d83dad1556172baf25d82b20209.camel@sipsolutions.net>
References: <cff777d659241d83dad1556172baf25d82b20209.camel@sipsolutions.net>
Message-ID: <CAEk5N5t0L+t-GMvy++11ksxnNQDOjLGLdpiJGndrXF6vxitAFA@mail.gmail.com>

>From the pandas side, we created
https://github.com/pandas-dev/pandas/issues/27557 and seems like there is
agreement in the proposal being accepted, and adhering to it.

On Thu, Oct 3, 2019 at 6:58 PM Sebastian Berg <sebastian at sipsolutions.net>
wrote:

> Hi all,
>
> we propose formally accepting the NumPy enhancement proposal 29:
>
> "Recommend Python and Numpy version support as a community policy
> standard"
>
> available at: https://numpy.org/neps/nep-0029-deprecation_policy.html
>
> If there are no objections within a week it may be accepted. This
> proposal is a recommendation to the larger ecosystem and thus should
> receive attention and acceptance from a wide audience.
> However, lets try to keep discussions on the NumPy mailing list.
>
> The most important points from the Abstract and Implementation sections
> are:
>
> "This NEP recommends that all projects across the Scientific Python
> ecosystem adopt a common ?time window-based? policy for support of
> Python and NumPy versions. Standardizing a recommendation for project
> support of minimum Python and NumPy versions will improve downstream
> project planning. ?"
>
> and:
>
> "We suggest that all projects adopt the following language into their
> development guidelines:
>
> This project supports:
>   * All minor versions of Python released 42 months prior to the
> project, and at minimum the two latest minor versions.
>   * All minor versions of numpy released in the 24 months prior to the
> project, and at minimum the last thee minor versions."
>
> For the full text, please refer to the link above.
>
> Cheers,
>
> Sebastian
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191003/d229069f/attachment.html>

From jorisvandenbossche at gmail.com  Mon Oct  7 04:41:50 2019
From: jorisvandenbossche at gmail.com (Joris Van den Bossche)
Date: Mon, 7 Oct 2019 10:41:50 +0200
Subject: [Pandas-dev] Fwd: [NumFOCUS Projects] Round 3: NumFOCUS Small
 Development Grants CFP is Open
In-Reply-To: <B0589C1A-BCCE-44CF-BE8A-34FE6AE50B3D@icloud.com>
References: <CAFhTXRNSvP9A+rWkVdb+YVZajKDKr0e_ityHfwXS_AQtgERLrg@mail.gmail.com>
 <CAFhTXRPAtu=4GRYiO809zq-oMfOo014MVhdi4t_3eGxMN2gSag@mail.gmail.com>
 <CAFhTXRM3eP4sWqoG=T-RZy8LOr6T1h3tQp7HNiE3RzKbL5kZQQ@mail.gmail.com>
 <CAEk5N5tkAgBBW_-QeijprEo9Oo4xhvLKA9i_oMsy6buAT81f+Q@mail.gmail.com>
 <B0589C1A-BCCE-44CF-BE8A-34FE6AE50B3D@icloud.com>
Message-ID: <CALQtMBbyE_Fq1YePB-TjsrczgOpp01r_TCVi_LBKUVYQ1pgr_w@mail.gmail.com>

Is there any interest in developing one of those ideas into a proposal?
I think a main question is also if there is someone who actually wants to
do the work if the proposal was accepted (ideas to do we will probably
always find ;-))

Joris

On Thu, 19 Sep 2019 at 23:53, William Ayd via Pandas-dev <
pandas-dev at python.org> wrote:

> I think ASV would be the best out of these, because I think it would be
> very useful and also easier to measure completion of versus something like
> ?Tighter Arrow Integration? which is a little open ended. I think an ASV
> proposal would be something along the lines of:
>
> - Develop a feedback loop to detect regressions as part of the PR process
> - Standardize test expectations (ex: say how long each test should run,
> define what kind of memory tests we need)
> - Build canned reports on our ASV runner to give a high level overview of
> performance over time (kind of there, but needs some usability polish)
> - Improving existing benchmark suite performance (I think these take a
> very long time to run)
> - Document and update contributing guide on how to test and develop
> benchmarks
>
> Can whittle down remaining points but figured I?d share thoughts for now.
>
> - Will
>
> On Sep 19, 2019, at 3:45 AM, Marc Garcia <garcia.marc at gmail.com> wrote:
>
> The new round of the NumFOCUS small development grants has been announced.
> Deadline for proposals is in a bit more than a month.
>
> I copy here previous ideas for proposals:
>  - Will: A better JSON -> DataFrame parser (I think RapidJSON came up in
> the past)
>  - Will: Tighter Arrow Integration(s)
>  - Will: Various ExtensionArrays (container support comes to mind)
>  - Brock: Improve ASV workflow
>
> Opening the discussion here. I guess to make the proposals more specific,
> would be good to specify:
> - Summary of the proposal
> - Amount required
> - If applies, who will be working on the proposal
>
> Probably worth nothing that funds do not necessarily need to be for
> development time, but other initiatives like training, events...
>
> I forward the relevant parts of the email from NumFOCUS.
>
> ---------- Forwarded message ---------
>
>
> Hello everyone,
>
> NumFOCUS is pleased to invite proposals from its sponsored and affiliated
> projects for targeted small development grants three times per year. This
> is the third and final call for proposals for 2019.
>
> There are no restrictions on what the funding can be used for: code
> development; documentation work; website updates; workshops and sprints;
> educational, sustainability, and diversity initiatives; or other types of
> projects.
>
> *Yes, you may re-submit a past grant proposal that was previously not
> chosen for funding. *
>
> For a list of all past successful proposals, see our website:
> https://numfocus.org/programs/sustainability#sdg
>
> Only one application may be submitted per project per grant funding cycle.
>
> Available Funding:
>
>    -
>
>    Up to $5,000 per proposal.
>
>
> Eligibility:
>
>    -
>
>    Any NumFOCUS Fiscally Sponsored or Affiliated project may submit one
>    proposal on behalf of the project per grant cycle.
>    - Proposed work must be achievable within calendar year 2019 or the
>    first few months of 2020.
>    -
>
>    The call is open to applicants from any nationality and can be
>    performed at any university, institute or business worldwide (US export
>    laws permitting).
>
>
> Round 3 Timeline:
>
>
>    - *27 Oct 2019: deadline for proposal submissions*
>    - 18 Nov 2019: proposal acceptance notifications
>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
>
> William Ayd
> william.ayd at icloud.com
>
>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191007/a9817d47/attachment.html>

From garcia.marc at gmail.com  Mon Oct  7 06:03:02 2019
From: garcia.marc at gmail.com (Marc Garcia)
Date: Mon, 7 Oct 2019 05:03:02 -0500
Subject: [Pandas-dev] Fwd: [NumFOCUS Projects] Round 3: NumFOCUS Small
 Development Grants CFP is Open
In-Reply-To: <CALQtMBbyE_Fq1YePB-TjsrczgOpp01r_TCVi_LBKUVYQ1pgr_w@mail.gmail.com>
References: <CAFhTXRNSvP9A+rWkVdb+YVZajKDKr0e_ityHfwXS_AQtgERLrg@mail.gmail.com>
 <CAFhTXRPAtu=4GRYiO809zq-oMfOo014MVhdi4t_3eGxMN2gSag@mail.gmail.com>
 <CAFhTXRM3eP4sWqoG=T-RZy8LOr6T1h3tQp7HNiE3RzKbL5kZQQ@mail.gmail.com>
 <CAEk5N5tkAgBBW_-QeijprEo9Oo4xhvLKA9i_oMsy6buAT81f+Q@mail.gmail.com>
 <B0589C1A-BCCE-44CF-BE8A-34FE6AE50B3D@icloud.com>
 <CALQtMBbyE_Fq1YePB-TjsrczgOpp01r_TCVi_LBKUVYQ1pgr_w@mail.gmail.com>
Message-ID: <CAEk5N5se64f-K-ORVALayJaAzrCtn5S2+8nU+mRPP-EiuryjRQ@mail.gmail.com>

If there are no other proposals I'll put together one for the people I'm
mentoring. To work on the docstring efforts, organize sprints for
minorities in their local communities, and try to make them lead and
coordinate the effort of fixing the docstrings
While helpimg increase the diversity of the project. I'm thinking on
something like $2,000 for 4 people/sprints (to pay for sprint expenses,
give some pandas swags, and a small grant to the people leading them).

On Mon, 7 Oct 2019, 03:42 Joris Van den Bossche, <
jorisvandenbossche at gmail.com> wrote:

> Is there any interest in developing one of those ideas into a proposal?
> I think a main question is also if there is someone who actually wants to
> do the work if the proposal was accepted (ideas to do we will probably
> always find ;-))
>
> Joris
>
> On Thu, 19 Sep 2019 at 23:53, William Ayd via Pandas-dev <
> pandas-dev at python.org> wrote:
>
>> I think ASV would be the best out of these, because I think it would be
>> very useful and also easier to measure completion of versus something like
>> ?Tighter Arrow Integration? which is a little open ended. I think an ASV
>> proposal would be something along the lines of:
>>
>> - Develop a feedback loop to detect regressions as part of the PR process
>> - Standardize test expectations (ex: say how long each test should run,
>> define what kind of memory tests we need)
>> - Build canned reports on our ASV runner to give a high level overview of
>> performance over time (kind of there, but needs some usability polish)
>> - Improving existing benchmark suite performance (I think these take a
>> very long time to run)
>> - Document and update contributing guide on how to test and develop
>> benchmarks
>>
>> Can whittle down remaining points but figured I?d share thoughts for now.
>>
>> - Will
>>
>> On Sep 19, 2019, at 3:45 AM, Marc Garcia <garcia.marc at gmail.com> wrote:
>>
>> The new round of the NumFOCUS small development grants has been
>> announced. Deadline for proposals is in a bit more than a month.
>>
>> I copy here previous ideas for proposals:
>>  - Will: A better JSON -> DataFrame parser (I think RapidJSON came up in
>> the past)
>>  - Will: Tighter Arrow Integration(s)
>>  - Will: Various ExtensionArrays (container support comes to mind)
>>  - Brock: Improve ASV workflow
>>
>> Opening the discussion here. I guess to make the proposals more specific,
>> would be good to specify:
>> - Summary of the proposal
>> - Amount required
>> - If applies, who will be working on the proposal
>>
>> Probably worth nothing that funds do not necessarily need to be for
>> development time, but other initiatives like training, events...
>>
>> I forward the relevant parts of the email from NumFOCUS.
>>
>> ---------- Forwarded message ---------
>>
>>
>> Hello everyone,
>>
>> NumFOCUS is pleased to invite proposals from its sponsored and affiliated
>> projects for targeted small development grants three times per year. This
>> is the third and final call for proposals for 2019.
>>
>> There are no restrictions on what the funding can be used for: code
>> development; documentation work; website updates; workshops and sprints;
>> educational, sustainability, and diversity initiatives; or other types of
>> projects.
>>
>> *Yes, you may re-submit a past grant proposal that was previously not
>> chosen for funding. *
>>
>> For a list of all past successful proposals, see our website:
>> https://numfocus.org/programs/sustainability#sdg
>>
>> Only one application may be submitted per project per grant funding cycle.
>>
>> Available Funding:
>>
>>    -
>>
>>    Up to $5,000 per proposal.
>>
>>
>> Eligibility:
>>
>>    -
>>
>>    Any NumFOCUS Fiscally Sponsored or Affiliated project may submit one
>>    proposal on behalf of the project per grant cycle.
>>    - Proposed work must be achievable within calendar year 2019 or the
>>    first few months of 2020.
>>    -
>>
>>    The call is open to applicants from any nationality and can be
>>    performed at any university, institute or business worldwide (US export
>>    laws permitting).
>>
>>
>> Round 3 Timeline:
>>
>>
>>    - *27 Oct 2019: deadline for proposal submissions*
>>    - 18 Nov 2019: proposal acceptance notifications
>>
>>
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>>
>> William Ayd
>> william.ayd at icloud.com
>>
>>
>>
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191007/d3e8ff8a/attachment-0001.html>

From tom.augspurger88 at gmail.com  Tue Oct  8 21:56:05 2019
From: tom.augspurger88 at gmail.com (Tom Augspurger)
Date: Tue, 8 Oct 2019 20:56:05 -0500
Subject: [Pandas-dev] Monthly Pandas Developer Meeting
Message-ID: <CAE1aY-=KNd0m6AMppYaZX0YdxEg_qgVdnBc_t0QMoL49DAgYow@mail.gmail.com>

Hi all,

The next regular pandas meeting is tomorrow, October 9th at 12:00 Central
time.

Minutes:
https://docs.google.com/document/u/1/d/1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit?ouid=102771015311436394588&usp=docs_home&ths=true

Hangout: https://meet.google.com/hav-rmax-zjx

Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191008/e67a5a01/attachment.html>

From jeffreback at gmail.com  Tue Oct  8 22:05:47 2019
From: jeffreback at gmail.com (Jeff Reback)
Date: Tue, 8 Oct 2019 22:05:47 -0400
Subject: [Pandas-dev] [pydata] Monthly Pandas Developer Meeting
In-Reply-To: <CAE1aY-=KNd0m6AMppYaZX0YdxEg_qgVdnBc_t0QMoL49DAgYow@mail.gmail.com>
References: <CAE1aY-=KNd0m6AMppYaZX0YdxEg_qgVdnBc_t0QMoL49DAgYow@mail.gmail.com>
Message-ID: <88D33C25-D2E2-4386-8B16-B64544D498AC@gmail.com>

tomorrow is a holiday so i won?t be around
but pls continue w/o me
> On Oct 8, 2019, at 9:56 PM, Tom Augspurger <tom.augspurger88 at gmail.com> wrote:
> 
> Hi all,
> 
> The next regular pandas meeting is tomorrow, October 9th at 12:00 Central time.
> 
> Minutes: https://docs.google.com/document/u/1/d/1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit?ouid=102771015311436394588&usp=docs_home&ths=true
> 
> Hangout: https://meet.google.com/hav-rmax-zjx
> 
> Tom
> -- 
> You received this message because you are subscribed to the Google Groups "PyData" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pydata/CAE1aY-%3DKNd0m6AMppYaZX0YdxEg_qgVdnBc_t0QMoL49DAgYow%40mail.gmail.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191008/d88c8f3a/attachment.html>

From jorisvandenbossche at gmail.com  Thu Oct 10 15:19:38 2019
From: jorisvandenbossche at gmail.com (Joris Van den Bossche)
Date: Thu, 10 Oct 2019 21:19:38 +0200
Subject: [Pandas-dev] Fwd: [NumFOCUS Projects] Round 3: NumFOCUS Small
 Development Grants CFP is Open
In-Reply-To: <CAEk5N5se64f-K-ORVALayJaAzrCtn5S2+8nU+mRPP-EiuryjRQ@mail.gmail.com>
References: <CAFhTXRNSvP9A+rWkVdb+YVZajKDKr0e_ityHfwXS_AQtgERLrg@mail.gmail.com>
 <CAFhTXRPAtu=4GRYiO809zq-oMfOo014MVhdi4t_3eGxMN2gSag@mail.gmail.com>
 <CAFhTXRM3eP4sWqoG=T-RZy8LOr6T1h3tQp7HNiE3RzKbL5kZQQ@mail.gmail.com>
 <CAEk5N5tkAgBBW_-QeijprEo9Oo4xhvLKA9i_oMsy6buAT81f+Q@mail.gmail.com>
 <B0589C1A-BCCE-44CF-BE8A-34FE6AE50B3D@icloud.com>
 <CALQtMBbyE_Fq1YePB-TjsrczgOpp01r_TCVi_LBKUVYQ1pgr_w@mail.gmail.com>
 <CAEk5N5se64f-K-ORVALayJaAzrCtn5S2+8nU+mRPP-EiuryjRQ@mail.gmail.com>
Message-ID: <CALQtMBYbWkPf04yo7uB8ttFyT9WHtEx7Hvv0tj0b5j_Os3MDCg@mail.gmail.com>

That sounds as very good idea to me!

On Mon, 7 Oct 2019 at 12:03, Marc Garcia <garcia.marc at gmail.com> wrote:

> If there are no other proposals I'll put together one for the people I'm
> mentoring. To work on the docstring efforts, organize sprints for
> minorities in their local communities, and try to make them lead and
> coordinate the effort of fixing the docstrings
> While helpimg increase the diversity of the project. I'm thinking on
> something like $2,000 for 4 people/sprints (to pay for sprint expenses,
> give some pandas swags, and a small grant to the people leading them).
>
> On Mon, 7 Oct 2019, 03:42 Joris Van den Bossche, <
> jorisvandenbossche at gmail.com> wrote:
>
>> Is there any interest in developing one of those ideas into a proposal?
>> I think a main question is also if there is someone who actually wants to
>> do the work if the proposal was accepted (ideas to do we will probably
>> always find ;-))
>>
>> Joris
>>
>> On Thu, 19 Sep 2019 at 23:53, William Ayd via Pandas-dev <
>> pandas-dev at python.org> wrote:
>>
>>> I think ASV would be the best out of these, because I think it would be
>>> very useful and also easier to measure completion of versus something like
>>> ?Tighter Arrow Integration? which is a little open ended. I think an ASV
>>> proposal would be something along the lines of:
>>>
>>> - Develop a feedback loop to detect regressions as part of the PR process
>>> - Standardize test expectations (ex: say how long each test should run,
>>> define what kind of memory tests we need)
>>> - Build canned reports on our ASV runner to give a high level overview
>>> of performance over time (kind of there, but needs some usability polish)
>>> - Improving existing benchmark suite performance (I think these take a
>>> very long time to run)
>>> - Document and update contributing guide on how to test and develop
>>> benchmarks
>>>
>>> Can whittle down remaining points but figured I?d share thoughts for now.
>>>
>>> - Will
>>>
>>> On Sep 19, 2019, at 3:45 AM, Marc Garcia <garcia.marc at gmail.com> wrote:
>>>
>>> The new round of the NumFOCUS small development grants has been
>>> announced. Deadline for proposals is in a bit more than a month.
>>>
>>> I copy here previous ideas for proposals:
>>>  - Will: A better JSON -> DataFrame parser (I think RapidJSON came up in
>>> the past)
>>>  - Will: Tighter Arrow Integration(s)
>>>  - Will: Various ExtensionArrays (container support comes to mind)
>>>  - Brock: Improve ASV workflow
>>>
>>> Opening the discussion here. I guess to make the proposals more
>>> specific, would be good to specify:
>>> - Summary of the proposal
>>> - Amount required
>>> - If applies, who will be working on the proposal
>>>
>>> Probably worth nothing that funds do not necessarily need to be for
>>> development time, but other initiatives like training, events...
>>>
>>> I forward the relevant parts of the email from NumFOCUS.
>>>
>>> ---------- Forwarded message ---------
>>>
>>>
>>> Hello everyone,
>>>
>>> NumFOCUS is pleased to invite proposals from its sponsored and
>>> affiliated projects for targeted small development grants three times per
>>> year. This is the third and final call for proposals for 2019.
>>>
>>> There are no restrictions on what the funding can be used for: code
>>> development; documentation work; website updates; workshops and sprints;
>>> educational, sustainability, and diversity initiatives; or other types of
>>> projects.
>>>
>>> *Yes, you may re-submit a past grant proposal that was previously not
>>> chosen for funding. *
>>>
>>> For a list of all past successful proposals, see our website:
>>> https://numfocus.org/programs/sustainability#sdg
>>>
>>> Only one application may be submitted per project per grant funding
>>> cycle.
>>>
>>> Available Funding:
>>>
>>>    -
>>>
>>>    Up to $5,000 per proposal.
>>>
>>>
>>> Eligibility:
>>>
>>>    -
>>>
>>>    Any NumFOCUS Fiscally Sponsored or Affiliated project may submit one
>>>    proposal on behalf of the project per grant cycle.
>>>    - Proposed work must be achievable within calendar year 2019 or the
>>>    first few months of 2020.
>>>    -
>>>
>>>    The call is open to applicants from any nationality and can be
>>>    performed at any university, institute or business worldwide (US export
>>>    laws permitting).
>>>
>>>
>>> Round 3 Timeline:
>>>
>>>
>>>    - *27 Oct 2019: deadline for proposal submissions*
>>>    - 18 Nov 2019: proposal acceptance notifications
>>>
>>>
>>> _______________________________________________
>>> Pandas-dev mailing list
>>> Pandas-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>
>>>
>>> William Ayd
>>> william.ayd at icloud.com
>>>
>>>
>>>
>>> _______________________________________________
>>> Pandas-dev mailing list
>>> Pandas-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191010/27947b65/attachment.html>

From jeffreback at gmail.com  Thu Oct 10 21:13:32 2019
From: jeffreback at gmail.com (Jeff Reback)
Date: Thu, 10 Oct 2019 21:13:32 -0400
Subject: [Pandas-dev] Fwd: [NumFOCUS Projects] Round 3: NumFOCUS Small
 Development Grants CFP is Open
In-Reply-To: <CALQtMBYbWkPf04yo7uB8ttFyT9WHtEx7Hvv0tj0b5j_Os3MDCg@mail.gmail.com>
References: <CAFhTXRNSvP9A+rWkVdb+YVZajKDKr0e_ityHfwXS_AQtgERLrg@mail.gmail.com>
 <CAFhTXRPAtu=4GRYiO809zq-oMfOo014MVhdi4t_3eGxMN2gSag@mail.gmail.com>
 <CAFhTXRM3eP4sWqoG=T-RZy8LOr6T1h3tQp7HNiE3RzKbL5kZQQ@mail.gmail.com>
 <CAEk5N5tkAgBBW_-QeijprEo9Oo4xhvLKA9i_oMsy6buAT81f+Q@mail.gmail.com>
 <B0589C1A-BCCE-44CF-BE8A-34FE6AE50B3D@icloud.com>
 <CALQtMBbyE_Fq1YePB-TjsrczgOpp01r_TCVi_LBKUVYQ1pgr_w@mail.gmail.com>
 <CAEk5N5se64f-K-ORVALayJaAzrCtn5S2+8nU+mRPP-EiuryjRQ@mail.gmail.com>
 <CALQtMBYbWkPf04yo7uB8ttFyT9WHtEx7Hvv0tj0b5j_Os3MDCg@mail.gmail.com>
Message-ID: <D5DB04AC-8913-4D7E-8117-E96EACA44186@gmail.com>

sounds good to me

> On Oct 10, 2019, at 3:19 PM, Joris Van den Bossche <jorisvandenbossche at gmail.com> wrote:
> 
> That sounds as very good idea to me!
> 
>> On Mon, 7 Oct 2019 at 12:03, Marc Garcia <garcia.marc at gmail.com> wrote:
>> If there are no other proposals I'll put together one for the people I'm mentoring. To work on the docstring efforts, organize sprints for minorities in their local communities, and try to make them lead and coordinate the effort of fixing the docstrings
>> While helpimg increase the diversity of the project. I'm thinking on something like $2,000 for 4 people/sprints (to pay for sprint expenses, give some pandas swags, and a small grant to the people leading them).
>> 
>>> On Mon, 7 Oct 2019, 03:42 Joris Van den Bossche, <jorisvandenbossche at gmail.com> wrote:
>>> Is there any interest in developing one of those ideas into a proposal? 
>>> I think a main question is also if there is someone who actually wants to do the work if the proposal was accepted (ideas to do we will probably always find ;-))
>>> 
>>> Joris
>>> 
>>>> On Thu, 19 Sep 2019 at 23:53, William Ayd via Pandas-dev <pandas-dev at python.org> wrote:
>>>> I think ASV would be the best out of these, because I think it would be very useful and also easier to measure completion of versus something like ?Tighter Arrow Integration? which is a little open ended. I think an ASV proposal would be something along the lines of:
>>>> 
>>>> 	- Develop a feedback loop to detect regressions as part of the PR process
>>>> 	- Standardize test expectations (ex: say how long each test should run, define what kind of memory tests we need)
>>>> 	- Build canned reports on our ASV runner to give a high level overview of performance over time (kind of there, but needs some usability polish)
>>>> 	- Improving existing benchmark suite performance (I think these take a very long time to run)
>>>> 	- Document and update contributing guide on how to test and develop benchmarks
>>>> 
>>>> Can whittle down remaining points but figured I?d share thoughts for now.
>>>> 
>>>> - Will
>>>> 
>>>>> On Sep 19, 2019, at 3:45 AM, Marc Garcia <garcia.marc at gmail.com> wrote:
>>>>> 
>>>>> The new round of the NumFOCUS small development grants has been announced. Deadline for proposals is in a bit more than a month.
>>>>> 
>>>>> I copy here previous ideas for proposals:
>>>>>  - Will: A better JSON -> DataFrame parser (I think RapidJSON came up in the past)
>>>>>  - Will: Tighter Arrow Integration(s)
>>>>>  - Will: Various ExtensionArrays (container support comes to mind)
>>>>>  - Brock: Improve ASV workflow
>>>>> 
>>>>> Opening the discussion here. I guess to make the proposals more specific, would be good to specify:
>>>>> - Summary of the proposal
>>>>> - Amount required
>>>>> - If applies, who will be working on the proposal
>>>>> 
>>>>> Probably worth nothing that funds do not necessarily need to be for development time, but other initiatives like training, events...
>>>>> 
>>>>> I forward the relevant parts of the email from NumFOCUS.
>>>>> 
>>>>> ---------- Forwarded message ---------
>>>>> 
>>>>> 
>>>>> Hello everyone,
>>>>> 
>>>>> NumFOCUS is pleased to invite proposals from its sponsored and affiliated projects for targeted small development grants three times per year. This is the third and final call for proposals for 2019.
>>>>> 
>>>>> There are no restrictions on what the funding can be used for: code development; documentation work; website updates; workshops and sprints; educational, sustainability, and diversity initiatives; or other types of projects.
>>>>> 
>>>>> Yes, you may re-submit a past grant proposal that was previously not chosen for funding. 
>>>>> 
>>>>> For a list of all past successful proposals, see our website: https://numfocus.org/programs/sustainability#sdg
>>>>> 
>>>>> Only one application may be submitted per project per grant funding cycle.
>>>>> 
>>>>> Available Funding:
>>>>> Up to $5,000 per proposal.
>>>>> 
>>>>> Eligibility:
>>>>> Any NumFOCUS Fiscally Sponsored or Affiliated project may submit one proposal on behalf of the project per grant cycle.
>>>>> Proposed work must be achievable within calendar year 2019 or the first few months of 2020.
>>>>> The call is open to applicants from any nationality and can be performed at any university, institute or business worldwide (US export laws permitting).
>>>>> 
>>>>> 
>>>>> Round 3 Timeline:
>>>>> 
>>>>> 27 Oct 2019: deadline for proposal submissions
>>>>> 18 Nov 2019: proposal acceptance notifications
>>>>> 
>>>>> _______________________________________________
>>>>> Pandas-dev mailing list
>>>>> Pandas-dev at python.org
>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>> 
>>>> William Ayd
>>>> william.ayd at icloud.com
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Pandas-dev mailing list
>>>> Pandas-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/pandas-dev
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191010/b5fe7ded/attachment-0001.html>

From alexbill944 at gmail.com  Thu Oct 10 11:16:44 2019
From: alexbill944 at gmail.com (Alex Bill)
Date: Thu, 10 Oct 2019 15:16:44 +0000
Subject: [Pandas-dev] Donation for GitHub Pandas-Dev
Message-ID: <SalesSequence.9000000961.27448.2187f3bc3.smtp@amsofttech.freshsales.io>


Hello Pandas,

We are a small business impressed by your open source initiative on https://github.com/pandas-dev. Our management support different open-source projects under a limited budget on a regular basis. You have made the final list.We are looking forward to supporting you for the year ahead-either through monthly or quarterly donations. Depending on what you prefer.Please let us know the payment mode that will work for you, so we may proceed. We are hoping you will accept our humble gesture.?  
Alex Bill
Blogger?|?Internet Marketing

e:? [http://amsofttech.fstracker.io/email/track/click?hash=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJkYXRhIjp7Im11c3RoIjoiaHR0cDovL2FwcHJveGVuLmZzdHJhY2tlci5pby9lbWFpbC90cmFjay9jbGljaz9oYXNoPWV5SjBlWEFpT2lKS1YxUWlMQ0poYkdjaU9pSklVekkxTmlKOS5leUprWVhSaElqcDdJbTExYzNSb0lqb2liV0ZwYkhSdk9tRnNaWGhpYVd4c09UUTBRR2R0WVdsc0xtTnZiU0lzSW14cGIyNGlPaUl5TlRBelppSXNJbWR2Y21sc2JHRWlPaUl5TVRnM05XRTBPV1lpTENKa1pXVnlJam9pTWpFNE56Y3haVFpoSW4wc0ltbGhkQ0k2TVRVMk56WXdPVFE0TVgwLm0zYlZhelhCMUEyTUFtT0NaN2pxOFdCNjVHcHFYX3hrX1VIZjZLRzVNa01-ZXlKMGVYQWlPaUpLVjFRaUxDSmhiR2NpT2lKSVV6STFOaUo5LmV5SmtZWFJoSWpwN0ltaHZjbk5sSWpvaWRHOWphSFZyZDNWdWQyRjZiM0pBWjIxaGFXd3VZMjl0SWl3aVkyRnRaV3dpT2lJeU1UZzNaV0V6WkRraWZTd2lhV0YwSWpveE5UWTNOakE1TkRneGZRLm11TEJQNmgzZnR6bzdDU21qMThvT3hYbmJ2VGxlYWZGMTAtMFB2S3RkbVkiLCJsaW9uIjoiMjc0NDgiLCJnb3JpbGxhIjoiMjE4N2YzYmMzIiwiZGVlciI6IjIxODhhZjM3ZSJ9LCJpYXQiOjE1NzA3MjA2MDV9.2pmtlPlcEdkCYwKVVYuAn6__KXx0hJAZIjY8eKF2Co0~eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJkYXRhIjp7ImhvcnNlIjoicGFuZGFzLWRldkBweXRob24ub3JnIiwiY2FtZWwiOiIyMTg5ZDdhYzIifSwiaWF0IjoxNTcwNzIwNjA1fQ.XgAt7VMtSHyyf5tjGofnYc9lIWsFZLOFzI2zTPFuoDs]alexbill944 at gmail.com [http://amsofttech.fstracker.io/email/track/click?hash=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJkYXRhIjp7Im11c3RoIjoibWFpbHRvOmFsZXhiaWxsOTQ0QGdtYWlsLmNvbSIsImxpb24iOiIyNzQ0OCIsImdvcmlsbGEiOiIyMTg3ZjNiYzMiLCJkZWVyIjoiMjE4OGFmMzdmIn0sImlhdCI6MTU3MDcyMDYwNX0.bNrE8aDaCPgKP0pI99_rFhqfFNm9m9-WVO01wyUDVSE~eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJkYXRhIjp7ImhvcnNlIjoicGFuZGFzLWRldkBweXRob24ub3JnIiwiY2FtZWwiOiIyMTg5ZDdhYzIifSwiaWF0IjoxNTcwNzIwNjA1fQ.XgAt7VMtSHyyf5tjGofnYc9lIWsFZLOFzI2zTPFuoDs]


w:?NamoBOT.com


 Unsubscribe [http://amsofttech.fstracker.io/email/track/unsubscribe?hash=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJkYXRhIjp7Imxpb24iOiIyNzQ0OCIsImdvcmlsbGEiOiIyMTg3ZjNiYzMiLCJob3JzZSI6InBhbmRhcy1kZXZAcHl0aG9uLm9yZyIsImNhbWVsIjoiMjE4OWQ3YWMyIn0sImlhdCI6MTU3MDcyMDYwNX0.Cc51sktQtDWR5MZTmgEKcoE9Nq61TtAWHKPClTBg2bc]

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191010/6e7e0cce/attachment.html>

From ralf.gommers at gmail.com  Sat Oct 12 11:48:00 2019
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 12 Oct 2019 17:48:00 +0200
Subject: [Pandas-dev] SciPy user documentation survey - please participate
Message-ID: <CABL7CQi0cgri46V1bKoCK94rwzcmqfe8P6TQYJNCLntite1NqQ@mail.gmail.com>

Hi everyone,

Maja, our technical writer for Season of Docs for SciPy, created a survey
specifically about how users use the SciPy docs and what they would like to
see improved. She sent out the link to scipy-dev before and it was shared
on Twitter. We've received some really valuable responses, and would love
to get more. If you're a SciPy user, this is a very easy way to help the
project!

https://docs.google.com/forms/d/e/1FAIpQLSeBAO0UFKDZyKpg2XzRslsLJVHU61ugjc18-2PVEabTQg2_6g/viewform?usp=sf_link

Apologies for the cross-post. We do a survey about once a decade and Maja
is putting in a lot of work creating and analyzing the survey, so it's
worth some more exposure and a couple of minutes of your time!

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191012/8ba2c1e3/attachment.html>

From garcia.marc at gmail.com  Sat Oct 12 16:17:54 2019
From: garcia.marc at gmail.com (Marc Garcia)
Date: Sat, 12 Oct 2019 15:17:54 -0500
Subject: [Pandas-dev] Donation for GitHub Pandas-Dev
In-Reply-To: <SalesSequence.9000000961.27448.2187f3bc3.smtp@amsofttech.freshsales.io>
References: <SalesSequence.9000000961.27448.2187f3bc3.smtp@amsofttech.freshsales.io>
Message-ID: <CAEk5N5tmcpB6EG8O-J_jTZ1i6jLbVq2mqJFmY1Vz9gy1c2YPUQ@mail.gmail.com>

Hi Alex,

Thanks so much for the donation. Can you please contact info at numfocus.org
to see if there is a better way for the donation than our donations page? I
guess that's the case, since the donation page has a fee.

Cheers!

On Fri, Oct 11, 2019 at 7:05 AM Alex Bill <alexbill944 at gmail.com> wrote:

> Hello Pandas,
>
> We are a small business impressed by your open source initiative on
> https://github.com/pandas-dev.
>
> Our management support different open-source projects under a limited
> budget on a regular basis. You have made the final list.
>
> We are looking forward to supporting you for the year ahead-either through
> monthly or quarterly donations. Depending on what you prefer.
>
> Please let us know the payment mode that will work for you, so we may
> proceed.
>
> We are hoping you will accept our humble gesture.
>
> Alex Bill
> Blogger | Internet Marketing
> e:
> <http://amsofttech.fstracker.io/email/track/click?hash=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJkYXRhIjp7Im11c3RoIjoiaHR0cDovL2FwcHJveGVuLmZzdHJhY2tlci5pby9lbWFpbC90cmFjay9jbGljaz9oYXNoPWV5SjBlWEFpT2lKS1YxUWlMQ0poYkdjaU9pSklVekkxTmlKOS5leUprWVhSaElqcDdJbTExYzNSb0lqb2liV0ZwYkhSdk9tRnNaWGhpYVd4c09UUTBRR2R0WVdsc0xtTnZiU0lzSW14cGIyNGlPaUl5TlRBelppSXNJbWR2Y21sc2JHRWlPaUl5TVRnM05XRTBPV1lpTENKa1pXVnlJam9pTWpFNE56Y3haVFpoSW4wc0ltbGhkQ0k2TVRVMk56WXdPVFE0TVgwLm0zYlZhelhCMUEyTUFtT0NaN2pxOFdCNjVHcHFYX3hrX1VIZjZLRzVNa01-ZXlKMGVYQWlPaUpLVjFRaUxDSmhiR2NpT2lKSVV6STFOaUo5LmV5SmtZWFJoSWpwN0ltaHZjbk5sSWpvaWRHOWphSFZyZDNWdWQyRjZiM0pBWjIxaGFXd3VZMjl0SWl3aVkyRnRaV3dpT2lJeU1UZzNaV0V6WkRraWZTd2lhV0YwSWpveE5UWTNOakE1TkRneGZRLm11TEJQNmgzZnR6bzdDU21qMThvT3hYbmJ2VGxlYWZGMTAtMFB2S3RkbVkiLCJsaW9uIjoiMjc0NDgiLCJnb3JpbGxhIjoiMjE4N2YzYmMzIiwiZGVlciI6IjIxODhhZjM3ZSJ9LCJpYXQiOjE1NzA3MjA2MDV9.2pmtlPlcEdkCYwKVVYuAn6__KXx0hJAZIjY8eKF2Co0~eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJkYXRhIjp7ImhvcnNlIjoicGFuZGFzLWRldkBweXRob24ub3JnIiwiY2FtZWwiOiIyMTg5ZDdhYzIifSwiaWF0IjoxNTcwNzIwNjA1fQ.XgAt7VMtSHyyf5tjGofnYc9lIWsFZLOFzI2zTPFuoDs>
> alexbill944 at gmail.com
> <http://amsofttech.fstracker.io/email/track/click?hash=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJkYXRhIjp7Im11c3RoIjoibWFpbHRvOmFsZXhiaWxsOTQ0QGdtYWlsLmNvbSIsImxpb24iOiIyNzQ0OCIsImdvcmlsbGEiOiIyMTg3ZjNiYzMiLCJkZWVyIjoiMjE4OGFmMzdmIn0sImlhdCI6MTU3MDcyMDYwNX0.bNrE8aDaCPgKP0pI99_rFhqfFNm9m9-WVO01wyUDVSE~eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJkYXRhIjp7ImhvcnNlIjoicGFuZGFzLWRldkBweXRob24ub3JnIiwiY2FtZWwiOiIyMTg5ZDdhYzIifSwiaWF0IjoxNTcwNzIwNjA1fQ.XgAt7VMtSHyyf5tjGofnYc9lIWsFZLOFzI2zTPFuoDs>
> w: NamoBOT.com
>
> Unsubscribe
> <http://amsofttech.fstracker.io/email/track/unsubscribe?hash=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJkYXRhIjp7Imxpb24iOiIyNzQ0OCIsImdvcmlsbGEiOiIyMTg3ZjNiYzMiLCJob3JzZSI6InBhbmRhcy1kZXZAcHl0aG9uLm9yZyIsImNhbWVsIjoiMjE4OWQ3YWMyIn0sImlhdCI6MTU3MDcyMDYwNX0.Cc51sktQtDWR5MZTmgEKcoE9Nq61TtAWHKPClTBg2bc>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191012/8d59a04b/attachment.html>

From emailformattr at gmail.com  Tue Oct 15 02:21:12 2019
From: emailformattr at gmail.com (Matthew Roeschke)
Date: Mon, 14 Oct 2019 23:21:12 -0700
Subject: [Pandas-dev] [pandas-dev] Proposal to include Numba as a required
 dependency for rolling operations
Message-ID: <CACvdwiMCiV8=yT8oqyGrRRwARFKDZdFMVMtzhA6nxt4ABiwkvQ@mail.gmail.com>

Hello All,

I have been working on a proof of concept
<https://github.com/twosigma/pandas/blob/feature/generalized_window_operations/doc/source/development/rolling_operations_with_numba.rst>
that
implements rolling mean
<https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.window.Rolling.mean.html>
and rolling apply
<https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.window.Rolling.apply.html>
in Numba without changing the current pandas rolling API. Numba is an
attractive substitute over the current Cython implementation as
maintainable, pure Python code and potential performance improvements.
Stemming from the proof of concept, I'd like to propose to add Numba as a
required dependency in pandas 1.0 and have rolling mean dispatch to Numba.

Please see the associated GitHub issue
<https://github.com/pandas-dev/pandas/issues/28987> for further details and
discussion.

Thanks,
Matt
-- 
Matthew Roeschke <http://mroeschke.github.io>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191014/b5fe5139/attachment.html>

From m at maximilianroos.com  Tue Oct 15 13:29:01 2019
From: m at maximilianroos.com (Maximilian Roos)
Date: Tue, 15 Oct 2019 13:29:01 -0400
Subject: [Pandas-dev] Proposal to include Numba as a required dependency
 for rolling operations
In-Reply-To: <mailman.27.1571155202.7340.pandas-dev@python.org>
References: <mailman.27.1571155202.7340.pandas-dev@python.org>
Message-ID: <CAFeL3qFfMjaphzRP2gEZEaSbviZ_hw-pCkiTsjPbtXtu2XVWVg@mail.gmail.com>

Hi Matthew,

That looks very exciting.

Have you seen Numbagg? That already has some rolling algos implemented over
n-dimensions, and pandas could easily take those over 1-2 dimensions. It's
authored by Stephan Hoyer.

I added an exponential moving average so we could have a n-dimensional
version in xarray, and basically copied the pandas implementation into
numba-compatible code (and added attribution to pandas).

Your implementation looks v interesting and looks flexible. To the extent
we can share implementations of the underlying algos, that would be great.

Max


On Tue, 15 Oct 2019 at 12:00, <pandas-dev-request at python.org> wrote:

>
> ---------- Forwarded message ----------
> From: Matthew Roeschke <emailformattr at gmail.com>
> To: pandas-dev <pandas-dev at python.org>
> Cc:
> Bcc:
> Date: Mon, 14 Oct 2019 23:21:12 -0700
> Subject: [Pandas-dev] [pandas-dev] Proposal to include Numba as a required
> dependency for rolling operations
> Hello All,
>
> I have been working on a proof of concept
> <https://github.com/twosigma/pandas/blob/feature/generalized_window_operations/doc/source/development/rolling_operations_with_numba.rst> that
> implements rolling mean
> <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.window.Rolling.mean.html>
> and rolling apply
> <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.window.Rolling.apply.html>
> in Numba without changing the current pandas rolling API. Numba is an
> attractive substitute over the current Cython implementation as
> maintainable, pure Python code and potential performance improvements.
> Stemming from the proof of concept, I'd like to propose to add Numba as a
> required dependency in pandas 1.0 and have rolling mean dispatch to Numba.
>
> Please see the associated GitHub issue
> <https://github.com/pandas-dev/pandas/issues/28987> for further details
> and discussion.
>
> Thanks,
> Matt
> --
> Matthew Roeschke <http://mroeschke.github.io>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191015/51e64b31/attachment.html>

From tom.augspurger88 at gmail.com  Sat Oct 19 10:15:56 2019
From: tom.augspurger88 at gmail.com (Tom Augspurger)
Date: Sat, 19 Oct 2019 09:15:56 -0500
Subject: [Pandas-dev] ANN: Pandas 0.25.2 released!
Message-ID: <CAE1aY-=z45H1K95D0xjk6+icYjHcYtPWb61+Q9dFxJ=Tm-a+_Q@mail.gmail.com>

This is a minor bug-fix release in the 0.25.x series and includes some
regression fixes
and bug fixes. We recommend that all users upgrade to this version.

This is the first pandas release to support Python 3.8

See the full whatsnew
<https://pandas.pydata.org/pandas-docs/version/0.25/whatsnew/v0.25.2.html>
for a list of all the changes.

The release can be installed with conda from the defaults and conda-forge
channels:

conda install pandas

Or via PyPI:

python3 -m pip install --upgrade pandas

Please report any issues with the release on the pandas issue tracker
<https://github.com/pandas-dev/pandas/issues>.
Note that there was an issue with building the PDF documentation. It will
be uploaded later.

- Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191019/4ccff167/attachment.html>

From tom.augspurger88 at gmail.com  Sun Oct 20 13:08:30 2019
From: tom.augspurger88 at gmail.com (Tom Augspurger)
Date: Sun, 20 Oct 2019 12:08:30 -0500
Subject: [Pandas-dev] [pydata] ANN: Pandas 0.25.2 released!
In-Reply-To: <CAF+o+SHJeikWQJLW+h_eRYfFd97fZb2JSF_g5M1_Fq+zu9t7gw@mail.gmail.com>
References: <CAF+o+SHJeikWQJLW+h_eRYfFd97fZb2JSF_g5M1_Fq+zu9t7gw@mail.gmail.com>
Message-ID: <9F26CC13-CB03-4F4F-8591-F830F260BFB0@gmail.com>

It?ll be available in defaults soon. 

> On Oct 20, 2019, at 09:40, Zoran Ljubi?i? <zljubisic at gmail.com> wrote:
> 
> ?
> conda install pandas
> sees only 0.25.1 version (miniconda).
> 
> Regards.
> 
>> On Sat, Oct 19, 2019 at 4:16 PM Tom Augspurger <tom.augspurger88 at gmail.com> wrote:
>> This is a minor bug-fix release in the 0.25.x series and includes some regression fixes
>> and bug fixes. We recommend that all users upgrade to this version.
>> 
>> This is the first pandas release to support Python 3.8
>> 
>> See the full whatsnew for a list of all the changes.
>> 
>> The release can be installed with conda from the defaults and conda-forge channels:
>> 
>> conda install pandas
>> Or via PyPI:
>> 
>> python3 -m pip install --upgrade pandas
>> Please report any issues with the release on the pandas issue tracker.
>> 
>> Note that there was an issue with building the PDF documentation. It will be uploaded later.
>> 
>> - Tom
>> -- 
>> You received this message because you are subscribed to the Google Groups "PyData" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe at googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/pydata/CAE1aY-%3Dz45H1K95D0xjk6%2BicYjHcYtPWb61%2BQ9dFxJ%3DTm-a%2B_Q%40mail.gmail.com.
> 
> -- 
> You received this message because you are subscribed to the Google Groups "PyData" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pydata/CAF%2Bo%2BSHJeikWQJLW%2Bh_eRYfFd97fZb2JSF_g5M1_Fq%2Bzu9t7gw%40mail.gmail.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191020/dd856780/attachment.html>

From zljubisic at gmail.com  Sun Oct 20 10:39:45 2019
From: zljubisic at gmail.com (=?UTF-8?B?Wm9yYW4gTGp1YmnFoWnEhw==?=)
Date: Sun, 20 Oct 2019 16:39:45 +0200
Subject: [Pandas-dev] [pydata] ANN: Pandas 0.25.2 released!
In-Reply-To: <CAE1aY-=z45H1K95D0xjk6+icYjHcYtPWb61+Q9dFxJ=Tm-a+_Q@mail.gmail.com>
References: <CAE1aY-=z45H1K95D0xjk6+icYjHcYtPWb61+Q9dFxJ=Tm-a+_Q@mail.gmail.com>
Message-ID: <CAF+o+SHJeikWQJLW+h_eRYfFd97fZb2JSF_g5M1_Fq+zu9t7gw@mail.gmail.com>

conda install pandas

sees only 0.25.1 version (miniconda).

Regards.

On Sat, Oct 19, 2019 at 4:16 PM Tom Augspurger <tom.augspurger88 at gmail.com>
wrote:

> This is a minor bug-fix release in the 0.25.x series and includes some
> regression fixes
> and bug fixes. We recommend that all users upgrade to this version.
>
> This is the first pandas release to support Python 3.8
>
> See the full whatsnew
> <https://pandas.pydata.org/pandas-docs/version/0.25/whatsnew/v0.25.2.html>
> for a list of all the changes.
>
> The release can be installed with conda from the defaults and conda-forge
> channels:
>
> conda install pandas
>
> Or via PyPI:
>
> python3 -m pip install --upgrade pandas
>
> Please report any issues with the release on the pandas issue tracker
> <https://github.com/pandas-dev/pandas/issues>.
> Note that there was an issue with building the PDF documentation. It will
> be uploaded later.
>
> - Tom
>
> --
> You received this message because you are subscribed to the Google Groups
> "PyData" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pydata+unsubscribe at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pydata/CAE1aY-%3Dz45H1K95D0xjk6%2BicYjHcYtPWb61%2BQ9dFxJ%3DTm-a%2B_Q%40mail.gmail.com
> <https://groups.google.com/d/msgid/pydata/CAE1aY-%3Dz45H1K95D0xjk6%2BicYjHcYtPWb61%2BQ9dFxJ%3DTm-a%2B_Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191020/05b1bdb9/attachment.html>

From jbrockmendel at gmail.com  Mon Oct 21 11:51:52 2019
From: jbrockmendel at gmail.com (Brock Mendel)
Date: Mon, 21 Oct 2019 08:51:52 -0700
Subject: [Pandas-dev] GH Issue Labels
Message-ID: <CAKf8g9RExi8xt1=S3Ebuse5x-9qGyrse9ZoaAysEppSaPDskUw@mail.gmail.com>

After doing some issue triage on older issues, I'd like to solicit
community opinions on what labels are useful.  In particular:

1) Are the "Difficulty Advanced/Intermediate", "Effort High/Low/Medium",
"Prio-high/low/medium" labels useful for anyone?  "Prio High" I understand,
but the others are mostly noise to me.
2) "Style" appears ambiguous.  Some people (me) think it refers to coding
style, but it also gets applied to things issues involving the Styler
class.  Is there a canonical answer?
3) Any objections to removing labels with 0 open Issues/PRs?
4) Issues involving pd.eval don't have a clear home.  Suggestions?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191021/d6696b39/attachment.html>

From emailformattr at gmail.com  Mon Oct 21 12:06:32 2019
From: emailformattr at gmail.com (Matthew Roeschke)
Date: Mon, 21 Oct 2019 09:06:32 -0700
Subject: [Pandas-dev] GH Issue Labels
In-Reply-To: <CAKf8g9RExi8xt1=S3Ebuse5x-9qGyrse9ZoaAysEppSaPDskUw@mail.gmail.com>
References: <CAKf8g9RExi8xt1=S3Ebuse5x-9qGyrse9ZoaAysEppSaPDskUw@mail.gmail.com>
Message-ID: <CACvdwiPQQDPZZQ-cV_gbH59vS4DA1H6JmGSSBi_KtXYWGkmWbw@mail.gmail.com>

1) Agreed with your conclusion.
3) No objections where applicable. For example, I would still keep the
"OSX" and "Python 3.6" tags.
2 + 4) I was thinking it might be useful to have more specific tags for
pandas methods (like we do for "Apply"). Though it would increase the
number of tags, it would help us and issue writers triage related/duplicate
issues instead of using the less-than-perfect Github filter bar.

On Mon, Oct 21, 2019 at 8:52 AM Brock Mendel <jbrockmendel at gmail.com> wrote:

> After doing some issue triage on older issues, I'd like to solicit
> community opinions on what labels are useful.  In particular:
>
> 1) Are the "Difficulty Advanced/Intermediate", "Effort High/Low/Medium",
> "Prio-high/low/medium" labels useful for anyone?  "Prio High" I understand,
> but the others are mostly noise to me.
> 2) "Style" appears ambiguous.  Some people (me) think it refers to coding
> style, but it also gets applied to things issues involving the Styler
> class.  Is there a canonical answer?
> 3) Any objections to removing labels with 0 open Issues/PRs?
> 4) Issues involving pd.eval don't have a clear home.  Suggestions?
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>


-- 
Matthew Roeschke <http://mroeschke.github.io>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191021/76ae3a09/attachment.html>

From william.ayd at icloud.com  Mon Oct 21 12:16:48 2019
From: william.ayd at icloud.com (William Ayd)
Date: Mon, 21 Oct 2019 09:16:48 -0700
Subject: [Pandas-dev] GH Issue Labels
In-Reply-To: <CACvdwiPQQDPZZQ-cV_gbH59vS4DA1H6JmGSSBi_KtXYWGkmWbw@mail.gmail.com>
References: <CAKf8g9RExi8xt1=S3Ebuse5x-9qGyrse9ZoaAysEppSaPDskUw@mail.gmail.com>
 <CACvdwiPQQDPZZQ-cV_gbH59vS4DA1H6JmGSSBi_KtXYWGkmWbw@mail.gmail.com>
Message-ID: <A96D484C-0701-42A5-875D-B0B5954D83B4@icloud.com>

1) I rarely find these useful. The only difficulty label I?ve found useful is the entirely separate ?good first issue?, so I would think we can certainly move low difficulty / effort tags
2) I would be OK with relabeling this as ?Coding Style? and moving the unrelated ones to an IO label to disambiguate
3) No objections but maybe not a high priority either. The way we operate now the vast majority of our issues will never be read (maybe a bot would be useful here)
4) Maybe a ?Tokenizer? tag for pd.eval? Could cover pd.query as well

> On Oct 21, 2019, at 9:06 AM, Matthew Roeschke <emailformattr at gmail.com> wrote:
> 
> 1) Agreed with your conclusion.
> 3) No objections where applicable. For example, I would still keep the "OSX" and "Python 3.6" tags.
> 2 + 4) I was thinking it might be useful to have more specific tags for pandas methods (like we do for "Apply"). Though it would increase the number of tags, it would help us and issue writers triage related/duplicate issues instead of using the less-than-perfect Github filter bar.
> 
> On Mon, Oct 21, 2019 at 8:52 AM Brock Mendel <jbrockmendel at gmail.com <mailto:jbrockmendel at gmail.com>> wrote:
> After doing some issue triage on older issues, I'd like to solicit community opinions on what labels are useful.  In particular:
> 
> 1) Are the "Difficulty Advanced/Intermediate", "Effort High/Low/Medium", "Prio-high/low/medium" labels useful for anyone?  "Prio High" I understand, but the others are mostly noise to me.
> 2) "Style" appears ambiguous.  Some people (me) think it refers to coding style, but it also gets applied to things issues involving the Styler class.  Is there a canonical answer?
> 3) Any objections to removing labels with 0 open Issues/PRs?
> 4) Issues involving pd.eval don't have a clear home.  Suggestions?
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org <mailto:Pandas-dev at python.org>
> https://mail.python.org/mailman/listinfo/pandas-dev <https://mail.python.org/mailman/listinfo/pandas-dev>
> 
> 
> -- 
> Matthew Roeschke <http://mroeschke.github.io/>
> 
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev

William Ayd
william.ayd at icloud.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191021/bbe9903e/attachment.html>

From tom.augspurger88 at gmail.com  Mon Oct 21 13:09:16 2019
From: tom.augspurger88 at gmail.com (Tom Augspurger)
Date: Mon, 21 Oct 2019 12:09:16 -0500
Subject: [Pandas-dev] GH Issue Labels
In-Reply-To: <A96D484C-0701-42A5-875D-B0B5954D83B4@icloud.com>
References: <CAKf8g9RExi8xt1=S3Ebuse5x-9qGyrse9ZoaAysEppSaPDskUw@mail.gmail.com>
 <CACvdwiPQQDPZZQ-cV_gbH59vS4DA1H6JmGSSBi_KtXYWGkmWbw@mail.gmail.com>
 <A96D484C-0701-42A5-875D-B0B5954D83B4@icloud.com>
Message-ID: <CAE1aY-m-ueqeSvhT6VPREP8Vg131MMVZmH09_wJMa76Rq4hQ_g@mail.gmail.com>

2. Style does indeed refer to coding style. Wouldn't be opposed to a Styler
label, but it's also covered by HTML formatting.
4. "Expressions" might also work, given
pandas/core/computation/expressions.py

On Mon, Oct 21, 2019 at 11:17 AM William Ayd via Pandas-dev <
pandas-dev at python.org> wrote:

> 1) I rarely find these useful. The only difficulty label I?ve found useful
> is the entirely separate ?good first issue?, so I would think we can
> certainly move low difficulty / effort tags
> 2) I would be OK with relabeling this as ?Coding Style? and moving the
> unrelated ones to an IO label to disambiguate
> 3) No objections but maybe not a high priority either. The way we operate
> now the vast majority of our issues will never be read (maybe a bot would
> be useful here)
> 4) Maybe a ?Tokenizer? tag for pd.eval? Could cover pd.query as well
>
> On Oct 21, 2019, at 9:06 AM, Matthew Roeschke <emailformattr at gmail.com>
> wrote:
>
> 1) Agreed with your conclusion.
> 3) No objections where applicable. For example, I would still keep the
> "OSX" and "Python 3.6" tags.
> 2 + 4) I was thinking it might be useful to have more specific tags for
> pandas methods (like we do for "Apply"). Though it would increase the
> number of tags, it would help us and issue writers triage related/duplicate
> issues instead of using the less-than-perfect Github filter bar.
>
> On Mon, Oct 21, 2019 at 8:52 AM Brock Mendel <jbrockmendel at gmail.com>
> wrote:
>
>> After doing some issue triage on older issues, I'd like to solicit
>> community opinions on what labels are useful.  In particular:
>>
>> 1) Are the "Difficulty Advanced/Intermediate", "Effort High/Low/Medium",
>> "Prio-high/low/medium" labels useful for anyone?  "Prio High" I understand,
>> but the others are mostly noise to me.
>> 2) "Style" appears ambiguous.  Some people (me) think it refers to coding
>> style, but it also gets applied to things issues involving the Styler
>> class.  Is there a canonical answer?
>> 3) Any objections to removing labels with 0 open Issues/PRs?
>> 4) Issues involving pd.eval don't have a clear home.  Suggestions?
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>
>
> --
> Matthew Roeschke <http://mroeschke.github.io/>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
>
> William Ayd
> william.ayd at icloud.com
>
>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191021/9af39345/attachment-0001.html>

From garcia.marc at gmail.com  Mon Oct 21 13:44:48 2019
From: garcia.marc at gmail.com (Marc Garcia)
Date: Mon, 21 Oct 2019 12:44:48 -0500
Subject: [Pandas-dev] GH Issue Labels
In-Reply-To: <CAE1aY-m-ueqeSvhT6VPREP8Vg131MMVZmH09_wJMa76Rq4hQ_g@mail.gmail.com>
References: <CAKf8g9RExi8xt1=S3Ebuse5x-9qGyrse9ZoaAysEppSaPDskUw@mail.gmail.com>
 <CACvdwiPQQDPZZQ-cV_gbH59vS4DA1H6JmGSSBi_KtXYWGkmWbw@mail.gmail.com>
 <A96D484C-0701-42A5-875D-B0B5954D83B4@icloud.com>
 <CAE1aY-m-ueqeSvhT6VPREP8Vg131MMVZmH09_wJMa76Rq4hQ_g@mail.gmail.com>
Message-ID: <CAEk5N5uu75wvkLP=U7SpeygM97r9uNt3+yozOfGp4MvRtq0qgg@mail.gmail.com>

I'm fine removing everything you mention in 1) and 3). I usually label with
the effort low for good first issues, but I think good first issue implies
that anyway, so happy to get rid of it too.

On Mon, 21 Oct 2019, 12:09 Tom Augspurger, <tom.augspurger88 at gmail.com>
wrote:

> 2. Style does indeed refer to coding style. Wouldn't be opposed to a
> Styler label, but it's also covered by HTML formatting.
> 4. "Expressions" might also work, given
> pandas/core/computation/expressions.py
>
> On Mon, Oct 21, 2019 at 11:17 AM William Ayd via Pandas-dev <
> pandas-dev at python.org> wrote:
>
>> 1) I rarely find these useful. The only difficulty label I?ve found
>> useful is the entirely separate ?good first issue?, so I would think we can
>> certainly move low difficulty / effort tags
>> 2) I would be OK with relabeling this as ?Coding Style? and moving the
>> unrelated ones to an IO label to disambiguate
>> 3) No objections but maybe not a high priority either. The way we operate
>> now the vast majority of our issues will never be read (maybe a bot would
>> be useful here)
>> 4) Maybe a ?Tokenizer? tag for pd.eval? Could cover pd.query as well
>>
>> On Oct 21, 2019, at 9:06 AM, Matthew Roeschke <emailformattr at gmail.com>
>> wrote:
>>
>> 1) Agreed with your conclusion.
>> 3) No objections where applicable. For example, I would still keep the
>> "OSX" and "Python 3.6" tags.
>> 2 + 4) I was thinking it might be useful to have more specific tags for
>> pandas methods (like we do for "Apply"). Though it would increase the
>> number of tags, it would help us and issue writers triage related/duplicate
>> issues instead of using the less-than-perfect Github filter bar.
>>
>> On Mon, Oct 21, 2019 at 8:52 AM Brock Mendel <jbrockmendel at gmail.com>
>> wrote:
>>
>>> After doing some issue triage on older issues, I'd like to solicit
>>> community opinions on what labels are useful.  In particular:
>>>
>>> 1) Are the "Difficulty Advanced/Intermediate", "Effort High/Low/Medium",
>>> "Prio-high/low/medium" labels useful for anyone?  "Prio High" I understand,
>>> but the others are mostly noise to me.
>>> 2) "Style" appears ambiguous.  Some people (me) think it refers to
>>> coding style, but it also gets applied to things issues involving the
>>> Styler class.  Is there a canonical answer?
>>> 3) Any objections to removing labels with 0 open Issues/PRs?
>>> 4) Issues involving pd.eval don't have a clear home.  Suggestions?
>>> _______________________________________________
>>> Pandas-dev mailing list
>>> Pandas-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>
>>
>>
>> --
>> Matthew Roeschke <http://mroeschke.github.io/>
>>
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>>
>> William Ayd
>> william.ayd at icloud.com
>>
>>
>>
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191021/d15150d9/attachment.html>

From jeffreback at gmail.com  Mon Oct 21 13:55:51 2019
From: jeffreback at gmail.com (Jeff Reback)
Date: Mon, 21 Oct 2019 13:55:51 -0400
Subject: [Pandas-dev] GH Issue Labels
In-Reply-To: <CAEk5N5uu75wvkLP=U7SpeygM97r9uNt3+yozOfGp4MvRtq0qgg@mail.gmail.com>
References: <CAKf8g9RExi8xt1=S3Ebuse5x-9qGyrse9ZoaAysEppSaPDskUw@mail.gmail.com>
 <CACvdwiPQQDPZZQ-cV_gbH59vS4DA1H6JmGSSBi_KtXYWGkmWbw@mail.gmail.com>
 <A96D484C-0701-42A5-875D-B0B5954D83B4@icloud.com>
 <CAE1aY-m-ueqeSvhT6VPREP8Vg131MMVZmH09_wJMa76Rq4hQ_g@mail.gmail.com>
 <CAEk5N5uu75wvkLP=U7SpeygM97r9uNt3+yozOfGp4MvRtq0qgg@mail.gmail.com>
Message-ID: <66CE1456-F887-427B-BF2D-AC34BE2BB33F@gmail.com>

since i created the priority and difficulty ones :)

ok with removing them (except good first issue)

they were a good idea at the time!

> On Oct 21, 2019, at 1:44 PM, Marc Garcia <garcia.marc at gmail.com> wrote:
> 
> I'm fine removing everything you mention in 1) and 3). I usually label with the effort low for good first issues, but I think good first issue implies that anyway, so happy to get rid of it too.
> 
>> On Mon, 21 Oct 2019, 12:09 Tom Augspurger, <tom.augspurger88 at gmail.com> wrote:
>> 2. Style does indeed refer to coding style. Wouldn't be opposed to a Styler label, but it's also covered by HTML formatting.
>> 4. "Expressions" might also work, given pandas/core/computation/expressions.py
>> 
>>> On Mon, Oct 21, 2019 at 11:17 AM William Ayd via Pandas-dev <pandas-dev at python.org> wrote:
>>> 1) I rarely find these useful. The only difficulty label I?ve found useful is the entirely separate ?good first issue?, so I would think we can certainly move low difficulty / effort tags
>>> 2) I would be OK with relabeling this as ?Coding Style? and moving the unrelated ones to an IO label to disambiguate
>>> 3) No objections but maybe not a high priority either. The way we operate now the vast majority of our issues will never be read (maybe a bot would be useful here)
>>> 4) Maybe a ?Tokenizer? tag for pd.eval? Could cover pd.query as well
>>> 
>>>> On Oct 21, 2019, at 9:06 AM, Matthew Roeschke <emailformattr at gmail.com> wrote:
>>>> 
>>>> 1) Agreed with your conclusion.
>>>> 3) No objections where applicable. For example, I would still keep the "OSX" and "Python 3.6" tags.
>>>> 2 + 4) I was thinking it might be useful to have more specific tags for pandas methods (like we do for "Apply"). Though it would increase the number of tags, it would help us and issue writers triage related/duplicate issues instead of using the less-than-perfect Github filter bar.
>>>> 
>>>>> On Mon, Oct 21, 2019 at 8:52 AM Brock Mendel <jbrockmendel at gmail.com> wrote:
>>>>> After doing some issue triage on older issues, I'd like to solicit community opinions on what labels are useful.  In particular:
>>>>> 
>>>>> 1) Are the "Difficulty Advanced/Intermediate", "Effort High/Low/Medium", "Prio-high/low/medium" labels useful for anyone?  "Prio High" I understand, but the others are mostly noise to me.
>>>>> 2) "Style" appears ambiguous.  Some people (me) think it refers to coding style, but it also gets applied to things issues involving the Styler class.  Is there a canonical answer?
>>>>> 3) Any objections to removing labels with 0 open Issues/PRs?
>>>>> 4) Issues involving pd.eval don't have a clear home.  Suggestions?
>>>>> _______________________________________________
>>>>> Pandas-dev mailing list
>>>>> Pandas-dev at python.org
>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>> 
>>>> 
>>>> -- 
>>>> Matthew Roeschke
>>>> 
>>>> _______________________________________________
>>>> Pandas-dev mailing list
>>>> Pandas-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>> 
>>> William Ayd
>>> william.ayd at icloud.com
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Pandas-dev mailing list
>>> Pandas-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pandas-dev
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191021/ce93f325/attachment-0001.html>

From garcia.marc at gmail.com  Tue Oct 29 12:23:49 2019
From: garcia.marc at gmail.com (Marc Garcia)
Date: Tue, 29 Oct 2019 11:23:49 -0500
Subject: [Pandas-dev] Discourse discussion forum
In-Reply-To: <CAEk5N5srS3uBED+Grb3Ya=coOQ8mLdhqGMC8mEJKaswc4Ho-dg@mail.gmail.com>
References: <CALQtMBZc=jMLPFmN4=dF9MG4u24ggVKnj-npE565TmdmVbjLOQ@mail.gmail.com>
 <CA+uMjAtfqzgnLEbSGZPFSnz_12RrVv26cYz9EqVTSMUArTeasw@mail.gmail.com>
 <CAE1aY-=U=FJZWrqcsjvaw-Q14bjCUOhbojX9uJ5kqG2o7M3W-Q@mail.gmail.com>
 <CAEk5N5vTGJAeWbYh3JOAdsbEvb1Agan2roZPyVGAeFvSjf6XqQ@mail.gmail.com>
 <CALQtMBY4uG6kxdZvqVMw6Ou_Zs3WTyD=MevJamZmCBKrwvMxMA@mail.gmail.com>
 <CAEk5N5srS3uBED+Grb3Ya=coOQ8mLdhqGMC8mEJKaswc4Ho-dg@mail.gmail.com>
Message-ID: <CAEk5N5swt=EGDa8YLZt5ScQm_7woyMYzB3b8OqEttp9nfx1u5w@mail.gmail.com>

Andy, could you experiment on having multiple projects in a single
discourse? I saw the PyData one was activated some time ago.

If it doesn't look feasible as I think, let me know so I'll move forward
discussing what to have in the pandas one.

Cheers!

On Wed, Sep 25, 2019 at 8:03 AM Marc Garcia <garcia.marc at gmail.com> wrote:

> Discourse has private categories, we already have a private "Maintainers"
> one, that only admins can see and use. And there are other permissions
> levels that can be used. For example, we can have a private category for
> the memebers of the code of conduct committee... I just need to check if we
> can associate email addresses to those groups, so when someone emails to
> coc at pandas.io the messages are posted in that private group. But if we
> can set up that as we need, I think we should be able to replace all those
> and centralize everything in Discourse.
>
> I'm skeptical on being able to set up a global Discourse for all the
> ecosystem, where things are easy to find, based on how Discourse works and
> the tests I did. I'd move forward with our own for now if nobody is able to
> set that up.
>
> Andy, I got the pandas account approved in minutes. I see that we can have
> a custom domain, so you can use the pandas and see if we can manage to have
> multiple projects in a way we like, and if we do we just change the domain
> to discuss.pydata.org (or whatever). You're already an admin, feel free
> to experiment and change the set up as you need.
>
> Maarten, not sure I understand your point. Not a fan of Discourse so far,
> but I think having the user and the devs discussions in a single place
> makes it easier to find the information, and I think Discourse interface
> also makes it easier to find compared to mailman, or google groups.
> Regardless of gitter (there are no important discussions or decision making
> there I think), would you prefer to stay with mailman and google groups
> over Discourse? Or what you think would be the ideal or best option?
>
> Thanks!
>
> On Wed, Sep 25, 2019 at 8:39 AM Joris Van den Bossche <
> jorisvandenbossche at gmail.com> wrote:
>
>> What do other people think about starting to use discourse for pandas?
>> (and about sharing it with other projects or having our own?)
>>
>> --
>>
>> On the existing lists: I don't think discourse would replace the core
>> devs list (that is intentionally private). And IMO also not gitter
>> (discourse is not a real-time chat).
>>
>> Joris
>>
>> On Fri, 20 Sep 2019 at 14:58, Marc Garcia <garcia.marc at gmail.com> wrote:
>>
>>> For what I've seen I'd say that Discourse can be configured to interact
>>> with a category like a distribution list (subscribe and have an email
>>> address to send messages there). Not sure, but for the settings I've seen
>>> should be possible.
>>>
>>> Personally I think it should replace all the existing lists:
>>> - pydata google group
>>> - pandas-dev (this)
>>> - core devs list
>>>
>>> I'm also ok to get rid of gitter once we move to discourse (also ok to
>>> keep it if people find it useful, but I rarely use it).
>>>
>>> I created an issue for this discussion some time ago:
>>> https://github.com/pandas-dev/pandas/issues/27903
>>>
>>> On Fri, Sep 20, 2019 at 1:50 PM Tom Augspurger <
>>> tom.augspurger88 at gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Sep 20, 2019 at 6:57 AM Andy Terrel <andy at numfocus.org> wrote:
>>>>
>>>>> Thanks Joris for splitting the thread, sorry if I hijacked the other
>>>>> one.
>>>>>
>>>>> For some discussion from numpy you can see here
>>>>> https://github.com/numpy/numpy.org/issues/28
>>>>>
>>>>> Julia and Jupyter both run their own discourse but Dask, Numpy, Scipy
>>>>> have all told me ?I don?t want to run it ourselves but be part of a larger
>>>>> one?
>>>>>
>>>>> I bet we can figure out how to organize it.
>>>>>
>>>>> I just put in an application to get pydata.discourse.org.
>>>>>
>>>>> ? Andy
>>>>>
>>>>> On Fri, Sep 20, 2019 at 6:52 AM Joris Van den Bossche <
>>>>> jorisvandenbossche at gmail.com> wrote:
>>>>>
>>>>>> (let's use a new thread for discourse, as it is a different
>>>>>> discussion from the website hosting I think, regardless whether OVH might
>>>>>> also host discourse)
>>>>>>
>>>>>> I am not familiar enough myself with discourse to know whether
>>>>>> multiple projects sharing a single discourse will become annoying. But
>>>>>> indeed, that sounds as it needs some kind of hierarchical category /
>>>>>> tagging.
>>>>>>
>>>>>> For pandas itself: I think I quite like the idea of having a
>>>>>> discourse, but *if* we do that, we should think about how that fits
>>>>>> with / replaces / adds to /... some of the other communication channels
>>>>>> (pandas-dev mailing list, pydata mailing list, github issues, ..).
>>>>>>
>>>>>
>>>> IMO, we can replace the pandas-dev & pydata mailing lists with it.
>>>> Possibly gitter as well.
>>>>
>>>>
>>>>> Joris
>>>>>>
>>>>>> On Fri, 20 Sep 2019 at 13:18, Marc Garcia <garcia.marc at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I'm fine with that conceptually, but I think Discourse will make
>>>>>>> things quite tricky to find things then.
>>>>>>>
>>>>>>> We already got our discourse approved, if you want to join it an
>>>>>>> experiment with the setting. But it's the first thing I tried, and after
>>>>>>> you join a category (project), everything feels like it's in the same place
>>>>>>> (even if subcategories and tags exist). And I think we need at least a
>>>>>>> clear separation between pandas/users pandas/contributors discussions.
>>>>>>>
>>>>>>> May be I just couldn't find the settings, let me know if you manage
>>>>>>> to get a multi-project set up that makes sense.
>>>>>>>
>>>>>>> On Fri, Sep 20, 2019 at 12:07 PM Tom Augspurger <
>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>
>>>>>>>> I'd prefer to join a discourse along with NumPy, Dask, and other
>>>>>>>> PyData or NumFOCUS projects, rather than going out on our own.
>>>>>>>>
>>>>>>>> On Fri, Sep 20, 2019 at 4:47 AM Marc Garcia <garcia.marc at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I don't know much about discourse, but why do we want to self-host
>>>>>>>>> it? Seems like Discourse does it for free for open source projects:
>>>>>>>>> https://free.discourse.group/ And I don't think we want another
>>>>>>>>> system to maintain. Am I missing something?
>>>>>>>>>
>>>>>>>>> I applied for https://pandas.discourse.group, so we can give it a
>>>>>>>>> try. We should have it approved and working in couple of days.
>>>>>>>>>
>>>>>>>>> For what I saw, Discourse has one level of categories, so I guess
>>>>>>>>> we want one per project, so we can have categories for "Users",
>>>>>>>>> "Contributors", "Ecosystem"... or something similar. I guess if we have a
>>>>>>>>> single Discourse for NumFOCUS, every project will be a category, and it'll
>>>>>>>>> be difficult to group conversations.
>>>>>>>>>
>>>>>>>>> If anyone already has experience with Discourse and disagrees with
>>>>>>>>> my guesses, please let me know.
>>>>>>>>>
>>>>>>>>> On Wed, Sep 18, 2019 at 4:32 PM Andy Terrel <andy at numfocus.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Sounds great to me. Just let me know where everything goes.
>>>>>>>>>>
>>>>>>>>>> NumPy wants me to help host a discourse for them, maybe OVH would
>>>>>>>>>> be a good place to do that as well, (although I would be more inclinded if
>>>>>>>>>> it was pydata and we had pandas, scipy, and numpy on it).
>>>>>>>>>>
>>>>>>>>>> -- Andy
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 18, 2019 at 8:51 AM Tom Augspurger <
>>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Sounds good w.r.t crediting OVH on those pages.
>>>>>>>>>>>
>>>>>>>>>>> For the ASV results at pandas.pydata.org/speed (which I now
>>>>>>>>>>> notice is currently broken for pandas), the only thing on the webserver is a
>>>>>>>>>>> cron job doing a `git pull` from
>>>>>>>>>>> https://github.com/asv-runner/asv-collection, from within
>>>>>>>>>>> `/usr/share/nginx`.
>>>>>>>>>>>
>>>>>>>>>>> Tom
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:18 AM Marc Garcia <
>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> An update on the new website infrastructure. We need to finish
>>>>>>>>>>>> discussing the details, but OVH is happy to provide the hosting for the
>>>>>>>>>>>> pandas infrastructure we need.
>>>>>>>>>>>>
>>>>>>>>>>>> My initial idea is to credit them in the page with the rest of
>>>>>>>>>>>> the sponsors in the new website:
>>>>>>>>>>>> https://datapythonista.github.io/pandas-web/community/team.html#institutional-partners and
>>>>>>>>>>>> also in the top right corner of the runnable code widgets (see for example
>>>>>>>>>>>> where Binder is credited here: https://spacy.io/).
>>>>>>>>>>>>
>>>>>>>>>>>> What I'd like to ask is:
>>>>>>>>>>>>
>>>>>>>>>>>> 1. For the production website and docs (static content only,
>>>>>>>>>>>> for the traffic we need):
>>>>>>>>>>>> https://us.ovhcloud.com/products/public-cloud/object-storage
>>>>>>>>>>>> 2. For our tools and processes, like the benchmarks, builds, CI
>>>>>>>>>>>> stuff (temporary publish the docs for every PR,...):
>>>>>>>>>>>> https://www.ovh.co.uk/vps/vps-ssd.xml (VPS SSD 3)
>>>>>>>>>>>> 3. For BinderHub (runnable code in our docs, launch tutorials
>>>>>>>>>>>> on Binder...): https://www.ovh.co.uk/public-cloud/kubernetes/
>>>>>>>>>>>>
>>>>>>>>>>>> For the BinderHub, QuantStack offered help with the set up
>>>>>>>>>>>> (which is great, because I don't know much about Binder myself, and I'm not
>>>>>>>>>>>> sure if anyone else does or wants to take care of this). I don't think
>>>>>>>>>>>> it'll be easy to estimate how big is the cluster we need beforehand, but I
>>>>>>>>>>>> guess we can add things to Binder iteratively, and have more info as we
>>>>>>>>>>>> grow.
>>>>>>>>>>>>
>>>>>>>>>>>> OVH gave us a 200 euros voucher to experiment with the
>>>>>>>>>>>> different services. Let me know how all this sounds, and if there are no
>>>>>>>>>>>> objections, I'll create an account and buy those services with the voucher,
>>>>>>>>>>>> and I'll start to prototype and see how everything works.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 20, 2019 at 11:06 PM Marc Garcia <
>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Somehow related to the work on the new website (
>>>>>>>>>>>>> https://github.com/pandas-dev/pandas/pull/28014), I've been
>>>>>>>>>>>>> discussing with the Binder team, and looks like should be quite easy soon
>>>>>>>>>>>>> (with a Sphinx extension) to make all the documentation pages runnable with
>>>>>>>>>>>>> Binder, directly from the website (without opening the page as a Jupyter in
>>>>>>>>>>>>> mybinder).
>>>>>>>>>>>>>
>>>>>>>>>>>>> While they are very happy with the idea of having this is
>>>>>>>>>>>>> pandas, it's uncertain if the current infrastructure Binder has got, is
>>>>>>>>>>>>> able to handle all the traffic we would send. And scikit-learn is working
>>>>>>>>>>>>> on it too (today they added to the dev docs a link to mybinder to run the
>>>>>>>>>>>>> examples).
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm discussing with OVH (their infrastructure provider) on
>>>>>>>>>>>>> whether they'd be happy to provide a dedicated BinderHub specific to pandas
>>>>>>>>>>>>> (or may be we can have one for all NumFOCUS projects). We'll see how it
>>>>>>>>>>>>> goes, but wanted to let you know, so you're updated, and in case anyone is
>>>>>>>>>>>>> interested in participating in the discussions. Of course before any
>>>>>>>>>>>>> decision is made I'll open a discussion here or on GitHub.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As part of the discussion I'm also trying to get a server for
>>>>>>>>>>>>> the website, and one for development stuff. Specfically for the dev docs
>>>>>>>>>>>>> (including rendered docs of every PR) and the GitHub app that will generate
>>>>>>>>>>>>> them. I guess it should be very easy to find a sponsor for these two
>>>>>>>>>>>>> servers (in exchange of a small note in the footer of the website, or
>>>>>>>>>>>>> something like that).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let me know if you have any comment, want to be involved or
>>>>>>>>>>>>> whatever.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pandas-dev mailing list
>>>>>>>>>>>> Pandas-dev at python.org
>>>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Andy R. Terrel, PhD
>>>>>>>>>> President
>>>>>>>>>> NumFOCUS
>>>>>>>>>> andy at numfocus.org
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>> Pandas-dev mailing list
>>>>>>> Pandas-dev at python.org
>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>
>>>>>> --
>>>>> Andy R. Terrel, PhD
>>>>> President
>>>>> NumFOCUS
>>>>> andy at numfocus.org
>>>>>
>>>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191029/dc5836c4/attachment-0001.html>

From andy at numfocus.org  Tue Oct 29 12:51:13 2019
From: andy at numfocus.org (Andy Ray Terrel)
Date: Tue, 29 Oct 2019 09:51:13 -0700
Subject: [Pandas-dev] Discourse discussion forum
In-Reply-To: <CAEk5N5swt=EGDa8YLZt5ScQm_7woyMYzB3b8OqEttp9nfx1u5w@mail.gmail.com>
References: <CALQtMBZc=jMLPFmN4=dF9MG4u24ggVKnj-npE565TmdmVbjLOQ@mail.gmail.com>
 <CA+uMjAtfqzgnLEbSGZPFSnz_12RrVv26cYz9EqVTSMUArTeasw@mail.gmail.com>
 <CAE1aY-=U=FJZWrqcsjvaw-Q14bjCUOhbojX9uJ5kqG2o7M3W-Q@mail.gmail.com>
 <CAEk5N5vTGJAeWbYh3JOAdsbEvb1Agan2roZPyVGAeFvSjf6XqQ@mail.gmail.com>
 <CALQtMBY4uG6kxdZvqVMw6Ou_Zs3WTyD=MevJamZmCBKrwvMxMA@mail.gmail.com>
 <CAEk5N5srS3uBED+Grb3Ya=coOQ8mLdhqGMC8mEJKaswc4Ho-dg@mail.gmail.com>
 <CAEk5N5swt=EGDa8YLZt5ScQm_7woyMYzB3b8OqEttp9nfx1u5w@mail.gmail.com>
Message-ID: <CA+WonSS7ML4J4DyOTNGv-LN01Wv90DvOG0tOe15KN-edcEz_rA@mail.gmail.com>

Sorry I've been traveling.

I have https://pydata.discourse. <http://pydata.discourse.org>group set up.
I can send out invites.

I guess as you have pointed out, we can set up categories for each project,
e.g. dask-users, pandas-users, pandas-dev, but maybe not exactly what you
want.

Happy to invite anyone to the discourse instance before we open it up to
the wild

-- Andy

On Tue, Oct 29, 2019 at 9:24 AM Marc Garcia <garcia.marc at gmail.com> wrote:

> Andy, could you experiment on having multiple projects in a single
> discourse? I saw the PyData one was activated some time ago.
>
> If it doesn't look feasible as I think, let me know so I'll move forward
> discussing what to have in the pandas one.
>
> Cheers!
>
> On Wed, Sep 25, 2019 at 8:03 AM Marc Garcia <garcia.marc at gmail.com> wrote:
>
>> Discourse has private categories, we already have a private "Maintainers"
>> one, that only admins can see and use. And there are other permissions
>> levels that can be used. For example, we can have a private category for
>> the memebers of the code of conduct committee... I just need to check if we
>> can associate email addresses to those groups, so when someone emails to
>> coc at pandas.io the messages are posted in that private group. But if we
>> can set up that as we need, I think we should be able to replace all those
>> and centralize everything in Discourse.
>>
>> I'm skeptical on being able to set up a global Discourse for all the
>> ecosystem, where things are easy to find, based on how Discourse works and
>> the tests I did. I'd move forward with our own for now if nobody is able to
>> set that up.
>>
>> Andy, I got the pandas account approved in minutes. I see that we can
>> have a custom domain, so you can use the pandas and see if we can manage to
>> have multiple projects in a way we like, and if we do we just change the
>> domain to discuss.pydata.org (or whatever). You're already an admin,
>> feel free to experiment and change the set up as you need.
>>
>> Maarten, not sure I understand your point. Not a fan of Discourse so far,
>> but I think having the user and the devs discussions in a single place
>> makes it easier to find the information, and I think Discourse interface
>> also makes it easier to find compared to mailman, or google groups.
>> Regardless of gitter (there are no important discussions or decision making
>> there I think), would you prefer to stay with mailman and google groups
>> over Discourse? Or what you think would be the ideal or best option?
>>
>> Thanks!
>>
>> On Wed, Sep 25, 2019 at 8:39 AM Joris Van den Bossche <
>> jorisvandenbossche at gmail.com> wrote:
>>
>>> What do other people think about starting to use discourse for pandas?
>>> (and about sharing it with other projects or having our own?)
>>>
>>> --
>>>
>>> On the existing lists: I don't think discourse would replace the core
>>> devs list (that is intentionally private). And IMO also not gitter
>>> (discourse is not a real-time chat).
>>>
>>> Joris
>>>
>>> On Fri, 20 Sep 2019 at 14:58, Marc Garcia <garcia.marc at gmail.com> wrote:
>>>
>>>> For what I've seen I'd say that Discourse can be configured to interact
>>>> with a category like a distribution list (subscribe and have an email
>>>> address to send messages there). Not sure, but for the settings I've seen
>>>> should be possible.
>>>>
>>>> Personally I think it should replace all the existing lists:
>>>> - pydata google group
>>>> - pandas-dev (this)
>>>> - core devs list
>>>>
>>>> I'm also ok to get rid of gitter once we move to discourse (also ok to
>>>> keep it if people find it useful, but I rarely use it).
>>>>
>>>> I created an issue for this discussion some time ago:
>>>> https://github.com/pandas-dev/pandas/issues/27903
>>>>
>>>> On Fri, Sep 20, 2019 at 1:50 PM Tom Augspurger <
>>>> tom.augspurger88 at gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 20, 2019 at 6:57 AM Andy Terrel <andy at numfocus.org> wrote:
>>>>>
>>>>>> Thanks Joris for splitting the thread, sorry if I hijacked the other
>>>>>> one.
>>>>>>
>>>>>> For some discussion from numpy you can see here
>>>>>> https://github.com/numpy/numpy.org/issues/28
>>>>>>
>>>>>> Julia and Jupyter both run their own discourse but Dask, Numpy, Scipy
>>>>>> have all told me ?I don?t want to run it ourselves but be part of a larger
>>>>>> one?
>>>>>>
>>>>>> I bet we can figure out how to organize it.
>>>>>>
>>>>>> I just put in an application to get pydata.discourse.org.
>>>>>>
>>>>>> ? Andy
>>>>>>
>>>>>> On Fri, Sep 20, 2019 at 6:52 AM Joris Van den Bossche <
>>>>>> jorisvandenbossche at gmail.com> wrote:
>>>>>>
>>>>>>> (let's use a new thread for discourse, as it is a different
>>>>>>> discussion from the website hosting I think, regardless whether OVH might
>>>>>>> also host discourse)
>>>>>>>
>>>>>>> I am not familiar enough myself with discourse to know whether
>>>>>>> multiple projects sharing a single discourse will become annoying. But
>>>>>>> indeed, that sounds as it needs some kind of hierarchical category /
>>>>>>> tagging.
>>>>>>>
>>>>>>> For pandas itself: I think I quite like the idea of having a
>>>>>>> discourse, but *if* we do that, we should think about how that fits
>>>>>>> with / replaces / adds to /... some of the other communication channels
>>>>>>> (pandas-dev mailing list, pydata mailing list, github issues, ..).
>>>>>>>
>>>>>>
>>>>> IMO, we can replace the pandas-dev & pydata mailing lists with it.
>>>>> Possibly gitter as well.
>>>>>
>>>>>
>>>>>> Joris
>>>>>>>
>>>>>>> On Fri, 20 Sep 2019 at 13:18, Marc Garcia <garcia.marc at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I'm fine with that conceptually, but I think Discourse will make
>>>>>>>> things quite tricky to find things then.
>>>>>>>>
>>>>>>>> We already got our discourse approved, if you want to join it an
>>>>>>>> experiment with the setting. But it's the first thing I tried, and after
>>>>>>>> you join a category (project), everything feels like it's in the same place
>>>>>>>> (even if subcategories and tags exist). And I think we need at least a
>>>>>>>> clear separation between pandas/users pandas/contributors discussions.
>>>>>>>>
>>>>>>>> May be I just couldn't find the settings, let me know if you manage
>>>>>>>> to get a multi-project set up that makes sense.
>>>>>>>>
>>>>>>>> On Fri, Sep 20, 2019 at 12:07 PM Tom Augspurger <
>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I'd prefer to join a discourse along with NumPy, Dask, and other
>>>>>>>>> PyData or NumFOCUS projects, rather than going out on our own.
>>>>>>>>>
>>>>>>>>> On Fri, Sep 20, 2019 at 4:47 AM Marc Garcia <garcia.marc at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I don't know much about discourse, but why do we want to
>>>>>>>>>> self-host it? Seems like Discourse does it for free for open source
>>>>>>>>>> projects: https://free.discourse.group/ And I don't think we
>>>>>>>>>> want another system to maintain. Am I missing something?
>>>>>>>>>>
>>>>>>>>>> I applied for https://pandas.discourse.group, so we can give it
>>>>>>>>>> a try. We should have it approved and working in couple of days.
>>>>>>>>>>
>>>>>>>>>> For what I saw, Discourse has one level of categories, so I guess
>>>>>>>>>> we want one per project, so we can have categories for "Users",
>>>>>>>>>> "Contributors", "Ecosystem"... or something similar. I guess if we have a
>>>>>>>>>> single Discourse for NumFOCUS, every project will be a category, and it'll
>>>>>>>>>> be difficult to group conversations.
>>>>>>>>>>
>>>>>>>>>> If anyone already has experience with Discourse and disagrees
>>>>>>>>>> with my guesses, please let me know.
>>>>>>>>>>
>>>>>>>>>> On Wed, Sep 18, 2019 at 4:32 PM Andy Terrel <andy at numfocus.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Sounds great to me. Just let me know where everything goes.
>>>>>>>>>>>
>>>>>>>>>>> NumPy wants me to help host a discourse for them, maybe OVH
>>>>>>>>>>> would be a good place to do that as well, (although I would be more
>>>>>>>>>>> inclinded if it was pydata and we had pandas, scipy, and numpy on it).
>>>>>>>>>>>
>>>>>>>>>>> -- Andy
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:51 AM Tom Augspurger <
>>>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Sounds good w.r.t crediting OVH on those pages.
>>>>>>>>>>>>
>>>>>>>>>>>> For the ASV results at pandas.pydata.org/speed (which I now
>>>>>>>>>>>> notice is currently broken for pandas), the only thing on the webserver is a
>>>>>>>>>>>> cron job doing a `git pull` from
>>>>>>>>>>>> https://github.com/asv-runner/asv-collection, from within
>>>>>>>>>>>> `/usr/share/nginx`.
>>>>>>>>>>>>
>>>>>>>>>>>> Tom
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:18 AM Marc Garcia <
>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> An update on the new website infrastructure. We need to finish
>>>>>>>>>>>>> discussing the details, but OVH is happy to provide the hosting for the
>>>>>>>>>>>>> pandas infrastructure we need.
>>>>>>>>>>>>>
>>>>>>>>>>>>> My initial idea is to credit them in the page with the rest of
>>>>>>>>>>>>> the sponsors in the new website:
>>>>>>>>>>>>> https://datapythonista.github.io/pandas-web/community/team.html#institutional-partners and
>>>>>>>>>>>>> also in the top right corner of the runnable code widgets (see for example
>>>>>>>>>>>>> where Binder is credited here: https://spacy.io/).
>>>>>>>>>>>>>
>>>>>>>>>>>>> What I'd like to ask is:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. For the production website and docs (static content only,
>>>>>>>>>>>>> for the traffic we need):
>>>>>>>>>>>>> https://us.ovhcloud.com/products/public-cloud/object-storage
>>>>>>>>>>>>> 2. For our tools and processes, like the benchmarks, builds,
>>>>>>>>>>>>> CI stuff (temporary publish the docs for every PR,...):
>>>>>>>>>>>>> https://www.ovh.co.uk/vps/vps-ssd.xml (VPS SSD 3)
>>>>>>>>>>>>> 3. For BinderHub (runnable code in our docs, launch tutorials
>>>>>>>>>>>>> on Binder...): https://www.ovh.co.uk/public-cloud/kubernetes/
>>>>>>>>>>>>>
>>>>>>>>>>>>> For the BinderHub, QuantStack offered help with the set up
>>>>>>>>>>>>> (which is great, because I don't know much about Binder myself, and I'm not
>>>>>>>>>>>>> sure if anyone else does or wants to take care of this). I don't think
>>>>>>>>>>>>> it'll be easy to estimate how big is the cluster we need beforehand, but I
>>>>>>>>>>>>> guess we can add things to Binder iteratively, and have more info as we
>>>>>>>>>>>>> grow.
>>>>>>>>>>>>>
>>>>>>>>>>>>> OVH gave us a 200 euros voucher to experiment with the
>>>>>>>>>>>>> different services. Let me know how all this sounds, and if there are no
>>>>>>>>>>>>> objections, I'll create an account and buy those services with the voucher,
>>>>>>>>>>>>> and I'll start to prototype and see how everything works.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Aug 20, 2019 at 11:06 PM Marc Garcia <
>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Somehow related to the work on the new website (
>>>>>>>>>>>>>> https://github.com/pandas-dev/pandas/pull/28014), I've been
>>>>>>>>>>>>>> discussing with the Binder team, and looks like should be quite easy soon
>>>>>>>>>>>>>> (with a Sphinx extension) to make all the documentation pages runnable with
>>>>>>>>>>>>>> Binder, directly from the website (without opening the page as a Jupyter in
>>>>>>>>>>>>>> mybinder).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> While they are very happy with the idea of having this is
>>>>>>>>>>>>>> pandas, it's uncertain if the current infrastructure Binder has got, is
>>>>>>>>>>>>>> able to handle all the traffic we would send. And scikit-learn is working
>>>>>>>>>>>>>> on it too (today they added to the dev docs a link to mybinder to run the
>>>>>>>>>>>>>> examples).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm discussing with OVH (their infrastructure provider) on
>>>>>>>>>>>>>> whether they'd be happy to provide a dedicated BinderHub specific to pandas
>>>>>>>>>>>>>> (or may be we can have one for all NumFOCUS projects). We'll see how it
>>>>>>>>>>>>>> goes, but wanted to let you know, so you're updated, and in case anyone is
>>>>>>>>>>>>>> interested in participating in the discussions. Of course before any
>>>>>>>>>>>>>> decision is made I'll open a discussion here or on GitHub.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As part of the discussion I'm also trying to get a server for
>>>>>>>>>>>>>> the website, and one for development stuff. Specfically for the dev docs
>>>>>>>>>>>>>> (including rendered docs of every PR) and the GitHub app that will generate
>>>>>>>>>>>>>> them. I guess it should be very easy to find a sponsor for these two
>>>>>>>>>>>>>> servers (in exchange of a small note in the footer of the website, or
>>>>>>>>>>>>>> something like that).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Let me know if you have any comment, want to be involved or
>>>>>>>>>>>>>> whatever.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Pandas-dev mailing list
>>>>>>>>>>>>> Pandas-dev at python.org
>>>>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Andy R. Terrel, PhD
>>>>>>>>>>> President
>>>>>>>>>>> NumFOCUS
>>>>>>>>>>> andy at numfocus.org
>>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>> Pandas-dev mailing list
>>>>>>>> Pandas-dev at python.org
>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>
>>>>>>> --
>>>>>> Andy R. Terrel, PhD
>>>>>> President
>>>>>> NumFOCUS
>>>>>> andy at numfocus.org
>>>>>>
>>>>> _______________________________________________
>>> Pandas-dev mailing list
>>> Pandas-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>
>> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>


-- 
Andy R. Terrel, PhD
President, NumFOCUS
andy at numfocus.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191029/17f2a28d/attachment-0001.html>

From garcia.marc at gmail.com  Tue Oct 29 13:06:55 2019
From: garcia.marc at gmail.com (Marc Garcia)
Date: Tue, 29 Oct 2019 12:06:55 -0500
Subject: [Pandas-dev] Discourse discussion forum
In-Reply-To: <CA+WonSS7ML4J4DyOTNGv-LN01Wv90DvOG0tOe15KN-edcEz_rA@mail.gmail.com>
References: <CALQtMBZc=jMLPFmN4=dF9MG4u24ggVKnj-npE565TmdmVbjLOQ@mail.gmail.com>
 <CA+uMjAtfqzgnLEbSGZPFSnz_12RrVv26cYz9EqVTSMUArTeasw@mail.gmail.com>
 <CAE1aY-=U=FJZWrqcsjvaw-Q14bjCUOhbojX9uJ5kqG2o7M3W-Q@mail.gmail.com>
 <CAEk5N5vTGJAeWbYh3JOAdsbEvb1Agan2roZPyVGAeFvSjf6XqQ@mail.gmail.com>
 <CALQtMBY4uG6kxdZvqVMw6Ou_Zs3WTyD=MevJamZmCBKrwvMxMA@mail.gmail.com>
 <CAEk5N5srS3uBED+Grb3Ya=coOQ8mLdhqGMC8mEJKaswc4Ho-dg@mail.gmail.com>
 <CAEk5N5swt=EGDa8YLZt5ScQm_7woyMYzB3b8OqEttp9nfx1u5w@mail.gmail.com>
 <CA+WonSS7ML4J4DyOTNGv-LN01Wv90DvOG0tOe15KN-edcEz_rA@mail.gmail.com>
Message-ID: <CAEk5N5sa9ZrzJyR5Q_d=KATagSQuPRgrfMBiUGXRX2qaAYyo4A@mail.gmail.com>

I personally don't see the value of having a common discourse for all the
projects, where the top-level is a list of possibly 100 items, where pandas
has few groups lost there, and not more structure than that, as opposed to
have a discourse per project.

Single-login is the only advantage I can see, and this can also be achieved
with separate groups for what I've seen.

Tom, Joris, I think you were the ones who preferred having a common
discourse. Does it still sounds as the best option, given the limitations?

On Tue, Oct 29, 2019 at 11:51 AM Andy Ray Terrel <andy at numfocus.org> wrote:

> Sorry I've been traveling.
>
> I have https://pydata.discourse. <http://pydata.discourse.org>group set
> up. I can send out invites.
>
> I guess as you have pointed out, we can set up categories for each
> project, e.g. dask-users, pandas-users, pandas-dev, but maybe not exactly
> what you want.
>
> Happy to invite anyone to the discourse instance before we open it up to
> the wild
>
> -- Andy
>
> On Tue, Oct 29, 2019 at 9:24 AM Marc Garcia <garcia.marc at gmail.com> wrote:
>
>> Andy, could you experiment on having multiple projects in a single
>> discourse? I saw the PyData one was activated some time ago.
>>
>> If it doesn't look feasible as I think, let me know so I'll move forward
>> discussing what to have in the pandas one.
>>
>> Cheers!
>>
>> On Wed, Sep 25, 2019 at 8:03 AM Marc Garcia <garcia.marc at gmail.com>
>> wrote:
>>
>>> Discourse has private categories, we already have a private
>>> "Maintainers" one, that only admins can see and use. And there are other
>>> permissions levels that can be used. For example, we can have a private
>>> category for the memebers of the code of conduct committee... I just need
>>> to check if we can associate email addresses to those groups, so when
>>> someone emails to coc at pandas.io the messages are posted in that private
>>> group. But if we can set up that as we need, I think we should be able to
>>> replace all those and centralize everything in Discourse.
>>>
>>> I'm skeptical on being able to set up a global Discourse for all the
>>> ecosystem, where things are easy to find, based on how Discourse works and
>>> the tests I did. I'd move forward with our own for now if nobody is able to
>>> set that up.
>>>
>>> Andy, I got the pandas account approved in minutes. I see that we can
>>> have a custom domain, so you can use the pandas and see if we can manage to
>>> have multiple projects in a way we like, and if we do we just change the
>>> domain to discuss.pydata.org (or whatever). You're already an admin,
>>> feel free to experiment and change the set up as you need.
>>>
>>> Maarten, not sure I understand your point. Not a fan of Discourse so
>>> far, but I think having the user and the devs discussions in a single place
>>> makes it easier to find the information, and I think Discourse interface
>>> also makes it easier to find compared to mailman, or google groups.
>>> Regardless of gitter (there are no important discussions or decision making
>>> there I think), would you prefer to stay with mailman and google groups
>>> over Discourse? Or what you think would be the ideal or best option?
>>>
>>> Thanks!
>>>
>>> On Wed, Sep 25, 2019 at 8:39 AM Joris Van den Bossche <
>>> jorisvandenbossche at gmail.com> wrote:
>>>
>>>> What do other people think about starting to use discourse for pandas?
>>>> (and about sharing it with other projects or having our own?)
>>>>
>>>> --
>>>>
>>>> On the existing lists: I don't think discourse would replace the core
>>>> devs list (that is intentionally private). And IMO also not gitter
>>>> (discourse is not a real-time chat).
>>>>
>>>> Joris
>>>>
>>>> On Fri, 20 Sep 2019 at 14:58, Marc Garcia <garcia.marc at gmail.com>
>>>> wrote:
>>>>
>>>>> For what I've seen I'd say that Discourse can be configured to
>>>>> interact with a category like a distribution list (subscribe and have an
>>>>> email address to send messages there). Not sure, but for the settings I've
>>>>> seen should be possible.
>>>>>
>>>>> Personally I think it should replace all the existing lists:
>>>>> - pydata google group
>>>>> - pandas-dev (this)
>>>>> - core devs list
>>>>>
>>>>> I'm also ok to get rid of gitter once we move to discourse (also ok to
>>>>> keep it if people find it useful, but I rarely use it).
>>>>>
>>>>> I created an issue for this discussion some time ago:
>>>>> https://github.com/pandas-dev/pandas/issues/27903
>>>>>
>>>>> On Fri, Sep 20, 2019 at 1:50 PM Tom Augspurger <
>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 20, 2019 at 6:57 AM Andy Terrel <andy at numfocus.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Joris for splitting the thread, sorry if I hijacked the other
>>>>>>> one.
>>>>>>>
>>>>>>> For some discussion from numpy you can see here
>>>>>>> https://github.com/numpy/numpy.org/issues/28
>>>>>>>
>>>>>>> Julia and Jupyter both run their own discourse but Dask, Numpy,
>>>>>>> Scipy have all told me ?I don?t want to run it ourselves but be part of a
>>>>>>> larger one?
>>>>>>>
>>>>>>> I bet we can figure out how to organize it.
>>>>>>>
>>>>>>> I just put in an application to get pydata.discourse.org.
>>>>>>>
>>>>>>> ? Andy
>>>>>>>
>>>>>>> On Fri, Sep 20, 2019 at 6:52 AM Joris Van den Bossche <
>>>>>>> jorisvandenbossche at gmail.com> wrote:
>>>>>>>
>>>>>>>> (let's use a new thread for discourse, as it is a different
>>>>>>>> discussion from the website hosting I think, regardless whether OVH might
>>>>>>>> also host discourse)
>>>>>>>>
>>>>>>>> I am not familiar enough myself with discourse to know whether
>>>>>>>> multiple projects sharing a single discourse will become annoying. But
>>>>>>>> indeed, that sounds as it needs some kind of hierarchical category /
>>>>>>>> tagging.
>>>>>>>>
>>>>>>>> For pandas itself: I think I quite like the idea of having a
>>>>>>>> discourse, but *if* we do that, we should think about how that
>>>>>>>> fits with / replaces / adds to /... some of the other communication
>>>>>>>> channels (pandas-dev mailing list, pydata mailing list, github issues, ..).
>>>>>>>>
>>>>>>>
>>>>>> IMO, we can replace the pandas-dev & pydata mailing lists with it.
>>>>>> Possibly gitter as well.
>>>>>>
>>>>>>
>>>>>>> Joris
>>>>>>>>
>>>>>>>> On Fri, 20 Sep 2019 at 13:18, Marc Garcia <garcia.marc at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I'm fine with that conceptually, but I think Discourse will make
>>>>>>>>> things quite tricky to find things then.
>>>>>>>>>
>>>>>>>>> We already got our discourse approved, if you want to join it an
>>>>>>>>> experiment with the setting. But it's the first thing I tried, and after
>>>>>>>>> you join a category (project), everything feels like it's in the same place
>>>>>>>>> (even if subcategories and tags exist). And I think we need at least a
>>>>>>>>> clear separation between pandas/users pandas/contributors discussions.
>>>>>>>>>
>>>>>>>>> May be I just couldn't find the settings, let me know if you
>>>>>>>>> manage to get a multi-project set up that makes sense.
>>>>>>>>>
>>>>>>>>> On Fri, Sep 20, 2019 at 12:07 PM Tom Augspurger <
>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> I'd prefer to join a discourse along with NumPy, Dask, and other
>>>>>>>>>> PyData or NumFOCUS projects, rather than going out on our own.
>>>>>>>>>>
>>>>>>>>>> On Fri, Sep 20, 2019 at 4:47 AM Marc Garcia <
>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I don't know much about discourse, but why do we want to
>>>>>>>>>>> self-host it? Seems like Discourse does it for free for open source
>>>>>>>>>>> projects: https://free.discourse.group/ And I don't think we
>>>>>>>>>>> want another system to maintain. Am I missing something?
>>>>>>>>>>>
>>>>>>>>>>> I applied for https://pandas.discourse.group, so we can give it
>>>>>>>>>>> a try. We should have it approved and working in couple of days.
>>>>>>>>>>>
>>>>>>>>>>> For what I saw, Discourse has one level of categories, so I
>>>>>>>>>>> guess we want one per project, so we can have categories for "Users",
>>>>>>>>>>> "Contributors", "Ecosystem"... or something similar. I guess if we have a
>>>>>>>>>>> single Discourse for NumFOCUS, every project will be a category, and it'll
>>>>>>>>>>> be difficult to group conversations.
>>>>>>>>>>>
>>>>>>>>>>> If anyone already has experience with Discourse and disagrees
>>>>>>>>>>> with my guesses, please let me know.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Sep 18, 2019 at 4:32 PM Andy Terrel <andy at numfocus.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Sounds great to me. Just let me know where everything goes.
>>>>>>>>>>>>
>>>>>>>>>>>> NumPy wants me to help host a discourse for them, maybe OVH
>>>>>>>>>>>> would be a good place to do that as well, (although I would be more
>>>>>>>>>>>> inclinded if it was pydata and we had pandas, scipy, and numpy on it).
>>>>>>>>>>>>
>>>>>>>>>>>> -- Andy
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:51 AM Tom Augspurger <
>>>>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Sounds good w.r.t crediting OVH on those pages.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For the ASV results at pandas.pydata.org/speed (which I now
>>>>>>>>>>>>> notice is currently broken for pandas), the only thing on the webserver is a
>>>>>>>>>>>>> cron job doing a `git pull` from
>>>>>>>>>>>>> https://github.com/asv-runner/asv-collection, from within
>>>>>>>>>>>>> `/usr/share/nginx`.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tom
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:18 AM Marc Garcia <
>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> An update on the new website infrastructure. We need to
>>>>>>>>>>>>>> finish discussing the details, but OVH is happy to provide the hosting for
>>>>>>>>>>>>>> the pandas infrastructure we need.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> My initial idea is to credit them in the page with the rest
>>>>>>>>>>>>>> of the sponsors in the new website:
>>>>>>>>>>>>>> https://datapythonista.github.io/pandas-web/community/team.html#institutional-partners and
>>>>>>>>>>>>>> also in the top right corner of the runnable code widgets (see for example
>>>>>>>>>>>>>> where Binder is credited here: https://spacy.io/).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What I'd like to ask is:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1. For the production website and docs (static content only,
>>>>>>>>>>>>>> for the traffic we need):
>>>>>>>>>>>>>> https://us.ovhcloud.com/products/public-cloud/object-storage
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2. For our tools and processes, like the benchmarks, builds,
>>>>>>>>>>>>>> CI stuff (temporary publish the docs for every PR,...):
>>>>>>>>>>>>>> https://www.ovh.co.uk/vps/vps-ssd.xml (VPS SSD 3)
>>>>>>>>>>>>>> 3. For BinderHub (runnable code in our docs, launch tutorials
>>>>>>>>>>>>>> on Binder...): https://www.ovh.co.uk/public-cloud/kubernetes/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For the BinderHub, QuantStack offered help with the set up
>>>>>>>>>>>>>> (which is great, because I don't know much about Binder myself, and I'm not
>>>>>>>>>>>>>> sure if anyone else does or wants to take care of this). I don't think
>>>>>>>>>>>>>> it'll be easy to estimate how big is the cluster we need beforehand, but I
>>>>>>>>>>>>>> guess we can add things to Binder iteratively, and have more info as we
>>>>>>>>>>>>>> grow.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> OVH gave us a 200 euros voucher to experiment with the
>>>>>>>>>>>>>> different services. Let me know how all this sounds, and if there are no
>>>>>>>>>>>>>> objections, I'll create an account and buy those services with the voucher,
>>>>>>>>>>>>>> and I'll start to prototype and see how everything works.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Aug 20, 2019 at 11:06 PM Marc Garcia <
>>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Somehow related to the work on the new website (
>>>>>>>>>>>>>>> https://github.com/pandas-dev/pandas/pull/28014), I've been
>>>>>>>>>>>>>>> discussing with the Binder team, and looks like should be quite easy soon
>>>>>>>>>>>>>>> (with a Sphinx extension) to make all the documentation pages runnable with
>>>>>>>>>>>>>>> Binder, directly from the website (without opening the page as a Jupyter in
>>>>>>>>>>>>>>> mybinder).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> While they are very happy with the idea of having this is
>>>>>>>>>>>>>>> pandas, it's uncertain if the current infrastructure Binder has got, is
>>>>>>>>>>>>>>> able to handle all the traffic we would send. And scikit-learn is working
>>>>>>>>>>>>>>> on it too (today they added to the dev docs a link to mybinder to run the
>>>>>>>>>>>>>>> examples).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm discussing with OVH (their infrastructure provider) on
>>>>>>>>>>>>>>> whether they'd be happy to provide a dedicated BinderHub specific to pandas
>>>>>>>>>>>>>>> (or may be we can have one for all NumFOCUS projects). We'll see how it
>>>>>>>>>>>>>>> goes, but wanted to let you know, so you're updated, and in case anyone is
>>>>>>>>>>>>>>> interested in participating in the discussions. Of course before any
>>>>>>>>>>>>>>> decision is made I'll open a discussion here or on GitHub.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As part of the discussion I'm also trying to get a server
>>>>>>>>>>>>>>> for the website, and one for development stuff. Specfically for the dev
>>>>>>>>>>>>>>> docs (including rendered docs of every PR) and the GitHub app that will
>>>>>>>>>>>>>>> generate them. I guess it should be very easy to find a sponsor for these
>>>>>>>>>>>>>>> two servers (in exchange of a small note in the footer of the website, or
>>>>>>>>>>>>>>> something like that).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Let me know if you have any comment, want to be involved or
>>>>>>>>>>>>>>> whatever.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Pandas-dev mailing list
>>>>>>>>>>>>>> Pandas-dev at python.org
>>>>>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Andy R. Terrel, PhD
>>>>>>>>>>>> President
>>>>>>>>>>>> NumFOCUS
>>>>>>>>>>>> andy at numfocus.org
>>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>> Pandas-dev mailing list
>>>>>>>>> Pandas-dev at python.org
>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>
>>>>>>>> --
>>>>>>> Andy R. Terrel, PhD
>>>>>>> President
>>>>>>> NumFOCUS
>>>>>>> andy at numfocus.org
>>>>>>>
>>>>>> _______________________________________________
>>>> Pandas-dev mailing list
>>>> Pandas-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>
>>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>
>
> --
> Andy R. Terrel, PhD
> President, NumFOCUS
> andy at numfocus.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191029/63f8374c/attachment-0001.html>

From andy at numfocus.org  Tue Oct 29 13:08:35 2019
From: andy at numfocus.org (Andy Ray Terrel)
Date: Tue, 29 Oct 2019 10:08:35 -0700
Subject: [Pandas-dev] Discourse discussion forum
In-Reply-To: <CAEk5N5sa9ZrzJyR5Q_d=KATagSQuPRgrfMBiUGXRX2qaAYyo4A@mail.gmail.com>
References: <CALQtMBZc=jMLPFmN4=dF9MG4u24ggVKnj-npE565TmdmVbjLOQ@mail.gmail.com>
 <CA+uMjAtfqzgnLEbSGZPFSnz_12RrVv26cYz9EqVTSMUArTeasw@mail.gmail.com>
 <CAE1aY-=U=FJZWrqcsjvaw-Q14bjCUOhbojX9uJ5kqG2o7M3W-Q@mail.gmail.com>
 <CAEk5N5vTGJAeWbYh3JOAdsbEvb1Agan2roZPyVGAeFvSjf6XqQ@mail.gmail.com>
 <CALQtMBY4uG6kxdZvqVMw6Ou_Zs3WTyD=MevJamZmCBKrwvMxMA@mail.gmail.com>
 <CAEk5N5srS3uBED+Grb3Ya=coOQ8mLdhqGMC8mEJKaswc4Ho-dg@mail.gmail.com>
 <CAEk5N5swt=EGDa8YLZt5ScQm_7woyMYzB3b8OqEttp9nfx1u5w@mail.gmail.com>
 <CA+WonSS7ML4J4DyOTNGv-LN01Wv90DvOG0tOe15KN-edcEz_rA@mail.gmail.com>
 <CAEk5N5sa9ZrzJyR5Q_d=KATagSQuPRgrfMBiUGXRX2qaAYyo4A@mail.gmail.com>
Message-ID: <CA+WonSRDvkF+VYiLK+ag1iqP+X0504p9vnBFtdbA0GexjCurvQ@mail.gmail.com>

I think the value many have is for cross project issues, but maybe those
are few and far between.

On Tue, Oct 29, 2019 at 10:07 AM Marc Garcia <garcia.marc at gmail.com> wrote:

> I personally don't see the value of having a common discourse for all the
> projects, where the top-level is a list of possibly 100 items, where pandas
> has few groups lost there, and not more structure than that, as opposed to
> have a discourse per project.
>
> Single-login is the only advantage I can see, and this can also be
> achieved with separate groups for what I've seen.
>
> Tom, Joris, I think you were the ones who preferred having a common
> discourse. Does it still sounds as the best option, given the limitations?
>
> On Tue, Oct 29, 2019 at 11:51 AM Andy Ray Terrel <andy at numfocus.org>
> wrote:
>
>> Sorry I've been traveling.
>>
>> I have https://pydata.discourse. <http://pydata.discourse.org>group set
>> up. I can send out invites.
>>
>> I guess as you have pointed out, we can set up categories for each
>> project, e.g. dask-users, pandas-users, pandas-dev, but maybe not exactly
>> what you want.
>>
>> Happy to invite anyone to the discourse instance before we open it up to
>> the wild
>>
>> -- Andy
>>
>> On Tue, Oct 29, 2019 at 9:24 AM Marc Garcia <garcia.marc at gmail.com>
>> wrote:
>>
>>> Andy, could you experiment on having multiple projects in a single
>>> discourse? I saw the PyData one was activated some time ago.
>>>
>>> If it doesn't look feasible as I think, let me know so I'll move forward
>>> discussing what to have in the pandas one.
>>>
>>> Cheers!
>>>
>>> On Wed, Sep 25, 2019 at 8:03 AM Marc Garcia <garcia.marc at gmail.com>
>>> wrote:
>>>
>>>> Discourse has private categories, we already have a private
>>>> "Maintainers" one, that only admins can see and use. And there are other
>>>> permissions levels that can be used. For example, we can have a private
>>>> category for the memebers of the code of conduct committee... I just need
>>>> to check if we can associate email addresses to those groups, so when
>>>> someone emails to coc at pandas.io the messages are posted in that
>>>> private group. But if we can set up that as we need, I think we should be
>>>> able to replace all those and centralize everything in Discourse.
>>>>
>>>> I'm skeptical on being able to set up a global Discourse for all the
>>>> ecosystem, where things are easy to find, based on how Discourse works and
>>>> the tests I did. I'd move forward with our own for now if nobody is able to
>>>> set that up.
>>>>
>>>> Andy, I got the pandas account approved in minutes. I see that we can
>>>> have a custom domain, so you can use the pandas and see if we can manage to
>>>> have multiple projects in a way we like, and if we do we just change the
>>>> domain to discuss.pydata.org (or whatever). You're already an admin,
>>>> feel free to experiment and change the set up as you need.
>>>>
>>>> Maarten, not sure I understand your point. Not a fan of Discourse so
>>>> far, but I think having the user and the devs discussions in a single place
>>>> makes it easier to find the information, and I think Discourse interface
>>>> also makes it easier to find compared to mailman, or google groups.
>>>> Regardless of gitter (there are no important discussions or decision making
>>>> there I think), would you prefer to stay with mailman and google groups
>>>> over Discourse? Or what you think would be the ideal or best option?
>>>>
>>>> Thanks!
>>>>
>>>> On Wed, Sep 25, 2019 at 8:39 AM Joris Van den Bossche <
>>>> jorisvandenbossche at gmail.com> wrote:
>>>>
>>>>> What do other people think about starting to use discourse for pandas?
>>>>> (and about sharing it with other projects or having our own?)
>>>>>
>>>>> --
>>>>>
>>>>> On the existing lists: I don't think discourse would replace the core
>>>>> devs list (that is intentionally private). And IMO also not gitter
>>>>> (discourse is not a real-time chat).
>>>>>
>>>>> Joris
>>>>>
>>>>> On Fri, 20 Sep 2019 at 14:58, Marc Garcia <garcia.marc at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> For what I've seen I'd say that Discourse can be configured to
>>>>>> interact with a category like a distribution list (subscribe and have an
>>>>>> email address to send messages there). Not sure, but for the settings I've
>>>>>> seen should be possible.
>>>>>>
>>>>>> Personally I think it should replace all the existing lists:
>>>>>> - pydata google group
>>>>>> - pandas-dev (this)
>>>>>> - core devs list
>>>>>>
>>>>>> I'm also ok to get rid of gitter once we move to discourse (also ok
>>>>>> to keep it if people find it useful, but I rarely use it).
>>>>>>
>>>>>> I created an issue for this discussion some time ago:
>>>>>> https://github.com/pandas-dev/pandas/issues/27903
>>>>>>
>>>>>> On Fri, Sep 20, 2019 at 1:50 PM Tom Augspurger <
>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 20, 2019 at 6:57 AM Andy Terrel <andy at numfocus.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks Joris for splitting the thread, sorry if I hijacked the
>>>>>>>> other one.
>>>>>>>>
>>>>>>>> For some discussion from numpy you can see here
>>>>>>>> https://github.com/numpy/numpy.org/issues/28
>>>>>>>>
>>>>>>>> Julia and Jupyter both run their own discourse but Dask, Numpy,
>>>>>>>> Scipy have all told me ?I don?t want to run it ourselves but be part of a
>>>>>>>> larger one?
>>>>>>>>
>>>>>>>> I bet we can figure out how to organize it.
>>>>>>>>
>>>>>>>> I just put in an application to get pydata.discourse.org.
>>>>>>>>
>>>>>>>> ? Andy
>>>>>>>>
>>>>>>>> On Fri, Sep 20, 2019 at 6:52 AM Joris Van den Bossche <
>>>>>>>> jorisvandenbossche at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> (let's use a new thread for discourse, as it is a different
>>>>>>>>> discussion from the website hosting I think, regardless whether OVH might
>>>>>>>>> also host discourse)
>>>>>>>>>
>>>>>>>>> I am not familiar enough myself with discourse to know whether
>>>>>>>>> multiple projects sharing a single discourse will become annoying. But
>>>>>>>>> indeed, that sounds as it needs some kind of hierarchical category /
>>>>>>>>> tagging.
>>>>>>>>>
>>>>>>>>> For pandas itself: I think I quite like the idea of having a
>>>>>>>>> discourse, but *if* we do that, we should think about how that
>>>>>>>>> fits with / replaces / adds to /... some of the other communication
>>>>>>>>> channels (pandas-dev mailing list, pydata mailing list, github issues, ..).
>>>>>>>>>
>>>>>>>>
>>>>>>> IMO, we can replace the pandas-dev & pydata mailing lists with it.
>>>>>>> Possibly gitter as well.
>>>>>>>
>>>>>>>
>>>>>>>> Joris
>>>>>>>>>
>>>>>>>>> On Fri, 20 Sep 2019 at 13:18, Marc Garcia <garcia.marc at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I'm fine with that conceptually, but I think Discourse will make
>>>>>>>>>> things quite tricky to find things then.
>>>>>>>>>>
>>>>>>>>>> We already got our discourse approved, if you want to join it an
>>>>>>>>>> experiment with the setting. But it's the first thing I tried, and after
>>>>>>>>>> you join a category (project), everything feels like it's in the same place
>>>>>>>>>> (even if subcategories and tags exist). And I think we need at least a
>>>>>>>>>> clear separation between pandas/users pandas/contributors discussions.
>>>>>>>>>>
>>>>>>>>>> May be I just couldn't find the settings, let me know if you
>>>>>>>>>> manage to get a multi-project set up that makes sense.
>>>>>>>>>>
>>>>>>>>>> On Fri, Sep 20, 2019 at 12:07 PM Tom Augspurger <
>>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I'd prefer to join a discourse along with NumPy, Dask, and other
>>>>>>>>>>> PyData or NumFOCUS projects, rather than going out on our own.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Sep 20, 2019 at 4:47 AM Marc Garcia <
>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I don't know much about discourse, but why do we want to
>>>>>>>>>>>> self-host it? Seems like Discourse does it for free for open source
>>>>>>>>>>>> projects: https://free.discourse.group/ And I don't think we
>>>>>>>>>>>> want another system to maintain. Am I missing something?
>>>>>>>>>>>>
>>>>>>>>>>>> I applied for https://pandas.discourse.group, so we can give
>>>>>>>>>>>> it a try. We should have it approved and working in couple of days.
>>>>>>>>>>>>
>>>>>>>>>>>> For what I saw, Discourse has one level of categories, so I
>>>>>>>>>>>> guess we want one per project, so we can have categories for "Users",
>>>>>>>>>>>> "Contributors", "Ecosystem"... or something similar. I guess if we have a
>>>>>>>>>>>> single Discourse for NumFOCUS, every project will be a category, and it'll
>>>>>>>>>>>> be difficult to group conversations.
>>>>>>>>>>>>
>>>>>>>>>>>> If anyone already has experience with Discourse and disagrees
>>>>>>>>>>>> with my guesses, please let me know.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Sep 18, 2019 at 4:32 PM Andy Terrel <andy at numfocus.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Sounds great to me. Just let me know where everything goes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> NumPy wants me to help host a discourse for them, maybe OVH
>>>>>>>>>>>>> would be a good place to do that as well, (although I would be more
>>>>>>>>>>>>> inclinded if it was pydata and we had pandas, scipy, and numpy on it).
>>>>>>>>>>>>>
>>>>>>>>>>>>> -- Andy
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:51 AM Tom Augspurger <
>>>>>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sounds good w.r.t crediting OVH on those pages.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For the ASV results at pandas.pydata.org/speed (which I now
>>>>>>>>>>>>>> notice is currently broken for pandas), the only thing on the webserver is a
>>>>>>>>>>>>>> cron job doing a `git pull` from
>>>>>>>>>>>>>> https://github.com/asv-runner/asv-collection, from within
>>>>>>>>>>>>>> `/usr/share/nginx`.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Tom
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:18 AM Marc Garcia <
>>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> An update on the new website infrastructure. We need to
>>>>>>>>>>>>>>> finish discussing the details, but OVH is happy to provide the hosting for
>>>>>>>>>>>>>>> the pandas infrastructure we need.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> My initial idea is to credit them in the page with the rest
>>>>>>>>>>>>>>> of the sponsors in the new website:
>>>>>>>>>>>>>>> https://datapythonista.github.io/pandas-web/community/team.html#institutional-partners and
>>>>>>>>>>>>>>> also in the top right corner of the runnable code widgets (see for example
>>>>>>>>>>>>>>> where Binder is credited here: https://spacy.io/).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What I'd like to ask is:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. For the production website and docs (static content only,
>>>>>>>>>>>>>>> for the traffic we need):
>>>>>>>>>>>>>>> https://us.ovhcloud.com/products/public-cloud/object-storage
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2. For our tools and processes, like the benchmarks, builds,
>>>>>>>>>>>>>>> CI stuff (temporary publish the docs for every PR,...):
>>>>>>>>>>>>>>> https://www.ovh.co.uk/vps/vps-ssd.xml (VPS SSD 3)
>>>>>>>>>>>>>>> 3. For BinderHub (runnable code in our docs, launch
>>>>>>>>>>>>>>> tutorials on Binder...):
>>>>>>>>>>>>>>> https://www.ovh.co.uk/public-cloud/kubernetes/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For the BinderHub, QuantStack offered help with the set up
>>>>>>>>>>>>>>> (which is great, because I don't know much about Binder myself, and I'm not
>>>>>>>>>>>>>>> sure if anyone else does or wants to take care of this). I don't think
>>>>>>>>>>>>>>> it'll be easy to estimate how big is the cluster we need beforehand, but I
>>>>>>>>>>>>>>> guess we can add things to Binder iteratively, and have more info as we
>>>>>>>>>>>>>>> grow.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> OVH gave us a 200 euros voucher to experiment with the
>>>>>>>>>>>>>>> different services. Let me know how all this sounds, and if there are no
>>>>>>>>>>>>>>> objections, I'll create an account and buy those services with the voucher,
>>>>>>>>>>>>>>> and I'll start to prototype and see how everything works.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Aug 20, 2019 at 11:06 PM Marc Garcia <
>>>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Somehow related to the work on the new website (
>>>>>>>>>>>>>>>> https://github.com/pandas-dev/pandas/pull/28014), I've
>>>>>>>>>>>>>>>> been discussing with the Binder team, and looks like should be quite easy
>>>>>>>>>>>>>>>> soon (with a Sphinx extension) to make all the documentation pages runnable
>>>>>>>>>>>>>>>> with Binder, directly from the website (without opening the page as a
>>>>>>>>>>>>>>>> Jupyter in mybinder).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> While they are very happy with the idea of having this is
>>>>>>>>>>>>>>>> pandas, it's uncertain if the current infrastructure Binder has got, is
>>>>>>>>>>>>>>>> able to handle all the traffic we would send. And scikit-learn is working
>>>>>>>>>>>>>>>> on it too (today they added to the dev docs a link to mybinder to run the
>>>>>>>>>>>>>>>> examples).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm discussing with OVH (their infrastructure provider) on
>>>>>>>>>>>>>>>> whether they'd be happy to provide a dedicated BinderHub specific to pandas
>>>>>>>>>>>>>>>> (or may be we can have one for all NumFOCUS projects). We'll see how it
>>>>>>>>>>>>>>>> goes, but wanted to let you know, so you're updated, and in case anyone is
>>>>>>>>>>>>>>>> interested in participating in the discussions. Of course before any
>>>>>>>>>>>>>>>> decision is made I'll open a discussion here or on GitHub.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As part of the discussion I'm also trying to get a server
>>>>>>>>>>>>>>>> for the website, and one for development stuff. Specfically for the dev
>>>>>>>>>>>>>>>> docs (including rendered docs of every PR) and the GitHub app that will
>>>>>>>>>>>>>>>> generate them. I guess it should be very easy to find a sponsor for these
>>>>>>>>>>>>>>>> two servers (in exchange of a small note in the footer of the website, or
>>>>>>>>>>>>>>>> something like that).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Let me know if you have any comment, want to be involved or
>>>>>>>>>>>>>>>> whatever.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Pandas-dev mailing list
>>>>>>>>>>>>>>> Pandas-dev at python.org
>>>>>>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Andy R. Terrel, PhD
>>>>>>>>>>>>> President
>>>>>>>>>>>>> NumFOCUS
>>>>>>>>>>>>> andy at numfocus.org
>>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>> Pandas-dev mailing list
>>>>>>>>>> Pandas-dev at python.org
>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>> Andy R. Terrel, PhD
>>>>>>>> President
>>>>>>>> NumFOCUS
>>>>>>>> andy at numfocus.org
>>>>>>>>
>>>>>>> _______________________________________________
>>>>> Pandas-dev mailing list
>>>>> Pandas-dev at python.org
>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>
>>>> _______________________________________________
>>> Pandas-dev mailing list
>>> Pandas-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>
>>
>>
>> --
>> Andy R. Terrel, PhD
>> President, NumFOCUS
>> andy at numfocus.org
>>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>


-- 
Andy R. Terrel, PhD
President, NumFOCUS
andy at numfocus.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191029/b37a7dd5/attachment-0001.html>

From tom.augspurger88 at gmail.com  Tue Oct 29 13:09:08 2019
From: tom.augspurger88 at gmail.com (Tom Augspurger)
Date: Tue, 29 Oct 2019 12:09:08 -0500
Subject: [Pandas-dev] Discourse discussion forum
In-Reply-To: <CAEk5N5sa9ZrzJyR5Q_d=KATagSQuPRgrfMBiUGXRX2qaAYyo4A@mail.gmail.com>
References: <CALQtMBZc=jMLPFmN4=dF9MG4u24ggVKnj-npE565TmdmVbjLOQ@mail.gmail.com>
 <CA+uMjAtfqzgnLEbSGZPFSnz_12RrVv26cYz9EqVTSMUArTeasw@mail.gmail.com>
 <CAE1aY-=U=FJZWrqcsjvaw-Q14bjCUOhbojX9uJ5kqG2o7M3W-Q@mail.gmail.com>
 <CAEk5N5vTGJAeWbYh3JOAdsbEvb1Agan2roZPyVGAeFvSjf6XqQ@mail.gmail.com>
 <CALQtMBY4uG6kxdZvqVMw6Ou_Zs3WTyD=MevJamZmCBKrwvMxMA@mail.gmail.com>
 <CAEk5N5srS3uBED+Grb3Ya=coOQ8mLdhqGMC8mEJKaswc4Ho-dg@mail.gmail.com>
 <CAEk5N5swt=EGDa8YLZt5ScQm_7woyMYzB3b8OqEttp9nfx1u5w@mail.gmail.com>
 <CA+WonSS7ML4J4DyOTNGv-LN01Wv90DvOG0tOe15KN-edcEz_rA@mail.gmail.com>
 <CAEk5N5sa9ZrzJyR5Q_d=KATagSQuPRgrfMBiUGXRX2qaAYyo4A@mail.gmail.com>
Message-ID: <CAE1aY-m=B2maXu5+NG5_TMSJwHby19t4b3qubbChdOrzF=xqNg@mail.gmail.com>

I don't have a strong opinion here. Happy to go with what works best.

On Tue, Oct 29, 2019 at 12:07 PM Marc Garcia <garcia.marc at gmail.com> wrote:

> I personally don't see the value of having a common discourse for all the
> projects, where the top-level is a list of possibly 100 items, where pandas
> has few groups lost there, and not more structure than that, as opposed to
> have a discourse per project.
>
> Single-login is the only advantage I can see, and this can also be
> achieved with separate groups for what I've seen.
>
> Tom, Joris, I think you were the ones who preferred having a common
> discourse. Does it still sounds as the best option, given the limitations?
>
> On Tue, Oct 29, 2019 at 11:51 AM Andy Ray Terrel <andy at numfocus.org>
> wrote:
>
>> Sorry I've been traveling.
>>
>> I have https://pydata.discourse. <http://pydata.discourse.org>group set
>> up. I can send out invites.
>>
>> I guess as you have pointed out, we can set up categories for each
>> project, e.g. dask-users, pandas-users, pandas-dev, but maybe not exactly
>> what you want.
>>
>> Happy to invite anyone to the discourse instance before we open it up to
>> the wild
>>
>> -- Andy
>>
>> On Tue, Oct 29, 2019 at 9:24 AM Marc Garcia <garcia.marc at gmail.com>
>> wrote:
>>
>>> Andy, could you experiment on having multiple projects in a single
>>> discourse? I saw the PyData one was activated some time ago.
>>>
>>> If it doesn't look feasible as I think, let me know so I'll move forward
>>> discussing what to have in the pandas one.
>>>
>>> Cheers!
>>>
>>> On Wed, Sep 25, 2019 at 8:03 AM Marc Garcia <garcia.marc at gmail.com>
>>> wrote:
>>>
>>>> Discourse has private categories, we already have a private
>>>> "Maintainers" one, that only admins can see and use. And there are other
>>>> permissions levels that can be used. For example, we can have a private
>>>> category for the memebers of the code of conduct committee... I just need
>>>> to check if we can associate email addresses to those groups, so when
>>>> someone emails to coc at pandas.io the messages are posted in that
>>>> private group. But if we can set up that as we need, I think we should be
>>>> able to replace all those and centralize everything in Discourse.
>>>>
>>>> I'm skeptical on being able to set up a global Discourse for all the
>>>> ecosystem, where things are easy to find, based on how Discourse works and
>>>> the tests I did. I'd move forward with our own for now if nobody is able to
>>>> set that up.
>>>>
>>>> Andy, I got the pandas account approved in minutes. I see that we can
>>>> have a custom domain, so you can use the pandas and see if we can manage to
>>>> have multiple projects in a way we like, and if we do we just change the
>>>> domain to discuss.pydata.org (or whatever). You're already an admin,
>>>> feel free to experiment and change the set up as you need.
>>>>
>>>> Maarten, not sure I understand your point. Not a fan of Discourse so
>>>> far, but I think having the user and the devs discussions in a single place
>>>> makes it easier to find the information, and I think Discourse interface
>>>> also makes it easier to find compared to mailman, or google groups.
>>>> Regardless of gitter (there are no important discussions or decision making
>>>> there I think), would you prefer to stay with mailman and google groups
>>>> over Discourse? Or what you think would be the ideal or best option?
>>>>
>>>> Thanks!
>>>>
>>>> On Wed, Sep 25, 2019 at 8:39 AM Joris Van den Bossche <
>>>> jorisvandenbossche at gmail.com> wrote:
>>>>
>>>>> What do other people think about starting to use discourse for pandas?
>>>>> (and about sharing it with other projects or having our own?)
>>>>>
>>>>> --
>>>>>
>>>>> On the existing lists: I don't think discourse would replace the core
>>>>> devs list (that is intentionally private). And IMO also not gitter
>>>>> (discourse is not a real-time chat).
>>>>>
>>>>> Joris
>>>>>
>>>>> On Fri, 20 Sep 2019 at 14:58, Marc Garcia <garcia.marc at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> For what I've seen I'd say that Discourse can be configured to
>>>>>> interact with a category like a distribution list (subscribe and have an
>>>>>> email address to send messages there). Not sure, but for the settings I've
>>>>>> seen should be possible.
>>>>>>
>>>>>> Personally I think it should replace all the existing lists:
>>>>>> - pydata google group
>>>>>> - pandas-dev (this)
>>>>>> - core devs list
>>>>>>
>>>>>> I'm also ok to get rid of gitter once we move to discourse (also ok
>>>>>> to keep it if people find it useful, but I rarely use it).
>>>>>>
>>>>>> I created an issue for this discussion some time ago:
>>>>>> https://github.com/pandas-dev/pandas/issues/27903
>>>>>>
>>>>>> On Fri, Sep 20, 2019 at 1:50 PM Tom Augspurger <
>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 20, 2019 at 6:57 AM Andy Terrel <andy at numfocus.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks Joris for splitting the thread, sorry if I hijacked the
>>>>>>>> other one.
>>>>>>>>
>>>>>>>> For some discussion from numpy you can see here
>>>>>>>> https://github.com/numpy/numpy.org/issues/28
>>>>>>>>
>>>>>>>> Julia and Jupyter both run their own discourse but Dask, Numpy,
>>>>>>>> Scipy have all told me ?I don?t want to run it ourselves but be part of a
>>>>>>>> larger one?
>>>>>>>>
>>>>>>>> I bet we can figure out how to organize it.
>>>>>>>>
>>>>>>>> I just put in an application to get pydata.discourse.org.
>>>>>>>>
>>>>>>>> ? Andy
>>>>>>>>
>>>>>>>> On Fri, Sep 20, 2019 at 6:52 AM Joris Van den Bossche <
>>>>>>>> jorisvandenbossche at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> (let's use a new thread for discourse, as it is a different
>>>>>>>>> discussion from the website hosting I think, regardless whether OVH might
>>>>>>>>> also host discourse)
>>>>>>>>>
>>>>>>>>> I am not familiar enough myself with discourse to know whether
>>>>>>>>> multiple projects sharing a single discourse will become annoying. But
>>>>>>>>> indeed, that sounds as it needs some kind of hierarchical category /
>>>>>>>>> tagging.
>>>>>>>>>
>>>>>>>>> For pandas itself: I think I quite like the idea of having a
>>>>>>>>> discourse, but *if* we do that, we should think about how that
>>>>>>>>> fits with / replaces / adds to /... some of the other communication
>>>>>>>>> channels (pandas-dev mailing list, pydata mailing list, github issues, ..).
>>>>>>>>>
>>>>>>>>
>>>>>>> IMO, we can replace the pandas-dev & pydata mailing lists with it.
>>>>>>> Possibly gitter as well.
>>>>>>>
>>>>>>>
>>>>>>>> Joris
>>>>>>>>>
>>>>>>>>> On Fri, 20 Sep 2019 at 13:18, Marc Garcia <garcia.marc at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I'm fine with that conceptually, but I think Discourse will make
>>>>>>>>>> things quite tricky to find things then.
>>>>>>>>>>
>>>>>>>>>> We already got our discourse approved, if you want to join it an
>>>>>>>>>> experiment with the setting. But it's the first thing I tried, and after
>>>>>>>>>> you join a category (project), everything feels like it's in the same place
>>>>>>>>>> (even if subcategories and tags exist). And I think we need at least a
>>>>>>>>>> clear separation between pandas/users pandas/contributors discussions.
>>>>>>>>>>
>>>>>>>>>> May be I just couldn't find the settings, let me know if you
>>>>>>>>>> manage to get a multi-project set up that makes sense.
>>>>>>>>>>
>>>>>>>>>> On Fri, Sep 20, 2019 at 12:07 PM Tom Augspurger <
>>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I'd prefer to join a discourse along with NumPy, Dask, and other
>>>>>>>>>>> PyData or NumFOCUS projects, rather than going out on our own.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Sep 20, 2019 at 4:47 AM Marc Garcia <
>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I don't know much about discourse, but why do we want to
>>>>>>>>>>>> self-host it? Seems like Discourse does it for free for open source
>>>>>>>>>>>> projects: https://free.discourse.group/ And I don't think we
>>>>>>>>>>>> want another system to maintain. Am I missing something?
>>>>>>>>>>>>
>>>>>>>>>>>> I applied for https://pandas.discourse.group, so we can give
>>>>>>>>>>>> it a try. We should have it approved and working in couple of days.
>>>>>>>>>>>>
>>>>>>>>>>>> For what I saw, Discourse has one level of categories, so I
>>>>>>>>>>>> guess we want one per project, so we can have categories for "Users",
>>>>>>>>>>>> "Contributors", "Ecosystem"... or something similar. I guess if we have a
>>>>>>>>>>>> single Discourse for NumFOCUS, every project will be a category, and it'll
>>>>>>>>>>>> be difficult to group conversations.
>>>>>>>>>>>>
>>>>>>>>>>>> If anyone already has experience with Discourse and disagrees
>>>>>>>>>>>> with my guesses, please let me know.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Sep 18, 2019 at 4:32 PM Andy Terrel <andy at numfocus.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Sounds great to me. Just let me know where everything goes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> NumPy wants me to help host a discourse for them, maybe OVH
>>>>>>>>>>>>> would be a good place to do that as well, (although I would be more
>>>>>>>>>>>>> inclinded if it was pydata and we had pandas, scipy, and numpy on it).
>>>>>>>>>>>>>
>>>>>>>>>>>>> -- Andy
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:51 AM Tom Augspurger <
>>>>>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sounds good w.r.t crediting OVH on those pages.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For the ASV results at pandas.pydata.org/speed (which I now
>>>>>>>>>>>>>> notice is currently broken for pandas), the only thing on the webserver is a
>>>>>>>>>>>>>> cron job doing a `git pull` from
>>>>>>>>>>>>>> https://github.com/asv-runner/asv-collection, from within
>>>>>>>>>>>>>> `/usr/share/nginx`.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Tom
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:18 AM Marc Garcia <
>>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> An update on the new website infrastructure. We need to
>>>>>>>>>>>>>>> finish discussing the details, but OVH is happy to provide the hosting for
>>>>>>>>>>>>>>> the pandas infrastructure we need.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> My initial idea is to credit them in the page with the rest
>>>>>>>>>>>>>>> of the sponsors in the new website:
>>>>>>>>>>>>>>> https://datapythonista.github.io/pandas-web/community/team.html#institutional-partners and
>>>>>>>>>>>>>>> also in the top right corner of the runnable code widgets (see for example
>>>>>>>>>>>>>>> where Binder is credited here: https://spacy.io/).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What I'd like to ask is:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. For the production website and docs (static content only,
>>>>>>>>>>>>>>> for the traffic we need):
>>>>>>>>>>>>>>> https://us.ovhcloud.com/products/public-cloud/object-storage
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2. For our tools and processes, like the benchmarks, builds,
>>>>>>>>>>>>>>> CI stuff (temporary publish the docs for every PR,...):
>>>>>>>>>>>>>>> https://www.ovh.co.uk/vps/vps-ssd.xml (VPS SSD 3)
>>>>>>>>>>>>>>> 3. For BinderHub (runnable code in our docs, launch
>>>>>>>>>>>>>>> tutorials on Binder...):
>>>>>>>>>>>>>>> https://www.ovh.co.uk/public-cloud/kubernetes/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For the BinderHub, QuantStack offered help with the set up
>>>>>>>>>>>>>>> (which is great, because I don't know much about Binder myself, and I'm not
>>>>>>>>>>>>>>> sure if anyone else does or wants to take care of this). I don't think
>>>>>>>>>>>>>>> it'll be easy to estimate how big is the cluster we need beforehand, but I
>>>>>>>>>>>>>>> guess we can add things to Binder iteratively, and have more info as we
>>>>>>>>>>>>>>> grow.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> OVH gave us a 200 euros voucher to experiment with the
>>>>>>>>>>>>>>> different services. Let me know how all this sounds, and if there are no
>>>>>>>>>>>>>>> objections, I'll create an account and buy those services with the voucher,
>>>>>>>>>>>>>>> and I'll start to prototype and see how everything works.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Aug 20, 2019 at 11:06 PM Marc Garcia <
>>>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Somehow related to the work on the new website (
>>>>>>>>>>>>>>>> https://github.com/pandas-dev/pandas/pull/28014), I've
>>>>>>>>>>>>>>>> been discussing with the Binder team, and looks like should be quite easy
>>>>>>>>>>>>>>>> soon (with a Sphinx extension) to make all the documentation pages runnable
>>>>>>>>>>>>>>>> with Binder, directly from the website (without opening the page as a
>>>>>>>>>>>>>>>> Jupyter in mybinder).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> While they are very happy with the idea of having this is
>>>>>>>>>>>>>>>> pandas, it's uncertain if the current infrastructure Binder has got, is
>>>>>>>>>>>>>>>> able to handle all the traffic we would send. And scikit-learn is working
>>>>>>>>>>>>>>>> on it too (today they added to the dev docs a link to mybinder to run the
>>>>>>>>>>>>>>>> examples).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm discussing with OVH (their infrastructure provider) on
>>>>>>>>>>>>>>>> whether they'd be happy to provide a dedicated BinderHub specific to pandas
>>>>>>>>>>>>>>>> (or may be we can have one for all NumFOCUS projects). We'll see how it
>>>>>>>>>>>>>>>> goes, but wanted to let you know, so you're updated, and in case anyone is
>>>>>>>>>>>>>>>> interested in participating in the discussions. Of course before any
>>>>>>>>>>>>>>>> decision is made I'll open a discussion here or on GitHub.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As part of the discussion I'm also trying to get a server
>>>>>>>>>>>>>>>> for the website, and one for development stuff. Specfically for the dev
>>>>>>>>>>>>>>>> docs (including rendered docs of every PR) and the GitHub app that will
>>>>>>>>>>>>>>>> generate them. I guess it should be very easy to find a sponsor for these
>>>>>>>>>>>>>>>> two servers (in exchange of a small note in the footer of the website, or
>>>>>>>>>>>>>>>> something like that).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Let me know if you have any comment, want to be involved or
>>>>>>>>>>>>>>>> whatever.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Pandas-dev mailing list
>>>>>>>>>>>>>>> Pandas-dev at python.org
>>>>>>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Andy R. Terrel, PhD
>>>>>>>>>>>>> President
>>>>>>>>>>>>> NumFOCUS
>>>>>>>>>>>>> andy at numfocus.org
>>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>> Pandas-dev mailing list
>>>>>>>>>> Pandas-dev at python.org
>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>> Andy R. Terrel, PhD
>>>>>>>> President
>>>>>>>> NumFOCUS
>>>>>>>> andy at numfocus.org
>>>>>>>>
>>>>>>> _______________________________________________
>>>>> Pandas-dev mailing list
>>>>> Pandas-dev at python.org
>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>
>>>> _______________________________________________
>>> Pandas-dev mailing list
>>> Pandas-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>
>>
>>
>> --
>> Andy R. Terrel, PhD
>> President, NumFOCUS
>> andy at numfocus.org
>>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191029/e6fd3fae/attachment-0001.html>

From garcia.marc at gmail.com  Tue Oct 29 13:21:32 2019
From: garcia.marc at gmail.com (Marc Garcia)
Date: Tue, 29 Oct 2019 12:21:32 -0500
Subject: [Pandas-dev] Discourse discussion forum
In-Reply-To: <CA+WonSRDvkF+VYiLK+ag1iqP+X0504p9vnBFtdbA0GexjCurvQ@mail.gmail.com>
References: <CALQtMBZc=jMLPFmN4=dF9MG4u24ggVKnj-npE565TmdmVbjLOQ@mail.gmail.com>
 <CA+uMjAtfqzgnLEbSGZPFSnz_12RrVv26cYz9EqVTSMUArTeasw@mail.gmail.com>
 <CAE1aY-=U=FJZWrqcsjvaw-Q14bjCUOhbojX9uJ5kqG2o7M3W-Q@mail.gmail.com>
 <CAEk5N5vTGJAeWbYh3JOAdsbEvb1Agan2roZPyVGAeFvSjf6XqQ@mail.gmail.com>
 <CALQtMBY4uG6kxdZvqVMw6Ou_Zs3WTyD=MevJamZmCBKrwvMxMA@mail.gmail.com>
 <CAEk5N5srS3uBED+Grb3Ya=coOQ8mLdhqGMC8mEJKaswc4Ho-dg@mail.gmail.com>
 <CAEk5N5swt=EGDa8YLZt5ScQm_7woyMYzB3b8OqEttp9nfx1u5w@mail.gmail.com>
 <CA+WonSS7ML4J4DyOTNGv-LN01Wv90DvOG0tOe15KN-edcEz_rA@mail.gmail.com>
 <CAEk5N5sa9ZrzJyR5Q_d=KATagSQuPRgrfMBiUGXRX2qaAYyo4A@mail.gmail.com>
 <CA+WonSRDvkF+VYiLK+ag1iqP+X0504p9vnBFtdbA0GexjCurvQ@mail.gmail.com>
Message-ID: <CAEk5N5uMnD5Y4RDpzTq1a-v_in9_HWbjKYppPHWyOdrGLW+pMg@mail.gmail.com>

That's a good point. I guess it doesn't make a big difference in terms of
organization of the threads, as a discussion on something dask-pandas will
still need to be in one of the categories (pandas-dev or dask-dev). But
being able to tag people from other projects could be useful.

But I still think that having separate discourse instances will make our
lives easier. Feels like a huge mess to have all projects in the same
instance with the navigation of discourse.

On Tue, Oct 29, 2019 at 12:09 PM Andy Ray Terrel <andy at numfocus.org> wrote:

> I think the value many have is for cross project issues, but maybe those
> are few and far between.
>
> On Tue, Oct 29, 2019 at 10:07 AM Marc Garcia <garcia.marc at gmail.com>
> wrote:
>
>> I personally don't see the value of having a common discourse for all the
>> projects, where the top-level is a list of possibly 100 items, where pandas
>> has few groups lost there, and not more structure than that, as opposed to
>> have a discourse per project.
>>
>> Single-login is the only advantage I can see, and this can also be
>> achieved with separate groups for what I've seen.
>>
>> Tom, Joris, I think you were the ones who preferred having a common
>> discourse. Does it still sounds as the best option, given the limitations?
>>
>> On Tue, Oct 29, 2019 at 11:51 AM Andy Ray Terrel <andy at numfocus.org>
>> wrote:
>>
>>> Sorry I've been traveling.
>>>
>>> I have https://pydata.discourse. <http://pydata.discourse.org>group set
>>> up. I can send out invites.
>>>
>>> I guess as you have pointed out, we can set up categories for each
>>> project, e.g. dask-users, pandas-users, pandas-dev, but maybe not exactly
>>> what you want.
>>>
>>> Happy to invite anyone to the discourse instance before we open it up to
>>> the wild
>>>
>>> -- Andy
>>>
>>> On Tue, Oct 29, 2019 at 9:24 AM Marc Garcia <garcia.marc at gmail.com>
>>> wrote:
>>>
>>>> Andy, could you experiment on having multiple projects in a single
>>>> discourse? I saw the PyData one was activated some time ago.
>>>>
>>>> If it doesn't look feasible as I think, let me know so I'll move
>>>> forward discussing what to have in the pandas one.
>>>>
>>>> Cheers!
>>>>
>>>> On Wed, Sep 25, 2019 at 8:03 AM Marc Garcia <garcia.marc at gmail.com>
>>>> wrote:
>>>>
>>>>> Discourse has private categories, we already have a private
>>>>> "Maintainers" one, that only admins can see and use. And there are other
>>>>> permissions levels that can be used. For example, we can have a private
>>>>> category for the memebers of the code of conduct committee... I just need
>>>>> to check if we can associate email addresses to those groups, so when
>>>>> someone emails to coc at pandas.io the messages are posted in that
>>>>> private group. But if we can set up that as we need, I think we should be
>>>>> able to replace all those and centralize everything in Discourse.
>>>>>
>>>>> I'm skeptical on being able to set up a global Discourse for all the
>>>>> ecosystem, where things are easy to find, based on how Discourse works and
>>>>> the tests I did. I'd move forward with our own for now if nobody is able to
>>>>> set that up.
>>>>>
>>>>> Andy, I got the pandas account approved in minutes. I see that we can
>>>>> have a custom domain, so you can use the pandas and see if we can manage to
>>>>> have multiple projects in a way we like, and if we do we just change the
>>>>> domain to discuss.pydata.org (or whatever). You're already an admin,
>>>>> feel free to experiment and change the set up as you need.
>>>>>
>>>>> Maarten, not sure I understand your point. Not a fan of Discourse so
>>>>> far, but I think having the user and the devs discussions in a single place
>>>>> makes it easier to find the information, and I think Discourse interface
>>>>> also makes it easier to find compared to mailman, or google groups.
>>>>> Regardless of gitter (there are no important discussions or decision making
>>>>> there I think), would you prefer to stay with mailman and google groups
>>>>> over Discourse? Or what you think would be the ideal or best option?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> On Wed, Sep 25, 2019 at 8:39 AM Joris Van den Bossche <
>>>>> jorisvandenbossche at gmail.com> wrote:
>>>>>
>>>>>> What do other people think about starting to use discourse for pandas?
>>>>>> (and about sharing it with other projects or having our own?)
>>>>>>
>>>>>> --
>>>>>>
>>>>>> On the existing lists: I don't think discourse would replace the core
>>>>>> devs list (that is intentionally private). And IMO also not gitter
>>>>>> (discourse is not a real-time chat).
>>>>>>
>>>>>> Joris
>>>>>>
>>>>>> On Fri, 20 Sep 2019 at 14:58, Marc Garcia <garcia.marc at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> For what I've seen I'd say that Discourse can be configured to
>>>>>>> interact with a category like a distribution list (subscribe and have an
>>>>>>> email address to send messages there). Not sure, but for the settings I've
>>>>>>> seen should be possible.
>>>>>>>
>>>>>>> Personally I think it should replace all the existing lists:
>>>>>>> - pydata google group
>>>>>>> - pandas-dev (this)
>>>>>>> - core devs list
>>>>>>>
>>>>>>> I'm also ok to get rid of gitter once we move to discourse (also ok
>>>>>>> to keep it if people find it useful, but I rarely use it).
>>>>>>>
>>>>>>> I created an issue for this discussion some time ago:
>>>>>>> https://github.com/pandas-dev/pandas/issues/27903
>>>>>>>
>>>>>>> On Fri, Sep 20, 2019 at 1:50 PM Tom Augspurger <
>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Sep 20, 2019 at 6:57 AM Andy Terrel <andy at numfocus.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks Joris for splitting the thread, sorry if I hijacked the
>>>>>>>>> other one.
>>>>>>>>>
>>>>>>>>> For some discussion from numpy you can see here
>>>>>>>>> https://github.com/numpy/numpy.org/issues/28
>>>>>>>>>
>>>>>>>>> Julia and Jupyter both run their own discourse but Dask, Numpy,
>>>>>>>>> Scipy have all told me ?I don?t want to run it ourselves but be part of a
>>>>>>>>> larger one?
>>>>>>>>>
>>>>>>>>> I bet we can figure out how to organize it.
>>>>>>>>>
>>>>>>>>> I just put in an application to get pydata.discourse.org.
>>>>>>>>>
>>>>>>>>> ? Andy
>>>>>>>>>
>>>>>>>>> On Fri, Sep 20, 2019 at 6:52 AM Joris Van den Bossche <
>>>>>>>>> jorisvandenbossche at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> (let's use a new thread for discourse, as it is a different
>>>>>>>>>> discussion from the website hosting I think, regardless whether OVH might
>>>>>>>>>> also host discourse)
>>>>>>>>>>
>>>>>>>>>> I am not familiar enough myself with discourse to know whether
>>>>>>>>>> multiple projects sharing a single discourse will become annoying. But
>>>>>>>>>> indeed, that sounds as it needs some kind of hierarchical category /
>>>>>>>>>> tagging.
>>>>>>>>>>
>>>>>>>>>> For pandas itself: I think I quite like the idea of having a
>>>>>>>>>> discourse, but *if* we do that, we should think about how that
>>>>>>>>>> fits with / replaces / adds to /... some of the other communication
>>>>>>>>>> channels (pandas-dev mailing list, pydata mailing list, github issues, ..).
>>>>>>>>>>
>>>>>>>>>
>>>>>>>> IMO, we can replace the pandas-dev & pydata mailing lists with it.
>>>>>>>> Possibly gitter as well.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Joris
>>>>>>>>>>
>>>>>>>>>> On Fri, 20 Sep 2019 at 13:18, Marc Garcia <garcia.marc at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I'm fine with that conceptually, but I think Discourse will make
>>>>>>>>>>> things quite tricky to find things then.
>>>>>>>>>>>
>>>>>>>>>>> We already got our discourse approved, if you want to join it an
>>>>>>>>>>> experiment with the setting. But it's the first thing I tried, and after
>>>>>>>>>>> you join a category (project), everything feels like it's in the same place
>>>>>>>>>>> (even if subcategories and tags exist). And I think we need at least a
>>>>>>>>>>> clear separation between pandas/users pandas/contributors discussions.
>>>>>>>>>>>
>>>>>>>>>>> May be I just couldn't find the settings, let me know if you
>>>>>>>>>>> manage to get a multi-project set up that makes sense.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Sep 20, 2019 at 12:07 PM Tom Augspurger <
>>>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I'd prefer to join a discourse along with NumPy, Dask, and
>>>>>>>>>>>> other PyData or NumFOCUS projects, rather than going out on our own.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Sep 20, 2019 at 4:47 AM Marc Garcia <
>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I don't know much about discourse, but why do we want to
>>>>>>>>>>>>> self-host it? Seems like Discourse does it for free for open source
>>>>>>>>>>>>> projects: https://free.discourse.group/ And I don't think we
>>>>>>>>>>>>> want another system to maintain. Am I missing something?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I applied for https://pandas.discourse.group, so we can give
>>>>>>>>>>>>> it a try. We should have it approved and working in couple of days.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For what I saw, Discourse has one level of categories, so I
>>>>>>>>>>>>> guess we want one per project, so we can have categories for "Users",
>>>>>>>>>>>>> "Contributors", "Ecosystem"... or something similar. I guess if we have a
>>>>>>>>>>>>> single Discourse for NumFOCUS, every project will be a category, and it'll
>>>>>>>>>>>>> be difficult to group conversations.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If anyone already has experience with Discourse and disagrees
>>>>>>>>>>>>> with my guesses, please let me know.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 4:32 PM Andy Terrel <andy at numfocus.org>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sounds great to me. Just let me know where everything goes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> NumPy wants me to help host a discourse for them, maybe OVH
>>>>>>>>>>>>>> would be a good place to do that as well, (although I would be more
>>>>>>>>>>>>>> inclinded if it was pydata and we had pandas, scipy, and numpy on it).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -- Andy
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:51 AM Tom Augspurger <
>>>>>>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sounds good w.r.t crediting OVH on those pages.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For the ASV results at pandas.pydata.org/speed (which I now
>>>>>>>>>>>>>>> notice is currently broken for pandas), the only thing on the webserver is a
>>>>>>>>>>>>>>> cron job doing a `git pull` from
>>>>>>>>>>>>>>> https://github.com/asv-runner/asv-collection, from within
>>>>>>>>>>>>>>> `/usr/share/nginx`.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Tom
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:18 AM Marc Garcia <
>>>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> An update on the new website infrastructure. We need to
>>>>>>>>>>>>>>>> finish discussing the details, but OVH is happy to provide the hosting for
>>>>>>>>>>>>>>>> the pandas infrastructure we need.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> My initial idea is to credit them in the page with the rest
>>>>>>>>>>>>>>>> of the sponsors in the new website:
>>>>>>>>>>>>>>>> https://datapythonista.github.io/pandas-web/community/team.html#institutional-partners and
>>>>>>>>>>>>>>>> also in the top right corner of the runnable code widgets (see for example
>>>>>>>>>>>>>>>> where Binder is credited here: https://spacy.io/).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What I'd like to ask is:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. For the production website and docs (static content
>>>>>>>>>>>>>>>> only, for the traffic we need):
>>>>>>>>>>>>>>>> https://us.ovhcloud.com/products/public-cloud/object-storage
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2. For our tools and processes, like the benchmarks,
>>>>>>>>>>>>>>>> builds, CI stuff (temporary publish the docs for every PR,...):
>>>>>>>>>>>>>>>> https://www.ovh.co.uk/vps/vps-ssd.xml (VPS SSD 3)
>>>>>>>>>>>>>>>> 3. For BinderHub (runnable code in our docs, launch
>>>>>>>>>>>>>>>> tutorials on Binder...):
>>>>>>>>>>>>>>>> https://www.ovh.co.uk/public-cloud/kubernetes/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For the BinderHub, QuantStack offered help with the set up
>>>>>>>>>>>>>>>> (which is great, because I don't know much about Binder myself, and I'm not
>>>>>>>>>>>>>>>> sure if anyone else does or wants to take care of this). I don't think
>>>>>>>>>>>>>>>> it'll be easy to estimate how big is the cluster we need beforehand, but I
>>>>>>>>>>>>>>>> guess we can add things to Binder iteratively, and have more info as we
>>>>>>>>>>>>>>>> grow.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> OVH gave us a 200 euros voucher to experiment with the
>>>>>>>>>>>>>>>> different services. Let me know how all this sounds, and if there are no
>>>>>>>>>>>>>>>> objections, I'll create an account and buy those services with the voucher,
>>>>>>>>>>>>>>>> and I'll start to prototype and see how everything works.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Aug 20, 2019 at 11:06 PM Marc Garcia <
>>>>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Somehow related to the work on the new website (
>>>>>>>>>>>>>>>>> https://github.com/pandas-dev/pandas/pull/28014), I've
>>>>>>>>>>>>>>>>> been discussing with the Binder team, and looks like should be quite easy
>>>>>>>>>>>>>>>>> soon (with a Sphinx extension) to make all the documentation pages runnable
>>>>>>>>>>>>>>>>> with Binder, directly from the website (without opening the page as a
>>>>>>>>>>>>>>>>> Jupyter in mybinder).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> While they are very happy with the idea of having this is
>>>>>>>>>>>>>>>>> pandas, it's uncertain if the current infrastructure Binder has got, is
>>>>>>>>>>>>>>>>> able to handle all the traffic we would send. And scikit-learn is working
>>>>>>>>>>>>>>>>> on it too (today they added to the dev docs a link to mybinder to run the
>>>>>>>>>>>>>>>>> examples).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm discussing with OVH (their infrastructure provider) on
>>>>>>>>>>>>>>>>> whether they'd be happy to provide a dedicated BinderHub specific to pandas
>>>>>>>>>>>>>>>>> (or may be we can have one for all NumFOCUS projects). We'll see how it
>>>>>>>>>>>>>>>>> goes, but wanted to let you know, so you're updated, and in case anyone is
>>>>>>>>>>>>>>>>> interested in participating in the discussions. Of course before any
>>>>>>>>>>>>>>>>> decision is made I'll open a discussion here or on GitHub.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As part of the discussion I'm also trying to get a server
>>>>>>>>>>>>>>>>> for the website, and one for development stuff. Specfically for the dev
>>>>>>>>>>>>>>>>> docs (including rendered docs of every PR) and the GitHub app that will
>>>>>>>>>>>>>>>>> generate them. I guess it should be very easy to find a sponsor for these
>>>>>>>>>>>>>>>>> two servers (in exchange of a small note in the footer of the website, or
>>>>>>>>>>>>>>>>> something like that).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Let me know if you have any comment, want to be involved
>>>>>>>>>>>>>>>>> or whatever.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Pandas-dev mailing list
>>>>>>>>>>>>>>>> Pandas-dev at python.org
>>>>>>>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Andy R. Terrel, PhD
>>>>>>>>>>>>>> President
>>>>>>>>>>>>>> NumFOCUS
>>>>>>>>>>>>>> andy at numfocus.org
>>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Pandas-dev mailing list
>>>>>>>>>>> Pandas-dev at python.org
>>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>> Andy R. Terrel, PhD
>>>>>>>>> President
>>>>>>>>> NumFOCUS
>>>>>>>>> andy at numfocus.org
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>> Pandas-dev mailing list
>>>>>> Pandas-dev at python.org
>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>
>>>>> _______________________________________________
>>>> Pandas-dev mailing list
>>>> Pandas-dev at python.org
>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>
>>>
>>>
>>> --
>>> Andy R. Terrel, PhD
>>> President, NumFOCUS
>>> andy at numfocus.org
>>>
>> _______________________________________________
>> Pandas-dev mailing list
>> Pandas-dev at python.org
>> https://mail.python.org/mailman/listinfo/pandas-dev
>>
>
>
> --
> Andy R. Terrel, PhD
> President, NumFOCUS
> andy at numfocus.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191029/51c40796/attachment-0001.html>

From andy at numfocus.org  Tue Oct 29 13:44:47 2019
From: andy at numfocus.org (Andy Ray Terrel)
Date: Tue, 29 Oct 2019 10:44:47 -0700
Subject: [Pandas-dev] Discourse discussion forum
In-Reply-To: <CAEk5N5uMnD5Y4RDpzTq1a-v_in9_HWbjKYppPHWyOdrGLW+pMg@mail.gmail.com>
References: <CALQtMBZc=jMLPFmN4=dF9MG4u24ggVKnj-npE565TmdmVbjLOQ@mail.gmail.com>
 <CA+uMjAtfqzgnLEbSGZPFSnz_12RrVv26cYz9EqVTSMUArTeasw@mail.gmail.com>
 <CAE1aY-=U=FJZWrqcsjvaw-Q14bjCUOhbojX9uJ5kqG2o7M3W-Q@mail.gmail.com>
 <CAEk5N5vTGJAeWbYh3JOAdsbEvb1Agan2roZPyVGAeFvSjf6XqQ@mail.gmail.com>
 <CALQtMBY4uG6kxdZvqVMw6Ou_Zs3WTyD=MevJamZmCBKrwvMxMA@mail.gmail.com>
 <CAEk5N5srS3uBED+Grb3Ya=coOQ8mLdhqGMC8mEJKaswc4Ho-dg@mail.gmail.com>
 <CAEk5N5swt=EGDa8YLZt5ScQm_7woyMYzB3b8OqEttp9nfx1u5w@mail.gmail.com>
 <CA+WonSS7ML4J4DyOTNGv-LN01Wv90DvOG0tOe15KN-edcEz_rA@mail.gmail.com>
 <CAEk5N5sa9ZrzJyR5Q_d=KATagSQuPRgrfMBiUGXRX2qaAYyo4A@mail.gmail.com>
 <CA+WonSRDvkF+VYiLK+ag1iqP+X0504p9vnBFtdbA0GexjCurvQ@mail.gmail.com>
 <CAEk5N5uMnD5Y4RDpzTq1a-v_in9_HWbjKYppPHWyOdrGLW+pMg@mail.gmail.com>
Message-ID: <CA+WonSSpAfEzrLfxCX3LPXvqD4DK8uUUZmxVD2cvbTgR=4567Q@mail.gmail.com>

Okay well I can go bug the heads of all the pydata projects, but the
confusion comes when a user doesn't know where to post. Having lots of
discourse sites, seems like it will lead to confusion and more work on
maintainers to curate the community discussion.


On Tue, Oct 29, 2019 at 10:21 AM Marc Garcia <garcia.marc at gmail.com> wrote:

> That's a good point. I guess it doesn't make a big difference in terms of
> organization of the threads, as a discussion on something dask-pandas will
> still need to be in one of the categories (pandas-dev or dask-dev). But
> being able to tag people from other projects could be useful.
>
> But I still think that having separate discourse instances will make our
> lives easier. Feels like a huge mess to have all projects in the same
> instance with the navigation of discourse.
>
> On Tue, Oct 29, 2019 at 12:09 PM Andy Ray Terrel <andy at numfocus.org>
> wrote:
>
>> I think the value many have is for cross project issues, but maybe those
>> are few and far between.
>>
>> On Tue, Oct 29, 2019 at 10:07 AM Marc Garcia <garcia.marc at gmail.com>
>> wrote:
>>
>>> I personally don't see the value of having a common discourse for all
>>> the projects, where the top-level is a list of possibly 100 items, where
>>> pandas has few groups lost there, and not more structure than that, as
>>> opposed to have a discourse per project.
>>>
>>> Single-login is the only advantage I can see, and this can also be
>>> achieved with separate groups for what I've seen.
>>>
>>> Tom, Joris, I think you were the ones who preferred having a common
>>> discourse. Does it still sounds as the best option, given the limitations?
>>>
>>> On Tue, Oct 29, 2019 at 11:51 AM Andy Ray Terrel <andy at numfocus.org>
>>> wrote:
>>>
>>>> Sorry I've been traveling.
>>>>
>>>> I have https://pydata.discourse. <http://pydata.discourse.org>group set
>>>> up. I can send out invites.
>>>>
>>>> I guess as you have pointed out, we can set up categories for each
>>>> project, e.g. dask-users, pandas-users, pandas-dev, but maybe not exactly
>>>> what you want.
>>>>
>>>> Happy to invite anyone to the discourse instance before we open it up
>>>> to the wild
>>>>
>>>> -- Andy
>>>>
>>>> On Tue, Oct 29, 2019 at 9:24 AM Marc Garcia <garcia.marc at gmail.com>
>>>> wrote:
>>>>
>>>>> Andy, could you experiment on having multiple projects in a single
>>>>> discourse? I saw the PyData one was activated some time ago.
>>>>>
>>>>> If it doesn't look feasible as I think, let me know so I'll move
>>>>> forward discussing what to have in the pandas one.
>>>>>
>>>>> Cheers!
>>>>>
>>>>> On Wed, Sep 25, 2019 at 8:03 AM Marc Garcia <garcia.marc at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Discourse has private categories, we already have a private
>>>>>> "Maintainers" one, that only admins can see and use. And there are other
>>>>>> permissions levels that can be used. For example, we can have a private
>>>>>> category for the memebers of the code of conduct committee... I just need
>>>>>> to check if we can associate email addresses to those groups, so when
>>>>>> someone emails to coc at pandas.io the messages are posted in that
>>>>>> private group. But if we can set up that as we need, I think we should be
>>>>>> able to replace all those and centralize everything in Discourse.
>>>>>>
>>>>>> I'm skeptical on being able to set up a global Discourse for all the
>>>>>> ecosystem, where things are easy to find, based on how Discourse works and
>>>>>> the tests I did. I'd move forward with our own for now if nobody is able to
>>>>>> set that up.
>>>>>>
>>>>>> Andy, I got the pandas account approved in minutes. I see that we can
>>>>>> have a custom domain, so you can use the pandas and see if we can manage to
>>>>>> have multiple projects in a way we like, and if we do we just change the
>>>>>> domain to discuss.pydata.org (or whatever). You're already an admin,
>>>>>> feel free to experiment and change the set up as you need.
>>>>>>
>>>>>> Maarten, not sure I understand your point. Not a fan of Discourse so
>>>>>> far, but I think having the user and the devs discussions in a single place
>>>>>> makes it easier to find the information, and I think Discourse interface
>>>>>> also makes it easier to find compared to mailman, or google groups.
>>>>>> Regardless of gitter (there are no important discussions or decision making
>>>>>> there I think), would you prefer to stay with mailman and google groups
>>>>>> over Discourse? Or what you think would be the ideal or best option?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> On Wed, Sep 25, 2019 at 8:39 AM Joris Van den Bossche <
>>>>>> jorisvandenbossche at gmail.com> wrote:
>>>>>>
>>>>>>> What do other people think about starting to use discourse for
>>>>>>> pandas?
>>>>>>> (and about sharing it with other projects or having our own?)
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> On the existing lists: I don't think discourse would replace the
>>>>>>> core devs list (that is intentionally private). And IMO also not gitter
>>>>>>> (discourse is not a real-time chat).
>>>>>>>
>>>>>>> Joris
>>>>>>>
>>>>>>> On Fri, 20 Sep 2019 at 14:58, Marc Garcia <garcia.marc at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> For what I've seen I'd say that Discourse can be configured to
>>>>>>>> interact with a category like a distribution list (subscribe and have an
>>>>>>>> email address to send messages there). Not sure, but for the settings I've
>>>>>>>> seen should be possible.
>>>>>>>>
>>>>>>>> Personally I think it should replace all the existing lists:
>>>>>>>> - pydata google group
>>>>>>>> - pandas-dev (this)
>>>>>>>> - core devs list
>>>>>>>>
>>>>>>>> I'm also ok to get rid of gitter once we move to discourse (also ok
>>>>>>>> to keep it if people find it useful, but I rarely use it).
>>>>>>>>
>>>>>>>> I created an issue for this discussion some time ago:
>>>>>>>> https://github.com/pandas-dev/pandas/issues/27903
>>>>>>>>
>>>>>>>> On Fri, Sep 20, 2019 at 1:50 PM Tom Augspurger <
>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Sep 20, 2019 at 6:57 AM Andy Terrel <andy at numfocus.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks Joris for splitting the thread, sorry if I hijacked the
>>>>>>>>>> other one.
>>>>>>>>>>
>>>>>>>>>> For some discussion from numpy you can see here
>>>>>>>>>> https://github.com/numpy/numpy.org/issues/28
>>>>>>>>>>
>>>>>>>>>> Julia and Jupyter both run their own discourse but Dask, Numpy,
>>>>>>>>>> Scipy have all told me ?I don?t want to run it ourselves but be part of a
>>>>>>>>>> larger one?
>>>>>>>>>>
>>>>>>>>>> I bet we can figure out how to organize it.
>>>>>>>>>>
>>>>>>>>>> I just put in an application to get pydata.discourse.org.
>>>>>>>>>>
>>>>>>>>>> ? Andy
>>>>>>>>>>
>>>>>>>>>> On Fri, Sep 20, 2019 at 6:52 AM Joris Van den Bossche <
>>>>>>>>>> jorisvandenbossche at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> (let's use a new thread for discourse, as it is a different
>>>>>>>>>>> discussion from the website hosting I think, regardless whether OVH might
>>>>>>>>>>> also host discourse)
>>>>>>>>>>>
>>>>>>>>>>> I am not familiar enough myself with discourse to know whether
>>>>>>>>>>> multiple projects sharing a single discourse will become annoying. But
>>>>>>>>>>> indeed, that sounds as it needs some kind of hierarchical category /
>>>>>>>>>>> tagging.
>>>>>>>>>>>
>>>>>>>>>>> For pandas itself: I think I quite like the idea of having a
>>>>>>>>>>> discourse, but *if* we do that, we should think about how that
>>>>>>>>>>> fits with / replaces / adds to /... some of the other communication
>>>>>>>>>>> channels (pandas-dev mailing list, pydata mailing list, github issues, ..).
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> IMO, we can replace the pandas-dev & pydata mailing lists with it.
>>>>>>>>> Possibly gitter as well.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Joris
>>>>>>>>>>>
>>>>>>>>>>> On Fri, 20 Sep 2019 at 13:18, Marc Garcia <garcia.marc at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I'm fine with that conceptually, but I think Discourse will
>>>>>>>>>>>> make things quite tricky to find things then.
>>>>>>>>>>>>
>>>>>>>>>>>> We already got our discourse approved, if you want to join it
>>>>>>>>>>>> an experiment with the setting. But it's the first thing I tried, and after
>>>>>>>>>>>> you join a category (project), everything feels like it's in the same place
>>>>>>>>>>>> (even if subcategories and tags exist). And I think we need at least a
>>>>>>>>>>>> clear separation between pandas/users pandas/contributors discussions.
>>>>>>>>>>>>
>>>>>>>>>>>> May be I just couldn't find the settings, let me know if you
>>>>>>>>>>>> manage to get a multi-project set up that makes sense.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Sep 20, 2019 at 12:07 PM Tom Augspurger <
>>>>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I'd prefer to join a discourse along with NumPy, Dask, and
>>>>>>>>>>>>> other PyData or NumFOCUS projects, rather than going out on our own.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Sep 20, 2019 at 4:47 AM Marc Garcia <
>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don't know much about discourse, but why do we want to
>>>>>>>>>>>>>> self-host it? Seems like Discourse does it for free for open source
>>>>>>>>>>>>>> projects: https://free.discourse.group/ And I don't think we
>>>>>>>>>>>>>> want another system to maintain. Am I missing something?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I applied for https://pandas.discourse.group, so we can give
>>>>>>>>>>>>>> it a try. We should have it approved and working in couple of days.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For what I saw, Discourse has one level of categories, so I
>>>>>>>>>>>>>> guess we want one per project, so we can have categories for "Users",
>>>>>>>>>>>>>> "Contributors", "Ecosystem"... or something similar. I guess if we have a
>>>>>>>>>>>>>> single Discourse for NumFOCUS, every project will be a category, and it'll
>>>>>>>>>>>>>> be difficult to group conversations.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If anyone already has experience with Discourse and disagrees
>>>>>>>>>>>>>> with my guesses, please let me know.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 4:32 PM Andy Terrel <
>>>>>>>>>>>>>> andy at numfocus.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sounds great to me. Just let me know where everything goes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> NumPy wants me to help host a discourse for them, maybe OVH
>>>>>>>>>>>>>>> would be a good place to do that as well, (although I would be more
>>>>>>>>>>>>>>> inclinded if it was pydata and we had pandas, scipy, and numpy on it).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -- Andy
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:51 AM Tom Augspurger <
>>>>>>>>>>>>>>> tom.augspurger88 at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sounds good w.r.t crediting OVH on those pages.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For the ASV results at pandas.pydata.org/speed (which I
>>>>>>>>>>>>>>>> now notice is currently broken for pandas), the only thing on the webserver
>>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>>> cron job doing a `git pull` from
>>>>>>>>>>>>>>>> https://github.com/asv-runner/asv-collection, from within
>>>>>>>>>>>>>>>> `/usr/share/nginx`.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Tom
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:18 AM Marc Garcia <
>>>>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> An update on the new website infrastructure. We need to
>>>>>>>>>>>>>>>>> finish discussing the details, but OVH is happy to provide the hosting for
>>>>>>>>>>>>>>>>> the pandas infrastructure we need.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> My initial idea is to credit them in the page with the
>>>>>>>>>>>>>>>>> rest of the sponsors in the new website:
>>>>>>>>>>>>>>>>> https://datapythonista.github.io/pandas-web/community/team.html#institutional-partners and
>>>>>>>>>>>>>>>>> also in the top right corner of the runnable code widgets (see for example
>>>>>>>>>>>>>>>>> where Binder is credited here: https://spacy.io/).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What I'd like to ask is:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1. For the production website and docs (static content
>>>>>>>>>>>>>>>>> only, for the traffic we need):
>>>>>>>>>>>>>>>>> https://us.ovhcloud.com/products/public-cloud/object-storage
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2. For our tools and processes, like the benchmarks,
>>>>>>>>>>>>>>>>> builds, CI stuff (temporary publish the docs for every PR,...):
>>>>>>>>>>>>>>>>> https://www.ovh.co.uk/vps/vps-ssd.xml (VPS SSD 3)
>>>>>>>>>>>>>>>>> 3. For BinderHub (runnable code in our docs, launch
>>>>>>>>>>>>>>>>> tutorials on Binder...):
>>>>>>>>>>>>>>>>> https://www.ovh.co.uk/public-cloud/kubernetes/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For the BinderHub, QuantStack offered help with the set up
>>>>>>>>>>>>>>>>> (which is great, because I don't know much about Binder myself, and I'm not
>>>>>>>>>>>>>>>>> sure if anyone else does or wants to take care of this). I don't think
>>>>>>>>>>>>>>>>> it'll be easy to estimate how big is the cluster we need beforehand, but I
>>>>>>>>>>>>>>>>> guess we can add things to Binder iteratively, and have more info as we
>>>>>>>>>>>>>>>>> grow.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> OVH gave us a 200 euros voucher to experiment with the
>>>>>>>>>>>>>>>>> different services. Let me know how all this sounds, and if there are no
>>>>>>>>>>>>>>>>> objections, I'll create an account and buy those services with the voucher,
>>>>>>>>>>>>>>>>> and I'll start to prototype and see how everything works.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Aug 20, 2019 at 11:06 PM Marc Garcia <
>>>>>>>>>>>>>>>>> garcia.marc at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Somehow related to the work on the new website (
>>>>>>>>>>>>>>>>>> https://github.com/pandas-dev/pandas/pull/28014), I've
>>>>>>>>>>>>>>>>>> been discussing with the Binder team, and looks like should be quite easy
>>>>>>>>>>>>>>>>>> soon (with a Sphinx extension) to make all the documentation pages runnable
>>>>>>>>>>>>>>>>>> with Binder, directly from the website (without opening the page as a
>>>>>>>>>>>>>>>>>> Jupyter in mybinder).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> While they are very happy with the idea of having this is
>>>>>>>>>>>>>>>>>> pandas, it's uncertain if the current infrastructure Binder has got, is
>>>>>>>>>>>>>>>>>> able to handle all the traffic we would send. And scikit-learn is working
>>>>>>>>>>>>>>>>>> on it too (today they added to the dev docs a link to mybinder to run the
>>>>>>>>>>>>>>>>>> examples).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm discussing with OVH (their infrastructure provider)
>>>>>>>>>>>>>>>>>> on whether they'd be happy to provide a dedicated BinderHub specific to
>>>>>>>>>>>>>>>>>> pandas (or may be we can have one for all NumFOCUS projects). We'll see how
>>>>>>>>>>>>>>>>>> it goes, but wanted to let you know, so you're updated, and in case anyone
>>>>>>>>>>>>>>>>>> is interested in participating in the discussions. Of course before any
>>>>>>>>>>>>>>>>>> decision is made I'll open a discussion here or on GitHub.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> As part of the discussion I'm also trying to get a server
>>>>>>>>>>>>>>>>>> for the website, and one for development stuff. Specfically for the dev
>>>>>>>>>>>>>>>>>> docs (including rendered docs of every PR) and the GitHub app that will
>>>>>>>>>>>>>>>>>> generate them. I guess it should be very easy to find a sponsor for these
>>>>>>>>>>>>>>>>>> two servers (in exchange of a small note in the footer of the website, or
>>>>>>>>>>>>>>>>>> something like that).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Let me know if you have any comment, want to be involved
>>>>>>>>>>>>>>>>>> or whatever.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Cheers!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> Pandas-dev mailing list
>>>>>>>>>>>>>>>>> Pandas-dev at python.org
>>>>>>>>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Andy R. Terrel, PhD
>>>>>>>>>>>>>>> President
>>>>>>>>>>>>>>> NumFOCUS
>>>>>>>>>>>>>>> andy at numfocus.org
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Pandas-dev mailing list
>>>>>>>>>>>> Pandas-dev at python.org
>>>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>> Andy R. Terrel, PhD
>>>>>>>>>> President
>>>>>>>>>> NumFOCUS
>>>>>>>>>> andy at numfocus.org
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>> Pandas-dev mailing list
>>>>>>> Pandas-dev at python.org
>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>>>
>>>>>> _______________________________________________
>>>>> Pandas-dev mailing list
>>>>> Pandas-dev at python.org
>>>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>>>
>>>>
>>>>
>>>> --
>>>> Andy R. Terrel, PhD
>>>> President, NumFOCUS
>>>> andy at numfocus.org
>>>>
>>> _______________________________________________
>>> Pandas-dev mailing list
>>> Pandas-dev at python.org
>>> https://mail.python.org/mailman/listinfo/pandas-dev
>>>
>>
>>
>> --
>> Andy R. Terrel, PhD
>> President, NumFOCUS
>> andy at numfocus.org
>>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>


-- 
Andy R. Terrel, PhD
President, NumFOCUS
andy at numfocus.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20191029/6f37508d/attachment-0001.html>