From emailformattr at gmail.com Tue Jun 4 01:58:59 2019 From: emailformattr at gmail.com (Matthew Roeschke) Date: Mon, 3 Jun 2019 22:58:59 -0700 Subject: [Pandas-dev] Pandas dev sprint June 27-30 @ Nashville In-Reply-To: References:

Message-ID: Wes, any tips where in Nashville we should book accommodations or generally where the co-working space would be? I'm looking to book my accomodations fairly soon. Thanks. On Thu, May 30, 2019 at 10:57 AM Wes McKinney wrote: > @Joris if you have a complete headcount for the meeting please let me > know so I can work on securing a space for us to work. Just to confirm > it's the 27th through the 30th inclusive, so 4 full days of workspace > required? > > Thanks > > On Sat, May 25, 2019 at 3:44 PM Chang She wrote: > > > > Oh this was more me just carving time out as a forcing function for > myself. I?ll be +13 hour ahead so certainly not expecting video > conferences. Code reviews would be appreciated. > > > > As for the read_parquet item, I will either attach to an existing issue > or open a new one so discussion can happen on github. > > > > Thanks. > > > > On Saturday, May 25, 2019, Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > >> > >> You are certainly welcome to sprint those days as well. I only can't > promise that we can do any things to improve remote participation such as > video meetings (but there should be a bunch of core devs active, which > might give faster feedback on PRs if needed). > >> > >> Op di 14 mei 2019 om 22:59 schreef Chang She : > >>> > >>> I'll be out of the country during those dates. Can I still join in > remotely? > >>> > >>> Here's what I'd be interested in working on if there's appetite for > these to be part of pandas and friends: > >>> > >>> 1. A stale PR on Series.explode I haven't had any time to finish up ( > https://github.com/pandas-dev/pandas/pull/24366). > >> > >> > >> I think there is still certainly interest in such a function (I have > regularly needed something like that myself). > >> > >>> > >>> 2. Open sourcing an improvement to the pandas-redshift connector that > speeds up the ingestion of medium amounts of data using a combination of > unload + read_csv + multiprocessing. > >> > >> > >> That reminds me: it might be good to mention the pandas-redshift > packages somewhere in the docs (in the ecosystem page or in the sql docs). > >> > >>> > >>> 3. A minor improvement to allow read_parquet to work with globs > directly. This makes it a lot easier for pandas to read parquet generated > by Spark. > >>> > >> Do you know if there is already an open issue about this? > >> > >> Best, > >> Joris > >> > >>> > >>> > >>> > >>> On Tue, May 14, 2019 at 4:50 AM Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > >>>> > >>>> Dear all, > >>>> > >>>> We are planning to do a pandas sprint end of June in Nashville > (Tennessee, USA): June 27-30. We will be meeting with some of the core devs > (so not a sprint to jump-start newcomers in this case), but sending this to > the mailing list to invite other pandas (or related libraries) contributors. > >>>> The exact planning of the sprint still needs to be discussed, but we > will probably be hacking and discussing on pandas, extension arrays, next > versions of pandas, etc. > >>>> > >>>> So if you are interested, let me know something! We want to keep the > number of participants somewhat limited, and also need to plan the location > and funding, so please state your interest before May 30. > >>>> If you would like to participate, but not sure if you would fit at > such a sprint, don't hesitate to mail me personally. > >>>> > >>>> Best, > >>>> Joris > >>>> > >>>> > >>>> _______________________________________________ > >>>> Pandas-dev mailing list > >>>> Pandas-dev at python.org > >>>> https://mail.python.org/mailman/listinfo/pandas-dev > > > > _______________________________________________ > > Pandas-dev mailing list > > Pandas-dev at python.org > > https://mail.python.org/mailman/listinfo/pandas-dev > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -- Matthew Roeschke -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Tue Jun 4 08:50:35 2019 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 4 Jun 2019 07:50:35 -0500 Subject: [Pandas-dev] Pandas dev sprint June 27-30 @ Nashville In-Reply-To: References:

Message-ID: Yes, I'd recommend within reach of East Nashville or Downtown Nashville. The earlier you can book accommodations the better! I don't have an accurate headcount yet so I haven't been able to arrange the location to meet yet. On Tue, Jun 4, 2019 at 12:59 AM Matthew Roeschke wrote: > > Wes, any tips where in Nashville we should book accommodations or generally where the co-working space would be? I'm looking to book my accomodations fairly soon. > > Thanks. > > On Thu, May 30, 2019 at 10:57 AM Wes McKinney wrote: >> >> @Joris if you have a complete headcount for the meeting please let me >> know so I can work on securing a space for us to work. Just to confirm >> it's the 27th through the 30th inclusive, so 4 full days of workspace >> required? >> >> Thanks >> >> On Sat, May 25, 2019 at 3:44 PM Chang She wrote: >> > >> > Oh this was more me just carving time out as a forcing function for myself. I?ll be +13 hour ahead so certainly not expecting video conferences. Code reviews would be appreciated. >> > >> > As for the read_parquet item, I will either attach to an existing issue or open a new one so discussion can happen on github. >> > >> > Thanks. >> > >> > On Saturday, May 25, 2019, Joris Van den Bossche wrote: >> >> >> >> You are certainly welcome to sprint those days as well. I only can't promise that we can do any things to improve remote participation such as video meetings (but there should be a bunch of core devs active, which might give faster feedback on PRs if needed). >> >> >> >> Op di 14 mei 2019 om 22:59 schreef Chang She : >> >>> >> >>> I'll be out of the country during those dates. Can I still join in remotely? >> >>> >> >>> Here's what I'd be interested in working on if there's appetite for these to be part of pandas and friends: >> >>> >> >>> 1. A stale PR on Series.explode I haven't had any time to finish up (https://github.com/pandas-dev/pandas/pull/24366). >> >> >> >> >> >> I think there is still certainly interest in such a function (I have regularly needed something like that myself). >> >> >> >>> >> >>> 2. Open sourcing an improvement to the pandas-redshift connector that speeds up the ingestion of medium amounts of data using a combination of unload + read_csv + multiprocessing. >> >> >> >> >> >> That reminds me: it might be good to mention the pandas-redshift packages somewhere in the docs (in the ecosystem page or in the sql docs). >> >> >> >>> >> >>> 3. A minor improvement to allow read_parquet to work with globs directly. This makes it a lot easier for pandas to read parquet generated by Spark. >> >>> >> >> Do you know if there is already an open issue about this? >> >> >> >> Best, >> >> Joris >> >> >> >>> >> >>> >> >>> >> >>> On Tue, May 14, 2019 at 4:50 AM Joris Van den Bossche wrote: >> >>>> >> >>>> Dear all, >> >>>> >> >>>> We are planning to do a pandas sprint end of June in Nashville (Tennessee, USA): June 27-30. We will be meeting with some of the core devs (so not a sprint to jump-start newcomers in this case), but sending this to the mailing list to invite other pandas (or related libraries) contributors. >> >>>> The exact planning of the sprint still needs to be discussed, but we will probably be hacking and discussing on pandas, extension arrays, next versions of pandas, etc. >> >>>> >> >>>> So if you are interested, let me know something! We want to keep the number of participants somewhat limited, and also need to plan the location and funding, so please state your interest before May 30. >> >>>> If you would like to participate, but not sure if you would fit at such a sprint, don't hesitate to mail me personally. >> >>>> >> >>>> Best, >> >>>> Joris >> >>>> >> >>>> >> >>>> _______________________________________________ >> >>>> Pandas-dev mailing list >> >>>> Pandas-dev at python.org >> >>>> https://mail.python.org/mailman/listinfo/pandas-dev >> > >> > _______________________________________________ >> > Pandas-dev mailing list >> > Pandas-dev at python.org >> > https://mail.python.org/mailman/listinfo/pandas-dev >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev > > > > -- > Matthew Roeschke > From jorisvandenbossche at gmail.com Fri Jun 7 12:53:24 2019 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Fri, 7 Jun 2019 18:53:24 +0200 Subject: [Pandas-dev] Tidelift Message-ID: Hi all, We discussed this on the last dev chat, but putting it on the mailing list for those who were not present: we are planning to contact Tidelift to enter into a sponsor agreement for Pandas. The idea is to follow what NumPy (and recently also Scipy) did to have an agreement between Tidelift and NumFOCUS instead of an individual maintainer (see their announcement mail: https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html). Blog with overview about Tidelift: https://blog.tidelift .com/how-to-start-earning-money-for-your-open-source-project-with-tidelift. We didn't discuss yet what to do specifically with those funds, that should still be discussed in the future. Cheers, Joris -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.ayd at icloud.com Sat Jun 8 16:54:08 2019 From: william.ayd at icloud.com (William Ayd) Date: Sat, 8 Jun 2019 16:54:08 -0400 Subject: [Pandas-dev] Tidelift In-Reply-To: References: Message-ID: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com> What is the minimum amount we are asking for? The $1,000 a month for NumPy seems rather low and I thought previous emails had something in the range of $3k a month. I don?t think we necessarily need or would be that much improved by $12k per year so would rather aim higher if we are going to do this > On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche wrote: > > Hi all, > > We discussed this on the last dev chat, but putting it on the mailing list for those who were not present: we are planning to contact Tidelift to enter into a sponsor agreement for Pandas. > > The idea is to follow what NumPy (and recently also Scipy) did to have an agreement between Tidelift and NumFOCUS instead of an individual maintainer (see their announcement mail: https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html ). > Blog with overview about Tidelift: https://blog.tidelift.com/how-to-start-earning-money-for-your-open-source-project-with-tidelift . > > We didn't discuss yet what to do specifically with those funds, that should still be discussed in the future. > > Cheers, > Joris > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From andres.gutierrez.arcia at gmail.com Mon Jun 10 13:23:07 2019 From: andres.gutierrez.arcia at gmail.com (=?UTF-8?B?QW5kcsOpcyBHdXRpw6lycmV6?=) Date: Mon, 10 Jun 2019 11:23:07 -0600 Subject: [Pandas-dev] Panel on 0.25 Message-ID: Hi, what are the plans for Panel objects for the next release Pandas 0.25? Are you going to keep it or definitely drop it? if so, are you going to store it in a particular Python package? Thanks! -- *| Andr?s Guti?rrez Arcia* *| andres.gutierrez.arcia at gmail.com * | https://github.com/andrsGutirrz | https://www.linkedin.com/in/andres-gutierrez-arcia -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Mon Jun 10 14:01:31 2019 From: wesmckinn at gmail.com (Wes McKinney) Date: Mon, 10 Jun 2019 13:01:31 -0500 Subject: [Pandas-dev] Pandas dev sprint June 27-30 @ Nashville In-Reply-To: References:

Message-ID: To confirm, the location for the sprints is in Downtown Nashville near the state capitol building. Specific details about the location will be shared non-publicly with sprint participants On Tue, Jun 4, 2019 at 7:50 AM Wes McKinney wrote: > > Yes, I'd recommend within reach of East Nashville or Downtown > Nashville. The earlier you can book accommodations the better! > > I don't have an accurate headcount yet so I haven't been able to > arrange the location to meet yet. > > On Tue, Jun 4, 2019 at 12:59 AM Matthew Roeschke > wrote: > > > > Wes, any tips where in Nashville we should book accommodations or generally where the co-working space would be? I'm looking to book my accomodations fairly soon. > > > > Thanks. > > > > On Thu, May 30, 2019 at 10:57 AM Wes McKinney wrote: > >> > >> @Joris if you have a complete headcount for the meeting please let me > >> know so I can work on securing a space for us to work. Just to confirm > >> it's the 27th through the 30th inclusive, so 4 full days of workspace > >> required? > >> > >> Thanks > >> > >> On Sat, May 25, 2019 at 3:44 PM Chang She wrote: > >> > > >> > Oh this was more me just carving time out as a forcing function for myself. I?ll be +13 hour ahead so certainly not expecting video conferences. Code reviews would be appreciated. > >> > > >> > As for the read_parquet item, I will either attach to an existing issue or open a new one so discussion can happen on github. > >> > > >> > Thanks. > >> > > >> > On Saturday, May 25, 2019, Joris Van den Bossche wrote: > >> >> > >> >> You are certainly welcome to sprint those days as well. I only can't promise that we can do any things to improve remote participation such as video meetings (but there should be a bunch of core devs active, which might give faster feedback on PRs if needed). > >> >> > >> >> Op di 14 mei 2019 om 22:59 schreef Chang She : > >> >>> > >> >>> I'll be out of the country during those dates. Can I still join in remotely? > >> >>> > >> >>> Here's what I'd be interested in working on if there's appetite for these to be part of pandas and friends: > >> >>> > >> >>> 1. A stale PR on Series.explode I haven't had any time to finish up (https://github.com/pandas-dev/pandas/pull/24366). > >> >> > >> >> > >> >> I think there is still certainly interest in such a function (I have regularly needed something like that myself). > >> >> > >> >>> > >> >>> 2. Open sourcing an improvement to the pandas-redshift connector that speeds up the ingestion of medium amounts of data using a combination of unload + read_csv + multiprocessing. > >> >> > >> >> > >> >> That reminds me: it might be good to mention the pandas-redshift packages somewhere in the docs (in the ecosystem page or in the sql docs). > >> >> > >> >>> > >> >>> 3. A minor improvement to allow read_parquet to work with globs directly. This makes it a lot easier for pandas to read parquet generated by Spark. > >> >>> > >> >> Do you know if there is already an open issue about this? > >> >> > >> >> Best, > >> >> Joris > >> >> > >> >>> > >> >>> > >> >>> > >> >>> On Tue, May 14, 2019 at 4:50 AM Joris Van den Bossche wrote: > >> >>>> > >> >>>> Dear all, > >> >>>> > >> >>>> We are planning to do a pandas sprint end of June in Nashville (Tennessee, USA): June 27-30. We will be meeting with some of the core devs (so not a sprint to jump-start newcomers in this case), but sending this to the mailing list to invite other pandas (or related libraries) contributors. > >> >>>> The exact planning of the sprint still needs to be discussed, but we will probably be hacking and discussing on pandas, extension arrays, next versions of pandas, etc. > >> >>>> > >> >>>> So if you are interested, let me know something! We want to keep the number of participants somewhat limited, and also need to plan the location and funding, so please state your interest before May 30. > >> >>>> If you would like to participate, but not sure if you would fit at such a sprint, don't hesitate to mail me personally. > >> >>>> > >> >>>> Best, > >> >>>> Joris > >> >>>> > >> >>>> > >> >>>> _______________________________________________ > >> >>>> Pandas-dev mailing list > >> >>>> Pandas-dev at python.org > >> >>>> https://mail.python.org/mailman/listinfo/pandas-dev > >> > > >> > _______________________________________________ > >> > Pandas-dev mailing list > >> > Pandas-dev at python.org > >> > https://mail.python.org/mailman/listinfo/pandas-dev > >> _______________________________________________ > >> Pandas-dev mailing list > >> Pandas-dev at python.org > >> https://mail.python.org/mailman/listinfo/pandas-dev > > > > > > > > -- > > Matthew Roeschke > > From tom.augspurger88 at gmail.com Mon Jun 10 14:03:15 2019 From: tom.augspurger88 at gmail.com (Tom Augspurger) Date: Mon, 10 Jun 2019 13:03:15 -0500 Subject: [Pandas-dev] Panel on 0.25 In-Reply-To: References: Message-ID: Panel will be removed in 0.25 (it's gone on master). No plans to move the functionality elsewhere. I'd recommend pinning to 0.24.x if you need Panel. On Mon, Jun 10, 2019 at 12:52 PM Andr?s Guti?rrez < andres.gutierrez.arcia at gmail.com> wrote: > Hi, > > what are the plans for Panel objects for the next release Pandas 0.25? > Are you going to keep it or definitely drop it? if so, are you going to > store it in a particular Python package? > > Thanks! > > -- > *| Andr?s Guti?rrez Arcia* > *| andres.gutierrez.arcia at gmail.com * > | https://github.com/andrsGutirrz > | https://www.linkedin.com/in/andres-gutierrez-arcia > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Mon Jun 10 14:08:08 2019 From: wesmckinn at gmail.com (Wes McKinney) Date: Mon, 10 Jun 2019 13:08:08 -0500 Subject: [Pandas-dev] Panel on 0.25 In-Reply-To: References:

Message-ID: Note that if there are community members who wish to fork the Panel code and maintain it as a separate project, you are free to do so. The consensus of the core team has been for some time to no longer invest energy in maintaining this code If you were going to go down this road, it might be worth questioning whether you'd be better off contributing to xarray instead, of building a composite data structure that uses xarray under the hood On Mon, Jun 10, 2019 at 1:03 PM Tom Augspurger wrote: > > Panel will be removed in 0.25 (it's gone on master). > > No plans to move the functionality elsewhere. I'd recommend pinning to 0.24.x if you need Panel. > > On Mon, Jun 10, 2019 at 12:52 PM Andr?s Guti?rrez wrote: >> >> Hi, >> >> what are the plans for Panel objects for the next release Pandas 0.25? >> Are you going to keep it or definitely drop it? if so, are you going to store it in a particular Python package? >> >> Thanks! >> >> -- >> | Andr?s Guti?rrez Arcia >> | andres.gutierrez.arcia at gmail.com >> | https://github.com/andrsGutirrz >> | https://www.linkedin.com/in/andres-gutierrez-arcia >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev From andres.gutierrez.arcia at gmail.com Mon Jun 10 14:26:12 2019 From: andres.gutierrez.arcia at gmail.com (=?UTF-8?B?QW5kcsOpcyBHdXRpw6lycmV6?=) Date: Mon, 10 Jun 2019 12:26:12 -0600 Subject: [Pandas-dev] Panel on 0.25 In-Reply-To: References:

Message-ID: Thank you guys! I was asking because Im dealing with legacy code that uses Panel, and there are a lot of warnings! Im using a multiindex dataframe to avoid Panel! Thanks for your El lun., 10 de jun. de 2019 a la(s) 12:08, Wes McKinney (wesmckinn at gmail.com) escribi?: > Note that if there are community members who wish to fork the Panel > code and maintain it as a separate project, you are free to do so. The > consensus of the core team has been for some time to no longer invest > energy in maintaining this code > > If you were going to go down this road, it might be worth questioning > whether you'd be better off contributing to xarray instead, of > building a composite data structure that uses xarray under the hood > > On Mon, Jun 10, 2019 at 1:03 PM Tom Augspurger > wrote: > > > > Panel will be removed in 0.25 (it's gone on master). > > > > No plans to move the functionality elsewhere. I'd recommend pinning to > 0.24.x if you need Panel. > > > > On Mon, Jun 10, 2019 at 12:52 PM Andr?s Guti?rrez < > andres.gutierrez.arcia at gmail.com> wrote: > >> > >> Hi, > >> > >> what are the plans for Panel objects for the next release Pandas 0.25? > >> Are you going to keep it or definitely drop it? if so, are you going to > store it in a particular Python package? > >> > >> Thanks! > >> > >> -- > >> | Andr?s Guti?rrez Arcia > >> | andres.gutierrez.arcia at gmail.com > >> | https://github.com/andrsGutirrz > >> | https://www.linkedin.com/in/andres-gutierrez-arcia > >> _______________________________________________ > >> Pandas-dev mailing list > >> Pandas-dev at python.org > >> https://mail.python.org/mailman/listinfo/pandas-dev > > > > _______________________________________________ > > Pandas-dev mailing list > > Pandas-dev at python.org > > https://mail.python.org/mailman/listinfo/pandas-dev > -- *| Andr?s Guti?rrez Arcia* *| Estudiante UNA-CR* *| andres.gutierrez.arcia at gmail.com * | https://github.com/andrsGutirrz | https://www.linkedin.com/in/andres-gutierrez-arcia *| 61688613* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbrockmendel at gmail.com Mon Jun 10 19:56:13 2019 From: jbrockmendel at gmail.com (Brock Mendel) Date: Mon, 10 Jun 2019 18:56:13 -0500 Subject: [Pandas-dev] Pandas dev sprint June 27-30 @ Nashville In-Reply-To: References:

Message-ID: Is there an agenda document somewhere? On Mon, Jun 10, 2019 at 1:02 PM Wes McKinney wrote: > To confirm, the location for the sprints is in Downtown Nashville near > the state capitol building. Specific details about the location will > be shared non-publicly with sprint participants > > On Tue, Jun 4, 2019 at 7:50 AM Wes McKinney wrote: > > > > Yes, I'd recommend within reach of East Nashville or Downtown > > Nashville. The earlier you can book accommodations the better! > > > > I don't have an accurate headcount yet so I haven't been able to > > arrange the location to meet yet. > > > > On Tue, Jun 4, 2019 at 12:59 AM Matthew Roeschke > > wrote: > > > > > > Wes, any tips where in Nashville we should book accommodations or > generally where the co-working space would be? I'm looking to book my > accomodations fairly soon. > > > > > > Thanks. > > > > > > On Thu, May 30, 2019 at 10:57 AM Wes McKinney > wrote: > > >> > > >> @Joris if you have a complete headcount for the meeting please let me > > >> know so I can work on securing a space for us to work. Just to confirm > > >> it's the 27th through the 30th inclusive, so 4 full days of workspace > > >> required? > > >> > > >> Thanks > > >> > > >> On Sat, May 25, 2019 at 3:44 PM Chang She wrote: > > >> > > > >> > Oh this was more me just carving time out as a forcing function for > myself. I?ll be +13 hour ahead so certainly not expecting video > conferences. Code reviews would be appreciated. > > >> > > > >> > As for the read_parquet item, I will either attach to an existing > issue or open a new one so discussion can happen on github. > > >> > > > >> > Thanks. > > >> > > > >> > On Saturday, May 25, 2019, Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > > >> >> > > >> >> You are certainly welcome to sprint those days as well. I only > can't promise that we can do any things to improve remote participation > such as video meetings (but there should be a bunch of core devs active, > which might give faster feedback on PRs if needed). > > >> >> > > >> >> Op di 14 mei 2019 om 22:59 schreef Chang She : > > >> >>> > > >> >>> I'll be out of the country during those dates. Can I still join > in remotely? > > >> >>> > > >> >>> Here's what I'd be interested in working on if there's appetite > for these to be part of pandas and friends: > > >> >>> > > >> >>> 1. A stale PR on Series.explode I haven't had any time to finish > up (https://github.com/pandas-dev/pandas/pull/24366). > > >> >> > > >> >> > > >> >> I think there is still certainly interest in such a function (I > have regularly needed something like that myself). > > >> >> > > >> >>> > > >> >>> 2. Open sourcing an improvement to the pandas-redshift connector > that speeds up the ingestion of medium amounts of data using a combination > of unload + read_csv + multiprocessing. > > >> >> > > >> >> > > >> >> That reminds me: it might be good to mention the pandas-redshift > packages somewhere in the docs (in the ecosystem page or in the sql docs). > > >> >> > > >> >>> > > >> >>> 3. A minor improvement to allow read_parquet to work with globs > directly. This makes it a lot easier for pandas to read parquet generated > by Spark. > > >> >>> > > >> >> Do you know if there is already an open issue about this? > > >> >> > > >> >> Best, > > >> >> Joris > > >> >> > > >> >>> > > >> >>> > > >> >>> > > >> >>> On Tue, May 14, 2019 at 4:50 AM Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > > >> >>>> > > >> >>>> Dear all, > > >> >>>> > > >> >>>> We are planning to do a pandas sprint end of June in Nashville > (Tennessee, USA): June 27-30. We will be meeting with some of the core devs > (so not a sprint to jump-start newcomers in this case), but sending this to > the mailing list to invite other pandas (or related libraries) contributors. > > >> >>>> The exact planning of the sprint still needs to be discussed, > but we will probably be hacking and discussing on pandas, extension arrays, > next versions of pandas, etc. > > >> >>>> > > >> >>>> So if you are interested, let me know something! We want to keep > the number of participants somewhat limited, and also need to plan the > location and funding, so please state your interest before May 30. > > >> >>>> If you would like to participate, but not sure if you would fit > at such a sprint, don't hesitate to mail me personally. > > >> >>>> > > >> >>>> Best, > > >> >>>> Joris > > >> >>>> > > >> >>>> > > >> >>>> _______________________________________________ > > >> >>>> Pandas-dev mailing list > > >> >>>> Pandas-dev at python.org > > >> >>>> https://mail.python.org/mailman/listinfo/pandas-dev > > >> > > > >> > _______________________________________________ > > >> > Pandas-dev mailing list > > >> > Pandas-dev at python.org > > >> > https://mail.python.org/mailman/listinfo/pandas-dev > > >> _______________________________________________ > > >> Pandas-dev mailing list > > >> Pandas-dev at python.org > > >> https://mail.python.org/mailman/listinfo/pandas-dev > > > > > > > > > > > > -- > > > Matthew Roeschke > > > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Tue Jun 11 04:15:30 2019 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Tue, 11 Jun 2019 10:15:30 +0200 Subject: [Pandas-dev] Tidelift In-Reply-To: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com> References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com> Message-ID: The current page about pandas ( https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar a month (but I am not fully sure this is what is already available from their current subscribers, or if it is a prospect). Op za 8 jun. 2019 om 22:54 schreef William Ayd : > What is the minimum amount we are asking for? The $1,000 a month for NumPy > seems rather low and I thought previous emails had something in the range > of $3k a month. > > I don?t think we necessarily need or would be that much improved by $12k > per year so would rather aim higher if we are going to do this > > On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > > Hi all, > > We discussed this on the last dev chat, but putting it on the mailing list > for those who were not present: we are planning to contact Tidelift to > enter into a sponsor agreement for Pandas. > > The idea is to follow what NumPy (and recently also Scipy) did to have an > agreement between Tidelift and NumFOCUS instead of an individual maintainer > (see their announcement mail: > https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html > ). > Blog with overview about Tidelift: https://blog.tidelift > .com/how-to-start-earning-money-for-your-open-source-project-with-tidelift > . > > We didn't discuss yet what to do specifically with those funds, that > should still be discussed in the future. > > Cheers, > Joris > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Jun 11 04:44:38 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 11 Jun 2019 10:44:38 +0200 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com> Message-ID: On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche < jorisvandenbossche at gmail.com> wrote: > The current page about pandas ( > https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar a > month (but I am not fully sure this is what is already available from their > current subscribers, or if it is a prospect). > It's not just a prospect, that's what you should/will get. NumPy and SciPy get the listed amounts too. Agreed that the NumPy amount is not that much. The amount gets determined automatically; it's some combination of customer interest, dependency analysis and size of the API surface. The current amounts are: NumPy: $1000 SciPy: $2500 Pandas: $3000 Matplotlib: n.a. Scikit-learn: $1500 Scikit-image: $50 Statsmodels: $50 So there's an element of randomness, but the results are not completely surprising I think. The four libraries that get order thousands of dollars are the ones that large corporations are going to have the highest interest in. Cheers, Ralf > > Op za 8 jun. 2019 om 22:54 schreef William Ayd : > >> What is the minimum amount we are asking for? The $1,000 a month for >> NumPy seems rather low and I thought previous emails had something in the >> range of $3k a month. >> >> I don?t think we necessarily need or would be that much improved by $12k >> per year so would rather aim higher if we are going to do this >> >> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche < >> jorisvandenbossche at gmail.com> wrote: >> >> Hi all, >> >> We discussed this on the last dev chat, but putting it on the mailing >> list for those who were not present: we are planning to contact Tidelift to >> enter into a sponsor agreement for Pandas. >> >> The idea is to follow what NumPy (and recently also Scipy) did to have an >> agreement between Tidelift and NumFOCUS instead of an individual maintainer >> (see their announcement mail: >> https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html >> ). >> Blog with overview about Tidelift: https://blog.tidelift >> .com/how-to-start-earning-money-for-your-open-source-project-with- >> tidelift. >> >> We didn't discuss yet what to do specifically with those funds, that >> should still be discussed in the future. >> >> Cheers, >> Joris >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> >> >> _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.ayd at icloud.com Tue Jun 11 08:57:49 2019 From: william.ayd at icloud.com (William Ayd) Date: Tue, 11 Jun 2019 08:57:49 -0400 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com> Message-ID: Just some counterpoints to consider: - $ 3,000 a month isn?t really that much, and if it?s just a number that a well-funded company chose for us chances are they are benefiting from it way more than we are - There is no such thing as free money; we have to consider how to account for and actually manage it (perhaps mitigated somewhat by NumFocus) - Advertising and ties to a corporate sponsorship may weaken the brand of pandas; at that point we may lose some creditability as open source volunteers - We don?t (AFAIK) have a plan on how to spend or allocate it Not totally against it but perhaps the last point above is the main sticking one. Do we have any idea how much we?d actually pocket out of the $ 3k they offer us and subsequently what we would do with it? Cover travel expenses? Support PyData conferences? Scholarships? - Will > On Jun 11, 2019, at 4:44 AM, Ralf Gommers wrote: > > > > On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche > wrote: > The current page about pandas (https://tidelift.com/lifter/search/pypi/pandas ) mentions $3,000 dollar a month (but I am not fully sure this is what is already available from their current subscribers, or if it is a prospect). > > It's not just a prospect, that's what you should/will get. NumPy and SciPy get the listed amounts too. > > Agreed that the NumPy amount is not that much. The amount gets determined automatically; it's some combination of customer interest, dependency analysis and size of the API surface. > > The current amounts are: > NumPy: $1000 > SciPy: $2500 > Pandas: $3000 > Matplotlib: n.a. > Scikit-learn: $1500 > Scikit-image: $50 > Statsmodels: $50 > > So there's an element of randomness, but the results are not completely surprising I think. The four libraries that get order thousands of dollars are the ones that large corporations are going to have the highest interest in. > > Cheers, > Ralf > > > Op za 8 jun. 2019 om 22:54 schreef William Ayd >: > What is the minimum amount we are asking for? The $1,000 a month for NumPy seems rather low and I thought previous emails had something in the range of $3k a month. > > I don?t think we necessarily need or would be that much improved by $12k per year so would rather aim higher if we are going to do this > >> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche > wrote: >> >> Hi all, >> >> We discussed this on the last dev chat, but putting it on the mailing list for those who were not present: we are planning to contact Tidelift to enter into a sponsor agreement for Pandas. >> >> The idea is to follow what NumPy (and recently also Scipy) did to have an agreement between Tidelift and NumFOCUS instead of an individual maintainer (see their announcement mail: https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html ). >> Blog with overview about Tidelift: https://blog.tidelift.com/how-to-start-earning-money-for-your-open-source-project-with-tidelift . >> >> We didn't discuss yet what to do specifically with those funds, that should still be discussed in the future. >> >> Cheers, >> Joris >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.augspurger88 at gmail.com Tue Jun 11 09:03:15 2019 From: tom.augspurger88 at gmail.com (Tom Augspurger) Date: Tue, 11 Jun 2019 08:03:15 -0500 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: On Tue, Jun 11, 2019 at 7:58 AM William Ayd via Pandas-dev < pandas-dev at python.org> wrote: > Just some counterpoints to consider: > > - $ 3,000 a month isn?t really that much, and if it?s just a number that a > well-funded company chose for us chances are they are benefiting from it > way more than we are > - There is no such thing as free money; we have to consider how to account > for and actually manage it (perhaps mitigated somewhat by NumFocus) > Perhaps Ralph can share how this has gone for NumPy. I imagine it's not too work on their end, thanks to NumFOCUS. > - Advertising and ties to a corporate sponsorship may weaken the brand of > pandas; at that point we may lose some creditability as open source > volunteers > Anecdotally, I don't think that's how the community views Tidelift. My perception (from Twitter, blogs / comments) is that it's been well received. > - We don?t (AFAIK) have a plan on how to spend or allocate it > > Not totally against it but perhaps the last point above is the main > sticking one. Do we have any idea how much we?d actually pocket out of the > $ 3k they offer us and subsequently what we would do with it? Cover travel > expenses? Support PyData conferences? Scholarships? > Agreed that we should set a purpose for this money (though, I have no objection to collecting while we set that dedicated purpose). > - Will > > On Jun 11, 2019, at 4:44 AM, Ralf Gommers wrote: > > > > On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > >> The current page about pandas ( >> https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar a >> month (but I am not fully sure this is what is already available from their >> current subscribers, or if it is a prospect). >> > > It's not just a prospect, that's what you should/will get. NumPy and SciPy > get the listed amounts too. > > Agreed that the NumPy amount is not that much. The amount gets determined > automatically; it's some combination of customer interest, dependency > analysis and size of the API surface. > > The current amounts are: > NumPy: $1000 > SciPy: $2500 > Pandas: $3000 > Matplotlib: n.a. > Scikit-learn: $1500 > Scikit-image: $50 > Statsmodels: $50 > > So there's an element of randomness, but the results are not completely > surprising I think. The four libraries that get order thousands of dollars > are the ones that large corporations are going to have the highest interest > in. > > Cheers, > Ralf > > >> >> Op za 8 jun. 2019 om 22:54 schreef William Ayd : >> >>> What is the minimum amount we are asking for? The $1,000 a month for >>> NumPy seems rather low and I thought previous emails had something in the >>> range of $3k a month. >>> >>> I don?t think we necessarily need or would be that much improved by $12k >>> per year so would rather aim higher if we are going to do this >>> >>> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche < >>> jorisvandenbossche at gmail.com> wrote: >>> >>> Hi all, >>> >>> We discussed this on the last dev chat, but putting it on the mailing >>> list for those who were not present: we are planning to contact Tidelift to >>> enter into a sponsor agreement for Pandas. >>> >>> The idea is to follow what NumPy (and recently also Scipy) did to have >>> an agreement between Tidelift and NumFOCUS instead of an individual >>> maintainer (see their announcement mail: >>> https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html >>> ). >>> Blog with overview about Tidelift: https://blog.tidelift >>> .com/how-to-start-earning-money-for-your-open-source-project-with- >>> tidelift. >>> >>> We didn't discuss yet what to do specifically with those funds, that >>> should still be discussed in the future. >>> >>> Cheers, >>> Joris >>> _______________________________________________ >>> Pandas-dev mailing list >>> Pandas-dev at python.org >>> https://mail.python.org/mailman/listinfo/pandas-dev >>> >>> >>> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Jun 11 09:32:40 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 11 Jun 2019 15:32:40 +0200 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: On Tue, Jun 11, 2019 at 3:03 PM Tom Augspurger wrote: > > > On Tue, Jun 11, 2019 at 7:58 AM William Ayd via Pandas-dev < > pandas-dev at python.org> wrote: > >> Just some counterpoints to consider: >> >> - $ 3,000 a month isn?t really that much, and if it?s just a number that >> a well-funded company chose for us chances are they are benefiting from it >> way more than we are >> > "it's not really that much" is something I don't agree with. It doesn't employ someone, but it's enough to pay for things like developer meetups, hiring an extra GSoC student if a good one happens to come along, paying a web dev for a full redesign of the project website, etc. Each of those things is in the $5,000 - %15,000 range, and it's _very_ nice to be able to do them without having to look for funding first. Tidelift is a small (now ~25 employees) company by the way, and they have a real understanding of the open source sustainability issues and seem dedicated to helping fix it. - There is no such thing as free money; we have to consider how to account >> for and actually manage it (perhaps mitigated somewhat by NumFocus) >> > > Perhaps Ralph can share how this has gone for NumPy. I imagine it's not > too work on their end, thanks to NumFOCUS. > NumFOCUS handles receiving the money and associated admin. As the project you'll be responsible for the setup and ongoing tasks. For NumPy and SciPy I have done those tasks. It's a fairly minimal amount of work: https://github.com/numpy/numpy/pulls?q=is%3Apr+tidelift+is%3Aclosed. The main one was dealing with GitHub not recognizing our license, and you don't have that issue for Pandas (it's reported correctly as BSD-3 in the UI at https://github.com/pandas-dev/pandas). So it's probably a day of work for one person, to get familiar with the interface, check dependencies, release streams, paste in release notes, etc. And then ongoing maybe one or a couple of hours a month. So far it's been a much more effective way of spending time than, for example, grant writing. > >> - Advertising and ties to a corporate sponsorship may weaken the brand of >> pandas; at that point we may lose some creditability as open source >> volunteers >> > > Anecdotally, I don't think that's how the community views Tidelift. My > perception (from Twitter, blogs / comments) is that it's been well received. > Agree, the feedback I've seen is all quite positive. > >> - We don?t (AFAIK) have a plan on how to spend or allocate it >> >> Not totally against it but perhaps the last point above is the main >> sticking one. Do we have any idea how much we?d actually pocket out of the >> $ 3k they offer us and subsequently what we would do with it? Cover travel >> expenses? Support PyData conferences? Scholarships? >> > > Agreed that we should set a purpose for this money (though, I have no > objection to collecting while we set that dedicated purpose). > For NumPy and SciPy we haven't earmarked the funds yet. It's nice to build up a buffer first. One thing I'm thinking of is that we're participating in Google Season of Docs, and are getting more high quality applicants than Google will accept. So we could pay one or two tech writers from the funds. Our website and high level docs (tutorial, restructuring of all docs to guide users better) sure could use it:) My abstract advice would be: pay for things that require money (like a dev meeting) or don't get done for free. Don't pay for writing code unless the case is extremely compelling, because that'll be a drop in the bucket. Cheers, Ralf > >> - Will >> >> On Jun 11, 2019, at 4:44 AM, Ralf Gommers wrote: >> >> >> >> On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche < >> jorisvandenbossche at gmail.com> wrote: >> >>> The current page about pandas ( >>> https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar >>> a month (but I am not fully sure this is what is already available from >>> their current subscribers, or if it is a prospect). >>> >> >> It's not just a prospect, that's what you should/will get. NumPy and >> SciPy get the listed amounts too. >> >> Agreed that the NumPy amount is not that much. The amount gets determined >> automatically; it's some combination of customer interest, dependency >> analysis and size of the API surface. >> >> The current amounts are: >> NumPy: $1000 >> SciPy: $2500 >> Pandas: $3000 >> Matplotlib: n.a. >> Scikit-learn: $1500 >> Scikit-image: $50 >> Statsmodels: $50 >> >> So there's an element of randomness, but the results are not completely >> surprising I think. The four libraries that get order thousands of dollars >> are the ones that large corporations are going to have the highest interest >> in. >> >> Cheers, >> Ralf >> >> >>> >>> Op za 8 jun. 2019 om 22:54 schreef William Ayd : >>> >>>> What is the minimum amount we are asking for? The $1,000 a month for >>>> NumPy seems rather low and I thought previous emails had something in the >>>> range of $3k a month. >>>> >>>> I don?t think we necessarily need or would be that much improved by >>>> $12k per year so would rather aim higher if we are going to do this >>>> >>>> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche < >>>> jorisvandenbossche at gmail.com> wrote: >>>> >>>> Hi all, >>>> >>>> We discussed this on the last dev chat, but putting it on the mailing >>>> list for those who were not present: we are planning to contact Tidelift to >>>> enter into a sponsor agreement for Pandas. >>>> >>>> The idea is to follow what NumPy (and recently also Scipy) did to have >>>> an agreement between Tidelift and NumFOCUS instead of an individual >>>> maintainer (see their announcement mail: >>>> https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html >>>> ). >>>> Blog with overview about Tidelift: https://blog.tidelift >>>> .com/how-to-start-earning-money-for-your-open-source-project-with- >>>> tidelift. >>>> >>>> We didn't discuss yet what to do specifically with those funds, that >>>> should still be discussed in the future. >>>> >>>> Cheers, >>>> Joris >>>> _______________________________________________ >>>> Pandas-dev mailing list >>>> Pandas-dev at python.org >>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>> >>>> >>>> _______________________________________________ >>> Pandas-dev mailing list >>> Pandas-dev at python.org >>> https://mail.python.org/mailman/listinfo/pandas-dev >>> >> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.ayd at icloud.com Tue Jun 11 09:37:36 2019 From: william.ayd at icloud.com (William Ayd) Date: Tue, 11 Jun 2019 09:37:36 -0400 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: Great thanks for the clarification Tom and Ralf. All points make sense and sounds like Tidelift is doing good things then. I?m on board if everyone else is. - Will > On Jun 11, 2019, at 9:32 AM, Ralf Gommers wrote: > > > > On Tue, Jun 11, 2019 at 3:03 PM Tom Augspurger > wrote: > > > On Tue, Jun 11, 2019 at 7:58 AM William Ayd via Pandas-dev > wrote: > Just some counterpoints to consider: > > - $ 3,000 a month isn?t really that much, and if it?s just a number that a well-funded company chose for us chances are they are benefiting from it way more than we are > > "it's not really that much" is something I don't agree with. It doesn't employ someone, but it's enough to pay for things like developer meetups, hiring an extra GSoC student if a good one happens to come along, paying a web dev for a full redesign of the project website, etc. Each of those things is in the $5,000 - %15,000 range, and it's _very_ nice to be able to do them without having to look for funding first. > > Tidelift is a small (now ~25 employees) company by the way, and they have a real understanding of the open source sustainability issues and seem dedicated to helping fix it. > > - There is no such thing as free money; we have to consider how to account for and actually manage it (perhaps mitigated somewhat by NumFocus) > > Perhaps Ralph can share how this has gone for NumPy. I imagine it's not too work on their end, thanks to NumFOCUS. > > NumFOCUS handles receiving the money and associated admin. As the project you'll be responsible for the setup and ongoing tasks. For NumPy and SciPy I have done those tasks. It's a fairly minimal amount of work: https://github.com/numpy/numpy/pulls?q=is%3Apr+tidelift+is%3Aclosed . The main one was dealing with GitHub not recognizing our license, and you don't have that issue for Pandas (it's reported correctly as BSD-3 in the UI at https://github.com/pandas-dev/pandas ). > > So it's probably a day of work for one person, to get familiar with the interface, check dependencies, release streams, paste in release notes, etc. And then ongoing maybe one or a couple of hours a month. So far it's been a much more effective way of spending time than, for example, grant writing. > > > - Advertising and ties to a corporate sponsorship may weaken the brand of pandas; at that point we may lose some creditability as open source volunteers > > Anecdotally, I don't think that's how the community views Tidelift. My perception (from Twitter, blogs / comments) is that it's been well received. > > Agree, the feedback I've seen is all quite positive. > > > - We don?t (AFAIK) have a plan on how to spend or allocate it > > Not totally against it but perhaps the last point above is the main sticking one. Do we have any idea how much we?d actually pocket out of the $ 3k they offer us and subsequently what we would do with it? Cover travel expenses? Support PyData conferences? Scholarships? > > Agreed that we should set a purpose for this money (though, I have no objection to collecting while we set that dedicated purpose). > > For NumPy and SciPy we haven't earmarked the funds yet. It's nice to build up a buffer first. One thing I'm thinking of is that we're participating in Google Season of Docs, and are getting more high quality applicants than Google will accept. So we could pay one or two tech writers from the funds. Our website and high level docs (tutorial, restructuring of all docs to guide users better) sure could use it:) > > My abstract advice would be: pay for things that require money (like a dev meeting) or don't get done for free. Don't pay for writing code unless the case is extremely compelling, because that'll be a drop in the bucket. > > Cheers, > Ralf > > > > - Will > >> On Jun 11, 2019, at 4:44 AM, Ralf Gommers > wrote: >> >> >> >> On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche > wrote: >> The current page about pandas (https://tidelift.com/lifter/search/pypi/pandas ) mentions $3,000 dollar a month (but I am not fully sure this is what is already available from their current subscribers, or if it is a prospect). >> >> It's not just a prospect, that's what you should/will get. NumPy and SciPy get the listed amounts too. >> >> Agreed that the NumPy amount is not that much. The amount gets determined automatically; it's some combination of customer interest, dependency analysis and size of the API surface. >> >> The current amounts are: >> NumPy: $1000 >> SciPy: $2500 >> Pandas: $3000 >> Matplotlib: n.a. >> Scikit-learn: $1500 >> Scikit-image: $50 >> Statsmodels: $50 >> >> So there's an element of randomness, but the results are not completely surprising I think. The four libraries that get order thousands of dollars are the ones that large corporations are going to have the highest interest in. >> >> Cheers, >> Ralf >> >> >> Op za 8 jun. 2019 om 22:54 schreef William Ayd >: >> What is the minimum amount we are asking for? The $1,000 a month for NumPy seems rather low and I thought previous emails had something in the range of $3k a month. >> >> I don?t think we necessarily need or would be that much improved by $12k per year so would rather aim higher if we are going to do this >> >>> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche > wrote: >>> >>> Hi all, >>> >>> We discussed this on the last dev chat, but putting it on the mailing list for those who were not present: we are planning to contact Tidelift to enter into a sponsor agreement for Pandas. >>> >>> The idea is to follow what NumPy (and recently also Scipy) did to have an agreement between Tidelift and NumFOCUS instead of an individual maintainer (see their announcement mail: https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html ). >>> Blog with overview about Tidelift: https://blog.tidelift.com/how-to-start-earning-money-for-your-open-source-project-with-tidelift . >>> >>> We didn't discuss yet what to do specifically with those funds, that should still be discussed in the future. >>> >>> Cheers, >>> Joris >>> _______________________________________________ >>> Pandas-dev mailing list >>> Pandas-dev at python.org >>> https://mail.python.org/mailman/listinfo/pandas-dev >> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Tue Jun 11 09:41:40 2019 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Tue, 11 Jun 2019 15:41:40 +0200 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: Op di 11 jun. 2019 om 15:31 schreef Ralf Gommers : > > > On Tue, Jun 11, 2019 at 3:03 PM Tom Augspurger > wrote: > >> >> >> On Tue, Jun 11, 2019 at 7:58 AM William Ayd via Pandas-dev < >> pandas-dev at python.org> wrote: >> >>> Just some counterpoints to consider: >>> >>> - $ 3,000 a month isn?t really that much, and if it?s just a number that >>> a well-funded company chose for us chances are they are benefiting from it >>> way more than we are >>> >> > "it's not really that much" is something I don't agree with. It doesn't > employ someone, but it's enough to pay for things like developer meetups, > hiring an extra GSoC student if a good one happens to come along, paying a > web dev for a full redesign of the project website, etc. Each of those > things is in the $5,000 - %15,000 range, and it's _very_ nice to be able to > do them without having to look for funding first. > > Tidelift is a small (now ~25 employees) company by the way, and they have > a real understanding of the open source sustainability issues and seem > dedicated to helping fix it. > > - There is no such thing as free money; we have to consider how to account >>> for and actually manage it (perhaps mitigated somewhat by NumFocus) >>> >> >> Perhaps Ralph can share how this has gone for NumPy. I imagine it's not >> too work on their end, thanks to NumFOCUS. >> > > NumFOCUS handles receiving the money and associated admin. As the project > you'll be responsible for the setup and ongoing tasks. For NumPy and SciPy > I have done those tasks. It's a fairly minimal amount of work: > https://github.com/numpy/numpy/pulls?q=is%3Apr+tidelift+is%3Aclosed. The > main one was dealing with GitHub not recognizing our license, and you don't > have that issue for Pandas (it's reported correctly as BSD-3 in the UI at > https://github.com/pandas-dev/pandas). > > So it's probably a day of work for one person, to get familiar with the > interface, check dependencies, release streams, paste in release notes, > etc. And then ongoing maybe one or a couple of hours a month. So far it's > been a much more effective way of spending time than, for example, grant > writing. > > >> >>> - Advertising and ties to a corporate sponsorship may weaken the brand >>> of pandas; at that point we may lose some creditability as open source >>> volunteers >>> >> >> Anecdotally, I don't think that's how the community views Tidelift. My >> perception (from Twitter, blogs / comments) is that it's been well received. >> > > Agree, the feedback I've seen is all quite positive. > Additionally, I don't think there is any "advertisement" involved, at least not in the classical sense of adding adds for third-party companies in a side bar to our website for which we get money. Of course we will need to mention Tidelift in some way, e.g. in our sponsors / institutional partners section, but we already do that for some other companies as well (that employ core devs). > > >> >>> - We don?t (AFAIK) have a plan on how to spend or allocate it >>> >>> Not totally against it but perhaps the last point above is the main >>> sticking one. Do we have any idea how much we?d actually pocket out of the >>> $ 3k they offer us and subsequently what we would do with it? Cover travel >>> expenses? Support PyData conferences? Scholarships? >>> >> >> Agreed that we should set a purpose for this money (though, I have no >> objection to collecting while we set that dedicated purpose). >> > > Indeed we need to discuss this, but I don't think we already need to know *exactly* what we want to do with it before setting up a contract with Tidelift. It's good for me to alraedy start discussing it now, but maybe in a separate thread? > For NumPy and SciPy we haven't earmarked the funds yet. It's nice to build > up a buffer first. One thing I'm thinking of is that we're participating in > Google Season of Docs, and are getting more high quality applicants than > Google will accept. So we could pay one or two tech writers from the funds. > Our website and high level docs (tutorial, restructuring of all docs to > guide users better) sure could use it:) > > My abstract advice would be: pay for things that require money (like a dev > meeting) or don't get done for free. Don't pay for writing code unless the > case is extremely compelling, because that'll be a drop in the bucket. > > Cheers, > Ralf > > > >> >>> - Will >>> >>> On Jun 11, 2019, at 4:44 AM, Ralf Gommers >>> wrote: >>> >>> >>> >>> On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche < >>> jorisvandenbossche at gmail.com> wrote: >>> >>>> The current page about pandas ( >>>> https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar >>>> a month (but I am not fully sure this is what is already available from >>>> their current subscribers, or if it is a prospect). >>>> >>> >>> It's not just a prospect, that's what you should/will get. NumPy and >>> SciPy get the listed amounts too. >>> >>> Agreed that the NumPy amount is not that much. The amount gets >>> determined automatically; it's some combination of customer interest, >>> dependency analysis and size of the API surface. >>> >>> The current amounts are: >>> NumPy: $1000 >>> SciPy: $2500 >>> Pandas: $3000 >>> Matplotlib: n.a. >>> Scikit-learn: $1500 >>> Scikit-image: $50 >>> Statsmodels: $50 >>> >>> So there's an element of randomness, but the results are not completely >>> surprising I think. The four libraries that get order thousands of dollars >>> are the ones that large corporations are going to have the highest interest >>> in. >>> >>> Cheers, >>> Ralf >>> >>> >>>> >>>> Op za 8 jun. 2019 om 22:54 schreef William Ayd >>> >: >>>> >>>>> What is the minimum amount we are asking for? The $1,000 a month for >>>>> NumPy seems rather low and I thought previous emails had something in the >>>>> range of $3k a month. >>>>> >>>>> I don?t think we necessarily need or would be that much improved by >>>>> $12k per year so would rather aim higher if we are going to do this >>>>> >>>>> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche < >>>>> jorisvandenbossche at gmail.com> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> We discussed this on the last dev chat, but putting it on the mailing >>>>> list for those who were not present: we are planning to contact Tidelift to >>>>> enter into a sponsor agreement for Pandas. >>>>> >>>>> The idea is to follow what NumPy (and recently also Scipy) did to have >>>>> an agreement between Tidelift and NumFOCUS instead of an individual >>>>> maintainer (see their announcement mail: >>>>> https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html >>>>> ). >>>>> Blog with overview about Tidelift: https://blog.tidelift >>>>> .com/how-to-start-earning-money-for-your-open-source-project-with- >>>>> tidelift. >>>>> >>>>> We didn't discuss yet what to do specifically with those funds, that >>>>> should still be discussed in the future. >>>>> >>>>> Cheers, >>>>> Joris >>>>> _______________________________________________ >>>>> Pandas-dev mailing list >>>>> Pandas-dev at python.org >>>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>>> >>>>> >>>>> _______________________________________________ >>>> Pandas-dev mailing list >>>> Pandas-dev at python.org >>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>> >>> >>> _______________________________________________ >>> Pandas-dev mailing list >>> Pandas-dev at python.org >>> https://mail.python.org/mailman/listinfo/pandas-dev >>> >> _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Tue Jun 11 10:09:23 2019 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 11 Jun 2019 09:09:23 -0500 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: Personally, I would recommend putting most of the money in your own pockets. The whole idea of Tidelift (as I understand it) is for the individuals doing work that is of importance to project users (to whom Tidelift is providing indemnification and "insurance" against defects) to get paid for their labor. So I think the most honest way to use the money is to put it in your respective bank accounts. If you've getting a little bit of money to spend on yourself, doesn't that make doing the maintenance work a bit less thankless? If you don't pay yourselves, I think it actually "breaks" Tidelift's pitch to customers which is that open source projects need to have a higher fraction of compensated maintenance and support work than they do now. How you allocate the money to each other is something you can debate privately On Tue, Jun 11, 2019 at 8:42 AM Joris Van den Bossche wrote: > > > > Op di 11 jun. 2019 om 15:31 schreef Ralf Gommers : >> >> >> >> On Tue, Jun 11, 2019 at 3:03 PM Tom Augspurger wrote: >>> >>> >>> >>> On Tue, Jun 11, 2019 at 7:58 AM William Ayd via Pandas-dev wrote: >>>> >>>> Just some counterpoints to consider: >>>> >>>> - $ 3,000 a month isn?t really that much, and if it?s just a number that a well-funded company chose for us chances are they are benefiting from it way more than we are >> >> >> "it's not really that much" is something I don't agree with. It doesn't employ someone, but it's enough to pay for things like developer meetups, hiring an extra GSoC student if a good one happens to come along, paying a web dev for a full redesign of the project website, etc. Each of those things is in the $5,000 - %15,000 range, and it's _very_ nice to be able to do them without having to look for funding first. >> >> Tidelift is a small (now ~25 employees) company by the way, and they have a real understanding of the open source sustainability issues and seem dedicated to helping fix it. >> >>>> - There is no such thing as free money; we have to consider how to account for and actually manage it (perhaps mitigated somewhat by NumFocus) >>> >>> >>> Perhaps Ralph can share how this has gone for NumPy. I imagine it's not too work on their end, thanks to NumFOCUS. >> >> >> NumFOCUS handles receiving the money and associated admin. As the project you'll be responsible for the setup and ongoing tasks. For NumPy and SciPy I have done those tasks. It's a fairly minimal amount of work: https://github.com/numpy/numpy/pulls?q=is%3Apr+tidelift+is%3Aclosed. The main one was dealing with GitHub not recognizing our license, and you don't have that issue for Pandas (it's reported correctly as BSD-3 in the UI at https://github.com/pandas-dev/pandas). >> >> So it's probably a day of work for one person, to get familiar with the interface, check dependencies, release streams, paste in release notes, etc. And then ongoing maybe one or a couple of hours a month. So far it's been a much more effective way of spending time than, for example, grant writing. >> >>> >>>> >>>> - Advertising and ties to a corporate sponsorship may weaken the brand of pandas; at that point we may lose some creditability as open source volunteers >>> >>> >>> Anecdotally, I don't think that's how the community views Tidelift. My perception (from Twitter, blogs / comments) is that it's been well received. >> >> >> Agree, the feedback I've seen is all quite positive. > > > Additionally, I don't think there is any "advertisement" involved, at least not in the classical sense of adding adds for third-party companies in a side bar to our website for which we get money. Of course we will need to mention Tidelift in some way, e.g. in our sponsors / institutional partners section, but we already do that for some other companies as well (that employ core devs). > >> >> >>> >>>> >>>> - We don?t (AFAIK) have a plan on how to spend or allocate it >>>> >>>> Not totally against it but perhaps the last point above is the main sticking one. Do we have any idea how much we?d actually pocket out of the $ 3k they offer us and subsequently what we would do with it? Cover travel expenses? Support PyData conferences? Scholarships? >>> >>> >>> Agreed that we should set a purpose for this money (though, I have no objection to collecting while we set that dedicated purpose). >> >> > Indeed we need to discuss this, but I don't think we already need to know *exactly* what we want to do with it before setting up a contract with Tidelift. It's good for me to alraedy start discussing it now, but maybe in a separate thread? > >> >> For NumPy and SciPy we haven't earmarked the funds yet. It's nice to build up a buffer first. One thing I'm thinking of is that we're participating in Google Season of Docs, and are getting more high quality applicants than Google will accept. So we could pay one or two tech writers from the funds. Our website and high level docs (tutorial, restructuring of all docs to guide users better) sure could use it:) >> >> My abstract advice would be: pay for things that require money (like a dev meeting) or don't get done for free. Don't pay for writing code unless the case is extremely compelling, because that'll be a drop in the bucket. >> >> Cheers, >> Ralf >> >> >>> >>>> >>>> - Will >>>> >>>> On Jun 11, 2019, at 4:44 AM, Ralf Gommers wrote: >>>> >>>> >>>> >>>> On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche wrote: >>>>> >>>>> The current page about pandas (https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar a month (but I am not fully sure this is what is already available from their current subscribers, or if it is a prospect). >>>> >>>> >>>> It's not just a prospect, that's what you should/will get. NumPy and SciPy get the listed amounts too. >>>> >>>> Agreed that the NumPy amount is not that much. The amount gets determined automatically; it's some combination of customer interest, dependency analysis and size of the API surface. >>>> >>>> The current amounts are: >>>> NumPy: $1000 >>>> SciPy: $2500 >>>> Pandas: $3000 >>>> Matplotlib: n.a. >>>> Scikit-learn: $1500 >>>> Scikit-image: $50 >>>> Statsmodels: $50 >>>> >>>> So there's an element of randomness, but the results are not completely surprising I think. The four libraries that get order thousands of dollars are the ones that large corporations are going to have the highest interest in. >>>> >>>> Cheers, >>>> Ralf >>>> >>>>> >>>>> >>>>> Op za 8 jun. 2019 om 22:54 schreef William Ayd : >>>>>> >>>>>> What is the minimum amount we are asking for? The $1,000 a month for NumPy seems rather low and I thought previous emails had something in the range of $3k a month. >>>>>> >>>>>> I don?t think we necessarily need or would be that much improved by $12k per year so would rather aim higher if we are going to do this >>>>>> >>>>>> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> We discussed this on the last dev chat, but putting it on the mailing list for those who were not present: we are planning to contact Tidelift to enter into a sponsor agreement for Pandas. >>>>>> >>>>>> The idea is to follow what NumPy (and recently also Scipy) did to have an agreement between Tidelift and NumFOCUS instead of an individual maintainer (see their announcement mail: https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html). >>>>>> Blog with overview about Tidelift: https://blog.tidelift.com/how-to-start-earning-money-for-your-open-source-project-with-tidelift. >>>>>> >>>>>> We didn't discuss yet what to do specifically with those funds, that should still be discussed in the future. >>>>>> >>>>>> Cheers, >>>>>> Joris >>>>>> _______________________________________________ >>>>>> Pandas-dev mailing list >>>>>> Pandas-dev at python.org >>>>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Pandas-dev mailing list >>>>> Pandas-dev at python.org >>>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>> >>>> >>>> _______________________________________________ >>>> Pandas-dev mailing list >>>> Pandas-dev at python.org >>>> https://mail.python.org/mailman/listinfo/pandas-dev >> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev From wesmckinn at gmail.com Tue Jun 11 10:12:20 2019 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 11 Jun 2019 09:12:20 -0500 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: > How you allocate the money to each other is something you can debate privately On this, I'm sure that you could set up a lightweight virtual "timesheet" so you can put yourselves "on the clock" when you're doing project maintenance work (there are many of these online, I just read about https://www.clockspot.com/ recently) to make time reporting a bit more accurate On Tue, Jun 11, 2019 at 9:09 AM Wes McKinney wrote: > > Personally, I would recommend putting most of the money in your own > pockets. The whole idea of Tidelift (as I understand it) is for the > individuals doing work that is of importance to project users (to whom > Tidelift is providing indemnification and "insurance" against defects) > to get paid for their labor. So I think the most honest way to use the > money is to put it in your respective bank accounts. If you've getting > a little bit of money to spend on yourself, doesn't that make doing > the maintenance work a bit less thankless? If you don't pay > yourselves, I think it actually "breaks" Tidelift's pitch to customers > which is that open source projects need to have a higher fraction of > compensated maintenance and support work than they do now. > > How you allocate the money to each other is something you can debate privately > > On Tue, Jun 11, 2019 at 8:42 AM Joris Van den Bossche > wrote: > > > > > > > > Op di 11 jun. 2019 om 15:31 schreef Ralf Gommers : > >> > >> > >> > >> On Tue, Jun 11, 2019 at 3:03 PM Tom Augspurger wrote: > >>> > >>> > >>> > >>> On Tue, Jun 11, 2019 at 7:58 AM William Ayd via Pandas-dev wrote: > >>>> > >>>> Just some counterpoints to consider: > >>>> > >>>> - $ 3,000 a month isn?t really that much, and if it?s just a number that a well-funded company chose for us chances are they are benefiting from it way more than we are > >> > >> > >> "it's not really that much" is something I don't agree with. It doesn't employ someone, but it's enough to pay for things like developer meetups, hiring an extra GSoC student if a good one happens to come along, paying a web dev for a full redesign of the project website, etc. Each of those things is in the $5,000 - %15,000 range, and it's _very_ nice to be able to do them without having to look for funding first. > >> > >> Tidelift is a small (now ~25 employees) company by the way, and they have a real understanding of the open source sustainability issues and seem dedicated to helping fix it. > >> > >>>> - There is no such thing as free money; we have to consider how to account for and actually manage it (perhaps mitigated somewhat by NumFocus) > >>> > >>> > >>> Perhaps Ralph can share how this has gone for NumPy. I imagine it's not too work on their end, thanks to NumFOCUS. > >> > >> > >> NumFOCUS handles receiving the money and associated admin. As the project you'll be responsible for the setup and ongoing tasks. For NumPy and SciPy I have done those tasks. It's a fairly minimal amount of work: https://github.com/numpy/numpy/pulls?q=is%3Apr+tidelift+is%3Aclosed. The main one was dealing with GitHub not recognizing our license, and you don't have that issue for Pandas (it's reported correctly as BSD-3 in the UI at https://github.com/pandas-dev/pandas). > >> > >> So it's probably a day of work for one person, to get familiar with the interface, check dependencies, release streams, paste in release notes, etc. And then ongoing maybe one or a couple of hours a month. So far it's been a much more effective way of spending time than, for example, grant writing. > >> > >>> > >>>> > >>>> - Advertising and ties to a corporate sponsorship may weaken the brand of pandas; at that point we may lose some creditability as open source volunteers > >>> > >>> > >>> Anecdotally, I don't think that's how the community views Tidelift. My perception (from Twitter, blogs / comments) is that it's been well received. > >> > >> > >> Agree, the feedback I've seen is all quite positive. > > > > > > Additionally, I don't think there is any "advertisement" involved, at least not in the classical sense of adding adds for third-party companies in a side bar to our website for which we get money. Of course we will need to mention Tidelift in some way, e.g. in our sponsors / institutional partners section, but we already do that for some other companies as well (that employ core devs). > > > >> > >> > >>> > >>>> > >>>> - We don?t (AFAIK) have a plan on how to spend or allocate it > >>>> > >>>> Not totally against it but perhaps the last point above is the main sticking one. Do we have any idea how much we?d actually pocket out of the $ 3k they offer us and subsequently what we would do with it? Cover travel expenses? Support PyData conferences? Scholarships? > >>> > >>> > >>> Agreed that we should set a purpose for this money (though, I have no objection to collecting while we set that dedicated purpose). > >> > >> > > Indeed we need to discuss this, but I don't think we already need to know *exactly* what we want to do with it before setting up a contract with Tidelift. It's good for me to alraedy start discussing it now, but maybe in a separate thread? > > > >> > >> For NumPy and SciPy we haven't earmarked the funds yet. It's nice to build up a buffer first. One thing I'm thinking of is that we're participating in Google Season of Docs, and are getting more high quality applicants than Google will accept. So we could pay one or two tech writers from the funds. Our website and high level docs (tutorial, restructuring of all docs to guide users better) sure could use it:) > >> > >> My abstract advice would be: pay for things that require money (like a dev meeting) or don't get done for free. Don't pay for writing code unless the case is extremely compelling, because that'll be a drop in the bucket. > >> > >> Cheers, > >> Ralf > >> > >> > >>> > >>>> > >>>> - Will > >>>> > >>>> On Jun 11, 2019, at 4:44 AM, Ralf Gommers wrote: > >>>> > >>>> > >>>> > >>>> On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche wrote: > >>>>> > >>>>> The current page about pandas (https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar a month (but I am not fully sure this is what is already available from their current subscribers, or if it is a prospect). > >>>> > >>>> > >>>> It's not just a prospect, that's what you should/will get. NumPy and SciPy get the listed amounts too. > >>>> > >>>> Agreed that the NumPy amount is not that much. The amount gets determined automatically; it's some combination of customer interest, dependency analysis and size of the API surface. > >>>> > >>>> The current amounts are: > >>>> NumPy: $1000 > >>>> SciPy: $2500 > >>>> Pandas: $3000 > >>>> Matplotlib: n.a. > >>>> Scikit-learn: $1500 > >>>> Scikit-image: $50 > >>>> Statsmodels: $50 > >>>> > >>>> So there's an element of randomness, but the results are not completely surprising I think. The four libraries that get order thousands of dollars are the ones that large corporations are going to have the highest interest in. > >>>> > >>>> Cheers, > >>>> Ralf > >>>> > >>>>> > >>>>> > >>>>> Op za 8 jun. 2019 om 22:54 schreef William Ayd : > >>>>>> > >>>>>> What is the minimum amount we are asking for? The $1,000 a month for NumPy seems rather low and I thought previous emails had something in the range of $3k a month. > >>>>>> > >>>>>> I don?t think we necessarily need or would be that much improved by $12k per year so would rather aim higher if we are going to do this > >>>>>> > >>>>>> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche wrote: > >>>>>> > >>>>>> Hi all, > >>>>>> > >>>>>> We discussed this on the last dev chat, but putting it on the mailing list for those who were not present: we are planning to contact Tidelift to enter into a sponsor agreement for Pandas. > >>>>>> > >>>>>> The idea is to follow what NumPy (and recently also Scipy) did to have an agreement between Tidelift and NumFOCUS instead of an individual maintainer (see their announcement mail: https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html). > >>>>>> Blog with overview about Tidelift: https://blog.tidelift.com/how-to-start-earning-money-for-your-open-source-project-with-tidelift. > >>>>>> > >>>>>> We didn't discuss yet what to do specifically with those funds, that should still be discussed in the future. > >>>>>> > >>>>>> Cheers, > >>>>>> Joris > >>>>>> _______________________________________________ > >>>>>> Pandas-dev mailing list > >>>>>> Pandas-dev at python.org > >>>>>> https://mail.python.org/mailman/listinfo/pandas-dev > >>>>>> > >>>>>> > >>>>> _______________________________________________ > >>>>> Pandas-dev mailing list > >>>>> Pandas-dev at python.org > >>>>> https://mail.python.org/mailman/listinfo/pandas-dev > >>>> > >>>> > >>>> _______________________________________________ > >>>> Pandas-dev mailing list > >>>> Pandas-dev at python.org > >>>> https://mail.python.org/mailman/listinfo/pandas-dev > >> > >> _______________________________________________ > >> Pandas-dev mailing list > >> Pandas-dev at python.org > >> https://mail.python.org/mailman/listinfo/pandas-dev > > > > _______________________________________________ > > Pandas-dev mailing list > > Pandas-dev at python.org > > https://mail.python.org/mailman/listinfo/pandas-dev From andy.terrel at gmail.com Tue Jun 11 10:56:04 2019 From: andy.terrel at gmail.com (Andy Ray Terrel) Date: Tue, 11 Jun 2019 09:56:04 -0500 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: While the original lifter agreement was an individual contract, in our negotiations with Tidelift, NumFOCUS has explicitly sought a model that allows the project to split the money how they prefer. This was always Tidelift's intention, it was just faster and easier to scale to focus on paying individuals. I do like the idea of paying for maintence work, I would recommend we set up folks as contractors with NumFOCUS rather than just pocketing money. It will give a lot more legal protection. Then if some folks don't want to take the cash you they can donate their time and be recognized as in-kind donations, which might have some tax deductions. It is something I would volunteer to help manage in order to learn how other projects might use the same techniques. -- Andy On Tue, Jun 11, 2019 at 9:13 AM Wes McKinney wrote: > > How you allocate the money to each other is something you can debate > privately > > On this, I'm sure that you could set up a lightweight virtual > "timesheet" so you can put yourselves "on the clock" when you're doing > project maintenance work (there are many of these online, I just read > about https://www.clockspot.com/ recently) to make time reporting a > bit more accurate > > On Tue, Jun 11, 2019 at 9:09 AM Wes McKinney wrote: > > > > Personally, I would recommend putting most of the money in your own > > pockets. The whole idea of Tidelift (as I understand it) is for the > > individuals doing work that is of importance to project users (to whom > > Tidelift is providing indemnification and "insurance" against defects) > > to get paid for their labor. So I think the most honest way to use the > > money is to put it in your respective bank accounts. If you've getting > > a little bit of money to spend on yourself, doesn't that make doing > > the maintenance work a bit less thankless? If you don't pay > > yourselves, I think it actually "breaks" Tidelift's pitch to customers > > which is that open source projects need to have a higher fraction of > > compensated maintenance and support work than they do now. > > > > How you allocate the money to each other is something you can debate > privately > > > > On Tue, Jun 11, 2019 at 8:42 AM Joris Van den Bossche > > wrote: > > > > > > > > > > > > Op di 11 jun. 2019 om 15:31 schreef Ralf Gommers < > ralf.gommers at gmail.com>: > > >> > > >> > > >> > > >> On Tue, Jun 11, 2019 at 3:03 PM Tom Augspurger < > tom.augspurger88 at gmail.com> wrote: > > >>> > > >>> > > >>> > > >>> On Tue, Jun 11, 2019 at 7:58 AM William Ayd via Pandas-dev < > pandas-dev at python.org> wrote: > > >>>> > > >>>> Just some counterpoints to consider: > > >>>> > > >>>> - $ 3,000 a month isn?t really that much, and if it?s just a number > that a well-funded company chose for us chances are they are benefiting > from it way more than we are > > >> > > >> > > >> "it's not really that much" is something I don't agree with. It > doesn't employ someone, but it's enough to pay for things like developer > meetups, hiring an extra GSoC student if a good one happens to come along, > paying a web dev for a full redesign of the project website, etc. Each of > those things is in the $5,000 - %15,000 range, and it's _very_ nice to be > able to do them without having to look for funding first. > > >> > > >> Tidelift is a small (now ~25 employees) company by the way, and they > have a real understanding of the open source sustainability issues and seem > dedicated to helping fix it. > > >> > > >>>> - There is no such thing as free money; we have to consider how to > account for and actually manage it (perhaps mitigated somewhat by NumFocus) > > >>> > > >>> > > >>> Perhaps Ralph can share how this has gone for NumPy. I imagine it's > not too work on their end, thanks to NumFOCUS. > > >> > > >> > > >> NumFOCUS handles receiving the money and associated admin. As the > project you'll be responsible for the setup and ongoing tasks. For NumPy > and SciPy I have done those tasks. It's a fairly minimal amount of work: > https://github.com/numpy/numpy/pulls?q=is%3Apr+tidelift+is%3Aclosed. The > main one was dealing with GitHub not recognizing our license, and you don't > have that issue for Pandas (it's reported correctly as BSD-3 in the UI at > https://github.com/pandas-dev/pandas). > > >> > > >> So it's probably a day of work for one person, to get familiar with > the interface, check dependencies, release streams, paste in release notes, > etc. And then ongoing maybe one or a couple of hours a month. So far it's > been a much more effective way of spending time than, for example, grant > writing. > > >> > > >>> > > >>>> > > >>>> - Advertising and ties to a corporate sponsorship may weaken the > brand of pandas; at that point we may lose some creditability as open > source volunteers > > >>> > > >>> > > >>> Anecdotally, I don't think that's how the community views Tidelift. > My perception (from Twitter, blogs / comments) is that it's been well > received. > > >> > > >> > > >> Agree, the feedback I've seen is all quite positive. > > > > > > > > > Additionally, I don't think there is any "advertisement" involved, at > least not in the classical sense of adding adds for third-party companies > in a side bar to our website for which we get money. Of course we will need > to mention Tidelift in some way, e.g. in our sponsors / institutional > partners section, but we already do that for some other companies as well > (that employ core devs). > > > > > >> > > >> > > >>> > > >>>> > > >>>> - We don?t (AFAIK) have a plan on how to spend or allocate it > > >>>> > > >>>> Not totally against it but perhaps the last point above is the main > sticking one. Do we have any idea how much we?d actually pocket out of the > $ 3k they offer us and subsequently what we would do with it? Cover travel > expenses? Support PyData conferences? Scholarships? > > >>> > > >>> > > >>> Agreed that we should set a purpose for this money (though, I have > no objection to collecting while we set that dedicated purpose). > > >> > > >> > > > Indeed we need to discuss this, but I don't think we already need to > know *exactly* what we want to do with it before setting up a contract with > Tidelift. It's good for me to alraedy start discussing it now, but maybe in > a separate thread? > > > > > >> > > >> For NumPy and SciPy we haven't earmarked the funds yet. It's nice to > build up a buffer first. One thing I'm thinking of is that we're > participating in Google Season of Docs, and are getting more high quality > applicants than Google will accept. So we could pay one or two tech writers > from the funds. Our website and high level docs (tutorial, restructuring of > all docs to guide users better) sure could use it:) > > >> > > >> My abstract advice would be: pay for things that require money (like > a dev meeting) or don't get done for free. Don't pay for writing code > unless the case is extremely compelling, because that'll be a drop in the > bucket. > > >> > > >> Cheers, > > >> Ralf > > >> > > >> > > >>> > > >>>> > > >>>> - Will > > >>>> > > >>>> On Jun 11, 2019, at 4:44 AM, Ralf Gommers > wrote: > > >>>> > > >>>> > > >>>> > > >>>> On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > > >>>>> > > >>>>> The current page about pandas ( > https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar a > month (but I am not fully sure this is what is already available from their > current subscribers, or if it is a prospect). > > >>>> > > >>>> > > >>>> It's not just a prospect, that's what you should/will get. NumPy > and SciPy get the listed amounts too. > > >>>> > > >>>> Agreed that the NumPy amount is not that much. The amount gets > determined automatically; it's some combination of customer interest, > dependency analysis and size of the API surface. > > >>>> > > >>>> The current amounts are: > > >>>> NumPy: $1000 > > >>>> SciPy: $2500 > > >>>> Pandas: $3000 > > >>>> Matplotlib: n.a. > > >>>> Scikit-learn: $1500 > > >>>> Scikit-image: $50 > > >>>> Statsmodels: $50 > > >>>> > > >>>> So there's an element of randomness, but the results are not > completely surprising I think. The four libraries that get order thousands > of dollars are the ones that large corporations are going to have the > highest interest in. > > >>>> > > >>>> Cheers, > > >>>> Ralf > > >>>> > > >>>>> > > >>>>> > > >>>>> Op za 8 jun. 2019 om 22:54 schreef William Ayd < > william.ayd at icloud.com>: > > >>>>>> > > >>>>>> What is the minimum amount we are asking for? The $1,000 a month > for NumPy seems rather low and I thought previous emails had something in > the range of $3k a month. > > >>>>>> > > >>>>>> I don?t think we necessarily need or would be that much improved > by $12k per year so would rather aim higher if we are going to do this > > >>>>>> > > >>>>>> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > > >>>>>> > > >>>>>> Hi all, > > >>>>>> > > >>>>>> We discussed this on the last dev chat, but putting it on the > mailing list for those who were not present: we are planning to contact > Tidelift to enter into a sponsor agreement for Pandas. > > >>>>>> > > >>>>>> The idea is to follow what NumPy (and recently also Scipy) did to > have an agreement between Tidelift and NumFOCUS instead of an individual > maintainer (see their announcement mail: > https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html > ). > > >>>>>> Blog with overview about Tidelift: > https://blog.tidelift.com/how-to-start-earning-money-for-your-open-source-project-with-tidelift > . > > >>>>>> > > >>>>>> We didn't discuss yet what to do specifically with those funds, > that should still be discussed in the future. > > >>>>>> > > >>>>>> Cheers, > > >>>>>> Joris > > >>>>>> _______________________________________________ > > >>>>>> Pandas-dev mailing list > > >>>>>> Pandas-dev at python.org > > >>>>>> https://mail.python.org/mailman/listinfo/pandas-dev > > >>>>>> > > >>>>>> > > >>>>> _______________________________________________ > > >>>>> Pandas-dev mailing list > > >>>>> Pandas-dev at python.org > > >>>>> https://mail.python.org/mailman/listinfo/pandas-dev > > >>>> > > >>>> > > >>>> _______________________________________________ > > >>>> Pandas-dev mailing list > > >>>> Pandas-dev at python.org > > >>>> https://mail.python.org/mailman/listinfo/pandas-dev > > >> > > >> _______________________________________________ > > >> Pandas-dev mailing list > > >> Pandas-dev at python.org > > >> https://mail.python.org/mailman/listinfo/pandas-dev > > > > > > _______________________________________________ > > > Pandas-dev mailing list > > > Pandas-dev at python.org > > > https://mail.python.org/mailman/listinfo/pandas-dev > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.ayd at icloud.com Tue Jun 11 11:54:40 2019 From: william.ayd at icloud.com (William Ayd) Date: Tue, 11 Jun 2019 11:54:40 -0400 Subject: [Pandas-dev] Pandas dev sprint June 27-30 @ Nashville In-Reply-To: References:

Message-ID: Doesn?t sound like it so I added a new section to our dev notes on google drive if it helps: https://docs.google.com/document/d/1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit# > On Jun 10, 2019, at 7:56 PM, Brock Mendel wrote: > > Is there an agenda document somewhere? > > On Mon, Jun 10, 2019 at 1:02 PM Wes McKinney > wrote: > To confirm, the location for the sprints is in Downtown Nashville near > the state capitol building. Specific details about the location will > be shared non-publicly with sprint participants > > On Tue, Jun 4, 2019 at 7:50 AM Wes McKinney > wrote: > > > > Yes, I'd recommend within reach of East Nashville or Downtown > > Nashville. The earlier you can book accommodations the better! > > > > I don't have an accurate headcount yet so I haven't been able to > > arrange the location to meet yet. > > > > On Tue, Jun 4, 2019 at 12:59 AM Matthew Roeschke > > > wrote: > > > > > > Wes, any tips where in Nashville we should book accommodations or generally where the co-working space would be? I'm looking to book my accomodations fairly soon. > > > > > > Thanks. > > > > > > On Thu, May 30, 2019 at 10:57 AM Wes McKinney > wrote: > > >> > > >> @Joris if you have a complete headcount for the meeting please let me > > >> know so I can work on securing a space for us to work. Just to confirm > > >> it's the 27th through the 30th inclusive, so 4 full days of workspace > > >> required? > > >> > > >> Thanks > > >> > > >> On Sat, May 25, 2019 at 3:44 PM Chang She > wrote: > > >> > > > >> > Oh this was more me just carving time out as a forcing function for myself. I?ll be +13 hour ahead so certainly not expecting video conferences. Code reviews would be appreciated. > > >> > > > >> > As for the read_parquet item, I will either attach to an existing issue or open a new one so discussion can happen on github. > > >> > > > >> > Thanks. > > >> > > > >> > On Saturday, May 25, 2019, Joris Van den Bossche > wrote: > > >> >> > > >> >> You are certainly welcome to sprint those days as well. I only can't promise that we can do any things to improve remote participation such as video meetings (but there should be a bunch of core devs active, which might give faster feedback on PRs if needed). > > >> >> > > >> >> Op di 14 mei 2019 om 22:59 schreef Chang She >: > > >> >>> > > >> >>> I'll be out of the country during those dates. Can I still join in remotely? > > >> >>> > > >> >>> Here's what I'd be interested in working on if there's appetite for these to be part of pandas and friends: > > >> >>> > > >> >>> 1. A stale PR on Series.explode I haven't had any time to finish up (https://github.com/pandas-dev/pandas/pull/24366 ). > > >> >> > > >> >> > > >> >> I think there is still certainly interest in such a function (I have regularly needed something like that myself). > > >> >> > > >> >>> > > >> >>> 2. Open sourcing an improvement to the pandas-redshift connector that speeds up the ingestion of medium amounts of data using a combination of unload + read_csv + multiprocessing. > > >> >> > > >> >> > > >> >> That reminds me: it might be good to mention the pandas-redshift packages somewhere in the docs (in the ecosystem page or in the sql docs). > > >> >> > > >> >>> > > >> >>> 3. A minor improvement to allow read_parquet to work with globs directly. This makes it a lot easier for pandas to read parquet generated by Spark. > > >> >>> > > >> >> Do you know if there is already an open issue about this? > > >> >> > > >> >> Best, > > >> >> Joris > > >> >> > > >> >>> > > >> >>> > > >> >>> > > >> >>> On Tue, May 14, 2019 at 4:50 AM Joris Van den Bossche > wrote: > > >> >>>> > > >> >>>> Dear all, > > >> >>>> > > >> >>>> We are planning to do a pandas sprint end of June in Nashville (Tennessee, USA): June 27-30. We will be meeting with some of the core devs (so not a sprint to jump-start newcomers in this case), but sending this to the mailing list to invite other pandas (or related libraries) contributors. > > >> >>>> The exact planning of the sprint still needs to be discussed, but we will probably be hacking and discussing on pandas, extension arrays, next versions of pandas, etc. > > >> >>>> > > >> >>>> So if you are interested, let me know something! We want to keep the number of participants somewhat limited, and also need to plan the location and funding, so please state your interest before May 30. > > >> >>>> If you would like to participate, but not sure if you would fit at such a sprint, don't hesitate to mail me personally. > > >> >>>> > > >> >>>> Best, > > >> >>>> Joris > > >> >>>> > > >> >>>> > > >> >>>> _______________________________________________ > > >> >>>> Pandas-dev mailing list > > >> >>>> Pandas-dev at python.org > > >> >>>> https://mail.python.org/mailman/listinfo/pandas-dev > > >> > > > >> > _______________________________________________ > > >> > Pandas-dev mailing list > > >> > Pandas-dev at python.org > > >> > https://mail.python.org/mailman/listinfo/pandas-dev > > >> _______________________________________________ > > >> Pandas-dev mailing list > > >> Pandas-dev at python.org > > >> https://mail.python.org/mailman/listinfo/pandas-dev > > > > > > > > > > > > -- > > > Matthew Roeschke > > > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Jun 11 12:18:19 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 11 Jun 2019 18:18:19 +0200 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: On Tue, Jun 11, 2019 at 4:56 PM Andy Ray Terrel wrote: > While the original lifter agreement was an individual contract, in our > negotiations with Tidelift, NumFOCUS has explicitly sought a model that > allows the project to split the money how they prefer. This was always > Tidelift's intention, it was just faster and easier to scale to focus on > paying individuals. > +1 the project decides for themselves is the intent and a good principle. > I do like the idea of paying for maintence work, I would recommend we set > up folks as contractors with NumFOCUS rather than just pocketing money. It > will give a lot more legal protection. Then if some folks don't want to > take the cash you they can donate their time and be recognized as in-kind > donations, which might have some tax deductions. > Keep in mind that this has a lot of potential issues. Examples: 1. Who decides who gets paid, and how? The pandas repo has 1500+ contributors. Lots of potential for friction over small amount of $. 2. Many people have employment contracts, those typically forbid contracting on the side. So inherently unfair to distribute only to those who are in a position to accept the money. 3. You're now introducing lots of extra paperwork and admin, both directly and indirectly (who wants to deal with the extra complications when filing your taxes?). 4. It may create other weird social dynamics. E.g. if money is now directly coupled to a commit bit, that makes the "who do we give commit rights and when" a potentially more loaded question. And, dividing it into N chunks, the funding becomes nice beer money and a thank you for volunteering. Could be exactly what you'd prefer as a team. But that's imho more in line with the current version of Patreon or GitHub Sponsors rather then with what Tidelift is aiming for. I'd like the idea of "paying for maintenance" if there were enough money to employ people. But realistically, that will take many years. The Tidelift slogan on this is unrealistic for a project like Pandas where maintenance effort is many FTEs; it's perhaps feasible for your typical Javascript library that's popular but small enough for one person maintaining it part-time. > It is something I would volunteer to help manage in order to learn how > other projects might use the same techniques. > > -- Andy > > On Tue, Jun 11, 2019 at 9:13 AM Wes McKinney wrote: > >> > How you allocate the money to each other is something you can debate >> privately >> >> On this, I'm sure that you could set up a lightweight virtual >> "timesheet" so you can put yourselves "on the clock" when you're doing >> project maintenance work (there are many of these online, I just read >> about https://www.clockspot.com/ recently) to make time reporting a >> bit more accurate >> >> On Tue, Jun 11, 2019 at 9:09 AM Wes McKinney wrote: >> > >> > Personally, I would recommend putting most of the money in your own >> > pockets. The whole idea of Tidelift (as I understand it) is for the >> > individuals doing work that is of importance to project users (to whom >> > Tidelift is providing indemnification and "insurance" against defects) >> > Actually that's only partially true. Tidelift is paying for very specific things, that allow them to do aggregated reporting on licensing, dependencies, security vulnerabilities, release streams & release docs, etc. - basically the stuff that helps large corporations do due diligence and management of a large software stack. It is explicitly out of scope to work on bugs or enhancements in the NumFOCUS-Tidelift agreement (and working on particular technical items was never their intention). So "insurance against defects" isn't part of this, except in a very abstract sense of making the project healthier and therefore reducing the risk of it being abandoned or a lot more buggy on the many-year time scale. Cheers, Ralf > to get paid for their labor. So I think the most honest way to use the >> > money is to put it in your respective bank accounts. If you've getting >> > a little bit of money to spend on yourself, doesn't that make doing >> > the maintenance work a bit less thankless? If you don't pay >> > yourselves, I think it actually "breaks" Tidelift's pitch to customers >> > which is that open source projects need to have a higher fraction of >> > compensated maintenance and support work than they do now. >> > >> > How you allocate the money to each other is something you can debate >> privately >> > >> > On Tue, Jun 11, 2019 at 8:42 AM Joris Van den Bossche >> > wrote: >> > > >> > > >> > > >> > > Op di 11 jun. 2019 om 15:31 schreef Ralf Gommers < >> ralf.gommers at gmail.com>: >> > >> >> > >> >> > >> >> > >> On Tue, Jun 11, 2019 at 3:03 PM Tom Augspurger < >> tom.augspurger88 at gmail.com> wrote: >> > >>> >> > >>> >> > >>> >> > >>> On Tue, Jun 11, 2019 at 7:58 AM William Ayd via Pandas-dev < >> pandas-dev at python.org> wrote: >> > >>>> >> > >>>> Just some counterpoints to consider: >> > >>>> >> > >>>> - $ 3,000 a month isn?t really that much, and if it?s just a >> number that a well-funded company chose for us chances are they are >> benefiting from it way more than we are >> > >> >> > >> >> > >> "it's not really that much" is something I don't agree with. It >> doesn't employ someone, but it's enough to pay for things like developer >> meetups, hiring an extra GSoC student if a good one happens to come along, >> paying a web dev for a full redesign of the project website, etc. Each of >> those things is in the $5,000 - %15,000 range, and it's _very_ nice to be >> able to do them without having to look for funding first. >> > >> >> > >> Tidelift is a small (now ~25 employees) company by the way, and they >> have a real understanding of the open source sustainability issues and seem >> dedicated to helping fix it. >> > >> >> > >>>> - There is no such thing as free money; we have to consider how to >> account for and actually manage it (perhaps mitigated somewhat by NumFocus) >> > >>> >> > >>> >> > >>> Perhaps Ralph can share how this has gone for NumPy. I imagine it's >> not too work on their end, thanks to NumFOCUS. >> > >> >> > >> >> > >> NumFOCUS handles receiving the money and associated admin. As the >> project you'll be responsible for the setup and ongoing tasks. For NumPy >> and SciPy I have done those tasks. It's a fairly minimal amount of work: >> https://github.com/numpy/numpy/pulls?q=is%3Apr+tidelift+is%3Aclosed. The >> main one was dealing with GitHub not recognizing our license, and you don't >> have that issue for Pandas (it's reported correctly as BSD-3 in the UI at >> https://github.com/pandas-dev/pandas). >> > >> >> > >> So it's probably a day of work for one person, to get familiar with >> the interface, check dependencies, release streams, paste in release notes, >> etc. And then ongoing maybe one or a couple of hours a month. So far it's >> been a much more effective way of spending time than, for example, grant >> writing. >> > >> >> > >>> >> > >>>> >> > >>>> - Advertising and ties to a corporate sponsorship may weaken the >> brand of pandas; at that point we may lose some creditability as open >> source volunteers >> > >>> >> > >>> >> > >>> Anecdotally, I don't think that's how the community views Tidelift. >> My perception (from Twitter, blogs / comments) is that it's been well >> received. >> > >> >> > >> >> > >> Agree, the feedback I've seen is all quite positive. >> > > >> > > >> > > Additionally, I don't think there is any "advertisement" involved, at >> least not in the classical sense of adding adds for third-party companies >> in a side bar to our website for which we get money. Of course we will need >> to mention Tidelift in some way, e.g. in our sponsors / institutional >> partners section, but we already do that for some other companies as well >> (that employ core devs). >> > > >> > >> >> > >> >> > >>> >> > >>>> >> > >>>> - We don?t (AFAIK) have a plan on how to spend or allocate it >> > >>>> >> > >>>> Not totally against it but perhaps the last point above is the >> main sticking one. Do we have any idea how much we?d actually pocket out of >> the $ 3k they offer us and subsequently what we would do with it? Cover >> travel expenses? Support PyData conferences? Scholarships? >> > >>> >> > >>> >> > >>> Agreed that we should set a purpose for this money (though, I have >> no objection to collecting while we set that dedicated purpose). >> > >> >> > >> >> > > Indeed we need to discuss this, but I don't think we already need to >> know *exactly* what we want to do with it before setting up a contract with >> Tidelift. It's good for me to alraedy start discussing it now, but maybe in >> a separate thread? >> > > >> > >> >> > >> For NumPy and SciPy we haven't earmarked the funds yet. It's nice to >> build up a buffer first. One thing I'm thinking of is that we're >> participating in Google Season of Docs, and are getting more high quality >> applicants than Google will accept. So we could pay one or two tech writers >> from the funds. Our website and high level docs (tutorial, restructuring of >> all docs to guide users better) sure could use it:) >> > >> >> > >> My abstract advice would be: pay for things that require money (like >> a dev meeting) or don't get done for free. Don't pay for writing code >> unless the case is extremely compelling, because that'll be a drop in the >> bucket. >> > >> >> > >> Cheers, >> > >> Ralf >> > >> >> > >> >> > >>> >> > >>>> >> > >>>> - Will >> > >>>> >> > >>>> On Jun 11, 2019, at 4:44 AM, Ralf Gommers >> wrote: >> > >>>> >> > >>>> >> > >>>> >> > >>>> On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche < >> jorisvandenbossche at gmail.com> wrote: >> > >>>>> >> > >>>>> The current page about pandas ( >> https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar a >> month (but I am not fully sure this is what is already available from their >> current subscribers, or if it is a prospect). >> > >>>> >> > >>>> >> > >>>> It's not just a prospect, that's what you should/will get. NumPy >> and SciPy get the listed amounts too. >> > >>>> >> > >>>> Agreed that the NumPy amount is not that much. The amount gets >> determined automatically; it's some combination of customer interest, >> dependency analysis and size of the API surface. >> > >>>> >> > >>>> The current amounts are: >> > >>>> NumPy: $1000 >> > >>>> SciPy: $2500 >> > >>>> Pandas: $3000 >> > >>>> Matplotlib: n.a. >> > >>>> Scikit-learn: $1500 >> > >>>> Scikit-image: $50 >> > >>>> Statsmodels: $50 >> > >>>> >> > >>>> So there's an element of randomness, but the results are not >> completely surprising I think. The four libraries that get order thousands >> of dollars are the ones that large corporations are going to have the >> highest interest in. >> > >>>> >> > >>>> Cheers, >> > >>>> Ralf >> > >>>> >> > >>>>> >> > >>>>> >> > >>>>> Op za 8 jun. 2019 om 22:54 schreef William Ayd < >> william.ayd at icloud.com>: >> > >>>>>> >> > >>>>>> What is the minimum amount we are asking for? The $1,000 a month >> for NumPy seems rather low and I thought previous emails had something in >> the range of $3k a month. >> > >>>>>> >> > >>>>>> I don?t think we necessarily need or would be that much improved >> by $12k per year so would rather aim higher if we are going to do this >> > >>>>>> >> > >>>>>> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche < >> jorisvandenbossche at gmail.com> wrote: >> > >>>>>> >> > >>>>>> Hi all, >> > >>>>>> >> > >>>>>> We discussed this on the last dev chat, but putting it on the >> mailing list for those who were not present: we are planning to contact >> Tidelift to enter into a sponsor agreement for Pandas. >> > >>>>>> >> > >>>>>> The idea is to follow what NumPy (and recently also Scipy) did >> to have an agreement between Tidelift and NumFOCUS instead of an individual >> maintainer (see their announcement mail: >> https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html >> ). >> > >>>>>> Blog with overview about Tidelift: >> https://blog.tidelift.com/how-to-start-earning-money-for-your-open-source-project-with-tidelift >> . >> > >>>>>> >> > >>>>>> We didn't discuss yet what to do specifically with those funds, >> that should still be discussed in the future. >> > >>>>>> >> > >>>>>> Cheers, >> > >>>>>> Joris >> > >>>>>> _______________________________________________ >> > >>>>>> Pandas-dev mailing list >> > >>>>>> Pandas-dev at python.org >> > >>>>>> https://mail.python.org/mailman/listinfo/pandas-dev >> > >>>>>> >> > >>>>>> >> > >>>>> _______________________________________________ >> > >>>>> Pandas-dev mailing list >> > >>>>> Pandas-dev at python.org >> > >>>>> https://mail.python.org/mailman/listinfo/pandas-dev >> > >>>> >> > >>>> >> > >>>> _______________________________________________ >> > >>>> Pandas-dev mailing list >> > >>>> Pandas-dev at python.org >> > >>>> https://mail.python.org/mailman/listinfo/pandas-dev >> > >> >> > >> _______________________________________________ >> > >> Pandas-dev mailing list >> > >> Pandas-dev at python.org >> > >> https://mail.python.org/mailman/listinfo/pandas-dev >> > > >> > > _______________________________________________ >> > > Pandas-dev mailing list >> > > Pandas-dev at python.org >> > > https://mail.python.org/mailman/listinfo/pandas-dev >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Tue Jun 11 13:50:47 2019 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 11 Jun 2019 12:50:47 -0500 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: hi, On Tue, Jun 11, 2019 at 11:16 AM Ralf Gommers wrote: > > > > On Tue, Jun 11, 2019 at 4:56 PM Andy Ray Terrel wrote: >> >> While the original lifter agreement was an individual contract, in our negotiations with Tidelift, NumFOCUS has explicitly sought a model that allows the project to split the money how they prefer. This was always Tidelift's intention, it was just faster and easier to scale to focus on paying individuals. > > > +1 the project decides for themselves is the intent and a good principle. > >> >> I do like the idea of paying for maintence work, I would recommend we set up folks as contractors with NumFOCUS rather than just pocketing money. It will give a lot more legal protection. Then if some folks don't want to take the cash you they can donate their time and be recognized as in-kind donations, which might have some tax deductions. > > > Keep in mind that this has a lot of potential issues. Examples: > 1. Who decides who gets paid, and how? The pandas repo has 1500+ contributors. Lots of potential for friction over small amount of $. More or less the _entire_ point of Tidelift is to incentivize people to do more maintenance work. I think it's worth at least attempting to use this money for its intended economic purpose. The maintainers are, as a first approximation, the ~10-15 active core members listed on https://github.com/pandas-dev/pandas-governance IMHO those are the people that should get paid (going forward) -- if contributors are more motivated to become core team members / maintainers as a result of the Tidelift money, then it has had the desired outcome. > 2. Many people have employment contracts, those typically forbid contracting on the side. So inherently unfair to distribute only to those who are in a position to accept the money. This is true -- at least Jeff and maybe others fall into this category. In such cases their "cut" of the maintenance funds can go into the communal fund to pay for other stuff > 3. You're now introducing lots of extra paperwork and admin, both directly and indirectly (who wants to deal with the extra complications when filing your taxes?). Hopefully we're talking just a 1099 from NumFOCUS with a single number to type in, but I'm the wrong person to judge since my taxes are more complicated than most people's =) > 4. It may create other weird social dynamics. E.g. if money is now directly coupled to a commit bit, that makes the "who do we give commit rights and when" a potentially more loaded question. I think this is where the honest self-reporting of time spent comes in. The goal is to increase the average number of maintainer hours per month/year. It's sort of like a crypto-mining pool, but for open source software maintenance =) Obviously maintainers are accountable to the rest of the core team to behave with integrity (professionalism, honesty, etc.) or they can be voted to be removed if they are found to be dishonest. > > And, dividing it into N chunks, the funding becomes nice beer money and a thank you for volunteering. Could be exactly what you'd prefer as a team. But that's imho more in line with the current version of Patreon or GitHub Sponsors rather then with what Tidelift is aiming for. > > I'd like the idea of "paying for maintenance" if there were enough money to employ people. But realistically, that will take many years. The Tidelift slogan on this is unrealistic for a project like Pandas where maintenance effort is many FTEs; it's perhaps feasible for your typical Javascript library that's popular but small enough for one person maintaining it part-time. > >> >> It is something I would volunteer to help manage in order to learn how other projects might use the same techniques. >> >> -- Andy >> >> On Tue, Jun 11, 2019 at 9:13 AM Wes McKinney wrote: >>> >>> > How you allocate the money to each other is something you can debate privately >>> >>> On this, I'm sure that you could set up a lightweight virtual >>> "timesheet" so you can put yourselves "on the clock" when you're doing >>> project maintenance work (there are many of these online, I just read >>> about https://www.clockspot.com/ recently) to make time reporting a >>> bit more accurate >>> >>> On Tue, Jun 11, 2019 at 9:09 AM Wes McKinney wrote: >>> > >>> > Personally, I would recommend putting most of the money in your own >>> > pockets. The whole idea of Tidelift (as I understand it) is for the >>> > individuals doing work that is of importance to project users (to whom >>> > Tidelift is providing indemnification and "insurance" against defects) > > > Actually that's only partially true. Tidelift is paying for very specific things, that allow them to do aggregated reporting on licensing, dependencies, security vulnerabilities, release streams & release docs, etc. - basically the stuff that helps large corporations do due diligence and management of a large software stack. > > It is explicitly out of scope to work on bugs or enhancements in the NumFOCUS-Tidelift agreement (and working on particular technical items was never their intention). So "insurance against defects" isn't part of this, except in a very abstract sense of making the project healthier and therefore reducing the risk of it being abandoned or a lot more buggy on the many-year time scale. > > Cheers, > Ralf > > >>> > to get paid for their labor. So I think the most honest way to use the >>> > money is to put it in your respective bank accounts. If you've getting >>> > a little bit of money to spend on yourself, doesn't that make doing >>> > the maintenance work a bit less thankless? If you don't pay >>> > yourselves, I think it actually "breaks" Tidelift's pitch to customers >>> > which is that open source projects need to have a higher fraction of >>> > compensated maintenance and support work than they do now. >>> > >>> > How you allocate the money to each other is something you can debate privately >>> > >>> > On Tue, Jun 11, 2019 at 8:42 AM Joris Van den Bossche >>> > wrote: >>> > > >>> > > >>> > > >>> > > Op di 11 jun. 2019 om 15:31 schreef Ralf Gommers : >>> > >> >>> > >> >>> > >> >>> > >> On Tue, Jun 11, 2019 at 3:03 PM Tom Augspurger wrote: >>> > >>> >>> > >>> >>> > >>> >>> > >>> On Tue, Jun 11, 2019 at 7:58 AM William Ayd via Pandas-dev wrote: >>> > >>>> >>> > >>>> Just some counterpoints to consider: >>> > >>>> >>> > >>>> - $ 3,000 a month isn?t really that much, and if it?s just a number that a well-funded company chose for us chances are they are benefiting from it way more than we are >>> > >> >>> > >> >>> > >> "it's not really that much" is something I don't agree with. It doesn't employ someone, but it's enough to pay for things like developer meetups, hiring an extra GSoC student if a good one happens to come along, paying a web dev for a full redesign of the project website, etc. Each of those things is in the $5,000 - %15,000 range, and it's _very_ nice to be able to do them without having to look for funding first. >>> > >> >>> > >> Tidelift is a small (now ~25 employees) company by the way, and they have a real understanding of the open source sustainability issues and seem dedicated to helping fix it. >>> > >> >>> > >>>> - There is no such thing as free money; we have to consider how to account for and actually manage it (perhaps mitigated somewhat by NumFocus) >>> > >>> >>> > >>> >>> > >>> Perhaps Ralph can share how this has gone for NumPy. I imagine it's not too work on their end, thanks to NumFOCUS. >>> > >> >>> > >> >>> > >> NumFOCUS handles receiving the money and associated admin. As the project you'll be responsible for the setup and ongoing tasks. For NumPy and SciPy I have done those tasks. It's a fairly minimal amount of work: https://github.com/numpy/numpy/pulls?q=is%3Apr+tidelift+is%3Aclosed. The main one was dealing with GitHub not recognizing our license, and you don't have that issue for Pandas (it's reported correctly as BSD-3 in the UI at https://github.com/pandas-dev/pandas). >>> > >> >>> > >> So it's probably a day of work for one person, to get familiar with the interface, check dependencies, release streams, paste in release notes, etc. And then ongoing maybe one or a couple of hours a month. So far it's been a much more effective way of spending time than, for example, grant writing. >>> > >> >>> > >>> >>> > >>>> >>> > >>>> - Advertising and ties to a corporate sponsorship may weaken the brand of pandas; at that point we may lose some creditability as open source volunteers >>> > >>> >>> > >>> >>> > >>> Anecdotally, I don't think that's how the community views Tidelift. My perception (from Twitter, blogs / comments) is that it's been well received. >>> > >> >>> > >> >>> > >> Agree, the feedback I've seen is all quite positive. >>> > > >>> > > >>> > > Additionally, I don't think there is any "advertisement" involved, at least not in the classical sense of adding adds for third-party companies in a side bar to our website for which we get money. Of course we will need to mention Tidelift in some way, e.g. in our sponsors / institutional partners section, but we already do that for some other companies as well (that employ core devs). >>> > > >>> > >> >>> > >> >>> > >>> >>> > >>>> >>> > >>>> - We don?t (AFAIK) have a plan on how to spend or allocate it >>> > >>>> >>> > >>>> Not totally against it but perhaps the last point above is the main sticking one. Do we have any idea how much we?d actually pocket out of the $ 3k they offer us and subsequently what we would do with it? Cover travel expenses? Support PyData conferences? Scholarships? >>> > >>> >>> > >>> >>> > >>> Agreed that we should set a purpose for this money (though, I have no objection to collecting while we set that dedicated purpose). >>> > >> >>> > >> >>> > > Indeed we need to discuss this, but I don't think we already need to know *exactly* what we want to do with it before setting up a contract with Tidelift. It's good for me to alraedy start discussing it now, but maybe in a separate thread? >>> > > >>> > >> >>> > >> For NumPy and SciPy we haven't earmarked the funds yet. It's nice to build up a buffer first. One thing I'm thinking of is that we're participating in Google Season of Docs, and are getting more high quality applicants than Google will accept. So we could pay one or two tech writers from the funds. Our website and high level docs (tutorial, restructuring of all docs to guide users better) sure could use it:) >>> > >> >>> > >> My abstract advice would be: pay for things that require money (like a dev meeting) or don't get done for free. Don't pay for writing code unless the case is extremely compelling, because that'll be a drop in the bucket. >>> > >> >>> > >> Cheers, >>> > >> Ralf >>> > >> >>> > >> >>> > >>> >>> > >>>> >>> > >>>> - Will >>> > >>>> >>> > >>>> On Jun 11, 2019, at 4:44 AM, Ralf Gommers wrote: >>> > >>>> >>> > >>>> >>> > >>>> >>> > >>>> On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche wrote: >>> > >>>>> >>> > >>>>> The current page about pandas (https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar a month (but I am not fully sure this is what is already available from their current subscribers, or if it is a prospect). >>> > >>>> >>> > >>>> >>> > >>>> It's not just a prospect, that's what you should/will get. NumPy and SciPy get the listed amounts too. >>> > >>>> >>> > >>>> Agreed that the NumPy amount is not that much. The amount gets determined automatically; it's some combination of customer interest, dependency analysis and size of the API surface. >>> > >>>> >>> > >>>> The current amounts are: >>> > >>>> NumPy: $1000 >>> > >>>> SciPy: $2500 >>> > >>>> Pandas: $3000 >>> > >>>> Matplotlib: n.a. >>> > >>>> Scikit-learn: $1500 >>> > >>>> Scikit-image: $50 >>> > >>>> Statsmodels: $50 >>> > >>>> >>> > >>>> So there's an element of randomness, but the results are not completely surprising I think. The four libraries that get order thousands of dollars are the ones that large corporations are going to have the highest interest in. >>> > >>>> >>> > >>>> Cheers, >>> > >>>> Ralf >>> > >>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> Op za 8 jun. 2019 om 22:54 schreef William Ayd : >>> > >>>>>> >>> > >>>>>> What is the minimum amount we are asking for? The $1,000 a month for NumPy seems rather low and I thought previous emails had something in the range of $3k a month. >>> > >>>>>> >>> > >>>>>> I don?t think we necessarily need or would be that much improved by $12k per year so would rather aim higher if we are going to do this >>> > >>>>>> >>> > >>>>>> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche wrote: >>> > >>>>>> >>> > >>>>>> Hi all, >>> > >>>>>> >>> > >>>>>> We discussed this on the last dev chat, but putting it on the mailing list for those who were not present: we are planning to contact Tidelift to enter into a sponsor agreement for Pandas. >>> > >>>>>> >>> > >>>>>> The idea is to follow what NumPy (and recently also Scipy) did to have an agreement between Tidelift and NumFOCUS instead of an individual maintainer (see their announcement mail: https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html). >>> > >>>>>> Blog with overview about Tidelift: https://blog.tidelift.com/how-to-start-earning-money-for-your-open-source-project-with-tidelift. >>> > >>>>>> >>> > >>>>>> We didn't discuss yet what to do specifically with those funds, that should still be discussed in the future. >>> > >>>>>> >>> > >>>>>> Cheers, >>> > >>>>>> Joris >>> > >>>>>> _______________________________________________ >>> > >>>>>> Pandas-dev mailing list >>> > >>>>>> Pandas-dev at python.org >>> > >>>>>> https://mail.python.org/mailman/listinfo/pandas-dev >>> > >>>>>> >>> > >>>>>> >>> > >>>>> _______________________________________________ >>> > >>>>> Pandas-dev mailing list >>> > >>>>> Pandas-dev at python.org >>> > >>>>> https://mail.python.org/mailman/listinfo/pandas-dev >>> > >>>> >>> > >>>> >>> > >>>> _______________________________________________ >>> > >>>> Pandas-dev mailing list >>> > >>>> Pandas-dev at python.org >>> > >>>> https://mail.python.org/mailman/listinfo/pandas-dev >>> > >> >>> > >> _______________________________________________ >>> > >> Pandas-dev mailing list >>> > >> Pandas-dev at python.org >>> > >> https://mail.python.org/mailman/listinfo/pandas-dev >>> > > >>> > > _______________________________________________ >>> > > Pandas-dev mailing list >>> > > Pandas-dev at python.org >>> > > https://mail.python.org/mailman/listinfo/pandas-dev >>> _______________________________________________ >>> Pandas-dev mailing list >>> Pandas-dev at python.org >>> https://mail.python.org/mailman/listinfo/pandas-dev From ralf.gommers at gmail.com Tue Jun 11 15:19:13 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 11 Jun 2019 21:19:13 +0200 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: On Tue, Jun 11, 2019 at 7:51 PM Wes McKinney wrote: > hi, > > On Tue, Jun 11, 2019 at 11:16 AM Ralf Gommers > wrote: > > > > > > > > On Tue, Jun 11, 2019 at 4:56 PM Andy Ray Terrel > wrote: > >> > >> While the original lifter agreement was an individual contract, in our > negotiations with Tidelift, NumFOCUS has explicitly sought a model that > allows the project to split the money how they prefer. This was always > Tidelift's intention, it was just faster and easier to scale to focus on > paying individuals. > > > > > > +1 the project decides for themselves is the intent and a good principle. > > > >> > >> I do like the idea of paying for maintence work, I would recommend we > set up folks as contractors with NumFOCUS rather than just pocketing money. > It will give a lot more legal protection. Then if some folks don't want to > take the cash you they can donate their time and be recognized as in-kind > donations, which might have some tax deductions. > > > > > > Keep in mind that this has a lot of potential issues. Examples: > > 1. Who decides who gets paid, and how? The pandas repo has 1500+ > contributors. Lots of potential for friction over small amount of $. > > More or less the _entire_ point of Tidelift is to incentivize people > to do more maintenance work. I think it's worth at least attempting to > use this money for its intended economic purpose. > You do realize that the other topics I suggested using money for are also maintenance work right? Even if your sole goal is "get more maintenance work done", spending funds with purpose is likely to be more effective than just putting it in your own pockets. > The maintainers are, as a first approximation, the ~10-15 active core > members listed on > > https://github.com/pandas-dev/pandas-governance > > IMHO those are the people that should get paid (going forward) -- if > contributors are more motivated to become core team members / > maintainers as a result of the Tidelift money, then it has had the > desired outcome. > > > 2. Many people have employment contracts, those typically forbid > contracting on the side. So inherently unfair to distribute only to those > who are in a position to accept the money. > > This is true -- at least Jeff and maybe others fall into this > category. In such cases their "cut" of the maintenance funds can go > into the communal fund to pay for other stuff > That does not really seem fair. There must be better options. E.g. for anyone in such a position, you could use the money to pay for travel and hotel if they go to a dev meeting or conference. People can't accept money as income in many cases, but everyone is usually able to get cost reimbursements or accept a free ticket. And you don't have to pay income tax over that, so the $ goes a lot further. > > 3. You're now introducing lots of extra paperwork and admin, both > directly and indirectly (who wants to deal with the extra complications > when filing your taxes?). > > Hopefully we're talking just a 1099 from NumFOCUS with a single number > to type in, but I'm the wrong person to judge since my taxes are more > complicated than most people's =) > There's a lot more to it than typing in a single number if you go, e.g., from 1 to 2 sources of income. Also, NumFOCUS won't be able to give any kind of paperwork for people outside the US. It'll all be up to them to do correct tax reporting/withholding. --- tl;dr this is a complicated topic, it's worth thinking about and making informed choices that maximize the benefits and minimize the costs rather than a simple "just put it in your pockets". Now I'm not a Pandas dev, I just helped with getting the original NumFOCUS-Tidelift agreement in place and wanted to share my experiences with Tidelift. So I'll bow out here. Cheers, Ralf > > 4. It may create other weird social dynamics. E.g. if money is now > directly coupled to a commit bit, that makes the "who do we give commit > rights and when" a potentially more loaded question. > > I think this is where the honest self-reporting of time spent comes > in. The goal is to increase the average number of maintainer hours per > month/year. It's sort of like a crypto-mining pool, but for open > source software maintenance =) Obviously maintainers are accountable > to the rest of the core team to behave with integrity > (professionalism, honesty, etc.) or they can be voted to be removed if > they are found to be dishonest. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andy.terrel at gmail.com Tue Jun 11 15:25:45 2019 From: andy.terrel at gmail.com (Andy Ray Terrel) Date: Tue, 11 Jun 2019 14:25:45 -0500 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: On Tue, Jun 11, 2019 at 12:51 PM Wes McKinney wrote: > hi, > > On Tue, Jun 11, 2019 at 11:16 AM Ralf Gommers > wrote: > > > > > > > > On Tue, Jun 11, 2019 at 4:56 PM Andy Ray Terrel > wrote: > >> > >> While the original lifter agreement was an individual contract, in our > negotiations with Tidelift, NumFOCUS has explicitly sought a model that > allows the project to split the money how they prefer. This was always > Tidelift's intention, it was just faster and easier to scale to focus on > paying individuals. > > > > > > +1 the project decides for themselves is the intent and a good principle. > > > >> > >> I do like the idea of paying for maintence work, I would recommend we > set up folks as contractors with NumFOCUS rather than just pocketing money. > It will give a lot more legal protection. Then if some folks don't want to > take the cash you they can donate their time and be recognized as in-kind > donations, which might have some tax deductions. > > > > > > Keep in mind that this has a lot of potential issues. Examples: > > 1. Who decides who gets paid, and how? The pandas repo has 1500+ > contributors. Lots of potential for friction over small amount of $. > > More or less the _entire_ point of Tidelift is to incentivize people > to do more maintenance work. I think it's worth at least attempting to > use this money for its intended economic purpose. > > The maintainers are, as a first approximation, the ~10-15 active core > members listed on > > https://github.com/pandas-dev/pandas-governance > > IMHO those are the people that should get paid (going forward) -- if > contributors are more motivated to become core team members / > maintainers as a result of the Tidelift money, then it has had the > desired outcome. > I would suggest leaving the decision to the project core team with the project Numfocus committee to be the overseer of the implementation. > > > 2. Many people have employment contracts, those typically forbid > contracting on the side. So inherently unfair to distribute only to those > who are in a position to accept the money. > > This is true -- at least Jeff and maybe others fall into this > category. In such cases their "cut" of the maintenance funds can go > into the communal fund to pay for other stuff > > Yes such accommodation will need to be worked out. > > 3. You're now introducing lots of extra paperwork and admin, both > directly and indirectly (who wants to deal with the extra complications > when filing your taxes?). > > Hopefully we're talking just a 1099 from NumFOCUS with a single number > to type in, but I'm the wrong person to judge since my taxes are more > complicated than most people's =) > Generally it is done that way for US based folks and for folks out of the US we tend to let them handle their own taxes. We would need to work that out. Additionally, as in all dealings with businesses, we do the extra paperwork for the other benefits such as limiting the liability of a maintainer. > > > 4. It may create other weird social dynamics. E.g. if money is now > directly coupled to a commit bit, that makes the "who do we give commit > rights and when" a potentially more loaded question. > > I think this is where the honest self-reporting of time spent comes > in. The goal is to increase the average number of maintainer hours per > month/year. It's sort of like a crypto-mining pool, but for open > source software maintenance =) Obviously maintainers are accountable > to the rest of the core team to behave with integrity > (professionalism, honesty, etc.) or they can be voted to be removed if > they are found to be dishonest. > > > > And, dividing it into N chunks, the funding becomes nice beer money and a > thank you for volunteering. Could be exactly what you'd prefer as a team. > But that's imho more in line with the current version of Patreon or GitHub > Sponsors rather then with what Tidelift is aiming for. > > > > I'd like the idea of "paying for maintenance" if there were enough money > to employ people. But realistically, that will take many years. The > Tidelift slogan on this is unrealistic for a project like Pandas where > maintenance effort is many FTEs; it's perhaps feasible for your typical > Javascript library that's popular but small enough for one person > maintaining it part-time. > > > >> > >> It is something I would volunteer to help manage in order to learn how > other projects might use the same techniques. > >> > >> -- Andy > >> > >> On Tue, Jun 11, 2019 at 9:13 AM Wes McKinney > wrote: > >>> > >>> > How you allocate the money to each other is something you can debate > privately > >>> > >>> On this, I'm sure that you could set up a lightweight virtual > >>> "timesheet" so you can put yourselves "on the clock" when you're doing > >>> project maintenance work (there are many of these online, I just read > >>> about https://www.clockspot.com/ recently) to make time reporting a > >>> bit more accurate > >>> > >>> On Tue, Jun 11, 2019 at 9:09 AM Wes McKinney > wrote: > >>> > > >>> > Personally, I would recommend putting most of the money in your own > >>> > pockets. The whole idea of Tidelift (as I understand it) is for the > >>> > individuals doing work that is of importance to project users (to > whom > >>> > Tidelift is providing indemnification and "insurance" against > defects) > > > > > > Actually that's only partially true. Tidelift is paying for very > specific things, that allow them to do aggregated reporting on licensing, > dependencies, security vulnerabilities, release streams & release docs, > etc. - basically the stuff that helps large corporations do due diligence > and management of a large software stack. > > > > It is explicitly out of scope to work on bugs or enhancements in the > NumFOCUS-Tidelift agreement (and working on particular technical items was > never their intention). So "insurance against defects" isn't part of this, > except in a very abstract sense of making the project healthier and > therefore reducing the risk of it being abandoned or a lot more buggy on > the many-year time scale. > > > > Cheers, > > Ralf > > > > > >>> > to get paid for their labor. So I think the most honest way to use > the > >>> > money is to put it in your respective bank accounts. If you've > getting > >>> > a little bit of money to spend on yourself, doesn't that make doing > >>> > the maintenance work a bit less thankless? If you don't pay > >>> > yourselves, I think it actually "breaks" Tidelift's pitch to > customers > >>> > which is that open source projects need to have a higher fraction of > >>> > compensated maintenance and support work than they do now. > >>> > > >>> > How you allocate the money to each other is something you can debate > privately > >>> > > >>> > On Tue, Jun 11, 2019 at 8:42 AM Joris Van den Bossche > >>> > wrote: > >>> > > > >>> > > > >>> > > > >>> > > Op di 11 jun. 2019 om 15:31 schreef Ralf Gommers < > ralf.gommers at gmail.com>: > >>> > >> > >>> > >> > >>> > >> > >>> > >> On Tue, Jun 11, 2019 at 3:03 PM Tom Augspurger < > tom.augspurger88 at gmail.com> wrote: > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> On Tue, Jun 11, 2019 at 7:58 AM William Ayd via Pandas-dev < > pandas-dev at python.org> wrote: > >>> > >>>> > >>> > >>>> Just some counterpoints to consider: > >>> > >>>> > >>> > >>>> - $ 3,000 a month isn?t really that much, and if it?s just a > number that a well-funded company chose for us chances are they are > benefiting from it way more than we are > >>> > >> > >>> > >> > >>> > >> "it's not really that much" is something I don't agree with. It > doesn't employ someone, but it's enough to pay for things like developer > meetups, hiring an extra GSoC student if a good one happens to come along, > paying a web dev for a full redesign of the project website, etc. Each of > those things is in the $5,000 - %15,000 range, and it's _very_ nice to be > able to do them without having to look for funding first. > >>> > >> > >>> > >> Tidelift is a small (now ~25 employees) company by the way, and > they have a real understanding of the open source sustainability issues and > seem dedicated to helping fix it. > >>> > >> > >>> > >>>> - There is no such thing as free money; we have to consider how > to account for and actually manage it (perhaps mitigated somewhat by > NumFocus) > >>> > >>> > >>> > >>> > >>> > >>> Perhaps Ralph can share how this has gone for NumPy. I imagine > it's not too work on their end, thanks to NumFOCUS. > >>> > >> > >>> > >> > >>> > >> NumFOCUS handles receiving the money and associated admin. As the > project you'll be responsible for the setup and ongoing tasks. For NumPy > and SciPy I have done those tasks. It's a fairly minimal amount of work: > https://github.com/numpy/numpy/pulls?q=is%3Apr+tidelift+is%3Aclosed. The > main one was dealing with GitHub not recognizing our license, and you don't > have that issue for Pandas (it's reported correctly as BSD-3 in the UI at > https://github.com/pandas-dev/pandas). > >>> > >> > >>> > >> So it's probably a day of work for one person, to get familiar > with the interface, check dependencies, release streams, paste in release > notes, etc. And then ongoing maybe one or a couple of hours a month. So far > it's been a much more effective way of spending time than, for example, > grant writing. > >>> > >> > >>> > >>> > >>> > >>>> > >>> > >>>> - Advertising and ties to a corporate sponsorship may weaken > the brand of pandas; at that point we may lose some creditability as open > source volunteers > >>> > >>> > >>> > >>> > >>> > >>> Anecdotally, I don't think that's how the community views > Tidelift. My perception (from Twitter, blogs / comments) is that it's been > well received. > >>> > >> > >>> > >> > >>> > >> Agree, the feedback I've seen is all quite positive. > >>> > > > >>> > > > >>> > > Additionally, I don't think there is any "advertisement" involved, > at least not in the classical sense of adding adds for third-party > companies in a side bar to our website for which we get money. Of course we > will need to mention Tidelift in some way, e.g. in our sponsors / > institutional partners section, but we already do that for some other > companies as well (that employ core devs). > >>> > > > >>> > >> > >>> > >> > >>> > >>> > >>> > >>>> > >>> > >>>> - We don?t (AFAIK) have a plan on how to spend or allocate it > >>> > >>>> > >>> > >>>> Not totally against it but perhaps the last point above is the > main sticking one. Do we have any idea how much we?d actually pocket out of > the $ 3k they offer us and subsequently what we would do with it? Cover > travel expenses? Support PyData conferences? Scholarships? > >>> > >>> > >>> > >>> > >>> > >>> Agreed that we should set a purpose for this money (though, I > have no objection to collecting while we set that dedicated purpose). > >>> > >> > >>> > >> > >>> > > Indeed we need to discuss this, but I don't think we already need > to know *exactly* what we want to do with it before setting up a contract > with Tidelift. It's good for me to alraedy start discussing it now, but > maybe in a separate thread? > >>> > > > >>> > >> > >>> > >> For NumPy and SciPy we haven't earmarked the funds yet. It's nice > to build up a buffer first. One thing I'm thinking of is that we're > participating in Google Season of Docs, and are getting more high quality > applicants than Google will accept. So we could pay one or two tech writers > from the funds. Our website and high level docs (tutorial, restructuring of > all docs to guide users better) sure could use it:) > >>> > >> > >>> > >> My abstract advice would be: pay for things that require money > (like a dev meeting) or don't get done for free. Don't pay for writing code > unless the case is extremely compelling, because that'll be a drop in the > bucket. > >>> > >> > >>> > >> Cheers, > >>> > >> Ralf > >>> > >> > >>> > >> > >>> > >>> > >>> > >>>> > >>> > >>>> - Will > >>> > >>>> > >>> > >>>> On Jun 11, 2019, at 4:44 AM, Ralf Gommers < > ralf.gommers at gmail.com> wrote: > >>> > >>>> > >>> > >>>> > >>> > >>>> > >>> > >>>> On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > >>> > >>>>> > >>> > >>>>> The current page about pandas ( > https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar a > month (but I am not fully sure this is what is already available from their > current subscribers, or if it is a prospect). > >>> > >>>> > >>> > >>>> > >>> > >>>> It's not just a prospect, that's what you should/will get. > NumPy and SciPy get the listed amounts too. > >>> > >>>> > >>> > >>>> Agreed that the NumPy amount is not that much. The amount gets > determined automatically; it's some combination of customer interest, > dependency analysis and size of the API surface. > >>> > >>>> > >>> > >>>> The current amounts are: > >>> > >>>> NumPy: $1000 > >>> > >>>> SciPy: $2500 > >>> > >>>> Pandas: $3000 > >>> > >>>> Matplotlib: n.a. > >>> > >>>> Scikit-learn: $1500 > >>> > >>>> Scikit-image: $50 > >>> > >>>> Statsmodels: $50 > >>> > >>>> > >>> > >>>> So there's an element of randomness, but the results are not > completely surprising I think. The four libraries that get order thousands > of dollars are the ones that large corporations are going to have the > highest interest in. > >>> > >>>> > >>> > >>>> Cheers, > >>> > >>>> Ralf > >>> > >>>> > >>> > >>>>> > >>> > >>>>> > >>> > >>>>> Op za 8 jun. 2019 om 22:54 schreef William Ayd < > william.ayd at icloud.com>: > >>> > >>>>>> > >>> > >>>>>> What is the minimum amount we are asking for? The $1,000 a > month for NumPy seems rather low and I thought previous emails had > something in the range of $3k a month. > >>> > >>>>>> > >>> > >>>>>> I don?t think we necessarily need or would be that much > improved by $12k per year so would rather aim higher if we are going to do > this > >>> > >>>>>> > >>> > >>>>>> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > >>> > >>>>>> > >>> > >>>>>> Hi all, > >>> > >>>>>> > >>> > >>>>>> We discussed this on the last dev chat, but putting it on the > mailing list for those who were not present: we are planning to contact > Tidelift to enter into a sponsor agreement for Pandas. > >>> > >>>>>> > >>> > >>>>>> The idea is to follow what NumPy (and recently also Scipy) > did to have an agreement between Tidelift and NumFOCUS instead of an > individual maintainer (see their announcement mail: > https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html > ). > >>> > >>>>>> Blog with overview about Tidelift: > https://blog.tidelift.com/how-to-start-earning-money-for-your-open-source-project-with-tidelift > . > >>> > >>>>>> > >>> > >>>>>> We didn't discuss yet what to do specifically with those > funds, that should still be discussed in the future. > >>> > >>>>>> > >>> > >>>>>> Cheers, > >>> > >>>>>> Joris > >>> > >>>>>> _______________________________________________ > >>> > >>>>>> Pandas-dev mailing list > >>> > >>>>>> Pandas-dev at python.org > >>> > >>>>>> https://mail.python.org/mailman/listinfo/pandas-dev > >>> > >>>>>> > >>> > >>>>>> > >>> > >>>>> _______________________________________________ > >>> > >>>>> Pandas-dev mailing list > >>> > >>>>> Pandas-dev at python.org > >>> > >>>>> https://mail.python.org/mailman/listinfo/pandas-dev > >>> > >>>> > >>> > >>>> > >>> > >>>> _______________________________________________ > >>> > >>>> Pandas-dev mailing list > >>> > >>>> Pandas-dev at python.org > >>> > >>>> https://mail.python.org/mailman/listinfo/pandas-dev > >>> > >> > >>> > >> _______________________________________________ > >>> > >> Pandas-dev mailing list > >>> > >> Pandas-dev at python.org > >>> > >> https://mail.python.org/mailman/listinfo/pandas-dev > >>> > > > >>> > > _______________________________________________ > >>> > > Pandas-dev mailing list > >>> > > Pandas-dev at python.org > >>> > > https://mail.python.org/mailman/listinfo/pandas-dev > >>> _______________________________________________ > >>> Pandas-dev mailing list > >>> Pandas-dev at python.org > >>> https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbrockmendel at gmail.com Tue Jun 11 16:38:18 2019 From: jbrockmendel at gmail.com (Brock Mendel) Date: Tue, 11 Jun 2019 15:38:18 -0500 Subject: [Pandas-dev] Arithmetic Proposal Message-ID: I've been working on arithmetic/comparison bugs and more recently on performance problems caused by fixing some of those bugs. After trying less-invasive approaches, I've concluded a fairly big fix is called for. This is an RFC for that proposed fix. ------ In 0.24.0 we fixed some arithmetic bugs in DataFrame operations by making DataFrame arithmetic ops operate column-by-column, dispatching to the Series implementations. This led to a significant performance hit for operations on DataFrames with many columns (#24990, #26061). To restore the lost performance, we need to have these operations take place at the Block level. To prevent DataFrame behavior from diverging from Series behavior (again), we need to retain a single shared implementation. This is a proposal for how meet these two needs. Proposal: - Allow EA to support 2D arrays - Use PandasArray to back Block subclasses currently backed by ndarray - Implement arithmetic and comparison ops directly on PandasArray, then have Series, DataFrame, and Index ops pass through to the PandasArray implementations. Fixes: - Performance degradation in DataFrame ops (#24990, #26061) - The last remaining inconsistencies between Index and Series ops (#19322, #18824) - Most of the xfailing arithmetic tests - #22120: Transposing dataframe loses dtype and ExtensionArray - #24600 BUG: DataFrame[Sparse] quantile fails because SparseArray has no reshape - #23925 DataFrame Quantile Broken with Datetime Data Other Upsides: - Series constructor could dispatch to pd.array, de-duplicating a lot of code. - Easier to move to Arrow backend if Blocks are numpy-naive. - Make EA closer to a drop-in replacement for np.ndarray, necessary if we want e.g. xarray to find them directly useful (#24716, #24583) - Block/BlockManager simplifications, see below. Downsides: - Existing constructors assume 1D - Existing downstream authors assume 1D - Reduction ops (of which there aren't many) don't have axis kwarg ATM - But for PandasArray they just pass through to nanops, which already have+test the axis kwargs - For DatetimeArray/TimedeltaArray/PeriodArray, I'm the one implementing the reductions and am OK with this extra complication. Block Simplifications: - Blocks have three main attributes: values, mgr_locs, and ndim - ndim is _usually_ the same as values.ndim, the exceptions being for cases where type(values) is restricted to 1D - Without these restrictions, we can get rid of: - Block.ndim, associated kludgy ndim-checking code - numerous can-this-be-reshaped/transposed checks and special cases in Block and BlockManager code (which are buggy anyway, e.g. #23925) - With ndim gone, we can then get rid of mgr_locs! - The blocks themselves never use mgr_locs except when passing to their own constructors. - mgr_locs makes _much_ more sense as an attribute of the BlockManager - With mgr_locs gone, Block becomes just a thin wrapper around an EA Implementation Strategy: - Remove the 1D restriction - Fairly small tweak, EA subclass must define `shape` instead of `__len__`; other attrs define in terms of shape. - Define `transpose`, `T`, `reshape`, and `ravel` - With this done, several tasks can proceed in parallel: - simplifications in core.internals, as special-cases for 1D-only can be removed - implement and test arithmetic ops on PandasArray - back Blocks with PandasArray - back Index (and numeric subclasses) with PandasArray - Change DataFrame, Series, Index ops to pass through to underlying Blocks/PandasArrays -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Tue Jun 11 17:09:36 2019 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 11 Jun 2019 16:09:36 -0500 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: On Tue, Jun 11, 2019 at 2:26 PM Andy Ray Terrel wrote: > > > > On Tue, Jun 11, 2019 at 12:51 PM Wes McKinney wrote: >> >> hi, >> >> On Tue, Jun 11, 2019 at 11:16 AM Ralf Gommers wrote: >> > >> > >> > >> > On Tue, Jun 11, 2019 at 4:56 PM Andy Ray Terrel wrote: >> >> >> >> While the original lifter agreement was an individual contract, in our negotiations with Tidelift, NumFOCUS has explicitly sought a model that allows the project to split the money how they prefer. This was always Tidelift's intention, it was just faster and easier to scale to focus on paying individuals. >> > >> > >> > +1 the project decides for themselves is the intent and a good principle. >> > >> >> >> >> I do like the idea of paying for maintence work, I would recommend we set up folks as contractors with NumFOCUS rather than just pocketing money. It will give a lot more legal protection. Then if some folks don't want to take the cash you they can donate their time and be recognized as in-kind donations, which might have some tax deductions. >> > >> > >> > Keep in mind that this has a lot of potential issues. Examples: >> > 1. Who decides who gets paid, and how? The pandas repo has 1500+ contributors. Lots of potential for friction over small amount of $. >> >> More or less the _entire_ point of Tidelift is to incentivize people >> to do more maintenance work. I think it's worth at least attempting to >> use this money for its intended economic purpose. >> >> The maintainers are, as a first approximation, the ~10-15 active core >> members listed on >> >> https://github.com/pandas-dev/pandas-governance >> >> IMHO those are the people that should get paid (going forward) -- if >> contributors are more motivated to become core team members / >> maintainers as a result of the Tidelift money, then it has had the >> desired outcome. > > > I would suggest leaving the decision to the project core team with the project Numfocus committee to be the overseer of the implementation. > Yes, of course, that's the governance that we have in place. I am just stressing that we should try to honor the intent of the asset that is being purchased by Tidelift customers. Tidelift is telling their customers that the money they are paying is going to end up in the pockets of the project maintainers https://tidelift.com/about/lifter If the pandas core team wishes to deny themselves the income (which, divided up, isn't going to be a life-changing amount of money) that's their prerogative -- I just wanted to be clear about where I stand on it, and there's nothing immoral about wanting to be compensated for one's time (given how much volunteered time has already gone uncompensated). One risk to be aware of is that if a high profile project like pandas take's TL's money and none of the maintainers pay themselves with it, then the monthly number may not have as much of a chance of increasing (since current or prospective TL customers may observe that the subscription dollars aren't being used in the way that is being pitched). >> >> >> > 2. Many people have employment contracts, those typically forbid contracting on the side. So inherently unfair to distribute only to those who are in a position to accept the money. >> >> This is true -- at least Jeff and maybe others fall into this >> category. In such cases their "cut" of the maintenance funds can go >> into the communal fund to pay for other stuff >> > > Yes such accommodation will need to be worked out. > >> >> > 3. You're now introducing lots of extra paperwork and admin, both directly and indirectly (who wants to deal with the extra complications when filing your taxes?). >> >> Hopefully we're talking just a 1099 from NumFOCUS with a single number >> to type in, but I'm the wrong person to judge since my taxes are more >> complicated than most people's =) > > > Generally it is done that way for US based folks and for folks out of the US we tend to let them handle their own taxes. We would need to work that out. > > Additionally, as in all dealings with businesses, we do the extra paperwork for the other benefits such as limiting the liability of a maintainer. > >> >> >> > 4. It may create other weird social dynamics. E.g. if money is now directly coupled to a commit bit, that makes the "who do we give commit rights and when" a potentially more loaded question. >> >> I think this is where the honest self-reporting of time spent comes >> in. The goal is to increase the average number of maintainer hours per >> month/year. It's sort of like a crypto-mining pool, but for open >> source software maintenance =) Obviously maintainers are accountable >> to the rest of the core team to behave with integrity >> (professionalism, honesty, etc.) or they can be voted to be removed if >> they are found to be dishonest. > > >> >> > >> >> > And, dividing it into N chunks, the funding becomes nice beer money and a thank you for volunteering. Could be exactly what you'd prefer as a team. But that's imho more in line with the current version of Patreon or GitHub Sponsors rather then with what Tidelift is aiming for. >> > >> > I'd like the idea of "paying for maintenance" if there were enough money to employ people. But realistically, that will take many years. The Tidelift slogan on this is unrealistic for a project like Pandas where maintenance effort is many FTEs; it's perhaps feasible for your typical Javascript library that's popular but small enough for one person maintaining it part-time. >> > >> >> >> >> It is something I would volunteer to help manage in order to learn how other projects might use the same techniques. >> >> >> >> -- Andy >> >> >> >> On Tue, Jun 11, 2019 at 9:13 AM Wes McKinney wrote: >> >>> >> >>> > How you allocate the money to each other is something you can debate privately >> >>> >> >>> On this, I'm sure that you could set up a lightweight virtual >> >>> "timesheet" so you can put yourselves "on the clock" when you're doing >> >>> project maintenance work (there are many of these online, I just read >> >>> about https://www.clockspot.com/ recently) to make time reporting a >> >>> bit more accurate >> >>> >> >>> On Tue, Jun 11, 2019 at 9:09 AM Wes McKinney wrote: >> >>> > >> >>> > Personally, I would recommend putting most of the money in your own >> >>> > pockets. The whole idea of Tidelift (as I understand it) is for the >> >>> > individuals doing work that is of importance to project users (to whom >> >>> > Tidelift is providing indemnification and "insurance" against defects) >> > >> > >> > Actually that's only partially true. Tidelift is paying for very specific things, that allow them to do aggregated reporting on licensing, dependencies, security vulnerabilities, release streams & release docs, etc. - basically the stuff that helps large corporations do due diligence and management of a large software stack. >> > >> > It is explicitly out of scope to work on bugs or enhancements in the NumFOCUS-Tidelift agreement (and working on particular technical items was never their intention). So "insurance against defects" isn't part of this, except in a very abstract sense of making the project healthier and therefore reducing the risk of it being abandoned or a lot more buggy on the many-year time scale. >> > >> > Cheers, >> > Ralf >> > >> > >> >>> > to get paid for their labor. So I think the most honest way to use the >> >>> > money is to put it in your respective bank accounts. If you've getting >> >>> > a little bit of money to spend on yourself, doesn't that make doing >> >>> > the maintenance work a bit less thankless? If you don't pay >> >>> > yourselves, I think it actually "breaks" Tidelift's pitch to customers >> >>> > which is that open source projects need to have a higher fraction of >> >>> > compensated maintenance and support work than they do now. >> >>> > >> >>> > How you allocate the money to each other is something you can debate privately >> >>> > >> >>> > On Tue, Jun 11, 2019 at 8:42 AM Joris Van den Bossche >> >>> > wrote: >> >>> > > >> >>> > > >> >>> > > >> >>> > > Op di 11 jun. 2019 om 15:31 schreef Ralf Gommers : >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > >> On Tue, Jun 11, 2019 at 3:03 PM Tom Augspurger wrote: >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> On Tue, Jun 11, 2019 at 7:58 AM William Ayd via Pandas-dev wrote: >> >>> > >>>> >> >>> > >>>> Just some counterpoints to consider: >> >>> > >>>> >> >>> > >>>> - $ 3,000 a month isn?t really that much, and if it?s just a number that a well-funded company chose for us chances are they are benefiting from it way more than we are >> >>> > >> >> >>> > >> >> >>> > >> "it's not really that much" is something I don't agree with. It doesn't employ someone, but it's enough to pay for things like developer meetups, hiring an extra GSoC student if a good one happens to come along, paying a web dev for a full redesign of the project website, etc. Each of those things is in the $5,000 - %15,000 range, and it's _very_ nice to be able to do them without having to look for funding first. >> >>> > >> >> >>> > >> Tidelift is a small (now ~25 employees) company by the way, and they have a real understanding of the open source sustainability issues and seem dedicated to helping fix it. >> >>> > >> >> >>> > >>>> - There is no such thing as free money; we have to consider how to account for and actually manage it (perhaps mitigated somewhat by NumFocus) >> >>> > >>> >> >>> > >>> >> >>> > >>> Perhaps Ralph can share how this has gone for NumPy. I imagine it's not too work on their end, thanks to NumFOCUS. >> >>> > >> >> >>> > >> >> >>> > >> NumFOCUS handles receiving the money and associated admin. As the project you'll be responsible for the setup and ongoing tasks. For NumPy and SciPy I have done those tasks. It's a fairly minimal amount of work: https://github.com/numpy/numpy/pulls?q=is%3Apr+tidelift+is%3Aclosed. The main one was dealing with GitHub not recognizing our license, and you don't have that issue for Pandas (it's reported correctly as BSD-3 in the UI at https://github.com/pandas-dev/pandas). >> >>> > >> >> >>> > >> So it's probably a day of work for one person, to get familiar with the interface, check dependencies, release streams, paste in release notes, etc. And then ongoing maybe one or a couple of hours a month. So far it's been a much more effective way of spending time than, for example, grant writing. >> >>> > >> >> >>> > >>> >> >>> > >>>> >> >>> > >>>> - Advertising and ties to a corporate sponsorship may weaken the brand of pandas; at that point we may lose some creditability as open source volunteers >> >>> > >>> >> >>> > >>> >> >>> > >>> Anecdotally, I don't think that's how the community views Tidelift. My perception (from Twitter, blogs / comments) is that it's been well received. >> >>> > >> >> >>> > >> >> >>> > >> Agree, the feedback I've seen is all quite positive. >> >>> > > >> >>> > > >> >>> > > Additionally, I don't think there is any "advertisement" involved, at least not in the classical sense of adding adds for third-party companies in a side bar to our website for which we get money. Of course we will need to mention Tidelift in some way, e.g. in our sponsors / institutional partners section, but we already do that for some other companies as well (that employ core devs). >> >>> > > >> >>> > >> >> >>> > >> >> >>> > >>> >> >>> > >>>> >> >>> > >>>> - We don?t (AFAIK) have a plan on how to spend or allocate it >> >>> > >>>> >> >>> > >>>> Not totally against it but perhaps the last point above is the main sticking one. Do we have any idea how much we?d actually pocket out of the $ 3k they offer us and subsequently what we would do with it? Cover travel expenses? Support PyData conferences? Scholarships? >> >>> > >>> >> >>> > >>> >> >>> > >>> Agreed that we should set a purpose for this money (though, I have no objection to collecting while we set that dedicated purpose). >> >>> > >> >> >>> > >> >> >>> > > Indeed we need to discuss this, but I don't think we already need to know *exactly* what we want to do with it before setting up a contract with Tidelift. It's good for me to alraedy start discussing it now, but maybe in a separate thread? >> >>> > > >> >>> > >> >> >>> > >> For NumPy and SciPy we haven't earmarked the funds yet. It's nice to build up a buffer first. One thing I'm thinking of is that we're participating in Google Season of Docs, and are getting more high quality applicants than Google will accept. So we could pay one or two tech writers from the funds. Our website and high level docs (tutorial, restructuring of all docs to guide users better) sure could use it:) >> >>> > >> >> >>> > >> My abstract advice would be: pay for things that require money (like a dev meeting) or don't get done for free. Don't pay for writing code unless the case is extremely compelling, because that'll be a drop in the bucket. >> >>> > >> >> >>> > >> Cheers, >> >>> > >> Ralf >> >>> > >> >> >>> > >> >> >>> > >>> >> >>> > >>>> >> >>> > >>>> - Will >> >>> > >>>> >> >>> > >>>> On Jun 11, 2019, at 4:44 AM, Ralf Gommers wrote: >> >>> > >>>> >> >>> > >>>> >> >>> > >>>> >> >>> > >>>> On Tue, Jun 11, 2019 at 10:15 AM Joris Van den Bossche wrote: >> >>> > >>>>> >> >>> > >>>>> The current page about pandas (https://tidelift.com/lifter/search/pypi/pandas) mentions $3,000 dollar a month (but I am not fully sure this is what is already available from their current subscribers, or if it is a prospect). >> >>> > >>>> >> >>> > >>>> >> >>> > >>>> It's not just a prospect, that's what you should/will get. NumPy and SciPy get the listed amounts too. >> >>> > >>>> >> >>> > >>>> Agreed that the NumPy amount is not that much. The amount gets determined automatically; it's some combination of customer interest, dependency analysis and size of the API surface. >> >>> > >>>> >> >>> > >>>> The current amounts are: >> >>> > >>>> NumPy: $1000 >> >>> > >>>> SciPy: $2500 >> >>> > >>>> Pandas: $3000 >> >>> > >>>> Matplotlib: n.a. >> >>> > >>>> Scikit-learn: $1500 >> >>> > >>>> Scikit-image: $50 >> >>> > >>>> Statsmodels: $50 >> >>> > >>>> >> >>> > >>>> So there's an element of randomness, but the results are not completely surprising I think. The four libraries that get order thousands of dollars are the ones that large corporations are going to have the highest interest in. >> >>> > >>>> >> >>> > >>>> Cheers, >> >>> > >>>> Ralf >> >>> > >>>> >> >>> > >>>>> >> >>> > >>>>> >> >>> > >>>>> Op za 8 jun. 2019 om 22:54 schreef William Ayd : >> >>> > >>>>>> >> >>> > >>>>>> What is the minimum amount we are asking for? The $1,000 a month for NumPy seems rather low and I thought previous emails had something in the range of $3k a month. >> >>> > >>>>>> >> >>> > >>>>>> I don?t think we necessarily need or would be that much improved by $12k per year so would rather aim higher if we are going to do this >> >>> > >>>>>> >> >>> > >>>>>> On Jun 7, 2019, at 12:53 PM, Joris Van den Bossche wrote: >> >>> > >>>>>> >> >>> > >>>>>> Hi all, >> >>> > >>>>>> >> >>> > >>>>>> We discussed this on the last dev chat, but putting it on the mailing list for those who were not present: we are planning to contact Tidelift to enter into a sponsor agreement for Pandas. >> >>> > >>>>>> >> >>> > >>>>>> The idea is to follow what NumPy (and recently also Scipy) did to have an agreement between Tidelift and NumFOCUS instead of an individual maintainer (see their announcement mail: https://mail.python.org/pipermail/numpy-discussion/2019-April/079370.html). >> >>> > >>>>>> Blog with overview about Tidelift: https://blog.tidelift.com/how-to-start-earning-money-for-your-open-source-project-with-tidelift. >> >>> > >>>>>> >> >>> > >>>>>> We didn't discuss yet what to do specifically with those funds, that should still be discussed in the future. >> >>> > >>>>>> >> >>> > >>>>>> Cheers, >> >>> > >>>>>> Joris >> >>> > >>>>>> _______________________________________________ >> >>> > >>>>>> Pandas-dev mailing list >> >>> > >>>>>> Pandas-dev at python.org >> >>> > >>>>>> https://mail.python.org/mailman/listinfo/pandas-dev >> >>> > >>>>>> >> >>> > >>>>>> >> >>> > >>>>> _______________________________________________ >> >>> > >>>>> Pandas-dev mailing list >> >>> > >>>>> Pandas-dev at python.org >> >>> > >>>>> https://mail.python.org/mailman/listinfo/pandas-dev >> >>> > >>>> >> >>> > >>>> >> >>> > >>>> _______________________________________________ >> >>> > >>>> Pandas-dev mailing list >> >>> > >>>> Pandas-dev at python.org >> >>> > >>>> https://mail.python.org/mailman/listinfo/pandas-dev >> >>> > >> >> >>> > >> _______________________________________________ >> >>> > >> Pandas-dev mailing list >> >>> > >> Pandas-dev at python.org >> >>> > >> https://mail.python.org/mailman/listinfo/pandas-dev >> >>> > > >> >>> > > _______________________________________________ >> >>> > > Pandas-dev mailing list >> >>> > > Pandas-dev at python.org >> >>> > > https://mail.python.org/mailman/listinfo/pandas-dev >> >>> _______________________________________________ >> >>> Pandas-dev mailing list >> >>> Pandas-dev at python.org >> >>> https://mail.python.org/mailman/listinfo/pandas-dev From jorisvandenbossche at gmail.com Tue Jun 11 18:17:50 2019 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Wed, 12 Jun 2019 00:17:50 +0200 Subject: [Pandas-dev] Arithmetic Proposal In-Reply-To: References: Message-ID: Hi Brock, Thanks a lot for starting this discussion and the detailed proposal! I will try to look at it in more detail tomorrow, but one general remark: from time to time, we talked about "getting rid of the BlockManager" or "simplifying the BlockManager" (although I am not sure if there is any specific github issue about it, might be from in-person discussions). One of the interpretations of that (or at least how I understood those discussions) was to get away of the 2D block based internals, and go to a simpler "table as collection of 1D arrays" model. This would also enable a simplication of the internals / BlockManager and many of the other items you mention. So I think we should at least compare a more detailed version of what I described above against your proposal. As if we would want to go in that direction long term, I am not sure extensive work on the current 2D blocks-based BlockManager is worth our time. Joris Op di 11 jun. 2019 om 22:38 schreef Brock Mendel : > I've been working on arithmetic/comparison bugs and more recently on > performance problems caused by fixing some of those bugs. After trying > less-invasive approaches, I've concluded a fairly big fix is called for. > This is an RFC for that proposed fix. > > ------ > In 0.24.0 we fixed some arithmetic bugs in DataFrame operations by making > DataFrame arithmetic ops operate column-by-column, dispatching to the > Series implementations. This led to a significant performance hit for > operations on DataFrames with many columns (#24990, #26061). > > To restore the lost performance, we need to have these operations take > place > at the Block level. To prevent DataFrame behavior from diverging from > Series > behavior (again), we need to retain a single shared implementation. > > This is a proposal for how meet these two needs. > > Proposal: > - Allow EA to support 2D arrays > - Use PandasArray to back Block subclasses currently backed by ndarray > - Implement arithmetic and comparison ops directly on PandasArray, then > have Series, DataFrame, and Index ops pass through to the PandasArray > implementations. > > Fixes: > - Performance degradation in DataFrame ops (#24990, #26061) > - The last remaining inconsistencies between Index and Series ops (#19322, > #18824) > - Most of the xfailing arithmetic tests > - #22120: Transposing dataframe loses dtype and ExtensionArray > - #24600 BUG: DataFrame[Sparse] quantile fails because SparseArray has no > reshape > - #23925 DataFrame Quantile Broken with Datetime Data > > Other Upsides: > - Series constructor could dispatch to pd.array, de-duplicating a lot of > code. > - Easier to move to Arrow backend if Blocks are numpy-naive. > - Make EA closer to a drop-in replacement for np.ndarray, necessary if we > want e.g. xarray to find them directly useful (#24716, #24583) > - Block/BlockManager simplifications, see below. > > Downsides: > - Existing constructors assume 1D > - Existing downstream authors assume 1D > - Reduction ops (of which there aren't many) don't have axis kwarg ATM > - But for PandasArray they just pass through to nanops, which already > have+test the axis kwargs > - For DatetimeArray/TimedeltaArray/PeriodArray, I'm the one > implementing the reductions and am OK with this extra complication. > > Block Simplifications: > - Blocks have three main attributes: values, mgr_locs, and ndim > - ndim is _usually_ the same as values.ndim, the exceptions being for > cases where type(values) is restricted to 1D > - Without these restrictions, we can get rid of: > - Block.ndim, associated kludgy ndim-checking code > - numerous can-this-be-reshaped/transposed checks and special cases in > Block and BlockManager code (which are buggy anyway, e.g. #23925) > - With ndim gone, we can then get rid of mgr_locs! > - The blocks themselves never use mgr_locs except when passing to their > own constructors. > - mgr_locs makes _much_ more sense as an attribute of the BlockManager > - With mgr_locs gone, Block becomes just a thin wrapper around an EA > > Implementation Strategy: > - Remove the 1D restriction > - Fairly small tweak, EA subclass must define `shape` instead of > `__len__`; other attrs define in terms of shape. > - Define `transpose`, `T`, `reshape`, and `ravel` > - With this done, several tasks can proceed in parallel: > - simplifications in core.internals, as special-cases for 1D-only can > be removed > - implement and test arithmetic ops on PandasArray > - back Blocks with PandasArray > - back Index (and numeric subclasses) with PandasArray > - Change DataFrame, Series, Index ops to pass through to underlying > Blocks/PandasArrays > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Tue Jun 11 18:25:12 2019 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 11 Jun 2019 18:25:12 -0400 Subject: [Pandas-dev] Arithmetic Proposal In-Reply-To: References: Message-ID: Indeed, it's worth considering if perhaps it would be OK to have a performance regression for very wide dataframes instead. With regards to xarray, 2D extension arrays are interesting but still not particularly helpful. We would still need a wrapper to make them fully N-D, which we need for our data model. On Tue, Jun 11, 2019 at 6:18 PM Joris Van den Bossche < jorisvandenbossche at gmail.com> wrote: > Hi Brock, > > Thanks a lot for starting this discussion and the detailed proposal! > > I will try to look at it in more detail tomorrow, but one general remark: > from time to time, we talked about "getting rid of the BlockManager" or > "simplifying the BlockManager" (although I am not sure if there is any > specific github issue about it, might be from in-person discussions). One > of the interpretations of that (or at least how I understood those > discussions) was to get away of the 2D block based internals, and go to a > simpler "table as collection of 1D arrays" model. This would also enable a > simplication of the internals / BlockManager and many of the other items > you mention. > > So I think we should at least compare a more detailed version of what I > described above against your proposal. As if we would want to go in that > direction long term, I am not sure extensive work on the current 2D > blocks-based BlockManager is worth our time. > > Joris > > Op di 11 jun. 2019 om 22:38 schreef Brock Mendel : > >> I've been working on arithmetic/comparison bugs and more recently on >> performance problems caused by fixing some of those bugs. After trying >> less-invasive approaches, I've concluded a fairly big fix is called for. >> This is an RFC for that proposed fix. >> >> ------ >> In 0.24.0 we fixed some arithmetic bugs in DataFrame operations by making >> DataFrame arithmetic ops operate column-by-column, dispatching to the >> Series implementations. This led to a significant performance hit for >> operations on DataFrames with many columns (#24990, #26061). >> >> To restore the lost performance, we need to have these operations take >> place >> at the Block level. To prevent DataFrame behavior from diverging from >> Series >> behavior (again), we need to retain a single shared implementation. >> >> This is a proposal for how meet these two needs. >> >> Proposal: >> - Allow EA to support 2D arrays >> - Use PandasArray to back Block subclasses currently backed by ndarray >> - Implement arithmetic and comparison ops directly on PandasArray, then >> have Series, DataFrame, and Index ops pass through to the PandasArray >> implementations. >> >> Fixes: >> - Performance degradation in DataFrame ops (#24990, #26061) >> - The last remaining inconsistencies between Index and Series ops >> (#19322, #18824) >> - Most of the xfailing arithmetic tests >> - #22120: Transposing dataframe loses dtype and ExtensionArray >> - #24600 BUG: DataFrame[Sparse] quantile fails because SparseArray has no >> reshape >> - #23925 DataFrame Quantile Broken with Datetime Data >> >> Other Upsides: >> - Series constructor could dispatch to pd.array, de-duplicating a lot of >> code. >> - Easier to move to Arrow backend if Blocks are numpy-naive. >> - Make EA closer to a drop-in replacement for np.ndarray, necessary if we >> want e.g. xarray to find them directly useful (#24716, #24583) >> - Block/BlockManager simplifications, see below. >> >> Downsides: >> - Existing constructors assume 1D >> - Existing downstream authors assume 1D >> - Reduction ops (of which there aren't many) don't have axis kwarg ATM >> - But for PandasArray they just pass through to nanops, which already >> have+test the axis kwargs >> - For DatetimeArray/TimedeltaArray/PeriodArray, I'm the one >> implementing the reductions and am OK with this extra complication. >> >> Block Simplifications: >> - Blocks have three main attributes: values, mgr_locs, and ndim >> - ndim is _usually_ the same as values.ndim, the exceptions being for >> cases where type(values) is restricted to 1D >> - Without these restrictions, we can get rid of: >> - Block.ndim, associated kludgy ndim-checking code >> - numerous can-this-be-reshaped/transposed checks and special cases in >> Block and BlockManager code (which are buggy anyway, e.g. #23925) >> - With ndim gone, we can then get rid of mgr_locs! >> - The blocks themselves never use mgr_locs except when passing to >> their own constructors. >> - mgr_locs makes _much_ more sense as an attribute of the BlockManager >> - With mgr_locs gone, Block becomes just a thin wrapper around an EA >> >> Implementation Strategy: >> - Remove the 1D restriction >> - Fairly small tweak, EA subclass must define `shape` instead of >> `__len__`; other attrs define in terms of shape. >> - Define `transpose`, `T`, `reshape`, and `ravel` >> - With this done, several tasks can proceed in parallel: >> - simplifications in core.internals, as special-cases for 1D-only can >> be removed >> - implement and test arithmetic ops on PandasArray >> - back Blocks with PandasArray >> - back Index (and numeric subclasses) with PandasArray >> - Change DataFrame, Series, Index ops to pass through to underlying >> Blocks/PandasArrays >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Tue Jun 11 21:34:06 2019 From: jeffreback at gmail.com (Jeff Reback) Date: Tue, 11 Jun 2019 21:34:06 -0400 Subject: [Pandas-dev] Tidelift In-Reply-To: References: <57676A0D-A1E8-441B-B60D-AA64ADD9184B@icloud.com>

Message-ID: I > One risk to be aware of is that if a high profile > project like pandas take's TL's money and none of the maintainers pay > themselves with it, then the monthly number may not have as much of a > chance of increasing (since current or prospective TL customers may > observe that the subscription dollars aren't being used in the way > that is being pitched). I actually see the exact opposite here. A project of pandas stature that decides to better the project is a pretty respectable goal. I believe we would be in the letter and more importantly the spirit of Tidelift for the pandas project itself to take this burden & receive the income. Having the project itself with the combined force of multiple maintainers actually would be much more comforting (from the customer's perspective), than a single maintainer (who may not always be there). Furthermore, we could use these funds for the combined benefit of the project, mainly I think for gatherings like the upcoming sprints. I am not sure many of you know, but pandas has not actively solicited *any* monies, and only received 2 largish contributions over the years, which are the majority of our current funds. The tidelift agreement looks to provide a stream of income which we currently do not have. With an income stream we have options; without we don't. We can always decide to remunerate maintainers who contribute to this effort, though, this should be a separate discussion. Jeff On Tue, Jun 11, 2019 at 5:10 PM Wes McKinney wrote: > On Tue, Jun 11, 2019 at 2:26 PM Andy Ray Terrel