From william.ayd at icloud.com Fri Nov 1 19:35:00 2019 From: william.ayd at icloud.com (William Ayd) Date: Fri, 1 Nov 2019 16:35:00 -0700 Subject: [Pandas-dev] ANN: Pandas 0.25.3 Released! Message-ID: <88C3D14C-DEDB-45FE-8B83-EDCC139A2BB4@icloud.com> This is a minor bug-fix release in the 0.25.x series and includes some regression fixes and bug fixes. We recommend that all users upgrade to this version. See the full whatsnew for a list of all the changes. The release can be installed with conda using the conda-forge channel: conda install pandas Or via PyPI: python3 -m pip install --upgrade pandas Please report any issues with the release on the pandas issue tracker . Note that there was an issue with building the PDF documentation. It will be uploaded later. - Will -------------- next part -------------- An HTML attachment was scrubbed... URL: From garcia.marc at gmail.com Mon Nov 4 15:38:23 2019 From: garcia.marc at gmail.com (Marc Garcia) Date: Mon, 4 Nov 2019 15:38:23 -0500 Subject: [Pandas-dev] Discourse discussion forum In-Reply-To: References:

Message-ID: We had a discussion in the NumFOCUS summit about Discourse. People with more experience also considered that is not feasible to use Discourse for all projects in a reasonable and organized way. So, if nobody objects we'll move forward with the pandas Discourse. I think a PyData / NumFOCUS one for communication among projects can make sense. If anybody have ideas on categories we should have, what to set up to communicate among projects... They are more than welcome. Otherwise we'll keep adding figure out what works best as we use Discourse. If you want to start signing up in our Discourse, pandas.discourse.group, I think we can start using it for discussions, and delete this list once we're confident. There are already few people in the core team that is in Discourse and got admin rights. Please let us know if you sign up, so we can grant you admin rights too. Cheers! On Tue, Oct 29, 2019 at 12:45 PM Andy Ray Terrel wrote: > Okay well I can go bug the heads of all the pydata projects, but the > confusion comes when a user doesn't know where to post. Having lots of > discourse sites, seems like it will lead to confusion and more work on > maintainers to curate the community discussion. > > > On Tue, Oct 29, 2019 at 10:21 AM Marc Garcia > wrote: > >> That's a good point. I guess it doesn't make a big difference in terms of >> organization of the threads, as a discussion on something dask-pandas will >> still need to be in one of the categories (pandas-dev or dask-dev). But >> being able to tag people from other projects could be useful. >> >> But I still think that having separate discourse instances will make our >> lives easier. Feels like a huge mess to have all projects in the same >> instance with the navigation of discourse. >> >> On Tue, Oct 29, 2019 at 12:09 PM Andy Ray Terrel >> wrote: >> >>> I think the value many have is for cross project issues, but maybe those >>> are few and far between. >>> >>> On Tue, Oct 29, 2019 at 10:07 AM Marc Garcia >>> wrote: >>> >>>> I personally don't see the value of having a common discourse for all >>>> the projects, where the top-level is a list of possibly 100 items, where >>>> pandas has few groups lost there, and not more structure than that, as >>>> opposed to have a discourse per project. >>>> >>>> Single-login is the only advantage I can see, and this can also be >>>> achieved with separate groups for what I've seen. >>>> >>>> Tom, Joris, I think you were the ones who preferred having a common >>>> discourse. Does it still sounds as the best option, given the limitations? >>>> >>>> On Tue, Oct 29, 2019 at 11:51 AM Andy Ray Terrel >>>> wrote: >>>> >>>>> Sorry I've been traveling. >>>>> >>>>> I have https://pydata.discourse. group set >>>>> up. I can send out invites. >>>>> >>>>> I guess as you have pointed out, we can set up categories for each >>>>> project, e.g. dask-users, pandas-users, pandas-dev, but maybe not exactly >>>>> what you want. >>>>> >>>>> Happy to invite anyone to the discourse instance before we open it up >>>>> to the wild >>>>> >>>>> -- Andy >>>>> >>>>> On Tue, Oct 29, 2019 at 9:24 AM Marc Garcia >>>>> wrote: >>>>> >>>>>> Andy, could you experiment on having multiple projects in a single >>>>>> discourse? I saw the PyData one was activated some time ago. >>>>>> >>>>>> If it doesn't look feasible as I think, let me know so I'll move >>>>>> forward discussing what to have in the pandas one. >>>>>> >>>>>> Cheers! >>>>>> >>>>>> On Wed, Sep 25, 2019 at 8:03 AM Marc Garcia >>>>>> wrote: >>>>>> >>>>>>> Discourse has private categories, we already have a private >>>>>>> "Maintainers" one, that only admins can see and use. And there are other >>>>>>> permissions levels that can be used. For example, we can have a private >>>>>>> category for the memebers of the code of conduct committee... I just need >>>>>>> to check if we can associate email addresses to those groups, so when >>>>>>> someone emails to coc at pandas.io the messages are posted in that >>>>>>> private group. But if we can set up that as we need, I think we should be >>>>>>> able to replace all those and centralize everything in Discourse. >>>>>>> >>>>>>> I'm skeptical on being able to set up a global Discourse for all the >>>>>>> ecosystem, where things are easy to find, based on how Discourse works and >>>>>>> the tests I did. I'd move forward with our own for now if nobody is able to >>>>>>> set that up. >>>>>>> >>>>>>> Andy, I got the pandas account approved in minutes. I see that we >>>>>>> can have a custom domain, so you can use the pandas and see if we can >>>>>>> manage to have multiple projects in a way we like, and if we do we just >>>>>>> change the domain to discuss.pydata.org (or whatever). You're >>>>>>> already an admin, feel free to experiment and change the set up as you need. >>>>>>> >>>>>>> Maarten, not sure I understand your point. Not a fan of Discourse so >>>>>>> far, but I think having the user and the devs discussions in a single place >>>>>>> makes it easier to find the information, and I think Discourse interface >>>>>>> also makes it easier to find compared to mailman, or google groups. >>>>>>> Regardless of gitter (there are no important discussions or decision making >>>>>>> there I think), would you prefer to stay with mailman and google groups >>>>>>> over Discourse? Or what you think would be the ideal or best option? >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> On Wed, Sep 25, 2019 at 8:39 AM Joris Van den Bossche < >>>>>>> jorisvandenbossche at gmail.com> wrote: >>>>>>> >>>>>>>> What do other people think about starting to use discourse for >>>>>>>> pandas? >>>>>>>> (and about sharing it with other projects or having our own?) >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> On the existing lists: I don't think discourse would replace the >>>>>>>> core devs list (that is intentionally private). And IMO also not gitter >>>>>>>> (discourse is not a real-time chat). >>>>>>>> >>>>>>>> Joris >>>>>>>> >>>>>>>> On Fri, 20 Sep 2019 at 14:58, Marc Garcia >>>>>>>> wrote: >>>>>>>> >>>>>>>>> For what I've seen I'd say that Discourse can be configured to >>>>>>>>> interact with a category like a distribution list (subscribe and have an >>>>>>>>> email address to send messages there). Not sure, but for the settings I've >>>>>>>>> seen should be possible. >>>>>>>>> >>>>>>>>> Personally I think it should replace all the existing lists: >>>>>>>>> - pydata google group >>>>>>>>> - pandas-dev (this) >>>>>>>>> - core devs list >>>>>>>>> >>>>>>>>> I'm also ok to get rid of gitter once we move to discourse (also >>>>>>>>> ok to keep it if people find it useful, but I rarely use it). >>>>>>>>> >>>>>>>>> I created an issue for this discussion some time ago: >>>>>>>>> https://github.com/pandas-dev/pandas/issues/27903 >>>>>>>>> >>>>>>>>> On Fri, Sep 20, 2019 at 1:50 PM Tom Augspurger < >>>>>>>>> tom.augspurger88 at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Sep 20, 2019 at 6:57 AM Andy Terrel >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Thanks Joris for splitting the thread, sorry if I hijacked the >>>>>>>>>>> other one. >>>>>>>>>>> >>>>>>>>>>> For some discussion from numpy you can see here >>>>>>>>>>> https://github.com/numpy/numpy.org/issues/28 >>>>>>>>>>> >>>>>>>>>>> Julia and Jupyter both run their own discourse but Dask, Numpy, >>>>>>>>>>> Scipy have all told me ?I don?t want to run it ourselves but be part of a >>>>>>>>>>> larger one? >>>>>>>>>>> >>>>>>>>>>> I bet we can figure out how to organize it. >>>>>>>>>>> >>>>>>>>>>> I just put in an application to get pydata.discourse.org. >>>>>>>>>>> >>>>>>>>>>> ? Andy >>>>>>>>>>> >>>>>>>>>>> On Fri, Sep 20, 2019 at 6:52 AM Joris Van den Bossche < >>>>>>>>>>> jorisvandenbossche at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> (let's use a new thread for discourse, as it is a different >>>>>>>>>>>> discussion from the website hosting I think, regardless whether OVH might >>>>>>>>>>>> also host discourse) >>>>>>>>>>>> >>>>>>>>>>>> I am not familiar enough myself with discourse to know whether >>>>>>>>>>>> multiple projects sharing a single discourse will become annoying. But >>>>>>>>>>>> indeed, that sounds as it needs some kind of hierarchical category / >>>>>>>>>>>> tagging. >>>>>>>>>>>> >>>>>>>>>>>> For pandas itself: I think I quite like the idea of having a >>>>>>>>>>>> discourse, but *if* we do that, we should think about how that >>>>>>>>>>>> fits with / replaces / adds to /... some of the other communication >>>>>>>>>>>> channels (pandas-dev mailing list, pydata mailing list, github issues, ..). >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> IMO, we can replace the pandas-dev & pydata mailing lists with >>>>>>>>>> it. Possibly gitter as well. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Joris >>>>>>>>>>>> >>>>>>>>>>>> On Fri, 20 Sep 2019 at 13:18, Marc Garcia < >>>>>>>>>>>> garcia.marc at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I'm fine with that conceptually, but I think Discourse will >>>>>>>>>>>>> make things quite tricky to find things then. >>>>>>>>>>>>> >>>>>>>>>>>>> We already got our discourse approved, if you want to join it >>>>>>>>>>>>> an experiment with the setting. But it's the first thing I tried, and after >>>>>>>>>>>>> you join a category (project), everything feels like it's in the same place >>>>>>>>>>>>> (even if subcategories and tags exist). And I think we need at least a >>>>>>>>>>>>> clear separation between pandas/users pandas/contributors discussions. >>>>>>>>>>>>> >>>>>>>>>>>>> May be I just couldn't find the settings, let me know if you >>>>>>>>>>>>> manage to get a multi-project set up that makes sense. >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Sep 20, 2019 at 12:07 PM Tom Augspurger < >>>>>>>>>>>>> tom.augspurger88 at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I'd prefer to join a discourse along with NumPy, Dask, and >>>>>>>>>>>>>> other PyData or NumFOCUS projects, rather than going out on our own. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Sep 20, 2019 at 4:47 AM Marc Garcia < >>>>>>>>>>>>>> garcia.marc at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I don't know much about discourse, but why do we want to >>>>>>>>>>>>>>> self-host it? Seems like Discourse does it for free for open source >>>>>>>>>>>>>>> projects: https://free.discourse.group/ And I don't think >>>>>>>>>>>>>>> we want another system to maintain. Am I missing something? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I applied for https://pandas.discourse.group, so we can >>>>>>>>>>>>>>> give it a try. We should have it approved and working in couple of days. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> For what I saw, Discourse has one level of categories, so I >>>>>>>>>>>>>>> guess we want one per project, so we can have categories for "Users", >>>>>>>>>>>>>>> "Contributors", "Ecosystem"... or something similar. I guess if we have a >>>>>>>>>>>>>>> single Discourse for NumFOCUS, every project will be a category, and it'll >>>>>>>>>>>>>>> be difficult to group conversations. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If anyone already has experience with Discourse and >>>>>>>>>>>>>>> disagrees with my guesses, please let me know. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 4:32 PM Andy Terrel < >>>>>>>>>>>>>>> andy at numfocus.org> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Sounds great to me. Just let me know where everything goes. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> NumPy wants me to help host a discourse for them, maybe OVH >>>>>>>>>>>>>>>> would be a good place to do that as well, (although I would be more >>>>>>>>>>>>>>>> inclinded if it was pydata and we had pandas, scipy, and numpy on it). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- Andy >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:51 AM Tom Augspurger < >>>>>>>>>>>>>>>> tom.augspurger88 at gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Sounds good w.r.t crediting OVH on those pages. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> For the ASV results at pandas.pydata.org/speed (which I >>>>>>>>>>>>>>>>> now notice is currently broken for pandas), the only thing on the webserver >>>>>>>>>>>>>>>>> is a >>>>>>>>>>>>>>>>> cron job doing a `git pull` from >>>>>>>>>>>>>>>>> https://github.com/asv-runner/asv-collection, from within >>>>>>>>>>>>>>>>> `/usr/share/nginx`. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Tom >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, Sep 18, 2019 at 8:18 AM Marc Garcia < >>>>>>>>>>>>>>>>> garcia.marc at gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> An update on the new website infrastructure. We need to >>>>>>>>>>>>>>>>>> finish discussing the details, but OVH is happy to provide the hosting for >>>>>>>>>>>>>>>>>> the pandas infrastructure we need. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> My initial idea is to credit them in the page with the >>>>>>>>>>>>>>>>>> rest of the sponsors in the new website: >>>>>>>>>>>>>>>>>> https://datapythonista.github.io/pandas-web/community/team.html#institutional-partners and >>>>>>>>>>>>>>>>>> also in the top right corner of the runnable code widgets (see for example >>>>>>>>>>>>>>>>>> where Binder is credited here: https://spacy.io/). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> What I'd like to ask is: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 1. For the production website and docs (static content >>>>>>>>>>>>>>>>>> only, for the traffic we need): >>>>>>>>>>>>>>>>>> https://us.ovhcloud.com/products/public-cloud/object-storage >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2. For our tools and processes, like the benchmarks, >>>>>>>>>>>>>>>>>> builds, CI stuff (temporary publish the docs for every PR,...): >>>>>>>>>>>>>>>>>> https://www.ovh.co.uk/vps/vps-ssd.xml (VPS SSD 3) >>>>>>>>>>>>>>>>>> 3. For BinderHub (runnable code in our docs, launch >>>>>>>>>>>>>>>>>> tutorials on Binder...): >>>>>>>>>>>>>>>>>> https://www.ovh.co.uk/public-cloud/kubernetes/ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> For the BinderHub, QuantStack offered help with the set >>>>>>>>>>>>>>>>>> up (which is great, because I don't know much about Binder myself, and I'm >>>>>>>>>>>>>>>>>> not sure if anyone else does or wants to take care of this). I don't think >>>>>>>>>>>>>>>>>> it'll be easy to estimate how big is the cluster we need beforehand, but I >>>>>>>>>>>>>>>>>> guess we can add things to Binder iteratively, and have more info as we >>>>>>>>>>>>>>>>>> grow. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> OVH gave us a 200 euros voucher to experiment with the >>>>>>>>>>>>>>>>>> different services. Let me know how all this sounds, and if there are no >>>>>>>>>>>>>>>>>> objections, I'll create an account and buy those services with the voucher, >>>>>>>>>>>>>>>>>> and I'll start to prototype and see how everything works. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Cheers! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, Aug 20, 2019 at 11:06 PM Marc Garcia < >>>>>>>>>>>>>>>>>> garcia.marc at gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Somehow related to the work on the new website ( >>>>>>>>>>>>>>>>>>> https://github.com/pandas-dev/pandas/pull/28014), I've >>>>>>>>>>>>>>>>>>> been discussing with the Binder team, and looks like should be quite easy >>>>>>>>>>>>>>>>>>> soon (with a Sphinx extension) to make all the documentation pages runnable >>>>>>>>>>>>>>>>>>> with Binder, directly from the website (without opening the page as a >>>>>>>>>>>>>>>>>>> Jupyter in mybinder). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> While they are very happy with the idea of having this >>>>>>>>>>>>>>>>>>> is pandas, it's uncertain if the current infrastructure Binder has got, is >>>>>>>>>>>>>>>>>>> able to handle all the traffic we would send. And scikit-learn is working >>>>>>>>>>>>>>>>>>> on it too (today they added to the dev docs a link to mybinder to run the >>>>>>>>>>>>>>>>>>> examples). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I'm discussing with OVH (their infrastructure provider) >>>>>>>>>>>>>>>>>>> on whether they'd be happy to provide a dedicated BinderHub specific to >>>>>>>>>>>>>>>>>>> pandas (or may be we can have one for all NumFOCUS projects). We'll see how >>>>>>>>>>>>>>>>>>> it goes, but wanted to let you know, so you're updated, and in case anyone >>>>>>>>>>>>>>>>>>> is interested in participating in the discussions. Of course before any >>>>>>>>>>>>>>>>>>> decision is made I'll open a discussion here or on GitHub. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> As part of the discussion I'm also trying to get a >>>>>>>>>>>>>>>>>>> server for the website, and one for development stuff. Specfically for the >>>>>>>>>>>>>>>>>>> dev docs (including rendered docs of every PR) and the GitHub app that will >>>>>>>>>>>>>>>>>>> generate them. I guess it should be very easy to find a sponsor for these >>>>>>>>>>>>>>>>>>> two servers (in exchange of a small note in the footer of the website, or >>>>>>>>>>>>>>>>>>> something like that). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Let me know if you have any comment, want to be involved >>>>>>>>>>>>>>>>>>> or whatever. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Cheers! >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>> Pandas-dev mailing list >>>>>>>>>>>>>>>>>> Pandas-dev at python.org >>>>>>>>>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Andy R. Terrel, PhD >>>>>>>>>>>>>>>> President >>>>>>>>>>>>>>>> NumFOCUS >>>>>>>>>>>>>>>> andy at numfocus.org >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> Pandas-dev mailing list >>>>>>>>>>>>> Pandas-dev at python.org >>>>>>>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>> Andy R. Terrel, PhD >>>>>>>>>>> President >>>>>>>>>>> NumFOCUS >>>>>>>>>>> andy at numfocus.org >>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>> Pandas-dev mailing list >>>>>>>> Pandas-dev at python.org >>>>>>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>>>>>> >>>>>>> _______________________________________________ >>>>>> Pandas-dev mailing list >>>>>> Pandas-dev at python.org >>>>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>>>> >>>>> >>>>> >>>>> -- >>>>> Andy R. Terrel, PhD >>>>> President, NumFOCUS >>>>> andy at numfocus.org >>>>> >>>> _______________________________________________ >>>> Pandas-dev mailing list >>>> Pandas-dev at python.org >>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>> >>> >>> >>> -- >>> Andy R. Terrel, PhD >>> President, NumFOCUS >>> andy at numfocus.org >>> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> > > > -- > Andy R. Terrel, PhD > President, NumFOCUS > andy at numfocus.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.augspurger88 at gmail.com Mon Nov 11 10:27:20 2019 From: tom.augspurger88 at gmail.com (Tom Augspurger) Date: Mon, 11 Nov 2019 09:27:20 -0600 Subject: [Pandas-dev] Monthly Dev Meeting Message-ID: Hi all, We're having our monthly dev call at 19:00 UTC. Hangout: https://meet.google.com/hav-rmax-zjx Agenda: https://www.google.com/url?q=https://docs.google.com/document/u/1/d/1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit?ouid%3D102771015311436394588%26usp%3Ddocs_home%26ths%3Dtrue&sa=D&ust=1568140984702000&usg=AOvVaw2b6zalhLhHSiI8GrLB5VVL Please feel free to add agenda items before the call. Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Mon Nov 11 10:32:54 2019 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Mon, 11 Nov 2019 16:32:54 +0100 Subject: [Pandas-dev] Monthly Dev Meeting In-Reply-To: References: Message-ID: To complement: the dev call is coming Wednesday, November 13th (19:00 UTC). On Mon, 11 Nov 2019 at 16:27, Tom Augspurger wrote: > Hi all, > > We're having our monthly dev call at 19:00 UTC. > > Hangout: https://meet.google.com/hav-rmax-zjx > > Agenda: > https://www.google.com/url?q=https://docs.google.com/document/u/1/d/1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit?ouid%3D102771015311436394588%26usp%3Ddocs_home%26ths%3Dtrue&sa=D&ust=1568140984702000&usg=AOvVaw2b6zalhLhHSiI8GrLB5VVL > > Please feel free to add agenda items before the call. > > Tom > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From garcia.marc at gmail.com Mon Nov 11 19:46:00 2019 From: garcia.marc at gmail.com (Marc Garcia) Date: Tue, 12 Nov 2019 01:46:00 +0100 Subject: [Pandas-dev] Remove old mailing lists Message-ID: (Forwarding message from Discourse here, but better keep the discussion there: https://pandas.discourse.group/t/remove-old-mailing-lists/) I don't think we should keep the existing mailing lists for too long in parallel with the new Discourse. Otherwise we'll keep the two systems, and have a mess with the discussions. My proposal is: - Announce Discourse via twitter and the pydata mailing list - Leave couple of weeks to see the there is no unexpected issue with Discourse - On the 25th of November, delete the previous mailing lists (pandas-dev, and the core team google group), and keep just Discourse I guess the pydata google group has a broader scope than pandas, and should be left as is for now. Any objection? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtratner at gmail.com Mon Nov 11 21:47:34 2019 From: jtratner at gmail.com (Jeffrey Tratner) Date: Mon, 11 Nov 2019 18:47:34 -0800 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: Message-ID: Here?s a (possible) guide to make discourse work like a mailing list . I haven?t tried it yet but am hoping to soon: https://discourse.mozilla.org/t/how-do-i-use-discourse-via-email/15279 On Mon, Nov 11, 2019 at 4:46 PM Marc Garcia wrote: > (Forwarding message from Discourse here, but better keep the discussion > there: https://pandas.discourse.group/t/remove-old-mailing-lists/) > > I don't think we should keep the existing mailing lists for too long in > parallel with the new Discourse. Otherwise we'll keep the two systems, and > have a mess with the discussions. > > My proposal is: > - Announce Discourse via twitter and the pydata mailing list > - Leave couple of weeks to see the there is no unexpected issue with > Discourse > - On the 25th of November, delete the previous mailing lists (pandas-dev, > and the core team google group), and keep just Discourse > > I guess the pydata google group has a broader scope than pandas, and > should be left as is for now. > > Any objection? > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at pietrobattiston.it Tue Nov 12 03:26:04 2019 From: me at pietrobattiston.it (Pietro Battiston) Date: Tue, 12 Nov 2019 09:26:04 +0100 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: Message-ID: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it> To be honest, I totally missed this, having thought that discourse was an experiment in community envolvement and not an immediate replacement for lists, even less so for devs lists. Precisely because I missed this, I totally accept whatever other devs, who spent time planning this, think works best. But the more the system works similarly as mailing lists (see Jeffrey's mail), the better I like it. Cheers, Pietro Il giorno lun, 11/11/2019 alle 18.47 -0800, Jeffrey Tratner ha scritto: > Here?s a (possible) guide to make discourse work like a mailing list > . I haven?t tried it yet but am hoping to soon: > https://discourse.mozilla.org/t/how-do-i-use-discourse-via-email/15279 > > On Mon, Nov 11, 2019 at 4:46 PM Marc Garcia > wrote: > > (Forwarding message from Discourse here, but better keep the > > discussion there: > > https://pandas.discourse.group/t/remove-old-mailing-lists/) > > > > I don't think we should keep the existing mailing lists for too > > long in parallel with the new Discourse. Otherwise we'll keep the > > two systems, and have a mess with the discussions. > > > > My proposal is: > > - Announce Discourse via twitter and the pydata mailing list > > - Leave couple of weeks to see the there is no unexpected issue > > with Discourse > > - On the 25th of November, delete the previous mailing lists > > (pandas-dev, and the core team google group), and keep just > > Discourse > > > > I guess the pydata google group has a broader scope than pandas, > > and should be left as is for now. > > > > Any objection? > > _______________________________________________ > > Pandas-dev mailing list > > Pandas-dev at python.org > > https://mail.python.org/mailman/listinfo/pandas-dev > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev From tom.augspurger88 at gmail.com Tue Nov 12 06:47:30 2019 From: tom.augspurger88 at gmail.com (Tom Augspurger) Date: Tue, 12 Nov 2019 05:47:30 -0600 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it> References: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it> Message-ID: Let's actually keep the discussion here, to get feedback from people who haven't signed up for discourse. I agree that we don't want to run these in parallel. So if we're going to make the switch we'll want to shut this list down and at least announce it on the PyData google group. I don't know what mailman supports, but we won't want to really delete the pandas-dev mailing list. https://mail.python.org/pipermail/pandas-dev/ is a valuable archive of past discussions. Hopefully we're able to put this list in archive mode. We can discuss this on Wednesday's call. On Tue, Nov 12, 2019 at 2:32 AM Pietro Battiston wrote: > To be honest, I totally missed this, having thought that discourse was > an experiment in community envolvement and not an immediate replacement > for lists, even less so for devs lists. > > Precisely because I missed this, I totally accept whatever other devs, > who spent time planning this, think works best. > > But the more the system works similarly as mailing lists (see Jeffrey's > mail), the better I like it. > > Cheers, > > Pietro > > Il giorno lun, 11/11/2019 alle 18.47 -0800, Jeffrey Tratner ha scritto: > > Here?s a (possible) guide to make discourse work like a mailing list > > . I haven?t tried it yet but am hoping to soon: > > https://discourse.mozilla.org/t/how-do-i-use-discourse-via-email/15279 > > > > On Mon, Nov 11, 2019 at 4:46 PM Marc Garcia > > wrote: > > > (Forwarding message from Discourse here, but better keep the > > > discussion there: > > > https://pandas.discourse.group/t/remove-old-mailing-lists/) > > > > > > I don't think we should keep the existing mailing lists for too > > > long in parallel with the new Discourse. Otherwise we'll keep the > > > two systems, and have a mess with the discussions. > > > > > > My proposal is: > > > - Announce Discourse via twitter and the pydata mailing list > > > - Leave couple of weeks to see the there is no unexpected issue > > > with Discourse > > > - On the 25th of November, delete the previous mailing lists > > > (pandas-dev, and the core team google group), and keep just > > > Discourse > > > > > > I guess the pydata google group has a broader scope than pandas, > > > and should be left as is for now. > > > > > > Any objection? > > > _______________________________________________ > > > Pandas-dev mailing list > > > Pandas-dev at python.org > > > https://mail.python.org/mailman/listinfo/pandas-dev > > > > _______________________________________________ > > Pandas-dev mailing list > > Pandas-dev at python.org > > https://mail.python.org/mailman/listinfo/pandas-dev > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From irv at princeton.com Tue Nov 12 10:14:38 2019 From: irv at princeton.com (Irv Lustig) Date: Tue, 12 Nov 2019 10:14:38 -0500 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: Message-ID: I'm just a lurker giving occasional feedback and occasional PRs, but I will say that you haven't had an announcement here on how to sign up for discourse. Or maybe I missed it. Dr-Irv Message: 4 > Date: Tue, 12 Nov 2019 05:47:30 -0600 > From: Tom Augspurger > To: Pietro Battiston > Cc: Jeffrey Tratner , Marc Garcia > , pandas-dev > Subject: Re: [Pandas-dev] Remove old mailing lists > Message-ID: > aJMSDxVLseQ at mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Let's actually keep the discussion here, to get feedback from people who > haven't signed up for discourse. > > I agree that we don't want to run these in parallel. So if we're going to > make the switch we'll want to shut this list down > and at least announce it on the PyData google group. > > I don't know what mailman supports, but we won't want to really delete the > pandas-dev mailing list. https://mail.python.org/pipermail/pandas-dev/ > is a valuable archive of past discussions. Hopefully we're able to put this > list in archive mode. > > We can discuss this on Wednesday's call. > > On Tue, Nov 12, 2019 at 2:32 AM Pietro Battiston > wrote: > > > To be honest, I totally missed this, having thought that discourse was > > an experiment in community envolvement and not an immediate replacement > > for lists, even less so for devs lists. > > > > Precisely because I missed this, I totally accept whatever other devs, > > who spent time planning this, think works best. > > > > But the more the system works similarly as mailing lists (see Jeffrey's > > mail), the better I like it. > > > > Cheers, > > > > Pietro > > > > Il giorno lun, 11/11/2019 alle 18.47 -0800, Jeffrey Tratner ha scritto: > > > Here?s a (possible) guide to make discourse work like a mailing list > > > . I haven?t tried it yet but am hoping to soon: > > > https://discourse.mozilla.org/t/how-do-i-use-discourse-via-email/15279 > > > > > > On Mon, Nov 11, 2019 at 4:46 PM Marc Garcia > > > wrote: > > > > (Forwarding message from Discourse here, but better keep the > > > > discussion there: > > > > https://pandas.discourse.group/t/remove-old-mailing-lists/) > > > > > > > > I don't think we should keep the existing mailing lists for too > > > > long in parallel with the new Discourse. Otherwise we'll keep the > > > > two systems, and have a mess with the discussions. > > > > > > > > My proposal is: > > > > - Announce Discourse via twitter and the pydata mailing list > > > > - Leave couple of weeks to see the there is no unexpected issue > > > > with Discourse > > > > - On the 25th of November, delete the previous mailing lists > > > > (pandas-dev, and the core team google group), and keep just > > > > Discourse > > > > > > > > I guess the pydata google group has a broader scope than pandas, > > > > and should be left as is for now. > > > > > > > > Any objection? > > > > _______________________________________________ > > > > Pandas-dev mailing list > > > > Pandas-dev at python.org > > > > https://mail.python.org/mailman/listinfo/pandas-dev > > > > > > _______________________________________________ > > > Pandas-dev mailing list > > > Pandas-dev at python.org > > > https://mail.python.org/mailman/listinfo/pandas-dev > > > > _______________________________________________ > > Pandas-dev mailing list > > Pandas-dev at python.org > > https://mail.python.org/mailman/listinfo/pandas-dev > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Tue Nov 12 11:01:29 2019 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Tue, 12 Nov 2019 17:01:29 +0100 Subject: [Pandas-dev] Plans for pandas 1.0 [Re: Plans for pandas 0.25.0 and pandas 1.0] In-Reply-To: References: Message-ID: Hi all, In the meantime, September went by .. :) (as usual with 1.0 targets) See also https://github.com/pandas-dev/pandas/issues/17287 But on a more serious note: if we want to get a 1.0 out in the somewhat near future, I think we should do some planning for it / make decisions on what are blockers / organize some effort towards them (right now, I don't have the feeling there is much (coordinated) effort to get closer to 1.0, also from my side to be clear). - The initial idea was to remove all deprecations from pre 0.25 - Do we still want to do this? (I would say yes; according to our policy they otherwise need to stay until 2.0; we can of course update our versioning policy) - Is this a blocker? (probably not each and every one of them; but having done a majority of them might be? Or certain specific ones like .ix might be?) - Is there a way we can do some coordinated effort on this? Or try to get more contributions in this area? - In the meantime, also some specific features / improvements have been discussed that might want to get into 1.0. Are those blockers? - The missing values discussion. Given my personal involvement, I would like to see some basics of it in 1.0 (so the new StringArray can also already start using it). But it's also something that will take time to implement, so maybe that will get tight. - The block-wise ops performance improvement. Is this still targetted for 1.0? - Some API breaks we we still want to do in 1.0? There are several ExtensionArray related ones (eg the return value of .astype(..)) that might be nice. - See also the github milestone: https://github.com/pandas-dev/pandas/milestone/16 - ... ? - Is the website and docs redesign a blocker? To what extent do we want to delay for some of the blockers above? What are our current ideas for a timeline? The gihub milestone is saying December 1, but I don't think that is realistic. Joris On Wed, 17 Jul 2019 at 00:55, Joris Van den Bossche < jorisvandenbossche at gmail.com> wrote: > Hi all, > > We had some discussion about this on the in-person dev sprint end of June, > and I thought it would be good to have some public record of this as well. > > A pandas 0.25.0 release is close (the RC was released earlier this month), > see https://github.com/pandas-dev/pandas/issues/24950 > > For pandas 1.0, the current plan is to finally "just do it". The idea is > that it should not take too long after 0.25.0, without additional major API > changes (additions are fine of course) but with removing the current > deprecated functionalities. > Depending on how much feedback there is on 0.25.0 and on how smoothly it > goes for removing deprecated stuff, we could (maybe optimistically) target > September for that. > > Comments certainly welcome! > > Joris > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.augspurger88 at gmail.com Tue Nov 12 11:15:03 2019 From: tom.augspurger88 at gmail.com (Tom Augspurger) Date: Tue, 12 Nov 2019 10:15:03 -0600 Subject: [Pandas-dev] Plans for pandas 1.0 [Re: Plans for pandas 0.25.0 and pandas 1.0] In-Reply-To: References: Message-ID: I started going through the issues 1.0 milestone this morning. Hoping to finish all of them by the end of the day. On Tue, Nov 12, 2019 at 10:01 AM Joris Van den Bossche < jorisvandenbossche at gmail.com> wrote: > Hi all, > > In the meantime, September went by .. :) (as usual with 1.0 targets) See > also https://github.com/pandas-dev/pandas/issues/17287 > > But on a more serious note: if we want to get a 1.0 out in the somewhat > near future, I think we should do some planning for it / make decisions on > what are blockers / organize some effort towards them (right now, I don't > have the feeling there is much (coordinated) effort to get closer to 1.0, > also from my side to be clear). > > - The initial idea was to remove all deprecations from pre 0.25 > - Do we still want to do this? (I would say yes; according to our > policy they otherwise need to stay until 2.0; we can of course update our > versioning policy) > - Is this a blocker? (probably not each and every one of them; but > having done a majority of them might be? Or certain specific ones like .ix > might be?) > - Is there a way we can do some coordinated effort on this? Or try > to get more contributions in this area? > > I think not a blocker in general. But perhaps we could organize a mini-sprint (online) to knock a bunch out? We can do it as devs and invite community members to join in. > > - In the meantime, also some specific features / improvements have > been discussed that might want to get into 1.0. Are those blockers? > - The missing values discussion. > Given my personal involvement, I would like to see some basics of > it in 1.0 (so the new StringArray can also already start using it). But > it's also something that will take time to implement, so maybe that will > get tight. > > I'd consider delaying 1.0 a bit for this, but it's not absolutely necessary. We can also break small pieces off: StringArray can define and use it's own pd.NA, without larger changes. > > - The block-wise ops performance improvement. Is this still targetted > for 1.0? > > Would be great to have this in. *Probably* not a blocker though. We can do 1.1 shortly after 1.0. > > - Some API breaks we we still want to do in 1.0? > There are several ExtensionArray related ones (eg the return value > of .astype(..)) that might be nice. > - See also the github milestone: > https://github.com/pandas-dev/pandas/milestone/16 > - ... ? > - Is the website and docs redesign a blocker? > > Happy to delay for a week or so, but not a blocker IMO. The nice thing there is we can continue to improve things after tagging the release. For major improvements to the docs, we can rebuild with a fake tag and reupload. To what extent do we want to delay for some of the blockers above? > > What are our current ideas for a timeline? The gihub milestone is saying > December 1, but I don't think that is realistic. > I think we'll have a better idea after the call tomorrow, but let's aim for a release candidate sometime in December. But people will likely be less active around the holidays, so a final release this year may be too ambitious. > Joris > > On Wed, 17 Jul 2019 at 00:55, Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > >> Hi all, >> >> We had some discussion about this on the in-person dev sprint end of >> June, and I thought it would be good to have some public record of this as >> well. >> >> A pandas 0.25.0 release is close (the RC was released earlier this >> month), see https://github.com/pandas-dev/pandas/issues/24950 >> >> For pandas 1.0, the current plan is to finally "just do it". The idea is >> that it should not take too long after 0.25.0, without additional major API >> changes (additions are fine of course) but with removing the current >> deprecated functionalities. >> Depending on how much feedback there is on 0.25.0 and on how smoothly it >> goes for removing deprecated stuff, we could (maybe optimistically) target >> September for that. >> >> Comments certainly welcome! >> >> Joris >> > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Tue Nov 12 13:33:31 2019 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 12 Nov 2019 10:33:31 -0800 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it> Message-ID: I'm not very active in pandas development anymore, but for what it's worth, I find mailing lists much easier to follow casually. Everyone once in a while I see something interesting and chime in. I would be unlikely to do that on Discourse. On Tue, Nov 12, 2019 at 3:47 AM Tom Augspurger wrote: > Let's actually keep the discussion here, to get feedback from people who > haven't signed up for discourse. > > I agree that we don't want to run these in parallel. So if we're going to > make the switch we'll want to shut this list down > and at least announce it on the PyData google group. > > I don't know what mailman supports, but we won't want to really delete the > pandas-dev mailing list. https://mail.python.org/pipermail/pandas-dev/ > is a valuable archive of past discussions. Hopefully we're able to put > this list in archive mode. > > We can discuss this on Wednesday's call. > > On Tue, Nov 12, 2019 at 2:32 AM Pietro Battiston > wrote: > >> To be honest, I totally missed this, having thought that discourse was >> an experiment in community envolvement and not an immediate replacement >> for lists, even less so for devs lists. >> >> Precisely because I missed this, I totally accept whatever other devs, >> who spent time planning this, think works best. >> >> But the more the system works similarly as mailing lists (see Jeffrey's >> mail), the better I like it. >> >> Cheers, >> >> Pietro >> >> Il giorno lun, 11/11/2019 alle 18.47 -0800, Jeffrey Tratner ha scritto: >> > Here?s a (possible) guide to make discourse work like a mailing list >> > . I haven?t tried it yet but am hoping to soon: >> > https://discourse.mozilla.org/t/how-do-i-use-discourse-via-email/15279 >> > >> > On Mon, Nov 11, 2019 at 4:46 PM Marc Garcia >> > wrote: >> > > (Forwarding message from Discourse here, but better keep the >> > > discussion there: >> > > https://pandas.discourse.group/t/remove-old-mailing-lists/) >> > > >> > > I don't think we should keep the existing mailing lists for too >> > > long in parallel with the new Discourse. Otherwise we'll keep the >> > > two systems, and have a mess with the discussions. >> > > >> > > My proposal is: >> > > - Announce Discourse via twitter and the pydata mailing list >> > > - Leave couple of weeks to see the there is no unexpected issue >> > > with Discourse >> > > - On the 25th of November, delete the previous mailing lists >> > > (pandas-dev, and the core team google group), and keep just >> > > Discourse >> > > >> > > I guess the pydata google group has a broader scope than pandas, >> > > and should be left as is for now. >> > > >> > > Any objection? >> > > _______________________________________________ >> > > Pandas-dev mailing list >> > > Pandas-dev at python.org >> > > https://mail.python.org/mailman/listinfo/pandas-dev >> > >> > _______________________________________________ >> > Pandas-dev mailing list >> > Pandas-dev at python.org >> > https://mail.python.org/mailman/listinfo/pandas-dev >> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrocklin at gmail.com Tue Nov 12 13:51:03 2019 From: mrocklin at gmail.com (Matthew Rocklin) Date: Tue, 12 Nov 2019 10:51:03 -0800 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it>

Message-ID: I'm in the same position as Stephan. I would be interested if there was some broader PyData wide discourse though. I rarely interact here though, so I wouldn't optimize for me. On Tue, Nov 12, 2019 at 10:34 AM Stephan Hoyer wrote: > I'm not very active in pandas development anymore, but for what it's > worth, I find mailing lists much easier to follow casually. Everyone once > in a while I see something interesting and chime in. I would be unlikely to > do that on Discourse. > > On Tue, Nov 12, 2019 at 3:47 AM Tom Augspurger > wrote: > >> Let's actually keep the discussion here, to get feedback from people who >> haven't signed up for discourse. >> >> I agree that we don't want to run these in parallel. So if we're going to >> make the switch we'll want to shut this list down >> and at least announce it on the PyData google group. >> >> I don't know what mailman supports, but we won't want to really delete >> the pandas-dev mailing list. >> https://mail.python.org/pipermail/pandas-dev/ >> is a valuable archive of past discussions. Hopefully we're able to put >> this list in archive mode. >> >> We can discuss this on Wednesday's call. >> >> On Tue, Nov 12, 2019 at 2:32 AM Pietro Battiston >> wrote: >> >>> To be honest, I totally missed this, having thought that discourse was >>> an experiment in community envolvement and not an immediate replacement >>> for lists, even less so for devs lists. >>> >>> Precisely because I missed this, I totally accept whatever other devs, >>> who spent time planning this, think works best. >>> >>> But the more the system works similarly as mailing lists (see Jeffrey's >>> mail), the better I like it. >>> >>> Cheers, >>> >>> Pietro >>> >>> Il giorno lun, 11/11/2019 alle 18.47 -0800, Jeffrey Tratner ha scritto: >>> > Here?s a (possible) guide to make discourse work like a mailing list >>> > . I haven?t tried it yet but am hoping to soon: >>> > https://discourse.mozilla.org/t/how-do-i-use-discourse-via-email/15279 >>> > >>> > On Mon, Nov 11, 2019 at 4:46 PM Marc Garcia >>> > wrote: >>> > > (Forwarding message from Discourse here, but better keep the >>> > > discussion there: >>> > > https://pandas.discourse.group/t/remove-old-mailing-lists/) >>> > > >>> > > I don't think we should keep the existing mailing lists for too >>> > > long in parallel with the new Discourse. Otherwise we'll keep the >>> > > two systems, and have a mess with the discussions. >>> > > >>> > > My proposal is: >>> > > - Announce Discourse via twitter and the pydata mailing list >>> > > - Leave couple of weeks to see the there is no unexpected issue >>> > > with Discourse >>> > > - On the 25th of November, delete the previous mailing lists >>> > > (pandas-dev, and the core team google group), and keep just >>> > > Discourse >>> > > >>> > > I guess the pydata google group has a broader scope than pandas, >>> > > and should be left as is for now. >>> > > >>> > > Any objection? >>> > > _______________________________________________ >>> > > Pandas-dev mailing list >>> > > Pandas-dev at python.org >>> > > https://mail.python.org/mailman/listinfo/pandas-dev >>> > >>> > _______________________________________________ >>> > Pandas-dev mailing list >>> > Pandas-dev at python.org >>> > https://mail.python.org/mailman/listinfo/pandas-dev >>> >>> _______________________________________________ >>> Pandas-dev mailing list >>> Pandas-dev at python.org >>> https://mail.python.org/mailman/listinfo/pandas-dev >>> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From garcia.marc at gmail.com Tue Nov 12 14:21:53 2019 From: garcia.marc at gmail.com (Marc Garcia) Date: Tue, 12 Nov 2019 20:21:53 +0100 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it>

Message-ID: There was a separate thread and a GitHub issue to discuss possible options, including the ones we currently have, and a PyData wide Discourse. I haven't used Discourse before, so can't really tell much about how it compares. But most projects of the ecosystem have moved or are planning to moved to Discourse, and that was the preferred option after the previous discussions. As it has been said, Discourse can be set up to work as a mailing list for people who don't want to use the Discourse interface. So, I don't think this should be a reason to not move forward, you'll just need to subscribe elsewhere, and send messages to a different email address. You can check those past conversations, and if there is a new reason to use something different than Discourse, please let us know asap. Otherwise, please sign up to Discourse at pandas.discourse.group, and let me know if you have any objection to the proposed plan to make the current lists inactive in couple of weeks (keeping the archive accessible as Tom said). On Tue, 12 Nov 2019, 19:51 Matthew Rocklin, wrote: > I'm in the same position as Stephan. > > I would be interested if there was some broader PyData wide discourse > though. > > I rarely interact here though, so I wouldn't optimize for me. > > On Tue, Nov 12, 2019 at 10:34 AM Stephan Hoyer wrote: > >> I'm not very active in pandas development anymore, but for what it's >> worth, I find mailing lists much easier to follow casually. Everyone once >> in a while I see something interesting and chime in. I would be unlikely to >> do that on Discourse. >> >> On Tue, Nov 12, 2019 at 3:47 AM Tom Augspurger < >> tom.augspurger88 at gmail.com> wrote: >> >>> Let's actually keep the discussion here, to get feedback from people who >>> haven't signed up for discourse. >>> >>> I agree that we don't want to run these in parallel. So if we're going >>> to make the switch we'll want to shut this list down >>> and at least announce it on the PyData google group. >>> >>> I don't know what mailman supports, but we won't want to really delete >>> the pandas-dev mailing list. >>> https://mail.python.org/pipermail/pandas-dev/ >>> is a valuable archive of past discussions. Hopefully we're able to put >>> this list in archive mode. >>> >>> We can discuss this on Wednesday's call. >>> >>> On Tue, Nov 12, 2019 at 2:32 AM Pietro Battiston >>> wrote: >>> >>>> To be honest, I totally missed this, having thought that discourse was >>>> an experiment in community envolvement and not an immediate replacement >>>> for lists, even less so for devs lists. >>>> >>>> Precisely because I missed this, I totally accept whatever other devs, >>>> who spent time planning this, think works best. >>>> >>>> But the more the system works similarly as mailing lists (see Jeffrey's >>>> mail), the better I like it. >>>> >>>> Cheers, >>>> >>>> Pietro >>>> >>>> Il giorno lun, 11/11/2019 alle 18.47 -0800, Jeffrey Tratner ha scritto: >>>> > Here?s a (possible) guide to make discourse work like a mailing list >>>> > . I haven?t tried it yet but am hoping to soon: >>>> > >>>> https://discourse.mozilla.org/t/how-do-i-use-discourse-via-email/15279 >>>> > >>>> > On Mon, Nov 11, 2019 at 4:46 PM Marc Garcia >>>> > wrote: >>>> > > (Forwarding message from Discourse here, but better keep the >>>> > > discussion there: >>>> > > https://pandas.discourse.group/t/remove-old-mailing-lists/) >>>> > > >>>> > > I don't think we should keep the existing mailing lists for too >>>> > > long in parallel with the new Discourse. Otherwise we'll keep the >>>> > > two systems, and have a mess with the discussions. >>>> > > >>>> > > My proposal is: >>>> > > - Announce Discourse via twitter and the pydata mailing list >>>> > > - Leave couple of weeks to see the there is no unexpected issue >>>> > > with Discourse >>>> > > - On the 25th of November, delete the previous mailing lists >>>> > > (pandas-dev, and the core team google group), and keep just >>>> > > Discourse >>>> > > >>>> > > I guess the pydata google group has a broader scope than pandas, >>>> > > and should be left as is for now. >>>> > > >>>> > > Any objection? >>>> > > _______________________________________________ >>>> > > Pandas-dev mailing list >>>> > > Pandas-dev at python.org >>>> > > https://mail.python.org/mailman/listinfo/pandas-dev >>>> > >>>> > _______________________________________________ >>>> > Pandas-dev mailing list >>>> > Pandas-dev at python.org >>>> > https://mail.python.org/mailman/listinfo/pandas-dev >>>> >>>> _______________________________________________ >>>> Pandas-dev mailing list >>>> Pandas-dev at python.org >>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>> >>> _______________________________________________ >>> Pandas-dev mailing list >>> Pandas-dev at python.org >>> https://mail.python.org/mailman/listinfo/pandas-dev >>> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at pietrobattiston.it Tue Nov 12 14:46:12 2019 From: me at pietrobattiston.it (Pietro Battiston) Date: Tue, 12 Nov 2019 20:46:12 +0100 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it>

Message-ID: Marc, is the thread you refer to https://github.com/pandas-dev/pandas/issues/27903 ? Again, I didn't participate in the discussion, I am not contributing much lately, and so my opinion is not very important. But I had followed the discussion, I now checked the thread above, and while I derived the idea that Discourse could (and probably should) replace the pydata ML as a frontend to users (ideally of pydata, not just pandas), I fail to see a real discussion, or even just arguments, in favour of replacing the devs MLs. Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha scritto: > [...] > > As it has been said, Discourse can be set up to work as a mailing > list for people who don't want to use the Discourse interface. So, I > don't think this should be a reason to not move forward, you'll just > need to subscribe elsewhere, and send messages to a different email > address. I just read the FAQ about how to "use Discourse via email"... I guess we can live with filtering incoming emails by List-ID, but I don't understand whether there will be multiple emails address to write to, for the different "subcommunities". Could you shed some light? Thanks, Pietro From garcia.marc at gmail.com Tue Nov 12 15:12:31 2019 From: garcia.marc at gmail.com (Marc Garcia) Date: Tue, 12 Nov 2019 21:12:31 +0100 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it>

Message-ID: That's the GitHub issue, most of the discussion was in a thread in this (pandas-dev) list. I'm in my phone and can't look for the thread now, but I think it started in the conversation about the website hosting, and then Joris created a separate thread specific to Discourse. As I said, I have no preference on Discourse (never used it as I said), but I think what we have now is suboptimal, and would be great to have something better. But we had that discussion several weeks ago. We decided to move forward with Discourse at that time. I spent many hours learning about it, and setting it up. So, I'm not really looking forward to start again with this. I'm happy to hand over this to you, if you want to research further and lead the discussion. And implement whatever is best. But if I need to continue spending time on this myself, I'd appreciate if you can find solutions and not problems. I have no idea about how to set up Discourse as a mailing list, or how to do it for subcommunities. But if you have a specific way you want it to work, please do the research, and propose (and implement) the best solution for all us. Or is your proposal to stay with what we have. Thanks! On Tue, 12 Nov 2019, 20:46 Pietro Battiston, wrote: > Marc, > > is the thread you refer to > https://github.com/pandas-dev/pandas/issues/27903 ? > > Again, I didn't participate in the discussion, I am not contributing > much lately, and so my opinion is not very important. But I had > followed the discussion, I now checked the thread above, and while I > derived the idea that Discourse could (and probably should) replace the > pydata ML as a frontend to users (ideally of pydata, not just pandas), > I fail to see a real discussion, or even just arguments, in favour of > replacing the devs MLs. > > > Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha scritto: > > [...] > > > > As it has been said, Discourse can be set up to work as a mailing > > list for people who don't want to use the Discourse interface. So, I > > don't think this should be a reason to not move forward, you'll just > > need to subscribe elsewhere, and send messages to a different email > > address. > > I just read the FAQ about how to "use Discourse via email"... I guess > we can live with filtering incoming emails by List-ID, but I don't > understand whether there will be multiple emails address to write to, > for the different "subcommunities". Could you shed some light? > > Thanks, > > Pietro > > On Tue, 12 Nov 2019, 20:46 Pietro Battiston, wrote: > Marc, > > is the thread you refer to > https://github.com/pandas-dev/pandas/issues/27903 ? > > Again, I didn't participate in the discussion, I am not contributing > much lately, and so my opinion is not very important. But I had > followed the discussion, I now checked the thread above, and while I > derived the idea that Discourse could (and probably should) replace the > pydata ML as a frontend to users (ideally of pydata, not just pandas), > I fail to see a real discussion, or even just arguments, in favour of > replacing the devs MLs. > > > Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha scritto: > > [...] > > > > As it has been said, Discourse can be set up to work as a mailing > > list for people who don't want to use the Discourse interface. So, I > > don't think this should be a reason to not move forward, you'll just > > need to subscribe elsewhere, and send messages to a different email > > address. > > I just read the FAQ about how to "use Discourse via email"... I guess > we can live with filtering incoming emails by List-ID, but I don't > understand whether there will be multiple emails address to write to, > for the different "subcommunities". Could you shed some light? > > Thanks, > > Pietro > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.augier at univ-grenoble-alpes.fr Tue Nov 12 16:51:42 2019 From: pierre.augier at univ-grenoble-alpes.fr (PIERRE AUGIER) Date: Tue, 12 Nov 2019 22:51:42 +0100 (CET) Subject: [Pandas-dev] =?utf-8?b?4oCLUmU6IFtOdW1weS1kaXNjdXNzaW9uXSBUcmFu?= =?utf-8?q?sonic_Vision=3A_unifying_Python-Numpy_accelerators?= In-Reply-To: References: Message-ID: <1204228262.17823550.1573595502265.JavaMail.zimbra@univ-grenoble-alpes.fr> Dear Pandas developers, Ralf Gommers wrote me that there was a discussion on the pandas-dev mailing list a couple of weeks ago about adopting Numba as a dependency. We recently wrote a serious text on this subject: http://tiny.cc/transonic-vision. As a side remark, we also played with Transonic and Pandas in this notebook https://github.com/fluiddyn/transonic-demos/blob/master/pandas.ipynb (the binder link to run the benchmarks: https://mybinder.org/v2/gh/fluiddyn/transonic-demos/master) > Date: Wed, 6 Nov 2019 23:49:08 -0500 > From: Ralf Gommers > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] Transonic Vision: unifying > Python-Numpy accelerators > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > On Mon, Nov 4, 2019 at 4:54 PM PIERRE AUGIER < > pierre.augier at univ-grenoble-alpes.fr> wrote: > >> Dear Python-Numpy community, >> >> Transonic is a pure Python package to easily accelerate modern >> Python-Numpy code with different accelerators (currently Cython, Pythran >> and Numba). >> >> I'm trying to get some funding for this project. The related work would >> benefit in particular to Cython, Numba, Pythran and Xtensor. >> >> To obtain this funding, we really need some feedback from some people >> knowing the subject of performance with Python-Numpy code. >> >> That's one of the reason why we wrote this long and serious text on >> Transonic Vision: http://tiny.cc/transonic-vision. We describe some >> issues (perf for numerical kernels, incompatible accelerators, community >> split between experts and simple users, ...) and possible improvements. >> > > Thanks Pierre, that's a very interesting vision paper. > > In case you haven't seen it, there was a discussion on the pandas-dev > mailing list a couple of weeks ago about adopting Numba as a dependency > (and issues with that). > > Your comment on my assessment from 1.5 years ago being a little unfair to > Pythran may be true - not sure it was at the time, but Pythran seems to > mature nicely. > > The ability to switch between just-in-time and ahead-of-time compilation is > nice. One thing I noticed is that this actual switching is not completely > fluent: the jit and boost decorators have different signatures, and there's > no way to globally switch behavior (say with an env var, as for backend > selection). > > >> Help would be very much appreciated. >> > > I'd be interested to help think about adoption and/or funding. > > Cheers, > Ralf > > >> >> Now a coding riddle: >> >> import numpy as np >> from transonic import jit >> >> @jit(native=True, xsimd=True) >> def fxfy(ft, fn, theta): >> sin_theta = np.sin(theta) >> cos_theta = np.cos(theta) >> fx = cos_theta * ft - sin_theta * fn >> fy = sin_theta * ft + cos_theta * fn >> return fx, fy >> >> @jit(native=True, xsimd=True) >> def fxfy_loops(ft, fn, theta): >> n0 = theta.size >> fx = np.empty_like(ft) >> fy = np.empty_like(fn) >> for index in range(n0): >> sin_theta = np.sin(theta[index]) >> cos_theta = np.cos(theta[index]) >> fx[index] = cos_theta * ft[index] - sin_theta * fn[index] >> fy[index] = sin_theta * ft[index] + cos_theta * fn[index] >> return fx, fy >> >> How can be compared the performances of these functions with pure Numpy, >> Numba and Pythran ? >> >> You can find out the answer in our note http://tiny.cc/transonic-vision >> :-) >> >> Pierre >> >> > Message: 1 >> > Date: Thu, 31 Oct 2019 21:16:06 +0100 (CET) >> > From: PIERRE AUGIER >> > To: numpy-discussion at python.org >> > Subject: [Numpy-discussion] Transonic Vision: unifying Python-Numpy >> > accelerators >> > Message-ID: >> > < >> 1080118635.5930814.1572552966711.JavaMail.zimbra at univ-grenoble-alpes.fr> >> > >> > Content-Type: text/plain; charset=utf-8 >> > >> > Dear Python-Numpy community, >> > >> > Few years ago I started to use a lot Python and Numpy for science. I'd >> like to >> > thanks all people who contribute to this fantastic community. >> > >> > I used a lot Cython, Pythran and Numba and for the FluidDyn project, we >> created >> > Transonic, a pure Python package to easily accelerate modern >> Python-Numpy code >> > with different accelerators. We wrote a long and serious text to explain >> why we >> > think Transonic could have a positive impact on the scientific Python >> > ecosystem. >> > >> > Here it is: http://tiny.cc/transonic-vision >> > >> > Feedback and discussions would be greatly appreciated! >> > >> > Pierre >> > >> > -- >> > Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr >> > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels >> > BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 >> _______________________________________________ From garcia.marc at gmail.com Tue Nov 12 21:10:49 2019 From: garcia.marc at gmail.com (Marc Garcia) Date: Wed, 13 Nov 2019 03:10:49 +0100 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it>

Message-ID: I was granting admin permissions to maintainers on the new Discourse, and looks like we've got a limit of 5. Spoke with Discourse support, and they want us to pay >$300 per month to have more. I think we could live with this, but I'm a bit worried about their open source plan being more of a freemium plan they use to make big open source projects pay. I was reading the details of the conditions, and seems like we also have bandwidth limits we may reach. An alternative is to host the instance ourselves. If we do that, we can also consider other alternatives. Just seem Flarum, which is in beta, but at a first glance it looks like it manages subcategories in a way that would make it possible to have a NumFOCUS broad forum. Not sure if they have the feature of letting categories/subcategories behave as mailing lists, need to research. What are your thoughts? On Tue, Nov 12, 2019 at 9:12 PM Marc Garcia wrote: > That's the GitHub issue, most of the discussion was in a thread in this > (pandas-dev) list. I'm in my phone and can't look for the thread now, but I > think it started in the conversation about the website hosting, and then > Joris created a separate thread specific to Discourse. > > As I said, I have no preference on Discourse (never used it as I said), > but I think what we have now is suboptimal, and would be great to have > something better. But we had that discussion several weeks ago. We decided > to move forward with Discourse at that time. I spent many hours learning > about it, and setting it up. So, I'm not really looking forward to start > again with this. I'm happy to hand over this to you, if you want to > research further and lead the discussion. And implement whatever is best. > > But if I need to continue spending time on this myself, I'd appreciate if > you can find solutions and not problems. I have no idea about how to set up > Discourse as a mailing list, or how to do it for subcommunities. But if you > have a specific way you want it to work, please do the research, and > propose (and implement) the best solution for all us. Or is your proposal > to stay with what we have. > > Thanks! > > > On Tue, 12 Nov 2019, 20:46 Pietro Battiston, > wrote: > >> Marc, >> >> is the thread you refer to >> https://github.com/pandas-dev/pandas/issues/27903 ? >> >> Again, I didn't participate in the discussion, I am not contributing >> much lately, and so my opinion is not very important. But I had >> followed the discussion, I now checked the thread above, and while I >> derived the idea that Discourse could (and probably should) replace the >> pydata ML as a frontend to users (ideally of pydata, not just pandas), >> I fail to see a real discussion, or even just arguments, in favour of >> replacing the devs MLs. >> >> >> Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha scritto: >> > [...] >> > >> > As it has been said, Discourse can be set up to work as a mailing >> > list for people who don't want to use the Discourse interface. So, I >> > don't think this should be a reason to not move forward, you'll just >> > need to subscribe elsewhere, and send messages to a different email >> > address. >> >> I just read the FAQ about how to "use Discourse via email"... I guess >> we can live with filtering incoming emails by List-ID, but I don't >> understand whether there will be multiple emails address to write to, >> for the different "subcommunities". Could you shed some light? >> >> Thanks, >> >> Pietro >> >> > On Tue, 12 Nov 2019, 20:46 Pietro Battiston, > wrote: > >> Marc, >> >> is the thread you refer to >> https://github.com/pandas-dev/pandas/issues/27903 ? >> >> Again, I didn't participate in the discussion, I am not contributing >> much lately, and so my opinion is not very important. But I had >> followed the discussion, I now checked the thread above, and while I >> derived the idea that Discourse could (and probably should) replace the >> pydata ML as a frontend to users (ideally of pydata, not just pandas), >> I fail to see a real discussion, or even just arguments, in favour of >> replacing the devs MLs. >> >> >> Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha scritto: >> > [...] >> > >> > As it has been said, Discourse can be set up to work as a mailing >> > list for people who don't want to use the Discourse interface. So, I >> > don't think this should be a reason to not move forward, you'll just >> > need to subscribe elsewhere, and send messages to a different email >> > address. >> >> I just read the FAQ about how to "use Discourse via email"... I guess >> we can live with filtering incoming emails by List-ID, but I don't >> understand whether there will be multiple emails address to write to, >> for the different "subcommunities". Could you shed some light? >> >> Thanks, >> >> Pietro >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.rashty at pandibay.com Tue Nov 12 17:17:01 2019 From: david.rashty at pandibay.com (David Rashty) Date: Tue, 12 Nov 2019 17:17:01 -0500 Subject: [Pandas-dev] FW: [EXTERNAL] Re: pandas or new project In-Reply-To: References: <4bc4feca077b4b56bd14b9a7483c5b7f@FSTROYMSMAIL04.CORP.FSROOT.FLAGSTAR.COM> Message-ID: Have you guys seen these? https://www.python.org/dev/peps/pep-0589/ https://github.com/typelevel/frameless I would love to have TypedDataFrame implemented in pandas and then have my IDE introspect type errors. I have a large set of business processes like this: def gen_y(df: pd.DataFrame[x1, x2, x3, x4]) -> pd.DataFrame[x1, x2, x3, x4, y]: ... return df Food for thought... ? On Wed, Jan 9, 2019 at 9:31 PM David M Rashty wrote: > > > > > *From:* Wes McKinney [mailto:wesmckinn at gmail.com] > *Sent:* Thursday, September 13, 2018 9:56 PM > *To:* Tom Augspurger > *Cc:* David M Rashty ; pandas-dev at python.org > *Subject:* [EXTERNAL] Re: [Pandas-dev] pandas or new project > > > *Flagstar Security Warning:* External Email. Please make sure you trust > this source before clicking links or opening attachments. > > hi David, > > > > There's nothing really wrong with injecting a bunch of custom methods into > the DataFrame.* namespace. If you wanted, you could release your package as > like > > > > import pandas_stata > > > > and then the new methods would be available. This is pretty common in > large corporate environments that use pandas AFAICT. You can also propose > your changes in pull requests to pandas. > > > > - Wes > > > > > > > > On Thu, Sep 13, 2018 at 9:41 PM Tom Augspurger > wrote: > > With respect to your `sdrop` and `skeep`, that's the goal of > DataFrame.filter, though the name isn't the best so it'll > > maybe be deprecated in favor of something better. > > > > The rest sound interesting, but likely out of scope for pandas. If you > build an open source library then we'd be > > happy to include in pandas' ecosystem page: > http://pandas.pydata.org/pandas-docs/stable/ecosystem.html > > > > > Tom > > > > > > On Thu, Sep 13, 2018 at 7:58 PM David M Rashty > wrote: > > Dear pandas team, > > I am a long time Stata user and I started using pandas about a year ago in > order to build web applications using an in memory dataframe structure. As > a business user, I?ve found Stata to have a key advantage over pandas that > many others have also noted: much faster development time. Examples in > Stata: > > > > drop myvar* // drops all columns starting with myvar > > keep myvar* // drops all columns except those starting with myvar > > reg z y x // runs the regression z = a+bx+cy + error > > > > In order to use pandas in a Stata-like fashion, I?ve had to monkey patch > large parts of the library e.g., > > > > df = df.sdrop(?myvar*?) # same as above > > df = df.skeep(?myvar*?) # same as above > > df = df.sreg(?z y x?) # same as above > > df = df.squery(?a>80 & b.str.contains(?hello?) & c.isin([1,2,3])?) # > df.query doesn?t support str.contains and isin to my knowledge > > > > I put an ?s? in front of my methods to mean either ?stata? or ?sugar?. > > > > Additionally, I?ve built a system to: > > a) Automatically load new DataFrame methods into memory (no > additional imports required) > > b) A caching system to make loading data blazing fast along with a > much tighter syntax e.g., pd.read_stata(?mydata.dta?) (6 secs load time) vs > use.mydata (0.001 secs load time after the first read from file) > > c) A system of column ?labels? and formats to prettify various > reports e.g., df.sscatter(?rate score?) produces a scatter plot with labels > ?Interest Rate, %? and ?Credit Score?, respectively. > > d) A reactive web app (using Flask/Redis) to quickly view the full > DataFrame content in a browser: > > > > Basically, I?ve tried to eliminate any obvious advantages Stata has over > pandas. > > > > I?m potentially interested in developing this project into something > bigger. Would you like me to share my work in the context of pandas or > should it be a completely separate project with a different scope? > > > > Thanks, > > > > David Rashty | Flagstar Bank | Whole Loan Trading | 248-312-6692 | > david.rashty at flagstar.com > > > > This e-mail may contain data that is confidential, proprietary or > non-public personal information, as that term is defined in the > Gramm-Leach-Bliley Act (collectively, Confidential Information). The > Confidential Information is disclosed conditioned upon your agreement that > you will treat it confidentially and in accordance with applicable law, > ensure that such data isn't used or disclosed except for the limited > purpose for which it's being provided and will notify and cooperate with us > regarding any requested or unauthorized disclosure or use of any > Confidential Information. > By accepting and reviewing the Confidential information, you agree to > indemnify us against any losses or expenses, including attorney's fees that > we may incur as a result of any unauthorized use or disclosure of this data > due to your acts or omissions. If a party other than the intended recipient > receives this e-mail, he or she is requested to instantly notify us of the > erroneous delivery and return to us all data so delivered. > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From short.chrisd at gmail.com Tue Nov 12 17:22:58 2019 From: short.chrisd at gmail.com (Christopher Short) Date: Wed, 13 Nov 2019 09:22:58 +1100 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it>

Message-ID: For what it?s worth - all the discourse channels have an RSS feed, so using your preferred RSS reader makes the occasional dipping in and out straight forward. For intermittent readers, It also offers the benefit that you easily cast your eye over all that has transpired since you last dipped in. cheers > On 13 Nov 2019, at 7:12 am, Marc Garcia wrote: > > That's the GitHub issue, most of the discussion was in a thread in this (pandas-dev) list. I'm in my phone and can't look for the thread now, but I think it started in the conversation about the website hosting, and then Joris created a separate thread specific to Discourse. > > As I said, I have no preference on Discourse (never used it as I said), but I think what we have now is suboptimal, and would be great to have something better. But we had that discussion several weeks ago. We decided to move forward with Discourse at that time. I spent many hours learning about it, and setting it up. So, I'm not really looking forward to start again with this. I'm happy to hand over this to you, if you want to research further and lead the discussion. And implement whatever is best. > > But if I need to continue spending time on this myself, I'd appreciate if you can find solutions and not problems. I have no idea about how to set up Discourse as a mailing list, or how to do it for subcommunities. But if you have a specific way you want it to work, please do the research, and propose (and implement) the best solution for all us. Or is your proposal to stay with what we have. > > Thanks! > > > On Tue, 12 Nov 2019, 20:46 Pietro Battiston, > wrote: > Marc, > > is the thread you refer to > https://github.com/pandas-dev/pandas/issues/27903 ? > > Again, I didn't participate in the discussion, I am not contributing > much lately, and so my opinion is not very important. But I had > followed the discussion, I now checked the thread above, and while I > derived the idea that Discourse could (and probably should) replace the > pydata ML as a frontend to users (ideally of pydata, not just pandas), > I fail to see a real discussion, or even just arguments, in favour of > replacing the devs MLs. > > > Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha scritto: > > [...] > > > > As it has been said, Discourse can be set up to work as a mailing > > list for people who don't want to use the Discourse interface. So, I > > don't think this should be a reason to not move forward, you'll just > > need to subscribe elsewhere, and send messages to a different email > > address. > > I just read the FAQ about how to "use Discourse via email"... I guess > we can live with filtering incoming emails by List-ID, but I don't > understand whether there will be multiple emails address to write to, > for the different "subcommunities". Could you shed some light? > > Thanks, > > Pietro > > > On Tue, 12 Nov 2019, 20:46 Pietro Battiston, > wrote: > Marc, > > is the thread you refer to > https://github.com/pandas-dev/pandas/issues/27903 ? > > Again, I didn't participate in the discussion, I am not contributing > much lately, and so my opinion is not very important. But I had > followed the discussion, I now checked the thread above, and while I > derived the idea that Discourse could (and probably should) replace the > pydata ML as a frontend to users (ideally of pydata, not just pandas), > I fail to see a real discussion, or even just arguments, in favour of > replacing the devs MLs. > > > Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha scritto: > > [...] > > > > As it has been said, Discourse can be set up to work as a mailing > > list for people who don't want to use the Discourse interface. So, I > > don't think this should be a reason to not move forward, you'll just > > need to subscribe elsewhere, and send messages to a different email > > address. > > I just read the FAQ about how to "use Discourse via email"... I guess > we can live with filtering incoming emails by List-ID, but I don't > understand whether there will be multiple emails address to write to, > for the different "subcommunities". Could you shed some light? > > Thanks, > > Pietro > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at pietrobattiston.it Wed Nov 13 03:21:03 2019 From: me at pietrobattiston.it (Pietro Battiston) Date: Wed, 13 Nov 2019 09:21:03 +0100 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it>

Message-ID: Marc, on whether Discourse (or Flarum) is the right tool to best reach/host our (pydata) community, while not being a fan of Discourse from my very limited experience (GNOME), I was not rhetorical when I wrote that I trust what you, and other people who spent time on this, think. I say this not just because I don't have time now to devote to the issue, but also because you already did great things for pandas in the past in terms of community involvement. You definitely know how to reach people, better than me. My previous two emails were just pointing out that while the pydata ML and gitter were probably a suboptimal way to involve our community, the devs MLs seemed to me to just work fine, and I didn't read of specific complaints that would justify closing them. Sure, some people had missed the fact that pandas-dev is open to non-core contributors, but I think we did some steps in clarifying the options to get in contact with us: https://github.com/pandas-dev/pandas-website/issues/68 Then, I might have missed some arguments/conclusions: in this case, please let me know, and I will stop bothering. But I'm not "against Discourse". Cheers, Pietro Il giorno mer, 13/11/2019 alle 03.10 +0100, Marc Garcia ha scritto: > I was granting admin permissions to maintainers on the new Discourse, > and looks like we've got a limit of 5. Spoke with Discourse support, > and they want us to pay >$300 per month to have more. I think we > could live with this, but I'm a bit worried about their open source > plan being more of a freemium plan they use to make big open source > projects pay. I was reading the details of the conditions, and seems > like we also have bandwidth limits we may reach. > > An alternative is to host the instance ourselves. If we do that, we > can also consider other alternatives. Just seem Flarum, which is in > beta, but at a first glance it looks like it manages subcategories in > a way that would make it possible to have a NumFOCUS broad forum. Not > sure if they have the feature of letting categories/subcategories > behave as mailing lists, need to research. > > What are your thoughts? > > On Tue, Nov 12, 2019 at 9:12 PM Marc Garcia > wrote: > > That's the GitHub issue, most of the discussion was in a thread in > > this (pandas-dev) list. I'm in my phone and can't look for the > > thread now, but I think it started in the conversation about the > > website hosting, and then Joris created a separate thread specific > > to Discourse. > > > > As I said, I have no preference on Discourse (never used it as I > > said), but I think what we have now is suboptimal, and would be > > great to have something better. But we had that discussion several > > weeks ago. We decided to move forward with Discourse at that time. > > I spent many hours learning about it, and setting it up. So, I'm > > not really looking forward to start again with this. I'm happy to > > hand over this to you, if you want to research further and lead the > > discussion. And implement whatever is best. > > > > But if I need to continue spending time on this myself, I'd > > appreciate if you can find solutions and not problems. I have no > > idea about how to set up Discourse as a mailing list, or how to do > > it for subcommunities. But if you have a specific way you want it > > to work, please do the research, and propose (and implement) the > > best solution for all us. Or is your proposal to stay with what we > > have. > > > > Thanks! > > > > > > On Tue, 12 Nov 2019, 20:46 Pietro Battiston, > > wrote: > > > Marc, > > > > > > is the thread you refer to > > > https://github.com/pandas-dev/pandas/issues/27903 ? > > > > > > Again, I didn't participate in the discussion, I am not > > > contributing > > > much lately, and so my opinion is not very important. But I had > > > followed the discussion, I now checked the thread above, and > > > while I > > > derived the idea that Discourse could (and probably should) > > > replace the > > > pydata ML as a frontend to users (ideally of pydata, not just > > > pandas), > > > I fail to see a real discussion, or even just arguments, in > > > favour of > > > replacing the devs MLs. > > > > > > > > > Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha > > > scritto: > > > > [...] > > > > > > > > As it has been said, Discourse can be set up to work as a > > > mailing > > > > list for people who don't want to use the Discourse interface. > > > So, I > > > > don't think this should be a reason to not move forward, you'll > > > just > > > > need to subscribe elsewhere, and send messages to a different > > > email > > > > address. > > > > > > I just read the FAQ about how to "use Discourse via email"... I > > > guess > > > we can live with filtering incoming emails by List-ID, but I > > > don't > > > understand whether there will be multiple emails address to write > > > to, > > > for the different "subcommunities". Could you shed some light? > > > > > > Thanks, > > > > > > Pietro > > > > > > > On Tue, 12 Nov 2019, 20:46 Pietro Battiston, > > wrote: > > > Marc, > > > > > > is the thread you refer to > > > https://github.com/pandas-dev/pandas/issues/27903 ? > > > > > > Again, I didn't participate in the discussion, I am not > > > contributing > > > much lately, and so my opinion is not very important. But I had > > > followed the discussion, I now checked the thread above, and > > > while I > > > derived the idea that Discourse could (and probably should) > > > replace the > > > pydata ML as a frontend to users (ideally of pydata, not just > > > pandas), > > > I fail to see a real discussion, or even just arguments, in > > > favour of > > > replacing the devs MLs. > > > > > > > > > Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha > > > scritto: > > > > [...] > > > > > > > > As it has been said, Discourse can be set up to work as a > > > mailing > > > > list for people who don't want to use the Discourse interface. > > > So, I > > > > don't think this should be a reason to not move forward, you'll > > > just > > > > need to subscribe elsewhere, and send messages to a different > > > email > > > > address. > > > > > > I just read the FAQ about how to "use Discourse via email"... I > > > guess > > > we can live with filtering incoming emails by List-ID, but I > > > don't > > > understand whether there will be multiple emails address to write > > > to, > > > for the different "subcommunities". Could you shed some light? > > > > > > Thanks, > > > > > > Pietro > > > From garcia.marc at gmail.com Wed Nov 13 06:07:36 2019 From: garcia.marc at gmail.com (Marc Garcia) Date: Wed, 13 Nov 2019 12:07:36 +0100 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it>

Message-ID: We can probably continue in the call, it'll be easier. My only point was that the discussion on whether we want Discourse or not already happened. And we also discussed on being able to use Discourse as a mailing list. So, if anybody wants to reopen that discussion, it's worth to catch up with those discussions, and propose specific changes to the current plan. We spent a significant amount of hours on this, and I don't think general opinions or preferences are helpful at this point. It's very difficult to move forward if we keep rediscussing things without proposing alternatives to what was previously agreed. I see your point now, but even if pandas-dev is working well, and there is no reason to close it, there is no reason to continue using it if Discourse can do exactly the same, and more things. In my opinion, people who have a preference on using mailing lists (and that may be me too) shouldn't be a reason to not move forward. The change will be transparent and you'll just have to sign up in Discourse, and start using a different email address to send messages. And other people will benefit from the new features that those platforms offer. And things will be easier to manage and easier to understand by newcomers, if we have a single platform for the communications. Does this make sense? Or is there something I'm missing? On Wed, 13 Nov 2019, 09:21 Pietro Battiston, wrote: > Marc, > > on whether Discourse (or Flarum) is the right tool to best reach/host > our (pydata) community, while not being a fan of Discourse from my very > limited experience (GNOME), I was not rhetorical when I wrote that I > trust what you, and other people who spent time on this, think. I say > this not just because I don't have time now to devote to the issue, but > also because you already did great things for pandas in the past in > terms of community involvement. You definitely know how to reach > people, better than me. > > My previous two emails were just pointing out that while the pydata ML > and gitter were probably a suboptimal way to involve our community, the > devs MLs seemed to me to just work fine, and I didn't read of specific > complaints that would justify closing them. Sure, some people had > missed the fact that pandas-dev is open to non-core contributors, but I > think we did some steps in clarifying the options to get in contact > with us: > https://github.com/pandas-dev/pandas-website/issues/68 > > Then, I might have missed some arguments/conclusions: in this case, > please let me know, and I will stop bothering. But I'm not "against > Discourse". > > Cheers, > > Pietro > > Il giorno mer, 13/11/2019 alle 03.10 +0100, Marc Garcia ha scritto: > > I was granting admin permissions to maintainers on the new Discourse, > > and looks like we've got a limit of 5. Spoke with Discourse support, > > and they want us to pay >$300 per month to have more. I think we > > could live with this, but I'm a bit worried about their open source > > plan being more of a freemium plan they use to make big open source > > projects pay. I was reading the details of the conditions, and seems > > like we also have bandwidth limits we may reach. > > > > An alternative is to host the instance ourselves. If we do that, we > > can also consider other alternatives. Just seem Flarum, which is in > > beta, but at a first glance it looks like it manages subcategories in > > a way that would make it possible to have a NumFOCUS broad forum. Not > > sure if they have the feature of letting categories/subcategories > > behave as mailing lists, need to research. > > > > What are your thoughts? > > > > On Tue, Nov 12, 2019 at 9:12 PM Marc Garcia > > wrote: > > > That's the GitHub issue, most of the discussion was in a thread in > > > this (pandas-dev) list. I'm in my phone and can't look for the > > > thread now, but I think it started in the conversation about the > > > website hosting, and then Joris created a separate thread specific > > > to Discourse. > > > > > > As I said, I have no preference on Discourse (never used it as I > > > said), but I think what we have now is suboptimal, and would be > > > great to have something better. But we had that discussion several > > > weeks ago. We decided to move forward with Discourse at that time. > > > I spent many hours learning about it, and setting it up. So, I'm > > > not really looking forward to start again with this. I'm happy to > > > hand over this to you, if you want to research further and lead the > > > discussion. And implement whatever is best. > > > > > > But if I need to continue spending time on this myself, I'd > > > appreciate if you can find solutions and not problems. I have no > > > idea about how to set up Discourse as a mailing list, or how to do > > > it for subcommunities. But if you have a specific way you want it > > > to work, please do the research, and propose (and implement) the > > > best solution for all us. Or is your proposal to stay with what we > > > have. > > > > > > Thanks! > > > > > > > > > On Tue, 12 Nov 2019, 20:46 Pietro Battiston, > > > wrote: > > > > Marc, > > > > > > > > is the thread you refer to > > > > https://github.com/pandas-dev/pandas/issues/27903 ? > > > > > > > > Again, I didn't participate in the discussion, I am not > > > > contributing > > > > much lately, and so my opinion is not very important. But I had > > > > followed the discussion, I now checked the thread above, and > > > > while I > > > > derived the idea that Discourse could (and probably should) > > > > replace the > > > > pydata ML as a frontend to users (ideally of pydata, not just > > > > pandas), > > > > I fail to see a real discussion, or even just arguments, in > > > > favour of > > > > replacing the devs MLs. > > > > > > > > > > > > Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha > > > > scritto: > > > > > [...] > > > > > > > > > > As it has been said, Discourse can be set up to work as a > > > > mailing > > > > > list for people who don't want to use the Discourse interface. > > > > So, I > > > > > don't think this should be a reason to not move forward, you'll > > > > just > > > > > need to subscribe elsewhere, and send messages to a different > > > > email > > > > > address. > > > > > > > > I just read the FAQ about how to "use Discourse via email"... I > > > > guess > > > > we can live with filtering incoming emails by List-ID, but I > > > > don't > > > > understand whether there will be multiple emails address to write > > > > to, > > > > for the different "subcommunities". Could you shed some light? > > > > > > > > Thanks, > > > > > > > > Pietro > > > > > > > > > > On Tue, 12 Nov 2019, 20:46 Pietro Battiston, > > > wrote: > > > > Marc, > > > > > > > > is the thread you refer to > > > > https://github.com/pandas-dev/pandas/issues/27903 ? > > > > > > > > Again, I didn't participate in the discussion, I am not > > > > contributing > > > > much lately, and so my opinion is not very important. But I had > > > > followed the discussion, I now checked the thread above, and > > > > while I > > > > derived the idea that Discourse could (and probably should) > > > > replace the > > > > pydata ML as a frontend to users (ideally of pydata, not just > > > > pandas), > > > > I fail to see a real discussion, or even just arguments, in > > > > favour of > > > > replacing the devs MLs. > > > > > > > > > > > > Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha > > > > scritto: > > > > > [...] > > > > > > > > > > As it has been said, Discourse can be set up to work as a > > > > mailing > > > > > list for people who don't want to use the Discourse interface. > > > > So, I > > > > > don't think this should be a reason to not move forward, you'll > > > > just > > > > > need to subscribe elsewhere, and send messages to a different > > > > email > > > > > address. > > > > > > > > I just read the FAQ about how to "use Discourse via email"... I > > > > guess > > > > we can live with filtering incoming emails by List-ID, but I > > > > don't > > > > understand whether there will be multiple emails address to write > > > > to, > > > > for the different "subcommunities". Could you shed some light? > > > > > > > > Thanks, > > > > > > > > Pietro > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Wed Nov 13 06:51:27 2019 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Wed, 13 Nov 2019 12:51:27 +0100 Subject: [Pandas-dev] Remove old mailing lists In-Reply-To: References: <5b0ff6da2658eb4af9bd0582b1f4480da2bb7486.camel@pietrobattiston.it>

Message-ID: On Wed, 13 Nov 2019 at 12:08, Marc Garcia wrote: > We can probably continue in the call, it'll be easier. > > My only point was that the discussion on whether we want Discourse or not > already happened. > There was indeed a mailing thread, but I don't think there was much discussion (and I am not blaming you for that, to be clear ;)) or involvement of many people. For such a change of moving away from the current mailing lists, personally I think we should have more buy in from the people actively using those mailing lists (also for our internal one). So I would rather welcome that we now finally have some discussion about it. And IMO we should also not defer everything to a call. Not all people are present then, mailing list (or discourse :)) discussions about significant topics are important as well. That said: I am in favor of trying out discourse (although the limitations of their free plan might sound problematic. It might be worth checking with other projects using it (jupyter, matplotlib) how they are dealing with that, or are they hosting themselves?). But I think it should be a "trying out", and not already saying "in a month we will close the mailing lists". So I am in favor of trying out discourse for a while, so we can then evaluate if it works and if we want to archive or discourage the mailing lists. Regarding the usage as a mailing list: there is also a "mailing list mode" in the preferences (not fully sure what the difference is with the previously mentioned link about configuring it to use with email) Joris > And we also discussed on being able to use Discourse as a mailing list. > So, if anybody wants to reopen that discussion, it's worth to catch up with > those discussions, and propose specific changes to the current plan. We > spent a significant amount of hours on this, and I don't think general > opinions or preferences are helpful at this point. It's very difficult to > move forward if we keep rediscussing things without proposing alternatives > to what was previously agreed. > > I see your point now, but even if pandas-dev is working well, and there is > no reason to close it, there is no reason to continue using it if Discourse > can do exactly the same, and more things. > > In my opinion, people who have a preference on using mailing lists (and > that may be me too) shouldn't be a reason to not move forward. The change > will be transparent and you'll just have to sign up in Discourse, and start > using a different email address to send messages. And other people will > benefit from the new features that those platforms offer. And things will > be easier to manage and easier to understand by newcomers, if we have a > single platform for the communications. > > Does this make sense? Or is there something I'm missing? > > On Wed, 13 Nov 2019, 09:21 Pietro Battiston, > wrote: > >> Marc, >> >> on whether Discourse (or Flarum) is the right tool to best reach/host >> our (pydata) community, while not being a fan of Discourse from my very >> limited experience (GNOME), I was not rhetorical when I wrote that I >> trust what you, and other people who spent time on this, think. I say >> this not just because I don't have time now to devote to the issue, but >> also because you already did great things for pandas in the past in >> terms of community involvement. You definitely know how to reach >> people, better than me. >> >> My previous two emails were just pointing out that while the pydata ML >> and gitter were probably a suboptimal way to involve our community, the >> devs MLs seemed to me to just work fine, and I didn't read of specific >> complaints that would justify closing them. Sure, some people had >> missed the fact that pandas-dev is open to non-core contributors, but I >> think we did some steps in clarifying the options to get in contact >> with us: >> https://github.com/pandas-dev/pandas-website/issues/68 >> >> Then, I might have missed some arguments/conclusions: in this case, >> please let me know, and I will stop bothering. But I'm not "against >> Discourse". >> >> Cheers, >> >> Pietro >> >> Il giorno mer, 13/11/2019 alle 03.10 +0100, Marc Garcia ha scritto: >> > I was granting admin permissions to maintainers on the new Discourse, >> > and looks like we've got a limit of 5. Spoke with Discourse support, >> > and they want us to pay >$300 per month to have more. I think we >> > could live with this, but I'm a bit worried about their open source >> > plan being more of a freemium plan they use to make big open source >> > projects pay. I was reading the details of the conditions, and seems >> > like we also have bandwidth limits we may reach. >> > >> > An alternative is to host the instance ourselves. If we do that, we >> > can also consider other alternatives. Just seem Flarum, which is in >> > beta, but at a first glance it looks like it manages subcategories in >> > a way that would make it possible to have a NumFOCUS broad forum. Not >> > sure if they have the feature of letting categories/subcategories >> > behave as mailing lists, need to research. >> > >> > What are your thoughts? >> > >> > On Tue, Nov 12, 2019 at 9:12 PM Marc Garcia >> > wrote: >> > > That's the GitHub issue, most of the discussion was in a thread in >> > > this (pandas-dev) list. I'm in my phone and can't look for the >> > > thread now, but I think it started in the conversation about the >> > > website hosting, and then Joris created a separate thread specific >> > > to Discourse. >> > > >> > > As I said, I have no preference on Discourse (never used it as I >> > > said), but I think what we have now is suboptimal, and would be >> > > great to have something better. But we had that discussion several >> > > weeks ago. We decided to move forward with Discourse at that time. >> > > I spent many hours learning about it, and setting it up. So, I'm >> > > not really looking forward to start again with this. I'm happy to >> > > hand over this to you, if you want to research further and lead the >> > > discussion. And implement whatever is best. >> > > >> > > But if I need to continue spending time on this myself, I'd >> > > appreciate if you can find solutions and not problems. I have no >> > > idea about how to set up Discourse as a mailing list, or how to do >> > > it for subcommunities. But if you have a specific way you want it >> > > to work, please do the research, and propose (and implement) the >> > > best solution for all us. Or is your proposal to stay with what we >> > > have. >> > > >> > > Thanks! >> > > >> > > >> > > On Tue, 12 Nov 2019, 20:46 Pietro Battiston, > > > > wrote: >> > > > Marc, >> > > > >> > > > is the thread you refer to >> > > > https://github.com/pandas-dev/pandas/issues/27903 ? >> > > > >> > > > Again, I didn't participate in the discussion, I am not >> > > > contributing >> > > > much lately, and so my opinion is not very important. But I had >> > > > followed the discussion, I now checked the thread above, and >> > > > while I >> > > > derived the idea that Discourse could (and probably should) >> > > > replace the >> > > > pydata ML as a frontend to users (ideally of pydata, not just >> > > > pandas), >> > > > I fail to see a real discussion, or even just arguments, in >> > > > favour of >> > > > replacing the devs MLs. >> > > > >> > > > >> > > > Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha >> > > > scritto: >> > > > > [...] >> > > > > >> > > > > As it has been said, Discourse can be set up to work as a >> > > > mailing >> > > > > list for people who don't want to use the Discourse interface. >> > > > So, I >> > > > > don't think this should be a reason to not move forward, you'll >> > > > just >> > > > > need to subscribe elsewhere, and send messages to a different >> > > > email >> > > > > address. >> > > > >> > > > I just read the FAQ about how to "use Discourse via email"... I >> > > > guess >> > > > we can live with filtering incoming emails by List-ID, but I >> > > > don't >> > > > understand whether there will be multiple emails address to write >> > > > to, >> > > > for the different "subcommunities". Could you shed some light? >> > > > >> > > > Thanks, >> > > > >> > > > Pietro >> > > > >> > > >> > > On Tue, 12 Nov 2019, 20:46 Pietro Battiston, > > > > wrote: >> > > > Marc, >> > > > >> > > > is the thread you refer to >> > > > https://github.com/pandas-dev/pandas/issues/27903 ? >> > > > >> > > > Again, I didn't participate in the discussion, I am not >> > > > contributing >> > > > much lately, and so my opinion is not very important. But I had >> > > > followed the discussion, I now checked the thread above, and >> > > > while I >> > > > derived the idea that Discourse could (and probably should) >> > > > replace the >> > > > pydata ML as a frontend to users (ideally of pydata, not just >> > > > pandas), >> > > > I fail to see a real discussion, or even just arguments, in >> > > > favour of >> > > > replacing the devs MLs. >> > > > >> > > > >> > > > Il giorno mar, 12/11/2019 alle 20.21 +0100, Marc Garcia ha >> > > > scritto: >> > > > > [...] >> > > > > >> > > > > As it has been said, Discourse can be set up to work as a >> > > > mailing >> > > > > list for people who don't want to use the Discourse interface. >> > > > So, I >> > > > > don't think this should be a reason to not move forward, you'll >> > > > just >> > > > > need to subscribe elsewhere, and send messages to a different >> > > > email >> > > > > address. >> > > > >> > > > I just read the FAQ about how to "use Discourse via email"... I >> > > > guess >> > > > we can live with filtering incoming emails by List-ID, but I >> > > > don't >> > > > understand whether there will be multiple emails address to write >> > > > to, >> > > > for the different "subcommunities". Could you shed some light? >> > > > >> > > > Thanks, >> > > > >> > > > Pietro >> > > > >> >> _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Thu Nov 14 15:44:53 2019 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Thu, 14 Nov 2019 21:44:53 +0100 Subject: [Pandas-dev] ROADMAP proposal: Consistent missing value handling with new NA scalar In-Reply-To: References: Message-ID: Quick update on this: there has been discussion at https://github.com/pandas-dev/pandas/issues/28095 and https://github.com/pandas-dev/pandas/issues/28778/, and there is now also a PR implementing such a pd.NA scalar missing value indicator: https://github.com/pandas-dev/pandas/pull/29597 Feedback is still very welcome! On Thu, 3 Oct 2019 at 22:32, Joris Van den Bossche < jorisvandenbossche at gmail.com> wrote: > Hi all, > > I would like to propose a revisit of missing value handling in pandas. > It's already being discussed on github ( > https://github.com/pandas-dev/pandas/issues/28095), but want to mention > this on the mailing list as well for broader feedback. > A more detailed proposal can be found here: > https://hackmd.io/@jorisvandenbossche/Sk0wMeAmB, and discussion can be > found at the above github issue. > > A summary of the proposal is to introduce *a new NA value (singleton) for > representing scalar missing values* (instead of np.nan) that can be used > consistently across all data types. This could be achieved under the hood > by using a mask-based approach to store the missing values on the > array/series-level, but the main discussion here is about the user-facing > API: the scalar NA value and the behaviour of NA in several operation. > > Motivation for this change: > > - *Consistent user interface.* > Currently, the value you get back for a missing scalar (eg from scalar > access s[idx]) depends on the data type (np.nan for many, but pd.NaT > for datetime-likes). Some types support missing values, others don't. This > proposal would ensure you get back pd.NA regardless of the dtype. > - *No "mis-use" of the np.nan floating point value.* > The NaN value is a specific floating point value, and not necessarily > an indicator for missing values (although pandas has always used it that > way). And because we also use it for other dtypes, you get back a float > value for non-float dtypes, giving misleading dtype information. > - *A missing value that behaves accordingly.* > Our current behaviour of missing values is inherited of the np.nan > behaviour. Other languages that have a NA/NULL value that is distinguished > from NaN (eg Julia, SQL, R) typically have different behaviour in > comparison and logical operations. For example, comparison with NA could > give NA instead of False, and consequently we need to have a boolean dtype > with NA support. A new NA value opens up the possibility of having such > behaviour. > - An "NA" scalar *matches the terminology* that is used throughout > pandas in functions and argument names (isna, dropna, fillna, skipna, > ?). > > > See the proposal for > more details. > > This has of course many consequences in the user API of pandas. Initially, > it could therefore be introduced optionally (eg only in the new data types > as nullable integer or string dtype). > And given those pervasive changes, many eyes on it are important. *So > feedback on this idea would be greatly appreciated!* > > Joris > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.augier at univ-grenoble-alpes.fr Fri Nov 15 16:29:45 2019 From: pierre.augier at univ-grenoble-alpes.fr (PIERRE AUGIER) Date: Fri, 15 Nov 2019 22:29:45 +0100 (CET) Subject: [Pandas-dev] [Numpy-discussion] Transonic Vision: unifying Python-Numpy accelerators Message-ID: <2041899490.23161412.1573853385839.JavaMail.zimbra@univ-grenoble-alpes.fr> Dear Pandas developers, Ralf Gommers wrote me that there was a discussion on the pandas-dev mailing list a couple of weeks ago about adopting Numba as a dependency. We recently wrote a serious text on this subject: http://tiny.cc/transonic-vision. As a side remark, we also played with Transonic and Pandas in this notebook https://github.com/fluiddyn/transonic-demos/blob/master/pandas.ipynb (the binder link to run the benchmarks: https://mybinder.org/v2/gh/fluiddyn/transonic-demos/master) > Date: Wed, 6 Nov 2019 23:49:08 -0500 > From: Ralf Gommers > To: Discussion of Numerical Python > Subject: Re: [Numpy-discussion] Transonic Vision: unifying > Python-Numpy accelerators > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > On Mon, Nov 4, 2019 at 4:54 PM PIERRE AUGIER < > pierre.augier at univ-grenoble-alpes.fr> wrote: > >> Dear Python-Numpy community, >> >> Transonic is a pure Python package to easily accelerate modern >> Python-Numpy code with different accelerators (currently Cython, Pythran >> and Numba). >> >> I'm trying to get some funding for this project. The related work would >> benefit in particular to Cython, Numba, Pythran and Xtensor. >> >> To obtain this funding, we really need some feedback from some people >> knowing the subject of performance with Python-Numpy code. >> >> That's one of the reason why we wrote this long and serious text on >> Transonic Vision: http://tiny.cc/transonic-vision. We describe some >> issues (perf for numerical kernels, incompatible accelerators, community >> split between experts and simple users, ...) and possible improvements. >> > > Thanks Pierre, that's a very interesting vision paper. > > In case you haven't seen it, there was a discussion on the pandas-dev > mailing list a couple of weeks ago about adopting Numba as a dependency > (and issues with that). > > Your comment on my assessment from 1.5 years ago being a little unfair to > Pythran may be true - not sure it was at the time, but Pythran seems to > mature nicely. > > The ability to switch between just-in-time and ahead-of-time compilation is > nice. One thing I noticed is that this actual switching is not completely > fluent: the jit and boost decorators have different signatures, and there's > no way to globally switch behavior (say with an env var, as for backend > selection). > > >> Help would be very much appreciated. >> > > I'd be interested to help think about adoption and/or funding. > > Cheers, > Ralf > > >> >> Now a coding riddle: >> >> import numpy as np >> from transonic import jit >> >> @jit(native=True, xsimd=True) >> def fxfy(ft, fn, theta): >> sin_theta = np.sin(theta) >> cos_theta = np.cos(theta) >> fx = cos_theta * ft - sin_theta * fn >> fy = sin_theta * ft + cos_theta * fn >> return fx, fy >> >> @jit(native=True, xsimd=True) >> def fxfy_loops(ft, fn, theta): >> n0 = theta.size >> fx = np.empty_like(ft) >> fy = np.empty_like(fn) >> for index in range(n0): >> sin_theta = np.sin(theta[index]) >> cos_theta = np.cos(theta[index]) >> fx[index] = cos_theta * ft[index] - sin_theta * fn[index] >> fy[index] = sin_theta * ft[index] + cos_theta * fn[index] >> return fx, fy >> >> How can be compared the performances of these functions with pure Numpy, >> Numba and Pythran ? >> >> You can find out the answer in our note http://tiny.cc/transonic-vision >> :-) >> >> Pierre >> >> > Message: 1 >> > Date: Thu, 31 Oct 2019 21:16:06 +0100 (CET) >> > From: PIERRE AUGIER >> > To: numpy-discussion at python.org >> > Subject: [Numpy-discussion] Transonic Vision: unifying Python-Numpy >> > accelerators >> > Message-ID: >> > < >> 1080118635.5930814.1572552966711.JavaMail.zimbra at univ-grenoble-alpes.fr> >> > >> > Content-Type: text/plain; charset=utf-8 >> > >> > Dear Python-Numpy community, >> > >> > Few years ago I started to use a lot Python and Numpy for science. I'd >> like to >> > thanks all people who contribute to this fantastic community. >> > >> > I used a lot Cython, Pythran and Numba and for the FluidDyn project, we >> created >> > Transonic, a pure Python package to easily accelerate modern >> Python-Numpy code >> > with different accelerators. We wrote a long and serious text to explain >> why we >> > think Transonic could have a positive impact on the scientific Python >> > ecosystem. >> > >> > Here it is: http://tiny.cc/transonic-vision >> > >> > Feedback and discussions would be greatly appreciated! >> > >> > Pierre >> > >> > -- >> > Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr >> > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels >> > BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16 >> _______________________________________________ From ralf.gommers at gmail.com Sun Nov 17 14:29:37 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 17 Nov 2019 11:29:37 -0800 Subject: [Pandas-dev] [Numpy-discussion] Transonic Vision: unifying Python-Numpy accelerators In-Reply-To: <2041899490.23161412.1573853385839.JavaMail.zimbra@univ-grenoble-alpes.fr> References: <2041899490.23161412.1573853385839.JavaMail.zimbra@univ-grenoble-alpes.fr> Message-ID: On Sat, Nov 16, 2019 at 1:49 AM PIERRE AUGIER < pierre.augier at univ-grenoble-alpes.fr> wrote: > Dear Pandas developers, > > Ralf Gommers wrote me that there was a discussion on the pandas-dev > mailing list a couple of weeks ago about adopting Numba as a dependency. > > We recently wrote a serious text on this subject: > http://tiny.cc/transonic-vision. > > As a side remark, we also played with Transonic and Pandas in this > notebook > https://github.com/fluiddyn/transonic-demos/blob/master/pandas.ipynb (the > binder link to run the benchmarks: > https://mybinder.org/v2/gh/fluiddyn/transonic-demos/master) > Thanks for sharing Pierre. For those who are interested, here are the results of running those benchmarks: Cython: 133 ms ? 11 ms Numba: 39.3 ms ? 5.52 ms Pythran: 36.7 ms ? 2.43 ms Cython + type annotations: 56.3 ms ? 6.56 ms The ease of switching between Cython, Numba and Pythran with Transonic in your notebook is very cool. Cheers, Ralf > > Date: Wed, 6 Nov 2019 23:49:08 -0500 > > From: Ralf Gommers > > To: Discussion of Numerical Python > > Subject: Re: [Numpy-discussion] Transonic Vision: unifying > > Python-Numpy accelerators > > Message-ID: > > Mw4bCOXb0H1dvU9AjTMw at mail.gmail.com> > > Content-Type: text/plain; charset="utf-8" > > > > On Mon, Nov 4, 2019 at 4:54 PM PIERRE AUGIER < > > pierre.augier at univ-grenoble-alpes.fr> wrote: > > > >> Dear Python-Numpy community, > >> > >> Transonic is a pure Python package to easily accelerate modern > >> Python-Numpy code with different accelerators (currently Cython, Pythran > >> and Numba). > >> > >> I'm trying to get some funding for this project. The related work would > >> benefit in particular to Cython, Numba, Pythran and Xtensor. > >> > >> To obtain this funding, we really need some feedback from some people > >> knowing the subject of performance with Python-Numpy code. > >> > >> That's one of the reason why we wrote this long and serious text on > >> Transonic Vision: http://tiny.cc/transonic-vision. We describe some > >> issues (perf for numerical kernels, incompatible accelerators, community > >> split between experts and simple users, ...) and possible improvements. > >> > > > > Thanks Pierre, that's a very interesting vision paper. > > > > In case you haven't seen it, there was a discussion on the pandas-dev > > mailing list a couple of weeks ago about adopting Numba as a dependency > > (and issues with that). > > > > Your comment on my assessment from 1.5 years ago being a little unfair to > > Pythran may be true - not sure it was at the time, but Pythran seems to > > mature nicely. > > > > The ability to switch between just-in-time and ahead-of-time compilation > is > > nice. One thing I noticed is that this actual switching is not completely > > fluent: the jit and boost decorators have different signatures, and > there's > > no way to globally switch behavior (say with an env var, as for backend > > selection). > > > > > >> Help would be very much appreciated. > >> > > > > I'd be interested to help think about adoption and/or funding. > > > > Cheers, > > Ralf > > > > > >> > >> Now a coding riddle: > >> > >> import numpy as np > >> from transonic import jit > >> > >> @jit(native=True, xsimd=True) > >> def fxfy(ft, fn, theta): > >> sin_theta = np.sin(theta) > >> cos_theta = np.cos(theta) > >> fx = cos_theta * ft - sin_theta * fn > >> fy = sin_theta * ft + cos_theta * fn > >> return fx, fy > >> > >> @jit(native=True, xsimd=True) > >> def fxfy_loops(ft, fn, theta): > >> n0 = theta.size > >> fx = np.empty_like(ft) > >> fy = np.empty_like(fn) > >> for index in range(n0): > >> sin_theta = np.sin(theta[index]) > >> cos_theta = np.cos(theta[index]) > >> fx[index] = cos_theta * ft[index] - sin_theta * fn[index] > >> fy[index] = sin_theta * ft[index] + cos_theta * fn[index] > >> return fx, fy > >> > >> How can be compared the performances of these functions with pure Numpy, > >> Numba and Pythran ? > >> > >> You can find out the answer in our note http://tiny.cc/transonic-vision > >> :-) > >> > >> Pierre > >> > >> > Message: 1 > >> > Date: Thu, 31 Oct 2019 21:16:06 +0100 (CET) > >> > From: PIERRE AUGIER > >> > To: numpy-discussion at python.org > >> > Subject: [Numpy-discussion] Transonic Vision: unifying Python-Numpy > >> > accelerators > >> > Message-ID: > >> > < > >> 1080118635.5930814.1572552966711.JavaMail.zimbra at univ-grenoble-alpes.fr > > > >> > > >> > Content-Type: text/plain; charset=utf-8 > >> > > >> > Dear Python-Numpy community, > >> > > >> > Few years ago I started to use a lot Python and Numpy for science. I'd > >> like to > >> > thanks all people who contribute to this fantastic community. > >> > > >> > I used a lot Cython, Pythran and Numba and for the FluidDyn project, > we > >> created > >> > Transonic, a pure Python package to easily accelerate modern > >> Python-Numpy code > >> > with different accelerators. We wrote a long and serious text to > explain > >> why we > >> > think Transonic could have a positive impact on the scientific Python > >> > ecosystem. > >> > > >> > Here it is: http://tiny.cc/transonic-vision > >> > > >> > Feedback and discussions would be greatly appreciated! > >> > > >> > Pierre > >> > > >> > -- > >> > Pierre Augier - CR CNRS > http://www.legi.grenoble-inp.fr > >> > LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et > Industriels > >> > BP53, 38041 Grenoble Cedex, France > tel:+33.4.56.52.86.16 > >> _______________________________________________ > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Tue Nov 19 12:56:07 2019 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Tue, 19 Nov 2019 18:56:07 +0100 Subject: [Pandas-dev] ROADMAP proposal: Consistent missing value handling with new NA scalar In-Reply-To: References:

Message-ID: In case that people are interested in this: we are having a dev chat (hangout) about this topic tomorrow at 18:20 UTC. Certainly welcome to join! On Thu, 14 Nov 2019 at 21:44, Joris Van den Bossche < jorisvandenbossche at gmail.com> wrote: > Quick update on this: there has been discussion at > https://github.com/pandas-dev/pandas/issues/28095 and > https://github.com/pandas-dev/pandas/issues/28778/, and there is now also > a PR implementing such a pd.NA scalar missing value indicator: > https://github.com/pandas-dev/pandas/pull/29597 > Feedback is still very welcome! > > On Thu, 3 Oct 2019 at 22:32, Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > >> Hi all, >> >> I would like to propose a revisit of missing value handling in pandas. >> It's already being discussed on github ( >> https://github.com/pandas-dev/pandas/issues/28095), but want to mention >> this on the mailing list as well for broader feedback. >> A more detailed proposal can be found here: >> https://hackmd.io/@jorisvandenbossche/Sk0wMeAmB, and discussion can be >> found at the above github issue. >> >> A summary of the proposal is to introduce *a new NA value (singleton) >> for representing scalar missing values* (instead of np.nan) that can be >> used consistently across all data types. This could be achieved under the >> hood by using a mask-based approach to store the missing values on the >> array/series-level, but the main discussion here is about the user-facing >> API: the scalar NA value and the behaviour of NA in several operation. >> >> Motivation for this change: >> >> - *Consistent user interface.* >> Currently, the value you get back for a missing scalar (eg from >> scalar access s[idx]) depends on the data type (np.nan for many, but >> pd.NaT for datetime-likes). Some types support missing values, others >> don't. This proposal would ensure you get back pd.NA regardless of >> the dtype. >> - *No "mis-use" of the np.nan floating point value.* >> The NaN value is a specific floating point value, and not necessarily >> an indicator for missing values (although pandas has always used it that >> way). And because we also use it for other dtypes, you get back a float >> value for non-float dtypes, giving misleading dtype information. >> - *A missing value that behaves accordingly.* >> Our current behaviour of missing values is inherited of the np.nan >> behaviour. Other languages that have a NA/NULL value that is distinguished >> from NaN (eg Julia, SQL, R) typically have different behaviour in >> comparison and logical operations. For example, comparison with NA could >> give NA instead of False, and consequently we need to have a boolean dtype >> with NA support. A new NA value opens up the possibility of having such >> behaviour. >> - An "NA" scalar *matches the terminology* that is used throughout >> pandas in functions and argument names (isna, dropna, fillna, skipna, >> ?). >> >> >> See the proposal for >> more details. >> >> This has of course many consequences in the user API of pandas. >> Initially, it could therefore be introduced optionally (eg only in the new >> data types as nullable integer or string dtype). >> And given those pervasive changes, many eyes on it are important. *So >> feedback on this idea would be greatly appreciated!* >> >> Joris >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Nov 19 13:00:20 2019 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 19 Nov 2019 10:00:20 -0800 Subject: [Pandas-dev] ROADMAP proposal: Consistent missing value handling with new NA scalar In-Reply-To: References:

Message-ID: On Tue, 2019-11-19 at 18:56 +0100, Joris Van den Bossche wrote: > In case that people are interested in this: we are having a dev chat > (hangout) about this topic tomorrow at 18:20 UTC. Certainly welcome > to join! > Hi, I think I will listen in. Can you send the meeting details around? Best, Sebastian > > On Thu, 14 Nov 2019 at 21:44, Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > > Quick update on this: there has been discussion at > > https://github.com/pandas-dev/pandas/issues/28095 and > > https://github.com/pandas-dev/pandas/issues/28778/, and there is > > now also a PR implementing such a pd.NA scalar missing value > > indicator: https://github.com/pandas-dev/pandas/pull/29597 > > Feedback is still very welcome! > > > > On Thu, 3 Oct 2019 at 22:32, Joris Van den Bossche < > > jorisvandenbossche at gmail.com> wrote: > > > Hi all, > > > > > > I would like to propose a revisit of missing value handling in > > > pandas. It's already being discussed on github ( > > > https://github.com/pandas-dev/pandas/issues/28095), but want to > > > mention this on the mailing list as well for broader feedback. > > > A more detailed proposal can be found here: > > > https://hackmd.io/@jorisvandenbossche/Sk0wMeAmB, and discussion > > > can be found at the above github issue. > > > > > > A summary of the proposal is to introduce a new NA value > > > (singleton) for representing scalar missing values (instead of > > > np.nan) that can be used consistently across all data types. This > > > could be achieved under the hood by using a mask-based approach > > > to store the missing values on the array/series-level, but the > > > main discussion here is about the user-facing API: the scalar NA > > > value and the behaviour of NA in several operation. > > > > > > Motivation for this change: > > > Consistent user interface. > > > Currently, the value you get back for a missing scalar (eg from > > > scalar access s[idx]) depends on the data type (np.nan for many, > > > but pd.NaT for datetime-likes). Some types support missing > > > values, others don't. This proposal would ensure you get back > > > pd.NA regardless of the dtype. > > > No "mis-use" of the np.nan floating point value. > > > The NaN value is a specific floating point value, and not > > > necessarily an indicator for missing values (although pandas has > > > always used it that way). And because we also use it for other > > > dtypes, you get back a float value for non-float dtypes, giving > > > misleading dtype information. > > > A missing value that behaves accordingly. > > > Our current behaviour of missing values is inherited of the > > > np.nan behaviour. Other languages that have a NA/NULL value that > > > is distinguished from NaN (eg Julia, SQL, R) typically have > > > different behaviour in comparison and logical operations. For > > > example, comparison with NA could give NA instead of False, and > > > consequently we need to have a boolean dtype with NA support. A > > > new NA value opens up the possibility of having such behaviour. > > > An "NA" scalar matches the terminology that is used throughout > > > pandas in functions and argument names (isna, dropna, fillna, > > > skipna, ?). > > > > > > See the proposal for more details. > > > > > > This has of course many consequences in the user API of pandas. > > > Initially, it could therefore be introduced optionally (eg only > > > in the new data types as nullable integer or string dtype). > > > And given those pervasive changes, many eyes on it are important. > > > So feedback on this idea would be greatly appreciated! > > > > > > Joris > > > > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From jorisvandenbossche at gmail.com Wed Nov 20 02:44:15 2019 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Wed, 20 Nov 2019 08:44:15 +0100 Subject: [Pandas-dev] ROADMAP proposal: Consistent missing value handling with new NA scalar In-Reply-To: References:

Message-ID: We will use this hangout link: https://meet.google.com/hav-rmax-zjx (we are having a call on another topic right before, so if we are already/still talking about something else when you join: don't worry!) And here is a google calendar event for it: https://calendar.google.com/event?action=TEMPLATE&tmeid=NGEwbGZyM242bW1ibW9zZ2ppYmZwNWNoN2YgcGdibjE0cDZwb2phOGExY2YyZHYyamhybWdAZw&tmsrc=pgbn14p6poja8a1cf2dv2jhrmg%40group.calendar.google.com On Tue, 19 Nov 2019 at 19:00, Sebastian Berg wrote: > On Tue, 2019-11-19 at 18:56 +0100, Joris Van den Bossche wrote: > > In case that people are interested in this: we are having a dev chat > > (hangout) about this topic tomorrow at 18:20 UTC. Certainly welcome > > to join! > > > > Hi, > > I think I will listen in. Can you send the meeting details around? > > Best, > > Sebastian > > > > > > On Thu, 14 Nov 2019 at 21:44, Joris Van den Bossche < > > jorisvandenbossche at gmail.com> wrote: > > > Quick update on this: there has been discussion at > > > https://github.com/pandas-dev/pandas/issues/28095 and > > > https://github.com/pandas-dev/pandas/issues/28778/, and there is > > > now also a PR implementing such a pd.NA scalar missing value > > > indicator: https://github.com/pandas-dev/pandas/pull/29597 > > > Feedback is still very welcome! > > > > > > On Thu, 3 Oct 2019 at 22:32, Joris Van den Bossche < > > > jorisvandenbossche at gmail.com> wrote: > > > > Hi all, > > > > > > > > I would like to propose a revisit of missing value handling in > > > > pandas. It's already being discussed on github ( > > > > https://github.com/pandas-dev/pandas/issues/28095), but want to > > > > mention this on the mailing list as well for broader feedback. > > > > A more detailed proposal can be found here: > > > > https://hackmd.io/@jorisvandenbossche/Sk0wMeAmB, and discussion > > > > can be found at the above github issue. > > > > > > > > A summary of the proposal is to introduce a new NA value > > > > (singleton) for representing scalar missing values (instead of > > > > np.nan) that can be used consistently across all data types. This > > > > could be achieved under the hood by using a mask-based approach > > > > to store the missing values on the array/series-level, but the > > > > main discussion here is about the user-facing API: the scalar NA > > > > value and the behaviour of NA in several operation. > > > > > > > > Motivation for this change: > > > > Consistent user interface. > > > > Currently, the value you get back for a missing scalar (eg from > > > > scalar access s[idx]) depends on the data type (np.nan for many, > > > > but pd.NaT for datetime-likes). Some types support missing > > > > values, others don't. This proposal would ensure you get back > > > > pd.NA regardless of the dtype. > > > > No "mis-use" of the np.nan floating point value. > > > > The NaN value is a specific floating point value, and not > > > > necessarily an indicator for missing values (although pandas has > > > > always used it that way). And because we also use it for other > > > > dtypes, you get back a float value for non-float dtypes, giving > > > > misleading dtype information. > > > > A missing value that behaves accordingly. > > > > Our current behaviour of missing values is inherited of the > > > > np.nan behaviour. Other languages that have a NA/NULL value that > > > > is distinguished from NaN (eg Julia, SQL, R) typically have > > > > different behaviour in comparison and logical operations. For > > > > example, comparison with NA could give NA instead of False, and > > > > consequently we need to have a boolean dtype with NA support. A > > > > new NA value opens up the possibility of having such behaviour. > > > > An "NA" scalar matches the terminology that is used throughout > > > > pandas in functions and argument names (isna, dropna, fillna, > > > > skipna, ?). > > > > > > > > See the proposal for more details. > > > > > > > > This has of course many consequences in the user API of pandas. > > > > Initially, it could therefore be introduced optionally (eg only > > > > in the new data types as nullable integer or string dtype). > > > > And given those pervasive changes, many eyes on it are important. > > > > So feedback on this idea would be greatly appreciated! > > > > > > > > Joris > > > > > > > > _______________________________________________ > > Pandas-dev mailing list > > Pandas-dev at python.org > > https://mail.python.org/mailman/listinfo/pandas-dev > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at diehlpk.de Mon Nov 25 22:18:53 2019 From: me at diehlpk.de (Patrick Diehl) Date: Mon, 25 Nov 2019 21:18:53 -0600 Subject: [Pandas-dev] FLOSS for science podcast Message-ID: <3dda10b0-3358-2c82-b7a8-924b21da7e0d@diehlpk.de> Dear Sir or Madam, I started with a colleague the podcast FLOSS for Science [0] with the goal of showcasing free, libre and open source software uses in science. We want to highlight how FLOSS empowers researchers and enables them to produce high quality research. Through each of our episodes, we want to showcase a scientist using FLOSS to produce his/her research or the developers of software used for scientific research. Would some one of you are interested to be interviewed about Pandas. Best, Patrick [0] https://flossforscience.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From bhava0895 at gmail.com Tue Nov 26 09:29:35 2019 From: bhava0895 at gmail.com (Bhavani Ravi) Date: Tue, 26 Nov 2019 19:59:35 +0530 Subject: [Pandas-dev] FLOSS for science podcast In-Reply-To: <3dda10b0-3358-2c82-b7a8-924b21da7e0d@diehlpk.de> References: <3dda10b0-3358-2c82-b7a8-924b21da7e0d@diehlpk.de> Message-ID: Definitely. I would love that. Though I'm not core researcher I used pandas in and out for building various reporting platforms if this alings with what you're looking for. I would be more than happy to do it. On Tue, Nov 26, 2019, 12:24 PM Patrick Diehl wrote: > Dear Sir or Madam, > > I started with a colleague the podcast FLOSS for Science [0] with the > goal of showcasing free, libre and open source software uses in science. > We want to highlight how FLOSS empowers researchers and enables them to > produce high quality research. Through each of our episodes, we want to > showcase a scientist using FLOSS to produce his/her research or the > developers of software used for scientific research. > > Would some one of you are interested to be interviewed about Pandas. > > Best, > > Patrick > > [0] https://flossforscience.com/ > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bms91 at abv.bg Thu Nov 28 22:30:27 2019 From: bms91 at abv.bg (Martin Gantchev) Date: Fri, 29 Nov 2019 05:30:27 +0200 (EET) Subject: [Pandas-dev] Just a quick question from a regular pandas user Message-ID: <1309884188.227412.1574998229127@nm43.abv.bg> Dear Representatives of Pandas-dev, This is Martin here, a regular user of the pandas library. First of all, thank you for providing, maintaining and still developing this amazing library which I use pretty much every day. On that note, I am facing a project that will involve working with pandas heavily, but that is supposed to retain the code for a long period of time (hopefully, for years to come). I am referring to this piece of information: https://github.com/pandas-dev/pandas/milestones It seems that pandas 1.0 has 90% completion rate, while pandas 2.0 is expected to be ready for as early as August 2020, however it strangely has just 10 problems that need to be solved. Of course, no precise answer is requested. However, I am afraid that in the next couple of months I may write code that might become obsolete in the middle of next summer. Am I right about that? I did read around the internet and read more articles, so I don't expect neither 1.0 or 2.0 to be drastically different from 0.25.3. At least, I guess most of the code I'd use in 0.25.3 should work normally under 1.0 or 2.0. Is that correct? Shedding light on this subject may save tons of worries for me, so even a loose delineation of your schedule and the potential impact it may have on code written in 0.25.3 would be greatly appreciated. Thank you very much! Looking forward to your answer. Best, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Fri Nov 29 06:05:33 2019 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Fri, 29 Nov 2019 12:05:33 +0100 Subject: [Pandas-dev] Just a quick question from a regular pandas user In-Reply-To: <1309884188.227412.1574998229127@nm43.abv.bg> References: <1309884188.227412.1574998229127@nm43.abv.bg> Message-ID: Hi Martin, The 2.0 milestone is not updated for a very long time, and also not yet really used (there are a few issues tagged with it to mean "maybe in a next big release but not yet in 1.0"). So I wouldn't look too much to that. In any case, we are certainly not going to do a pandas 2.0 release in summer 2020 (so we should update the milestone date). What we *do* plan is a final 1.0 release in early 2020. What we also discussed recently is a version policy for starting with 1.0: https://dev.pandas.io/docs/development/policies.html#version-policy This means that code working with 1.0 should mostly keep working in the full 1.x series of releases when not using experimental features (although we will keep doing deprecations, so you still might need to change code to get rid of such warnings, in preparation of pandas 2.0). And you are correct: pandas 1.0 will not be drastically different from 0.25.3 (the main difference will be that a lot of things that were deprecated before will now be removed, plus some documented API changes). While we do not yet have much concrete plans for pandas 2.0, I think the expectation is that it will be similar (and also not something for the coming year anyway). So if you are writing code now for 0.25.3, and you take notice of possible deprecation warnings and fix your code for those, you can be ensured that your code will mostly work on 1.0 as well. Now, it is still very recommended to ensure you write tests for your code, so you can run those on new pandas releases to verify this is indeed the case (and running such tests on release candidates of new pandas releases is also very valuable, so potential regressions can be reported and fixed early). Hopefully that could shed some light Joris On Fri, 29 Nov 2019 at 05:42, Martin Gantchev wrote: > Dear Representatives of Pandas-dev, > > This is Martin here, a regular user of the pandas library. > > First of all, thank you for providing, maintaining and still developing > this amazing library which I use pretty much every day. > > On that note, I am facing a project that will involve working with pandas > heavily, but that is supposed to retain the code for a long period of time > (hopefully, for years to come). > > I am referring to this piece of information: > > https://github.com/pandas-dev/pandas/milestones > > It seems that pandas 1.0 has 90% completion rate, while pandas 2.0 is > expected to be ready for as early as August 2020, however it strangely has > just 10 problems that need to be solved. > > Of course, no precise answer is requested. However, I am afraid that in > the next couple of months I may write code that might become obsolete in > the middle of next summer. Am I right about that? > > I did read around the internet and read more articles, so I don't expect > neither 1.0 or 2.0 to be drastically different from 0.25.3. At least, I > guess most of the code I'd use in 0.25.3 should work normally under 1.0 or > 2.0. Is that correct? > > Shedding light on this subject may save tons of worries for me, so even a > loose delineation of your schedule and the potential impact it may have on > code written in 0.25.3 would be greatly appreciated. > > Thank you very much! > > Looking forward to your answer. > Best, > Martin > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at pietrobattiston.it Sat Nov 30 10:39:21 2019 From: me at pietrobattiston.it (Pietro Battiston) Date: Sat, 30 Nov 2019 16:39:21 +0100 Subject: [Pandas-dev] Just a quick question from a regular pandas user In-Reply-To: References: <1309884188.227412.1574998229127@nm43.abv.bg> Message-ID: <9e4fe1eaa534dfd3150cf694c0fce0e557fed7ab.camel@pietrobattiston.it> Dear devs, every time that "pandas 2" comes out, it is (it seems to me) not because of our concrete plans for it, or even because it is used as inspiration for current pandas (which by the way is receiving great and substantial improvements), but because some user is confused by the docs/issues mentioning it. I know it is somewhat of a rhetorical question - because we ourselves always considered "pandas 2" first and foremost as a direction to take (or at least discuss) rather than as a version to release - but I'm wondering whether having pandas 2 mentioned, discussed and postponed (inevitably, as we are not even really targeting at it) is really helpful, and in particular whether the separate github project is really helpful. I see two options: - spend serious effort in communicating users what/when to expect (and not to expect) from pandas 2 - delete any mention to pandas 2 from our github and from the "pandas 2.0 Design Documents" - which could be just described as "the future of pandas" ... which clearly doesn't mean we do not need to introduce important changes in pandas (this is happening daily), or that there shouldn't be a version 2.0 some day. This is some of the "confused users" I have in mind: https://www.reddit.com/r/datascience/comments/8rcoou/what_happened_to_pandas_2/ Cheers, Pietro Il giorno ven, 29/11/2019 alle 12.05 +0100, Joris Van den Bossche ha scritto: > Hi Martin, > > The 2.0 milestone is not updated for a very long time, and also not > yet really used (there are a few issues tagged with it to mean "maybe > in a next big release but not yet in 1.0"). So I wouldn't look too > much to that. In any case, we are certainly not going to do a pandas > 2.0 release in summer 2020 (so we should update the milestone date). > > What we do plan is a final 1.0 release in early 2020. What we also > discussed recently is a version policy for starting with 1.0: > https://dev.pandas.io/docs/development/policies.html#version-policy > This means that code working with 1.0 should mostly keep working in > the full 1.x series of releases when not using experimental features > (although we will keep doing deprecations, so you still might need to > change code to get rid of such warnings, in preparation of pandas > 2.0). > > And you are correct: pandas 1.0 will not be drastically different > from 0.25.3 (the main difference will be that a lot of things that > were deprecated before will now be removed, plus some documented API > changes). While we do not yet have much concrete plans for pandas > 2.0, I think the expectation is that it will be similar (and also not > something for the coming year anyway). > > So if you are writing code now for 0.25.3, and you take notice of > possible deprecation warnings and fix your code for those, you can be > ensured that your code will mostly work on 1.0 as well. > Now, it is still very recommended to ensure you write tests for your > code, so you can run those on new pandas releases to verify this is > indeed the case (and running such tests on release candidates of new > pandas releases is also very valuable, so potential regressions can > be reported and fixed early). > > Hopefully that could shed some light > Joris > > > On Fri, 29 Nov 2019 at 05:42, Martin Gantchev wrote: > > Dear Representatives of Pandas-dev, > > > > This is Martin here, a regular user of the pandas library. > > > > First of all, thank you for providing, maintaining and still > > developing this amazing library which I use pretty much every day. > > > > On that note, I am facing a project that will involve working with > > pandas heavily, but that is supposed to retain the code for a long > > period of time (hopefully, for years to come). > > > > I am referring to this piece of information: > > > > https://github.com/pandas-dev/pandas/milestones > > > > It seems that pandas 1.0 has 90% completion rate, while pandas 2.0 > > is expected to be ready for as early as August 2020, however it > > strangely has just 10 problems that need to be solved. > > > > Of course, no precise answer is requested. However, I am afraid > > that in the next couple of months I may write code that might > > become obsolete in the middle of next summer. Am I right about > > that? > > > > I did read around the internet and read more articles, so I don't > > expect neither 1.0 or 2.0 to be drastically different from 0.25.3. > > At least, I guess most of the code I'd use in 0.25.3 should work > > normally under 1.0 or 2.0. Is that correct? > > > > Shedding light on this subject may save tons of worries for me, so > > even a loose delineation of your schedule and the potential impact > > it may have on code written in 0.25.3 would be greatly appreciated. > > > > Thank you very much! > > > > Looking forward to your answer. > > Best, > > Martin > > _______________________________________________ > > Pandas-dev mailing list > > Pandas-dev at python.org > > https://mail.python.org/mailman/listinfo/pandas-dev > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev