From jorisvandenbossche at gmail.com Tue Feb 9 19:59:32 2016 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Wed, 10 Feb 2016 01:59:32 +0100 Subject: [Pandas-dev] On bug-fix releases and maintenance branches Message-ID: Hi all, I wanted to stir some discussion on pandas its policy on bug-fx releases and upgrading pains. First some context: *Context part 1*: Currently we do not use maintenance branches for bugfix releases, and we actually also do not *really *do bugfix releases. We just develop further on master, and try to not merge breaking changes the first weeks/months, so we can do a minor kind of bug-fix release (but usually also with a lot of new features). But we don't, for example, backport fixes of regressions if they are fixed after master is pointing to the next major release. *Context part 2*: pandas is not yet that stable, in the sense that there are still quite some breaking changes in each release. I am not arguing for not doing these breaking changes, as some of these changes are really needed to clean up the API (although there are also arguments for that, but I think that is another discussion). This has the consequence that updating your pandas version is not always that pleasant. Sidenote: I have not that much experience with using pandas in a larger company or in larger codebases that need to be upgraded, rather with just my own code for my PhD. So it is difficult for me to judge on how much this is a problem or if bug-fx releases would help. Questions: - What are other people's experiences with upgrading pandas? And would more bug-fix releases actually ease the upgrading? - Do we want to do more bug-fix releases? - Having a maintenance branch and backporting fixes is extra work. Would we be able to handle this? Would it be worth the effort? (It has been mentioned before, but I think the main point raised was lack of manpower to maintain separate branches) To put it another way. In our whatsnew notice there is "*We recommend that all users upgrade to this version*", but I am actually not sure we should recommend that. I personally do not always recommend that no matter what *without careful consideration*. Regards, Joris -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Thu Feb 11 22:34:16 2016 From: jeffreback at gmail.com (Jeff Reback) Date: Thu, 11 Feb 2016 22:34:16 -0500 Subject: [Pandas-dev] Fwd: [pydata] Google Summer of Code 2016, GSoC2016 In-Reply-To: <20160208135350.GR1293@buriti.rgaiacs.com> References: <20160208135350.GR1293@buriti.rgaiacs.com> Message-ID: Anyone have interest in being a mentor for google summer of code? I think could put in the application pretty easily (though due next week): https://developers.google.com/open-source/gsoc/timeline if you do pls respond. I would say that we would need at least 1 other person (I can be a mentor). Jeff ---------- Forwarded message ---------- From: Raniere Silva Date: Mon, Feb 8, 2016 at 8:53 AM Subject: [pydata] Google Summer of Code 2016, GSoC2016 To: pydata at googlegroups.com Hi all, Since Pandas is a NumFOCUS sponsored project and NumFOCUS will apply to be a mentoring organization on GSoC I want to know (1) if Pandas is planning to apply this year and (2) if want to apply under NumFOCUS umbrella. Pandas is welcome and encouraged to apply as separate mentoring organizations directly with Google. We're happy to help you fill out your application and improve your ideas pages, as well as link your page to help students find you. We may also be able to be a reference for you. It is totally fine if you want to use the NumFOCUS umbrella org as a backup plan in case you don't get selected and we do! Cheers, Raniere -- You received this message because you are subscribed to the Google Groups "PyData" group. To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe at googlegroups.com. For more options, visit https://groups.google.com/d/optout. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From tom.augspurger88 at gmail.com Fri Feb 12 08:44:40 2016 From: tom.augspurger88 at gmail.com (tom) Date: Fri, 12 Feb 2016 07:44:40 -0600 Subject: [Pandas-dev] On bug-fix releases and maintenance branches In-Reply-To: References: Message-ID: > On Feb 9, 2016, at 6:59 PM, Joris Van den Bossche > wrote: > > Hi all, > > I wanted to stir some discussion on pandas its policy on bug-fx releases and upgrading pains. First some context: > > Context part 1: Currently we do not use maintenance branches for bugfix releases, and we actually also do not really do bugfix releases. We just develop further on master, and try to not merge breaking changes the first weeks/months, so we can do a minor kind of bug-fix release (but usually also with a lot of new features). > But we don't, for example, backport fixes of regressions if they are fixed after master is pointing to the next major release. > > Context part 2: pandas is not yet that stable, in the sense that there are still quite some breaking changes in each release. I am not arguing for not doing these breaking changes, as some of these changes are really needed to clean up the API (although there are also arguments for that, but I think that is another discussion). This has the consequence that updating your pandas version is not always that pleasant. The third bit of context here is an eventual pandas 1.0. I could see us applying bug fixes to a pre-1.0 maintenance branch along side the 1.x branch initially. Perhaps it?s worth practicing that policy a bit before we get to 1.0. > Sidenote: I have not that much experience with using pandas in a larger company or in larger codebases that need to be upgraded, rather with just my own code for my PhD. So it is difficult for me to judge on how much this is a problem or if bug-fx releases would help. > > Questions: > ? What are other people's experiences with upgrading pandas? And would more bug-fix releases actually ease the upgrading? > ? Do we want to do more bug-fix releases? > ? Having a maintenance branch and backporting fixes is extra work. Would we be able to handle this? Would it be worth the effort? > (It has been mentioned before, but I think the main point raised was lack of manpower to maintain separate branches) > > To put it another way. In our whatsnew notice there is "We recommend that all users upgrade to this version", but I am actually not sure we should recommend that. I personally do not always recommend that no matter what without careful consideration. This style guide was going around today, and it mentioned > The basic Pandas API is still changing. When possible, production code should use numpy or standard Python. The copyright on that page is 2012 so it could be a bit dated (it?s also copyrighted to Chang She among others). I do think pandas is at the point in its development cycle where we should be more conservative. And I think we have been a bit, but perhaps we can advertise that more. > Regards, > Joris > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev - Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Fri Feb 12 23:38:09 2016 From: jeffreback at gmail.com (Jeff Reback) Date: Fri, 12 Feb 2016 23:38:09 -0500 Subject: [Pandas-dev] google summer of code Message-ID: So I started an application I need to have a co-adminstrator (or several), pls raise your hand if you'd like to do that. I think I tried to have it send e-mail but not sure if it went thru. I also need to know whom could possibly serve as a mentor. Here is a start of an ideas list, needs to be more text / fleshed out. Feel free to add / update. https://github.com/pydata/pandas/wiki/Google-Summer-of-Code The application is due friday the 19th (next FRIDAY)! so pls let me know ASAP about the above Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Sat Feb 13 19:53:15 2016 From: jeffreback at gmail.com (Jeff Reback) Date: Sat, 13 Feb 2016 19:53:15 -0500 Subject: [Pandas-dev] ANN: pandas v0.18.0rc1 - RELEASE CANDIDATE Message-ID: Hi, I'm pleased to announce the availability of the first release candidate of Pandas 0.18.0. Please try this RC and report any issues here: Pandas Issues We will be releasing officially in 1-2 weeks or so. **RELEASE CANDIDATE 1** This is a major release from 0.17.1 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Highlights include: - pandas >= 0.18.0 will no longer support compatibility with Python version 2.6 GH7718 or version 3.3 GH11273 - Moving and expanding window functions are now methods on Series and DataFrame similar to .groupby like objects, see here . - Adding support for a RangeIndex as a specialized form of the Int64Index for memory savings, see here . - API breaking .resample changes to make it more .groupby like, see here - Removal of support for positional indexing with floats, which was deprecated since 0.14.0. This will now raise a TypeError, see here - The .to_xarray() function has been added for compatibility with the xarray package see here . - Addition of the .str.extractall() method , and API changes to the the .str.extract() method , and the .str.cat() method - pd.test() top-level nose test runner is available GH4327 See the Whatsnew for much more information. Best way to get this is to install via conda from our development channel. Builds for osx-64,linux-64,win-64 for Python 2.7 and Python 3.5 are all available. conda install pandas=v0.18.0rc1 -c pandas Thanks to all who made this release happen. It is a very large release! Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Sun Feb 14 19:23:09 2016 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Mon, 15 Feb 2016 01:23:09 +0100 Subject: [Pandas-dev] google summer of code In-Reply-To: References: Message-ID: It would be nice if pandas could participate to Google Summer of Code this year, but at this moment I cannot yet guarantee I have the time to be a mentor. Regards, Joris 2016-02-13 5:38 GMT+01:00 Jeff Reback : > So I started an application > > I need to have a co-adminstrator (or several), pls raise your hand if > you'd like to do that. > > I think I tried to have it send e-mail but not sure if it went thru. > > I also need to know whom could possibly serve as a mentor. > > Here is a start of an ideas list, needs to be more text / fleshed out. > > Feel free to add / update. > > https://github.com/pydata/pandas/wiki/Google-Summer-of-Code > > The application is due friday the 19th (next FRIDAY)! so pls let me know > ASAP > > about the above > > > Jeff > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.augspurger88 at gmail.com Mon Feb 15 09:07:07 2016 From: tom.augspurger88 at gmail.com (tom) Date: Mon, 15 Feb 2016 08:07:07 -0600 Subject: [Pandas-dev] google summer of code In-Reply-To: References: Message-ID: <7FDA76E5-4D6A-480B-B0A2-D69C2C4CC62B@gmail.com> Like Joris, I?m not sure that I?ll have the time this summer (having a kid in May, so I might be contributing even less for a while unfortunately). I could perhaps be a co-administartor, depending on what all that involves. - Tom > On Feb 14, 2016, at 6:23 PM, Joris Van den Bossche wrote: > > It would be nice if pandas could participate to Google Summer of Code this year, > but at this moment I cannot yet guarantee I have the time to be a mentor. > > Regards, > Joris > > 2016-02-13 5:38 GMT+01:00 Jeff Reback >: > So I started an application > > I need to have a co-adminstrator (or several), pls raise your hand if you'd like to do that. > > I think I tried to have it send e-mail but not sure if it went thru. > > I also need to know whom could possibly serve as a mentor. > > Here is a start of an ideas list, needs to be more text / fleshed out. > > Feel free to add / update. > > https://github.com/pydata/pandas/wiki/Google-Summer-of-Code > > The application is due friday the 19th (next FRIDAY)! so pls let me know ASAP > > about the above > > > Jeff > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Mon Feb 15 15:34:22 2016 From: jeffreback at gmail.com (Jeff Reback) Date: Mon, 15 Feb 2016 15:34:22 -0500 Subject: [Pandas-dev] google summer of code In-Reply-To: <7FDA76E5-4D6A-480B-B0A2-D69C2C4CC62B@gmail.com> References: <7FDA76E5-4D6A-480B-B0A2-D69C2C4CC62B@gmail.com> Message-ID: ok I invited you all to be administrators. I don't think there is actually any work for this. The real 'work' is for mentoring, but I suppose it will be much like reviewing pull-requests and such. Jeff On Mon, Feb 15, 2016 at 9:07 AM, tom wrote: > Like Joris, I?m not sure that I?ll have the time this summer (having a kid > in May, so I might be contributing even less for a while unfortunately). > I could perhaps be a co-administartor, depending on what all that involves. > > - Tom > > > On Feb 14, 2016, at 6:23 PM, Joris Van den Bossche < > jorisvandenbossche at gmail.com> wrote: > > It would be nice if pandas could participate to Google Summer of Code this > year, > but at this moment I cannot yet guarantee I have the time to be a mentor. > > Regards, > Joris > > 2016-02-13 5:38 GMT+01:00 Jeff Reback : > >> So I started an application >> >> I need to have a co-adminstrator (or several), pls raise your hand if >> you'd like to do that. >> >> I think I tried to have it send e-mail but not sure if it went thru. >> >> I also need to know whom could possibly serve as a mentor. >> >> Here is a start of an ideas list, needs to be more text / fleshed out. >> >> Feel free to add / update. >> >> https://github.com/pydata/pandas/wiki/Google-Summer-of-Code >> >> The application is due friday the 19th (next FRIDAY)! so pls let me know >> ASAP >> >> about the above >> >> >> Jeff >> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> >> > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Feb 15 15:52:22 2016 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 15 Feb 2016 12:52:22 -0800 Subject: [Pandas-dev] google summer of code In-Reply-To: References: <7FDA76E5-4D6A-480B-B0A2-D69C2C4CC62B@gmail.com> Message-ID: I just signed up to be an admin. I'd also be willing to be a mentor for either pandas or NumPy, if there's a good project fit -- I think I should be able to make the time this summer. Cheers, Stephan On Mon, Feb 15, 2016 at 12:34 PM, Jeff Reback wrote: > ok I invited you all to be administrators. I don't think there is actually > any work for this. > > The real 'work' is for mentoring, but I suppose it will be much like > reviewing pull-requests and such. > > Jeff > > On Mon, Feb 15, 2016 at 9:07 AM, tom wrote: > >> Like Joris, I?m not sure that I?ll have the time this summer (having a >> kid in May, so I might be contributing even less for a while unfortunately). >> I could perhaps be a co-administartor, depending on what all that >> involves. >> >> - Tom >> >> >> On Feb 14, 2016, at 6:23 PM, Joris Van den Bossche < >> jorisvandenbossche at gmail.com> wrote: >> >> It would be nice if pandas could participate to Google Summer of Code >> this year, >> but at this moment I cannot yet guarantee I have the time to be a mentor. >> >> Regards, >> Joris >> >> 2016-02-13 5:38 GMT+01:00 Jeff Reback : >> >>> So I started an application >>> >>> I need to have a co-adminstrator (or several), pls raise your hand if >>> you'd like to do that. >>> >>> I think I tried to have it send e-mail but not sure if it went thru. >>> >>> I also need to know whom could possibly serve as a mentor. >>> >>> Here is a start of an ideas list, needs to be more text / fleshed out. >>> >>> Feel free to add / update. >>> >>> https://github.com/pydata/pandas/wiki/Google-Summer-of-Code >>> >>> The application is due friday the 19th (next FRIDAY)! so pls let me know >>> ASAP >>> >>> about the above >>> >>> >>> Jeff >>> >>> _______________________________________________ >>> Pandas-dev mailing list >>> Pandas-dev at python.org >>> https://mail.python.org/mailman/listinfo/pandas-dev >>> >>> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> >> >> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> >> > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Mon Feb 15 15:53:51 2016 From: jeffreback at gmail.com (Jeff Reback) Date: Mon, 15 Feb 2016 15:53:51 -0500 Subject: [Pandas-dev] google summer of code In-Reply-To: References: <7FDA76E5-4D6A-480B-B0A2-D69C2C4CC62B@gmail.com> Message-ID: awesome Stephan! further if something is cross-related, e.g. fixed up some Panel -> xray things, could add that to the list (wiki) On Mon, Feb 15, 2016 at 3:52 PM, Stephan Hoyer wrote: > I just signed up to be an admin. I'd also be willing to be a mentor for > either pandas or NumPy, if there's a good project fit -- I think I should > be able to make the time this summer. > > Cheers, > Stephan > > On Mon, Feb 15, 2016 at 12:34 PM, Jeff Reback > wrote: > >> ok I invited you all to be administrators. I don't think there is >> actually any work for this. >> >> The real 'work' is for mentoring, but I suppose it will be much like >> reviewing pull-requests and such. >> >> Jeff >> >> On Mon, Feb 15, 2016 at 9:07 AM, tom wrote: >> >>> Like Joris, I?m not sure that I?ll have the time this summer (having a >>> kid in May, so I might be contributing even less for a while unfortunately). >>> I could perhaps be a co-administartor, depending on what all that >>> involves. >>> >>> - Tom >>> >>> >>> On Feb 14, 2016, at 6:23 PM, Joris Van den Bossche < >>> jorisvandenbossche at gmail.com> wrote: >>> >>> It would be nice if pandas could participate to Google Summer of Code >>> this year, >>> but at this moment I cannot yet guarantee I have the time to be a mentor. >>> >>> Regards, >>> Joris >>> >>> 2016-02-13 5:38 GMT+01:00 Jeff Reback : >>> >>>> So I started an application >>>> >>>> I need to have a co-adminstrator (or several), pls raise your hand if >>>> you'd like to do that. >>>> >>>> I think I tried to have it send e-mail but not sure if it went thru. >>>> >>>> I also need to know whom could possibly serve as a mentor. >>>> >>>> Here is a start of an ideas list, needs to be more text / fleshed out. >>>> >>>> Feel free to add / update. >>>> >>>> https://github.com/pydata/pandas/wiki/Google-Summer-of-Code >>>> >>>> The application is due friday the 19th (next FRIDAY)! so pls let me >>>> know ASAP >>>> >>>> about the above >>>> >>>> >>>> Jeff >>>> >>>> _______________________________________________ >>>> Pandas-dev mailing list >>>> Pandas-dev at python.org >>>> https://mail.python.org/mailman/listinfo/pandas-dev >>>> >>>> >>> _______________________________________________ >>> Pandas-dev mailing list >>> Pandas-dev at python.org >>> https://mail.python.org/mailman/listinfo/pandas-dev >>> >>> >>> >>> _______________________________________________ >>> Pandas-dev mailing list >>> Pandas-dev at python.org >>> https://mail.python.org/mailman/listinfo/pandas-dev >>> >>> >> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> >> > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Tue Feb 23 14:21:37 2016 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 23 Feb 2016 11:21:37 -0800 Subject: [Pandas-dev] On bug-fix releases and maintenance branches In-Reply-To: References: Message-ID: hi Joris, I'm sorry it's taken a couple weeks to write a reply -- been really busy and wanted to put some thought into this. This is a really important discussion given how important pandas has become to so many people, thank you for bringing it up. On Tue, Feb 9, 2016 at 4:59 PM, Joris Van den Bossche wrote: > Hi all, > > I wanted to stir some discussion on pandas its policy on bug-fx releases and > upgrading pains. First some context: > > Context part 1: Currently we do not use maintenance branches for bugfix > releases, and we actually also do not really do bugfix releases. We just > develop further on master, and try to not merge breaking changes the first > weeks/months, so we can do a minor kind of bug-fix release (but usually also > with a lot of new features). > But we don't, for example, backport fixes of regressions if they are fixed > after master is pointing to the next major release. I think in general it would be a good idea to tilt development away from new feature development and toward bug fixes and stability. Given that we are contemplating making some breaking changes in a 1.x development branch (like removing the Panel classes), we should decide as some point to create a 0.X.Y maintenance line where we can backport bug fixes only, so that "legacy pandas" users can have a "LTS" (in Ubuntu parlance) maintenance branch. This introduces some development overhead but it seems worth it. > > Context part 2: pandas is not yet that stable, in the sense that there are > still quite some breaking changes in each release. I am not arguing for not > doing these breaking changes, as some of these changes are really needed to > clean up the API (although there are also arguments for that, but I think > that is another discussion). This has the consequence that updating your > pandas version is not always that pleasant. Over the years I've heard many horror stories from companies who have created and maintained internal 0.7.x, 0.8.x, or 0.9.x pandas forks because of the API breakage issues. This is definitely an anti-pattern that we should try to avoid happening in the future, but API breakages in many cases are the inevitable price of progress. Some of the API breakage has resulted from experiences accumulated over a long period of time -- I made a lot of decisions early on in the project that ended up not being the right ones (e.g. resample default arguments changed at one point). There wasn't enough community engagement at that point to have a thorough design process to potentially come up with the "right" design first. In other cases, the "right" choice was perhaps more ambiguous. API changes are most painful for users who do not write tests for their code that depends on pandas. That problem is probably not fixable =) I think having stable releases with backports of serious correctness bugs helps mitigate this problem, whereas modest API changes between major releases. I would also be in favor of having point releases only contain bug fixes rather than the current system of point releases being a stable snapshot of trunk. Since Jeff is the most affected by this on a day to day basis as de facto steward of the PR queue I would be curious what process he feels would be the most helpful. - Wes > > Sidenote: I have not that much experience with using pandas in a larger > company or in larger codebases that need to be upgraded, rather with just my > own code for my PhD. So it is difficult for me to judge on how much this is > a problem or if bug-fx releases would help. > > Questions: > > What are other people's experiences with upgrading pandas? And would more > bug-fix releases actually ease the upgrading? > Do we want to do more bug-fix releases? > Having a maintenance branch and backporting fixes is extra work. Would we be > able to handle this? Would it be worth the effort? > > (It has been mentioned before, but I think the main point raised was lack of > manpower to maintain separate branches) > > To put it another way. In our whatsnew notice there is "We recommend that > all users upgrade to this version", but I am actually not sure we should > recommend that. I personally do not always recommend that no matter what > without careful consideration. > > Regards, > Joris > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > From 0xdeadcafebeef at gmail.com Tue Feb 23 09:23:29 2016 From: 0xdeadcafebeef at gmail.com (P C) Date: Tue, 23 Feb 2016 14:23:29 +0000 Subject: [Pandas-dev] On bug-fix releases and maintenance branches Message-ID: Apologies if the formatting for this is messed up, I have just joined the list but would like to reply to a thread that started before I joined, so I have to copy and paste relevant items in. Specifically, I have comments on this: - What are other people's experiences with upgrading pandas? And would more bug-fix releases actually ease the upgrading? I work for a large company with an established Python code base (3+ years of maintenance, tens of users). Pandas and numpy are heavily used throughout the code base. In our experience, upgrading pandas is usually quite an involved exercise. When we upgraded from 0.15.2 to 0.17, it took several passes, as follows: * attempted to upgrade from 0.15.2 to 0.16.1: Found the following bugs, raised the following PRs https://github.com/pydata/pandas/issues/10181 / https://github.com/pydata/pandas/pull/10188 https://github.com/pydata/pandas/issues/10195 / https://github.com/pydata/pandas/pull/10196 https://github.com/pydata/pandas/issues/10193 / https://github.com/pydata/pandas/pull/10379 * The above fixes went into 0.16.2 and 0.17.0 so we decided to wait for 0.17.0. * Tried 0.17.0: Found https://github.com/pydata/pandas/issues/11372 and https://github.com/pydata/pandas/issues/11370 * https://github.com/pydata/pandas/issues/11372 was fixed in 0.17.1 but not https://github.com/pydata/pandas/issues/11370 (targeted 0.18.0) * In the end we went with an intermediate build: 0.17.1.post78+g158e85a in the normal git versioning parlance. ( https://github.com/pydata/pandas/pull/11427 was the specific item that we needed- to fix issue 11372 above). We chose not to wait for 0.18.0 because that contains possibly breaking changes to resample. We judged that it was better to go with a known non-official intermediate build of master, that passed all our tests, than to risk another cycle of chasing bugs. We found that from 0.15.2 to 0.17.1, each release fixed one or more issues that affected us but also introduced other issues. I describe this just to add a data point and relate our experience with upgrading, rather than to shout for massive changes in the release process.The main things that hit us were edge cases breaking when core classes were refactored. Thanks, PC -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Tue Feb 23 15:11:30 2016 From: jeffreback at gmail.com (Jeff Reback) Date: Tue, 23 Feb 2016 15:11:30 -0500 Subject: [Pandas-dev] On bug-fix releases and maintenance branches In-Reply-To: References: Message-ID: Thanks for bringing this up joris, here are some thoughts. 1) I agree that the next releases should probably focus on bug fixes. So this might mean we should shoot for 0.18.2....3 etc. However, we do need a 0.19.0 in order to provide any big deprecations (Panel) and API changes that are needed. 2) I am a bit hesitant to even make a big break (1.0) because I have seen this just bifurcating people (e.g. do I upgrade now, what if I want compat). This just creates less community. So I think this should be a goal, that even though its called 1.0 it is as back-compat as possible. 3) Releases can be big, and do fix lots of bugs, and usually introduce new ones. This is almost inevitable as we add new features, changes, and even bug fixes which occasionally have regressions (though test suite is pretty good, so hopefully not too often). 4) I don't relish backporting things. I think this could lead to lots of headaches and IMHO doesn't really buy much. 5) We don't want to just go into maintenance mode because we still have a fair amount of feature requests. (though these are often pretty targeted), but off of the top of my head, nothing really *new*, mainly some API changes to bring consistency. E.g. ``.agg`` on a DataFrame is a long-requested feature, which actually after 0.18.0 is quite trivially to do. 6) I think we telegraph any API changes and really really try to have back-compat, so people do have the ability to upgrade at their leisure. API changes are most painful for users who do not write tests for > their code that depends on pandas. That problem is probably not > fixable =) of course this is a telling point. pandas upgrades often expose bugs in user code. I view this as a good thing! So given all of the somewhat contradictory points above, what do I really think we should do? In order for pandas to be (even more) of a force in leading the scientific community. I think we have to grow. So having more contributors is a great thing. People do like / appreciate fixing bugs, but even more (IMHO), are performance enhancements and *some* new features. I will probably try to do more bug-fixing (rather than large API's ish fixes) I think. There is quite a back-log. This should *slow* the issue of the BIG API changes. So I am kind of -1 on backports for mostly 2), it seems to just slow things down, and 4) it can often lead to MORE things being inconcistent (you need machinery to ensure that what is backported is correct and is included). I can easily forsee that we decide to create 'stable' branches, which in fact are stable but might have inconsistent fixes, this is even more confusing in my view. I think we have a fairly aggressive release cycle. We for sure don't want to debate everything. I am of the opinion that it is much better to put things out there quicker, then to endlessly debate extremely minor points (not naming project names here :). For the general user what we do w.r.t. release cycles probably doesn't matter, and for the corporate user, they almost always have a 'fixed' version anyhow (and then they do of course port the new ones, but then they have people upgraded it carefully). I am not so sure we should impose structure on this. We already have announced major releases and minor releases. All for better 'language' in the minor releases. Jeff On Tue, Feb 23, 2016 at 2:21 PM, Wes McKinney wrote: > hi Joris, > > I'm sorry it's taken a couple weeks to write a reply -- been really > busy and wanted to put some thought into this. > > This is a really important discussion given how important pandas has > become to so many people, thank you for bringing it up. > > On Tue, Feb 9, 2016 at 4:59 PM, Joris Van den Bossche > wrote: > > Hi all, > > > > I wanted to stir some discussion on pandas its policy on bug-fx releases > and > > upgrading pains. First some context: > > > > Context part 1: Currently we do not use maintenance branches for bugfix > > releases, and we actually also do not really do bugfix releases. We just > > develop further on master, and try to not merge breaking changes the > first > > weeks/months, so we can do a minor kind of bug-fix release (but usually > also > > with a lot of new features). > > But we don't, for example, backport fixes of regressions if they are > fixed > > after master is pointing to the next major release. > > I think in general it would be a good idea to tilt development away > from new feature development and toward bug fixes and stability. Given > that we are contemplating making some breaking changes in a 1.x > development branch (like removing the Panel classes), we should decide > as some point to create a 0.X.Y maintenance line where we can backport > bug fixes only, so that "legacy pandas" users can have a "LTS" (in > Ubuntu parlance) maintenance branch. This introduces some development > overhead but it seems worth it. > > > > > Context part 2: pandas is not yet that stable, in the sense that there > are > > still quite some breaking changes in each release. I am not arguing for > not > > doing these breaking changes, as some of these changes are really needed > to > > clean up the API (although there are also arguments for that, but I > think > > that is another discussion). This has the consequence that updating your > > pandas version is not always that pleasant. > > Over the years I've heard many horror stories from companies who have > created and maintained internal 0.7.x, 0.8.x, or 0.9.x pandas forks > because of the API breakage issues. This is definitely an anti-pattern > that we should try to avoid happening in the future, but API breakages > in many cases are the inevitable price of progress. > > Some of the API breakage has resulted from experiences accumulated > over a long period of time -- I made a lot of decisions early on in > the project that ended up not being the right ones (e.g. resample > default arguments changed at one point). There wasn't enough community > engagement at that point to have a thorough design process to > potentially come up with the "right" design first. In other cases, the > "right" choice was perhaps more ambiguous. > > API changes are most painful for users who do not write tests for > their code that depends on pandas. That problem is probably not > fixable =) > > I think having stable releases with backports of serious correctness > bugs helps mitigate this problem, whereas modest API changes between > major releases. I would also be in favor of having point releases only > contain bug fixes rather than the current system of point releases > being a stable snapshot of trunk. > > Since Jeff is the most affected by this on a day to day basis as de > facto steward of the PR queue I would be curious what process he feels > would be the most helpful. > > - Wes > > > > > Sidenote: I have not that much experience with using pandas in a larger > > company or in larger codebases that need to be upgraded, rather with > just my > > own code for my PhD. So it is difficult for me to judge on how much this > is > > a problem or if bug-fx releases would help. > > > > Questions: > > > > What are other people's experiences with upgrading pandas? And would more > > bug-fix releases actually ease the upgrading? > > Do we want to do more bug-fix releases? > > Having a maintenance branch and backporting fixes is extra work. Would > we be > > able to handle this? Would it be worth the effort? > > > > (It has been mentioned before, but I think the main point raised was > lack of > > manpower to maintain separate branches) > > > > To put it another way. In our whatsnew notice there is "We recommend that > > all users upgrade to this version", but I am actually not sure we should > > recommend that. I personally do not always recommend that no matter what > > without careful consideration. > > > > Regards, > > Joris > > > > _______________________________________________ > > Pandas-dev mailing list > > Pandas-dev at python.org > > https://mail.python.org/mailman/listinfo/pandas-dev > > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Sun Feb 28 12:03:45 2016 From: jeffreback at gmail.com (Jeff) Date: Sun, 28 Feb 2016 09:03:45 -0800 (PST) Subject: [Pandas-dev] ANN: pandas v0.18.0rc1 - RELEASE CANDIDATE In-Reply-To: <4f18bfb7-ad9e-4d50-bf3e-1f7164e96bfa@googlegroups.com> References: <4f18bfb7-ad9e-4d50-bf3e-1f7164e96bfa@googlegroups.com> Message-ID: <5224abfa-fccd-4f07-8d4a-7360c9764cc4@googlegroups.com> These are pre-releases. In other words, we would want the community to test out before an official release, and see if there are any show stoppers. The docs are setup for the official releases. These are not put into official channels at all (that is the point), e.g. not on PyPi, nor in the conda main channels. Only official releases will go there. Generally we will try to do release candidates before major changes, but not before minor changes. So the official release of 0.18.0 has not happened yet! (in fact going to do a v0.18.0rc2 next week). We would love for you to test out! Jeff On Sunday, February 28, 2016 at 11:50:57 AM UTC-5, John E wrote: > > I hope this doesn't come across as a trivial, semantical question, but... > > The initial releases of the last 2 or so versions have been labelled as > "release candidates" but still say "We recommend that all > users upgrade to this version." > > So this is a little confusing to me for using pandas in a production > environment. "Release candidate" seems to suggest that you should wait for > 0.18.1, but the note unambiguously says not to wait. So which > interpretation is recommended for a production environment? > > > On Saturday, February 13, 2016 at 7:53:18 PM UTC-5, Jeff wrote: >> >> Hi, >> >> I'm pleased to announce the availability of the first release candidate >> of Pandas 0.18.0. >> Please try this RC and report any issues here: Pandas Issues >> >> We will be releasing officially in 1-2 weeks or so. >> >> **RELEASE CANDIDATE 1** >> >> This is a major release from 0.17.1 and includes a small number of API >> changes, several new features, >> enhancements, and performance improvements along with a large number of >> bug fixes. We recommend that all >> users upgrade to this version. >> >> Highlights include: >> >> - pandas >= 0.18.0 will no longer support compatibility with Python >> version 2.6 GH7718 or >> version 3.3 GH11273 >> - Moving and expanding window functions are now methods on Series and >> DataFrame similar to .groupby like objects, see here >> >> . >> - Adding support for a RangeIndex as a specialized form of the >> Int64Index for memory savings, see here >> >> . >> - API breaking .resample changes to make it more .groupby like, see >> here >> >> - Removal of support for positional indexing with floats, which was >> deprecated since 0.14.0. This will now raise a TypeError, see here >> >> - The .to_xarray() function has been added for compatibility with the xarray >> package see here >> >> . >> - Addition of the .str.extractall() method >> , >> and API changes to the the .str.extract() method >> , >> and the .str.cat() method >> >> - pd.test() top-level nose test runner is available GH4327 >> >> >> See the Whatsnew >> for much >> more information. >> >> Best way to get this is to install via conda >> from >> our development channel. Builds for osx-64,linux-64,win-64 for Python 2.7 >> and Python 3.5 are all available. >> >> conda install pandas=v0.18.0rc1 -c pandas >> >> Thanks to all who made this release happen. It is a very large release! >> >> Jeff >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Sun Feb 28 13:49:41 2016 From: jeffreback at gmail.com (Jeff) Date: Sun, 28 Feb 2016 10:49:41 -0800 (PST) Subject: [Pandas-dev] ANN: pandas v0.18.0rc1 - RELEASE CANDIDATE In-Reply-To: References: <4f18bfb7-ad9e-4d50-bf3e-1f7164e96bfa@googlegroups.com> <5224abfa-fccd-4f07-8d4a-7360c9764cc4@googlegroups.com> Message-ID: So you are probably reading sas7bdat, which was put in AFTER 0.18.0rc1 was cut (if you are reading xport format then you are good to go), otherwise you may want to wait a bit for 0.18.0rc2. On Sunday, February 28, 2016 at 1:42:53 PM UTC-5, John E wrote: > > OK, thanks, I got it. Although... I would consider pandas.pydata.org to > be a common end user gateway and if one starts there they will read "We > recommend that *all *users upgrade to this version." And then if they > scroll down a short distance they will see a single line instruction for > installing via conda: "conda install pandas=v0.18.0rc1 -c pandas". > > And also somewhat confusing to me about pandas.pydata.org is that looking > to the right, you have a choice of RC, dev, and previous releases, but > nothing that says something like "current, stable release". > > Anyways.... quite possibly this is confusing only to me, and not others, > but I thought I'd mention it just in case. FWIW. > > I've now installed 0.18.0rc1 and will try to test out some of the newer > features. I'm really interested to see how well the SAS reader works (i.e. > how fast). I hate SAS myself, but this would be a really, really nice > feature for my organization and likely increase adoption of python & pandas. > > > > On Sunday, February 28, 2016 at 12:03:45 PM UTC-5, Jeff wrote: >> >> >> These are pre-releases. In other words, we would want the community to >> test out before an official release, and see if there are any show >> stoppers. The docs are setup for the official releases. These are not put >> into official channels at all (that is the point), e.g. not on PyPi, nor in >> the conda main channels. Only official releases will go there. >> >> Generally we will try to do release candidates before major changes, but >> not before minor changes. >> >> So the official release of 0.18.0 has not happened yet! (in fact going to >> do a v0.18.0rc2 next week). >> >> We would love for you to test out! >> >> Jeff >> >> >> >> >> On Sunday, February 28, 2016 at 11:50:57 AM UTC-5, John E wrote: >>> >>> I hope this doesn't come across as a trivial, semantical question, but... >>> >>> The initial releases of the last 2 or so versions have been labelled as >>> "release candidates" but still say "We recommend that all >>> users upgrade to this version." >>> >>> So this is a little confusing to me for using pandas in a production >>> environment. "Release candidate" seems to suggest that you should wait for >>> 0.18.1, but the note unambiguously says not to wait. So which >>> interpretation is recommended for a production environment? >>> >>> >>> On Saturday, February 13, 2016 at 7:53:18 PM UTC-5, Jeff wrote: >>>> >>>> Hi, >>>> >>>> I'm pleased to announce the availability of the first release candidate >>>> of Pandas 0.18.0. >>>> Please try this RC and report any issues here: Pandas Issues >>>> >>>> We will be releasing officially in 1-2 weeks or so. >>>> >>>> **RELEASE CANDIDATE 1** >>>> >>>> This is a major release from 0.17.1 and includes a small number of API >>>> changes, several new features, >>>> enhancements, and performance improvements along with a large number of >>>> bug fixes. We recommend that all >>>> users upgrade to this version. >>>> >>>> Highlights include: >>>> >>>> - pandas >= 0.18.0 will no longer support compatibility with Python >>>> version 2.6 GH7718 or >>>> version 3.3 GH11273 >>>> - Moving and expanding window functions are now methods on Series >>>> and DataFrame similar to .groupby like objects, see here >>>> >>>> . >>>> - Adding support for a RangeIndex as a specialized form of the >>>> Int64Index for memory savings, see here >>>> >>>> . >>>> - API breaking .resample changes to make it more .groupby like, see >>>> here >>>> >>>> - Removal of support for positional indexing with floats, which was >>>> deprecated since 0.14.0. This will now raise a TypeError, see here >>>> >>>> - The .to_xarray() function has been added for compatibility with >>>> the xarray package see here >>>> >>>> . >>>> - Addition of the .str.extractall() method >>>> , >>>> and API changes to the the .str.extract() method >>>> , >>>> and the .str.cat() method >>>> >>>> - pd.test() top-level nose test runner is available GH4327 >>>> >>>> >>>> See the Whatsnew >>>> for >>>> much more information. >>>> >>>> Best way to get this is to install via conda >>>> from >>>> our development channel. Builds for osx-64,linux-64,win-64 for Python 2.7 >>>> and Python 3.5 are all available. >>>> >>>> conda install pandas=v0.18.0rc1 -c pandas >>>> >>>> Thanks to all who made this release happen. It is a very large release! >>>> >>>> Jeff >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: