From tom.augspurger88 at gmail.com Mon Jun 5 08:36:31 2017 From: tom.augspurger88 at gmail.com (Tom Augspurger) Date: Mon, 5 Jun 2017 07:36:31 -0500 Subject: [Pandas-dev] ANN: pandas v0.20.2 released Message-ID: I'm pleased to announce the release of pandas 0.20.2. This is a minor bug-fix release in the 0.20.x series and includes some small regression fixes, bug fixes, and performance improvements. See the Whatsnew Page to see all of the changes. We recommend that all users upgrade to this version. This was a release of 4 weeks with 67 commits by 34 authors. Tom --- *What is it* pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with ?relational? or ?labeled? data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. *How to get it* Source tarballs and windows/mac/linux wheels are available on PyPI (thanks to Christoph Gohlke for the windows wheels, and to Matthew Brett for setting up the mac/linux wheels). Conda packages are already available via the conda-forge channel (conda install pandas -c conda-forge). It will be available on the main channel shortly. *Issues* Please report any issues on our issue tracker: https://github.com/pydata/pandas/issues *Thanks* Thanks to all the contributors: - Aaron Barber - Andrew ? - Becky Sweger - Christian Prinoth - Christian Stade-Schuldt - DSM - Erik Fredriksen - Hugues Valois - Jeff Reback - Jeff Tratner - JimStearns206 - John W. O'Brien - Joris Van den Bossche - JosephWagner - Keith Webber - Mehmet Ali "Mali" Akmanalp - Pankaj Pandey - Patrick Luo - Patrick O'Melveny - Pietro Battiston - RobinFiveWords - Ryan Hendrickson - SimonBaron - Tom Augspurger - WBare - bpraggastis - chernrick - chris-b1 - economy - gfyoung - jaredsnyder - keitakurita - linebp - lloydkirk -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Wed Jun 14 11:30:23 2017 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Wed, 14 Jun 2017 17:30:23 +0200 Subject: [Pandas-dev] [pydata] Pandas 2.0 Design Request: A more dplyr-like API In-Reply-To: References:

Message-ID: 2017-06-14 17:24 GMT+02:00 Paul Hobson : > Just my 2 cents on indexes: > > Every time I think I'm done with them and don't need them any more, I get > into some weird situation where a complex, nested, categorical index makes > my life soooo much easier. > > I recognize that if the library and general community doesn't need them, > they can represent a significant maintenance burden. But they saved my ass > a couple of times this week. > > Stephan mentioned some ideas to make those cases where you don't need them easier (eg allow not to have an index), but there are no plans to ditch Indexes altogether (if you look at the linked issue, it speaks about "optional indexes", but Stephan's wording in the mail below was maybe a bit misleading). Joris > -paul > > On Tue, Jun 13, 2017 at 5:03 PM, Stephan Hoyer wrote: > >> Hi Chris, >> >> I think most of us agree with you. We've been slowly moving in this >> direction (e.g., with .assign()) and hope to do more. For example, see our speculative >> discussion concerning >> getting rid of indexes for pandas2 and a proposal for allowing indexes >> to be referenced by name >> . >> >> There are a few major obstacles here: >> 1. Coming up with concrete plans for how new APIs should work. This is >> harder than just copying dplyr, because we don't have access to >> non-standard evaluation in Python. >> 2. Figuring out how to deprecate/replace existing behavior in a minimally >> painful way, to minimize clutter of the pandas API. (Arguably, we already >> have too many methods.) >> 3. Actually implementing these changes in a consistent fashion in the >> complex pandas codebase. >> >> These are all important work, but only the last item requires actually >> writing code. Help would be appreciated on all of these. >> >> It's worth noting that some of this may actually be easier to do outside >> of pandas proper. For example, Wes and Phil have been working on a pandas >> backend to Ibis . >> >> Best, >> Stephan >> >> On Tue, Jun 13, 2017 at 3:48 PM, Chris Said wrote: >> >>> Hi Pandas developers, >>> >>> I want to start by thanking all of the pandas developers for the effort >>> they've put into the project. So much of what you do is thankless, and I >>> want you to know it is really appreciated. Pandas is a huge part of my >>> day-to-day coding. >>> >>> Because I use it so much, I want to submit a request. I want somebody to >>> #MakePandasMoreLikeDplyr. To me and to almost everyone else I've talked to >>> who knows pandas and dplyr, this is more important than performance >>> improvements and arguably more important than most of the goals in the pandas >>> 2.0 design docs . >>> >>> I'm not an R guy. 95% of my work is done in pandas. But everyone I know >>> who uses pandas is constantly having to google how to do things. In >>> contrast, dplyr feels like coding at the speed of thought. In particular, >>> the combination of groupby->{mutate, summarize} is incredibly natural >>> . It is so >>> easy to create multiple named output columns from multiple input columns. >>> That's because the definition of new columns, with reference to multiple >>> input columns, is all done inside the call to mutate / summarize. With >>> pandas, it's much more complicated and hard to remember >>> >>> . >>> >>> The new transform method in 0.20 gets us part of the way there. But >>> instead of allowing users to name the output columns, it returns multi-indexed >>> columns , >>> which for me and most other people I've talked to are unwanted >>> . >>> >>> Thank you again for all your hard work. Just as a TL;DR: More like >>> dplyr, less injection of multi-indexes. (Could they be eliminated entirely?) >>> >>> Best, >>> Chris >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "PyData" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to pydata+unsubscribe at googlegroups.com. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "PyData" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to pydata+unsubscribe at googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "PyData" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pydata+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From th020394 at gmail.com Sun Jun 11 16:11:50 2017 From: th020394 at gmail.com (Tyler Hardin) Date: Sun, 11 Jun 2017 16:11:50 -0400 Subject: [Pandas-dev] Why is it not possible to access index entries like columns? Message-ID: Hi, Is it just a matter of implementing the lookup in the . and [] operator methods to return df.index.get_level_values() with the argument, or is there some deeper reason this can't work that I'm not seeing? E.g. I want df['date'] to do df.get_level_values('date') when there is no date column and there is a level in the index named 'date'. Presumably, this would be a secondary lookup w/ columns taking precedence -- i.e. a 'date' column would mask a 'date' index. Thanks, Tyler -------------- next part -------------- An HTML attachment was scrubbed... URL: From cbartak at gmail.com Wed Jun 14 12:27:24 2017 From: cbartak at gmail.com (Chris Bartak) Date: Wed, 14 Jun 2017 11:27:24 -0500 Subject: [Pandas-dev] [pydata] Pandas 2.0 Design Request: A more dplyr-like API In-Reply-To: References:

Message-ID: Chris, I'd encourage you to experiment with (and contribute to!) the ibis expression api that Stephan mentioned. The pandas backend is a work in progress, but functional enough to try out. It is already quite dplyr-like, for example, here's the translation of your first example. The obvious difference is the verbosity of having to fully qualify column names - this is essentially a python syntax limitation. In [77]: (diamonds .groupby(diamonds.cut) .aggregate(mean_x=diamonds.x.mean(), mean_y=diamonds.y.mean()) https://github.com/ibis-project/ibis On Wed, Jun 14, 2017 at 10:30 AM, Joris Van den Bossche < jorisvandenbossche at gmail.com> wrote: > > 2017-06-14 17:24 GMT+02:00 Paul Hobson : > >> Just my 2 cents on indexes: >> >> Every time I think I'm done with them and don't need them any more, I get >> into some weird situation where a complex, nested, categorical index makes >> my life soooo much easier. >> >> I recognize that if the library and general community doesn't need them, >> they can represent a significant maintenance burden. But they saved my ass >> a couple of times this week. >> >> > Stephan mentioned some ideas to make those cases where you don't need them > easier (eg allow not to have an index), but there are no plans to ditch > Indexes altogether (if you look at the linked issue, it speaks about > "optional indexes", but Stephan's wording in the mail below was maybe a bit > misleading). > > Joris > > >> -paul >> >> On Tue, Jun 13, 2017 at 5:03 PM, Stephan Hoyer wrote: >> >>> Hi Chris, >>> >>> I think most of us agree with you. We've been slowly moving in this >>> direction (e.g., with .assign()) and hope to do more. For example, see our speculative >>> discussion concerning >>> getting rid of indexes for pandas2 and a proposal for allowing indexes >>> to be referenced by name >>> . >>> >>> There are a few major obstacles here: >>> 1. Coming up with concrete plans for how new APIs should work. This is >>> harder than just copying dplyr, because we don't have access to >>> non-standard evaluation in Python. >>> 2. Figuring out how to deprecate/replace existing behavior in a >>> minimally painful way, to minimize clutter of the pandas API. (Arguably, we >>> already have too many methods.) >>> 3. Actually implementing these changes in a consistent fashion in the >>> complex pandas codebase. >>> >>> These are all important work, but only the last item requires actually >>> writing code. Help would be appreciated on all of these. >>> >>> It's worth noting that some of this may actually be easier to do outside >>> of pandas proper. For example, Wes and Phil have been working on a pandas >>> backend to Ibis . >>> >>> Best, >>> Stephan >>> >>> On Tue, Jun 13, 2017 at 3:48 PM, Chris Said >>> wrote: >>> >>>> Hi Pandas developers, >>>> >>>> I want to start by thanking all of the pandas developers for the effort >>>> they've put into the project. So much of what you do is thankless, and I >>>> want you to know it is really appreciated. Pandas is a huge part of my >>>> day-to-day coding. >>>> >>>> Because I use it so much, I want to submit a request. I want somebody >>>> to #MakePandasMoreLikeDplyr. To me and to almost everyone else I've talked >>>> to who knows pandas and dplyr, this is more important than performance >>>> improvements and arguably more important than most of the goals in the pandas >>>> 2.0 design docs . >>>> >>>> I'm not an R guy. 95% of my work is done in pandas. But everyone I know >>>> who uses pandas is constantly having to google how to do things. In >>>> contrast, dplyr feels like coding at the speed of thought. In particular, >>>> the combination of groupby->{mutate, summarize} is incredibly natural >>>> . It is so >>>> easy to create multiple named output columns from multiple input columns. >>>> That's because the definition of new columns, with reference to multiple >>>> input columns, is all done inside the call to mutate / summarize. With >>>> pandas, it's much more complicated and hard to remember >>>> >>>> . >>>> >>>> The new transform method in 0.20 gets us part of the way there. But >>>> instead of allowing users to name the output columns, it returns multi-indexed >>>> columns , >>>> which for me and most other people I've talked to are unwanted >>>> . >>>> >>>> Thank you again for all your hard work. Just as a TL;DR: More like >>>> dplyr, less injection of multi-indexes. (Could they be eliminated entirely?) >>>> >>>> Best, >>>> Chris >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "PyData" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to pydata+unsubscribe at googlegroups.com. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "PyData" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to pydata+unsubscribe at googlegroups.com. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "PyData" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to pydata+unsubscribe at googlegroups.com. >> For more options, visit https://groups.google.com/d/optout. >> > > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jorisvandenbossche at gmail.com Thu Jun 15 10:15:21 2017 From: jorisvandenbossche at gmail.com (Joris Van den Bossche) Date: Thu, 15 Jun 2017 16:15:21 +0200 Subject: [Pandas-dev] Why is it not possible to access index entries like columns? In-Reply-To: References: Message-ID: Hi Tyler, I think there is agreement on that it is something we would like to change in pandas. Related github issue is here: https://github.com/pandas-dev/pandas/issues/8162 So as far as I can see, it is mainly someone taking up the implementation work. There are a few API questions to be answered as well (see the discussion in the issue). For example, it would probably *not* be just returning df.index.get_level_values(), but rather wrapped that in a Series (and keeping the original index as well). What to do when a name appears both as column name and index level name, is another question (both on short and long term). Very welcome to work on this if you would be interested! Regards, Joris 2017-06-11 22:11 GMT+02:00 Tyler Hardin : > Hi, > > Is it just a matter of implementing the lookup in the . and [] operator > methods to return df.index.get_level_values() with the argument, or is > there some deeper reason this can't work that I'm not seeing? > > E.g. I want df['date'] to do df.get_level_values('date') when there is no > date column and there is a level in the index named 'date'. Presumably, > this would be a secondary lookup w/ columns taking precedence -- i.e. a > 'date' column would mask a 'date' index. > > Thanks, > Tyler > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tom.augspurger88 at gmail.com Thu Jun 15 10:33:57 2017 From: tom.augspurger88 at gmail.com (Tom Augspurger) Date: Thu, 15 Jun 2017 09:33:57 -0500 Subject: [Pandas-dev] Why is it not possible to access index entries like columns? In-Reply-To: References:

Message-ID: https://github.com/pandas-dev/pandas/pull/12404 started to implement it, but stalled. On Thu, Jun 15, 2017 at 9:15 AM, Joris Van den Bossche < jorisvandenbossche at gmail.com> wrote: > Hi Tyler, > > I think there is agreement on that it is something we would like to change > in pandas. Related github issue is here: https://github.com/pandas-dev/ > pandas/issues/8162 > > So as far as I can see, it is mainly someone taking up the implementation > work. There are a few API questions to be answered as well (see the > discussion in the issue). > For example, it would probably *not* be just returning > df.index.get_level_values(), but rather wrapped that in a Series (and > keeping the original index as well). > What to do when a name appears both as column name and index level name, > is another question (both on short and long term). > > Very welcome to work on this if you would be interested! > > Regards, > Joris > > 2017-06-11 22:11 GMT+02:00 Tyler Hardin : > >> Hi, >> >> Is it just a matter of implementing the lookup in the . and [] operator >> methods to return df.index.get_level_values() with the argument, or is >> there some deeper reason this can't work that I'm not seeing? >> >> E.g. I want df['date'] to do df.get_level_values('date') when there is no >> date column and there is a level in the index named 'date'. Presumably, >> this would be a secondary lookup w/ columns taking precedence -- i.e. a >> 'date' column would mask a 'date' index. >> >> Thanks, >> Tyler >> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> >> > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Thu Jun 15 20:55:28 2017 From: wesmckinn at gmail.com (Wes McKinney) Date: Thu, 15 Jun 2017 20:55:28 -0400 Subject: [Pandas-dev] Pandas Deferred Expressions In-Reply-To: References:

Message-ID: Probably one of the least invasive places where a deferred syntax could be introduced would be in pandas's IO / data access layer. Then we could start to think about simple predicate pushdown in a uniform way, and in some cases this could help avoid materializing huge datasets in memory only to immediately filter them down On Tue, May 30, 2017 at 6:51 PM, Matthew Rocklin wrote: > *(My apologies for chiming in here without intending to do any of the > actual work.)* > > I wonder if there is a half-solution where a small subset of operations > are lazy much in the same way that the current groupby operations are lazy > in Pandas 0.x. If this laziness were extended to a small set of mostly > linear operations (element-wise, filters, aggregations, column projections, > groupbys) then that might hit a few of the bigger optimizations that people > care about without going down the full lazy-relational-algebra-in-python > path. Once you do an operation that is not one of these, we collapse the > lazy dataframe and replace it with a concrete one. Slowing extending a > small set of operations may also be doable in an incremental fashion as > needed, which might be an easier transition for a community of users. > > Of course, half-measures can also cause more maintenance costs long term > and may lack optimizations that Pandas devs find valuable. I'm unqualified > to judge the merits of any of these solutions, just thought I'd bring this > up. Feel free to ignore. > > On Tue, May 30, 2017 at 6:28 PM, Phillip Cloud wrote: > >> On Tue, May 30, 2017 at 5:19 PM Phillip Cloud wrote: >> >> Hi all, >>> >>> I'd like to fork part of the thread from Wes's original email about the >>> future of pandas and discuss all things deferred expressions. To start, >>> here's Wes's original thoughts, and a response from Chris Bartak that was >>> in a different thread. After I send this email I'm going to follow up with >>> my own thoughts in a different email so I can address any specific concerns >>> as well as offer up a list of advantages and disadvantages to this approach >>> and lessons learned about building DSLs in Python. >>> >>> *Wes's post:* >>> >>> *TOPIC THREE:* I think we should start developing a "deferred pandas >>> API" that is designed and directly developed by the pandas developer >>> community. From our respective experiences creating expression DSLs and >>> other computation frameworks on top of pandas, I believe this is something >>> where we can build something reasonable and useful. As one concrete problem >>> this would help with: addressing some of the awkwardness around complex >>> groupby-aggregate expressions (custom aggregations would simply be named >>> expressions). >>> >>> The idea of the deferred expression API would be similar to dplyr in R: >>> >> >>> * "True" schemas (we'll have to work around pandas 0.x warts with >>> implicit casts, etc.) >>> >>> * Immutable data structures / no mutation outside "amend" operations >>> that change values by returning new objects >>> >>> * Less index-related stuff in this API (perhaps this is controversial, >>> we shall see) >>> >>> We can create an in-memory backend for "pandas expressions" on pandas >>> 0.x/1.0 and separately create an alternative backend using libpandas (once >>> that is more fully baked / functional) -- this will also help provide a >>> forcing function for implementing analytics that are required for >>> implementing the backend. >>> >>> Distributed execution for us is almost certainly out of scope, and even >>> if so we would probably want to offload onto prior art in Dask or >>> elsewhere. So if the dask.dataframe API and the pandas expression API >>> look different in ways that are unpleasant, we could either compile from >>> pandas -> dask under the hood, or make API changes to make the semantics >>> more conforming. >>> >>> When libpandas / pandas 2.0 is more mature we can consider building >>> stronger out-of-core execution (plenty of prior art we can learn from here, >>> e.g. SFrame). >>> >>> As far as tools to implement the deferred expression API -- I will >>> leave this to discussion. I spent a considerable amount of time making a >>> pandas-like expression API for SQL in Ibis (see >>> https://github.com/cloudera/ibis/tree/master/ibis/expr) while I was at >>> Cloudera, so there's some ideas there (like separating the "internal" AST >>> from the "external" user expressions) that we can learn from, or fork >>> or use some of that expression code in some way. I don't have a strong >>> opinion as long as the expressions are as strongly-typed as possible >>> (i.e. tables have schemas, operations have checked input and output types) >>> and catch user errors as soon as feasible. >>> >>> *Chris B's response:* >>> >>> Deferred API >>> >>> Mixed thoughts about this. On the one hand, it's obviously a good >>> thing, enables smarter execution, typing/schemas could result in much >>> easier/safer to write code, etc. >>> >> >>> On the other hand, the pandas API is already massive and reasonably >>> difficult to master, and it's a big ask to learn a new one. Dask is a good >>> example of how NOT having a new API can be very valuable. All this to say >>> I think adoption might be pretty low? Could be my own biases - coming from >>> a "smallish data" user of pandas, I've never found the "write once, execute >>> on different backends" argument especially compelling because I've never >>> had the need. >>> >> I agree with the underlying sentiment in Chris?s post. If we are going to >> build something new, there needs to be very compelling reasons to switch so >> that there?s some offset to the switching costs. >> Benefits I see from using expressions that individual users may find >> convincing: >> >> 1. Code correctness guarantees and API clarity using schemas and >> types. >> 1. Operations fail very early and tab completion shows you exactly >> what operations are valid on a particular object. >> 2. Optimizations through expression rewriting (column pruning, >> predicate pushdown). >> 1. We don?t need to read every column to select just one. Last >> time I checked nearly all of our IO APIs require reading in all columns to >> do an operation on just a few. >> 3. Somewhat ironically, a much smaller API to learn. >> 1. No indexes, extremely complex slicing or functions that have >> many different ways to do the same thing (like our old friend >> replace). >> >> Reasons that I think individual users will not find convincing: >> >> 1. The ability to run on multiple backends. Many people do not have >> this problem. I suspect the majority of pandas users do *not* have >> this problem. We shouldn?t try to convince our users that this is why they >> should switch, nor should we prioritize this aspect of pandas2. >> >> Potential pitfalls to adoption with using expressions to build pandas2: >> >> 1. Too dissimilar from current pandas. >> 2. Development getting bogged down in lowest common denominator >> problems (i.e., requiring that every backend implement every operation) >> resulting in an extremely limited API. >> 3. More abstract execution model, and therefore more difficult to >> understand and debug errors. >> >> I personally think we should do the following: >> >> 1. Draft a list of ?must-have? operations on DataFrames >> 2. Use ibis as a base for building experimental pandas deferred >> expressions. >> 3. Forget about supporting ?all the backends? and focus on SQL and >> pandas. Make sure that most of our users don?t have to care about this >> aspect of pandas. The fact that operations are delayed should be almost >> invisible unless desired. For example, even though we are delaying >> operations internally, the result should appear to be eagerly evaluated. >> The model would be: ?write once, execute on pandas only by default, nearly >> invisible to the user? >> 4. Go deep on pandas expressions and add non SQL compatible ones if >> necessary to preserve as much of the spec?d-out API that we can. >> 5. Try not to break backwards compatibility with SQL backends, but >> don?t require it if it?s needed for pandas2. Alternatively, we build the >> pandas backend on top of ibis instead of inside so that we have even more >> freedom. >> >> I?ve got a patch up that implements some of the pandas API in ibis here >> , if anyone would like to >> follow along. >> >> -Phillip >> ? >> >> _______________________________________________ >> Pandas-dev mailing list >> Pandas-dev at python.org >> https://mail.python.org/mailman/listinfo/pandas-dev >> >> > > _______________________________________________ > Pandas-dev mailing list > Pandas-dev at python.org > https://mail.python.org/mailman/listinfo/pandas-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Wed Jun 21 06:49:21 2017 From: jeffreback at gmail.com (Jeff Reback) Date: Wed, 21 Jun 2017 06:49:21 -0400 Subject: [Pandas-dev] travis slowing down? Message-ID: We are fairly heavy open-source users (plus we have paid additional builds) of Travis CI. Thanks for all of the hard work! We have noticed a considerable slowdown generally on builds, see here from 9 days ago: https://travis-ci.org/pandas-dev/pandas/builds/241842849 total time 2.5 hrs and compare to here (yesterday): https://travis-ci.org/pandas-dev/pandas/builds/244822108 total time 3.5 hrs we have had very minor code changes in this period. In fact I tried reverting all of this code and build times stayed elevated. I have to conclude that something in travis itself has changed. any ideas? thanks Jeff Reback pandas development team. -------------- next part -------------- An HTML attachment was scrubbed... URL: From support at travis-ci.com Fri Jun 23 06:09:46 2017 From: support at travis-ci.com (Hiro Asari) Date: Fri, 23 Jun 2017 10:09:46 +0000 Subject: [Pandas-dev] =?utf-8?q?travis_slowing_down=3F?= In-Reply-To: References: Message-ID: -- Please reply above this line -- Hi, Jeff, Thanks for the email. We are sorry to hear about the problem you are having. We are happy to look into it. We have not made any changes to container-based Precise infrastructure in a while (there is no plan to do so in the foreseeable future, as Precise is an outgoing Ubuntu release). Could you try restarting the previously-fast build (https://travis-ci.org/pandas-dev/pandas/builds/241842849 [1]) to see how fast it goes? If the change is due to our infrastructure, I'd expect it to be slow. Thank you. Links: ------ [1] https://travis-ci.org/pandas-dev/pandas/builds/241842849 How would you rate my reply? Great [1] ?? Okay [2] ?? Not Good [3] -- Hiro Asari Travis Builder support at travis-ci.com P.S. We have updated our Ubuntu Trusty 14.04 images on Wednesday, June 21st, 2017. Read all about it on our?blog [4] and take note that if you use the sudo enabled Trusty image, you can add?group: deprecated-2017Q2?to use the previous versions. Visit?https://www.traviscistatus.com/ [5] for service status and uptime details ? _Travis CI GmbH,?Rigaer?Str.8, 10247 Berlin, Germany | GF/CEO: Mathias Meyer, Joshua Kalderimis | Contact: contact at travis-ci.org | Amtsgericht Charlottenburg, Berlin, HRB 140133 B | Umsatzsteuer-ID gem?? ?27 a Umsatzsteuergesetz: DE282002648_ Links: ------ [1] https://secure.helpscout.net/satisfaction/101252342/record/1043881728/1/ [2] https://secure.helpscout.net/satisfaction/101252342/record/1043881728/2/ [3] https://secure.helpscout.net/satisfaction/101252342/record/1043881728/3/ [4] https://blog.travis-ci.com/2017-06-21-trusty-updates-2017-Q2-launch [5] https://www.traviscistatus.com/ > On Wed, Jun 21, 2017 at 10:49:30 UTC, Jeff Reback <jeff at reback.net> wrote: > > We are fairly heavy open-source users (plus we have paid additional > builds) of Travis CI. Thanks for all of the hard work! > > We have noticed a considerable slowdown generally on builds, see here > from 9 days ago: > https://travis-ci.org/pandas-dev/pandas/builds/241842849 [1] total > time 2.5 hrs > and compare to here (yesterday): > https://travis-ci.org/pandas-dev/pandas/builds/244822108 [2] total > time 3.5 hrs > we have had very minor code changes in this period. In fact I tried > reverting all of this code and build times stayed elevated. I have to > conclude that something in travis itself has changed. > any ideas? > thanks > Jeff Reback pandas development team. > > > > Links: > ------ > [1] https://travis-ci.org/pandas-dev/pandas/builds/241842849 > [2] https://travis-ci.org/pandas-dev/pandas/builds/244822108 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: