From tom.augspurger88 at gmail.com  Fri Aug 18 06:57:29 2017
From: tom.augspurger88 at gmail.com (Tom Augspurger)
Date: Fri, 18 Aug 2017 05:57:29 -0500
Subject: [Pandas-dev] August 2017 Developer Meeting
Message-ID: <CAE1aY-=0upoUZu5SmiAaVN+yHnjLxzrqE3NjUOawFY5xLcqWKA@mail.gmail.com>

Hi all,

We're holding a developer meeting at 2:00 PM EST / 6:00 PM UTC today.
You're welcome to join at
https://plus.google.com/hangouts/_/calendar/dG9tLmF1Z3NwdXJnZXI4OEBnbWFpbC5jb20.6i17sn8l2g2js2tog1q8cgrqts?authuser=0
or view the minutes at
https://docs.google.com/document/d/1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit

Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20170818/e4be965a/attachment.html>

From tom.augspurger88 at gmail.com  Sun Aug 20 08:21:12 2017
From: tom.augspurger88 at gmail.com (Tom Augspurger)
Date: Sun, 20 Aug 2017 07:21:12 -0500
Subject: [Pandas-dev] Benchmark updates
Message-ID: <CAE1aY-mbu_HHKM4oaQgkJsPUNju+wXHDzUb8xfRXCm9=bto19Q@mail.gmail.com>

Hi everyone,

I did some work on the benchmark running & publishing yesterday. The
results are now hosted at http://pandas.pydata.org/speed/, so pandas' are
at http://pandas.pydata.org/speed/pandas.
An RSS feed of regressions is available at
http://pandas.pydata.org/speed/pandas/regressions.xml. I plan to track that
and manually open issues if they seem legitimate.

The runs are now triggered and monitored by Apache Airflow (instead of the
cron job I had setup). This gives us a nice dashboard with the ability to
view logs and see when benchmarks fail (and eventually email alerts). The
dashboard isn't exposed publicly. If you have SSH access to the benchmark
server, you can create a tunnel to port 8080

    ssh -L 8080:localhost:8080 pandas at panda.likescandy.com

Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20170820/560d5c50/attachment.html>

From wesmckinn at gmail.com  Tue Aug 29 15:58:59 2017
From: wesmckinn at gmail.com (Wes McKinney)
Date: Tue, 29 Aug 2017 15:58:59 -0400
Subject: [Pandas-dev] Benchmark updates
In-Reply-To: <CAE1aY-mbu_HHKM4oaQgkJsPUNju+wXHDzUb8xfRXCm9=bto19Q@mail.gmail.com>
References: <CAE1aY-mbu_HHKM4oaQgkJsPUNju+wXHDzUb8xfRXCm9=bto19Q@mail.gmail.com>
Message-ID: <CAJPUwMAxiAyHkHX2sVaSwQ4ESQAfDt7xEWdfyuby2JKn9gVfCQ@mail.gmail.com>

hey Tom,

This is really great. Any chance we can create a wiki or README about
configuration in the event that the Airflow config needs to be
recreated or changed?

Thanks again for setting this up, and keeping my coat closet (where
the machine is located) toasty.

As one minor thing with the benchmarking, I've noticed that default
Linux configs can be a little bit aggressive about throttling the CPU
frequency. This can be edited in the cpufrequtils script, but at least
on my laptop and desktop (Ubuntu 14.04) I find myself having to run
"/etc/init.d/cpufrequtils restart" to get it to disable frequency
scaling. This should probably happen at boot time, but I'm not sure
yet how to do it. So we might want to document this so that we are
getting the best quality performance data out of the machine.

- Wes

On Sun, Aug 20, 2017 at 8:21 AM, Tom Augspurger
<tom.augspurger88 at gmail.com> wrote:
> Hi everyone,
>
> I did some work on the benchmark running & publishing yesterday. The results
> are now hosted at http://pandas.pydata.org/speed/, so pandas' are at
> http://pandas.pydata.org/speed/pandas.
> An RSS feed of regressions is available at
> http://pandas.pydata.org/speed/pandas/regressions.xml. I plan to track that
> and manually open issues if they seem legitimate.
>
> The runs are now triggered and monitored by Apache Airflow (instead of the
> cron job I had setup). This gives us a nice dashboard with the ability to
> view logs and see when benchmarks fail (and eventually email alerts). The
> dashboard isn't exposed publicly. If you have SSH access to the benchmark
> server, you can create a tunnel to port 8080
>
>     ssh -L 8080:localhost:8080 pandas at panda.likescandy.com
>
> Tom
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>

From nesdis at gmail.com  Thu Aug 31 09:12:19 2017
From: nesdis at gmail.com (Siddarth Sen)
Date: Thu, 31 Aug 2017 13:12:19 -0000
Subject: [Pandas-dev] Make the underlying data structure of a sparse
 dataframe an sparse matrix instead of sparse series
Message-ID: <CA+jAkkwZ-nj36wxoYFT+X3yWAAB__ZCRf24RCRmWrrZSiuGFFA@mail.gmail.com>

Hi

I would like to consider the option of converting the underlying structure
of a sparse Dataframe to a sparse matrix instead of multiple sparse series
in case all the columns of the Dataframe  are of the same dtype. This will
make row/column slicing of the Dataframe much faster that what it is
currently.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20170831/7a598b5e/attachment.html>