[Pandas-dev] #3089 [PERF: regression from 0.10.1] discussion

Stephen Lin swlin at post.harvard.edu
Wed Mar 20 19:46:25 CET 2013


p.s. also, "triggering the optimization only when the take operation
is done along the shorter of the two dimensions" is probably more
restrictive than it has to be, but I'm not comfortable hardcoding a
lower-limit size for calling memmove (I searched for guidance on
setting such a limit appropriately online, but couldn't find any: I
think the presumption is usually that it doesn't matter if the
compiler does the right thing)

On Wed, Mar 20, 2013 at 1:25 AM, Stephen Lin <swlin at post.harvard.edu> wrote:
> Ahh! I figured it out...the platform issue is part of it, but mostly
> it's that two (independently tested) commits had a weird effect when
> merged.
>
> And the reason they did so is because this particular test turns out
> all of our reindexing tests are testing something very
> non-representative, because of the way they're constructed, so we're
> not really getting representative performance data unfortunately (it
> has to do with the DataFrame constructor and c-contiguity vs
> f-contiguity). We should probably write new tests to fix this issue.
>
> I'll write up a fuller explanation when I get a chance. Anyway, sorry
> for sending you on a git bisect goose chase, Jeff.
>
> Stephen
>
> On Wed, Mar 20, 2013 at 1:01 AM, Stephen Lin <swlin at post.harvard.edu> wrote:
>> As per the "we're getting too chatty on GitHub" comment, should we be
>> moving extended issue discussion about bugs to this list whenever
>> possible?
>>
>> I posted a few comments on #3089 just now but realized maybe starting
>> an e-mail chain would be better..
>>
>> Anyway, I'm looking into the issue, I suspect it's a corner case due
>> to an array that's very large in one dimension but small in another,
>> and possibly that there's compiler and architecture differences
>> causing different results as well....Jeff, do you mind sending me your
>> the output of "gcc -dumpmachine" and "gcc -dumpspecs" on the machine
>> you ran vb_suite on?
>>
>> I'll set up a 64-bit dev machine going forward so I can test on both platforms.
>>
>> Thanks,
>> Stephen


More information about the Pandas-dev mailing list