[pypy-dev] Change to the frontpage of speed.pypy.org

Tue Mar 8 20:20:06 CET 2011

Ok, I just committed the changes.

They address two general cases:
- You want to know how fast PyPy is *now* compared to CPython in
different benchmark scenarios, or tasks.
- You want to know how PyPy has been *improving* overall over the last releases

That is now answered on the front page, and the reports are now much
less prominent (I didn't change the logic because it is something I
want to do properly, not just as a hack for speed.pypy).
- I have not yet addressed the "smaller is better" point.

I am aware that the wording of the "faster on average" needs to be
improved (I am discussing it with Holger even now ;). Please chime in
so that we can have a good paragraph that is informative and short
enough while at the same time not being misleading.

Miquel

2011/3/8 Miquel Torres <tobami at googlemail.com>:
> you mean this timeline, right?:
> http://speed.pypy.org/timeline/?ben=spectral-norm
>
> Because the December 22 result is so high, the yaxis maximum goes up
> to 2.5, thus having less space for the more interesting < 1 range,
> right?
>
> Regarding mozilla, do you mean this site?: http://arewefastyet.com/
> I can see their timelines have some holes, probably failed runs...
>
> I see a problem with the approach you suggest. Entering an arbitrary
> maximum yaxis number is not a good thing. I think the onus is there on
> the benchmark infrastructure to not send results that aren't
> statistically significant. See Javastats
> (http://www.elis.ugent.be/en/JavaStats), or ReBench
> (https://github.com/smarr/ReBench).
>
> Something that can be done on the Codespeed side is to treat
> differently points that have a too high stddev. In the aforementioned
> spectral-norm timeline, the stddev "floor" is around 0.0050, while the
> spike has a 0.30 stddev, much higher. A "strict" mode could be
> implemented that invalidates or hides statistically unsound data.
>
> Btw., I had written to the arewefastyet guys about the possibility of
> configuring a Codespeed instance for them. We may yet see
> collaboration there ;-)
>
> Miquel
>
>
> 2011/3/8 Maciej Fijalkowski <fijall at gmail.com>:
>> On Tue, Mar 8, 2011 at 8:14 AM, Laura Creighton <lac at openend.se> wrote:
>>> In a message of Tue, 08 Mar 2011 09:10:32 +0100, Miquel Torres writes:
>>>>Hi,
>>>>
>>>>I finished the changes to the speed.pypy.org home page last night, but
>>>>alas!, I didn't have time to deploy. I will do it later today and will
>>>>then ping you back.
>>>>
>>>>The extra info provided is really nice as an overview, you will see ;-)
>>>>
>>>>
>>>
>>> Ah good.  Thank you very much.  We spent yesterday afternoon with
>>> the Mozilla engineers, and I got to talk to the person who maintains
>>> the benchmarks for tracemonkey.  He had timelines very much like ours.
>>> There is one feature he has that I would like to have.  Take a look
>>> at    the timeline for spectral.norm.  There are two spikes there.
>>> Mozilla has lines like that too, though mostly it is because their
>>> jit decides that the whole benchmark is bogus and optimises out all the
>>> code.  So it takes 0 time.  oops.
>>>
>>> At any rate, aside from knowing that something went horribly wrong with
>>> that rev, you don't really need to know how wrong.  And by making the
>>> graph display up to that point means that the dots where things really
>>> do matter get crammed closer together than would otherwise be the case.
>>> So he had a mode where things wehre displayed with an arbitrary value
>>> at the bottom (in our coase it would be the top) which he could specify.
>>> Then the graph would be replotted, with the outliers off the graph, but
>>> making it easier to read the dots for the more normal cases.
>>>
>>> Any chance we could do that too?
>>
>> Link maybe?
>>
>>>
>>> Laura
>>> _______________________________________________
>>> pypy-dev at codespeak.net
>>> http://codespeak.net/mailman/listinfo/pypy-dev
>>>
>>
>