[pypy-dev] Benchmarking PyPy performance on real-world Django app

Igor Katson igor.katson at gmail.com
Sat Oct 8 01:28:05 CEST 2011


On 10/08/2011 02:50 AM, Maciej Fijalkowski wrote:
> On Sat, Oct 8, 2011 at 12:48 AM, Andy<angelflow at yahoo.com>  wrote:
>> 15 times more memory? That's a lot.
>> Interestingly Quora reported that their PyPy processes were only 50% larger
>> than CPython ones:
>> http://www.quora.com/Quora-Infrastructure/Did-Quoras-switch-to-PyPy-result-in-increased-memory-consumption
>>
>> "our PyPy worker processes themselves take approximately 50% more memory
>> than our equivalent CPython worker processes, although we did not do a large
>> amount of tuning of the GC. Regardless, this wasn't the main cause of our
>> memory blowup.
>> "In our development, we found that certain functions were not worth being
>> ported from their C libraries to pure Python, things like
>>
>> crypto
>>
>> ,
>>
>> lxml
>>
>> ,
>>
>> PyML
>>
>> , and a couple other random libraries. Our solution for those functions was
>> to run a parallel CPython process that would do nothing but take arguments
>> via an
>>
>> execnet
>>
>> channel, and output return values via the same
>>
>> execnet
>>
>>   channel.
>>
>> "The overhead for some of these Python processes, especially for the ones
>> that required a lot of state (for example,
>>
>> PyML
>>
>> ) is comparable to the amount of memory taken by the master PyPy process,
>> effectively causing a 2-3x blowup in memory just to maintain the CPython
>> processes; this is our main memory sink for our PyPy branch."
>> ----
>> I wonder what accounts for this large difference in PyPy memory consumption
>> (50% more vs. 1,400% more). What type of "large amount of tuning of the GC"
>> did Quora do?
> I think this is a bug, but also different stack was used right?
> Indeed, pypy should not use much more than 2x of CPython usage, I
> would like to give it a go if you can come up with a small
> reproducible example.
>
> Cheers,
> fijal
yeah, I will send you the test suite in a while. This is a bit another 
setup: same site with no data and sqlite instead of pypq, but it's clear 
that the memory usage is also huge, though far more requests are needed 
to bump memory usage to 200mb. cPython memory usage is constant.


More information about the pypy-dev mailing list