[pypy-dev] Python vs pypy: interesting performance difference [dict.setdefault]

Antonio Cuni anto.cuni at gmail.com
Fri Aug 12 14:51:36 CEST 2011


Hello David,

On 10/08/11 21:27, David Naylor wrote:
> Hi,
>
> I needed to create a cache of date and time objects and I wondered what was the best way to handle the cache.  For comparison I put together
> the following test:
>
[cut]
> Pypy displays significant slowdown in the defaultdict function, otherwise displays its usual speedup.  To check what is the cause I replaced i.date()
> with i.day and found no major difference in times.  It appears dict.setdefault (or it's interaction with jit) is causing a slow down.

I don't think that setdefault is the culprit here, as shown by this benchmark:

@bench.bench
def setdef():
     d = {}
     for i in range(10000000):
         d.setdefault(i, i)
     return d

@bench.bench
def tryexcept():
     d = {}
     for i in range(10000000):
         try:
             d[i]
         except KeyError:
             d[i] = i
     return d

setdef()
tryexcept()

$ python dictbench.py
setdef: 2.03 seconds
tryexcept: 8.54 seconds

tmp $ pypy-c dictbench.py
setdef: 1.31 seconds
tryexcept: 1.37 seconds

as you can see, in PyPy there is almost no difference between using a 
try/except or using setdefault.


What is very slow on PyPy seems to be hashing datetime objects:

import datetime

@bench.bench
def hashdate():
     res = 0
     for i in range(100000):
         now = datetime.datetime.now()
         res ^= hash(now)
     return res

hashdate()

$ pypy-c dictbench.py
hashdate: 0.83 seconds

$ python dictbench.py
hashdate: 0.22 seconds

I had a quick look at the code (which in PyPy is written at applevel) and it 
does a lot of nonsense.  In particular, __hash__ calls __getstate which 
formats a dynamically created string, just to call hash() on it.  I suppose 
that this code can (and should) be optimized a lot.  I may try to look at it 
but it's unclear when, since I'm about to go on vacation.

ciao,
Anto


More information about the pypy-dev mailing list