[SciPy-dev] Fwd: [SciPy-user] MemoryError in scipy_core

Travis Oliphant oliphant at ee.byu.edu
Wed Nov 16 16:18:32 EST 2005


Chris Fonnesbeck wrote:

>>On 11/16/05, Travis Oliphant <oliphant at ee.byu.edu> wrote:
>>    
>>
>>>>Good news: the leak appears to be fixed. cbm.py is chugging along at
>>>>constant memory usage (pre-burn-in, of course). Only took 41 messages
>>>>to sort it out!
>>>>
>>>>
>>>>        
>>>>
>>>Fantastic.   Your code is actually a pretty good test because apparently
>>>you are using a lot of scalars.  The performance of the array scalars
>>>will be improved as time goes on.  I'd be interested in hearing how your
>>>code performs relative to the Numeric version because you have a lot of
>>>small arrays in your code base --- and a lot of scalars.
>>>
>>>      
>>>
>
>Travis,
>
>I have profiled the Numeric-based and scipy-based versions of my code.
>The Numeric version is just under twice as fast. The slowest bits
>appear to be in the numeric.py and oldnumeric.py modules, although the
>f2py stuff appears a little slower also. Here are the two profiles:
>
>  
>
This seems about right given your heavy use of small arrays.    While in 
C, the scipy core array creation is a bit more complicated, the indexing 
is more complicated,  and the ufuncs require an attribute look up.  
These three things I suspect account for most of the slowdown although 
I'd be thrilled to find other low-hanging fruit optimizations.

There are still optimizations possible (for example defining math on the 
array scalars rather than going through ufuncs would probably be a big 
gain). 

The other optimization possible is to consider a different way to get 
the special computation flags rather than an attribute look up in the 
locals, globals, and builtin dictionaries.   Or, perhaps allow someway 
to bypass the standard lookup approach and fix the behavior globally.  

Profiling can help in the optimization process. 

-Travis


Original profiles for scipy developers:

>scipy code:
>
>         1399359 function calls in 58.010 CPU seconds
>
>   Ordered by: standard name
>
>   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>    57005    0.400    0.000    0.400    0.000 :0(append)
>       55    0.350    0.006    0.350    0.006 :0(array)
>    60008    1.610    0.000    1.610    0.000 :0(concatenate)
>     8000    0.060    0.000    0.060    0.000 :0(copy)
>       15    0.000    0.000    0.000    0.000 :0(isinstance)
>   165103    1.350    0.000    1.350    0.000 :0(len)
>    15000    0.220    0.000    0.220    0.000 :0(normal)
>     7544    0.100    0.000    0.100    0.000 :0(random_sample)
>        3    0.000    0.000    0.000    0.000 :0(range)
>    65010    0.470    0.000    0.470    0.000 :0(ravel)
>    60023    2.840    0.000    2.840    0.000 :0(reduce)
>    80010    0.720    0.000    0.720    0.000 :0(reshape)
>     7238    0.100    0.000    0.100    0.000 :0(round)
>        1    0.000    0.000    0.000    0.000 :0(setprofile)
>        9    0.020    0.002    0.020    0.002 :0(sort)
>    60015    1.370    0.000    1.370    0.000 :0(sum)
>       51    0.000    0.000    0.000    0.000 :0(time)
>        3    0.000    0.000    0.000    0.000 :0(transpose)
>    10020    0.220    0.000    0.220    0.000 :0(values)
>    15006    0.440    0.000    0.440    0.000 :0(zip)
>        1    0.000    0.000   58.010   58.010 <string>:1(?)
>    30003    3.410    0.000   28.380    0.001 MCMC.py:1176(poisson_like)
>        2    0.000    0.000    0.000    0.000 MCMC.py:1567(parameter)
>        1    0.000    0.000    0.000    0.000 MCMC.py:1588(node)
>        1    0.010    0.010    1.680    1.680 MCMC.py:1593(summary)
>        1    0.000    0.000    0.050    0.050 MCMC.py:1989(calculate_dic)
>        1    0.000    0.000    0.000    0.000 MCMC.py:2055(__init__)
>    10000    1.750    0.000   35.700    0.004 MCMC.py:2061(test)
>        1    0.860    0.860   58.010   58.010 MCMC.py:2086(sample)
>        1    0.000    0.000    0.000    0.000 MCMC.py:2197(__init__)
>    15002    2.380    0.000   49.710    0.003 MCMC.py:2217(calculate_likelihood)
>        1    0.000    0.000    0.000    0.000 MCMC.py:279(make_indices)
>        3    0.000    0.000    0.000    0.000 MCMC.py:335(__init__)
>    36009    0.280    0.000    0.280    0.000 MCMC.py:352(get_value)
>     8739    0.090    0.000    0.090    0.000 MCMC.py:357(set_value)
>    12000    0.640    0.000    0.920    0.000 MCMC.py:362(tally)
>       39    0.090    0.002    0.430    0.011 MCMC.py:384(get_trace)
>        4    0.000    0.000    0.050    0.012 MCMC.py:396(quantiles)
>       15    0.000    0.000    0.190    0.013 MCMC.py:444(mean)
>        8    1.170    0.146    1.370    0.171 MCMC.py:452(var)
>        4    0.000    0.000    0.710    0.178 MCMC.py:462(mcerror)
>        4    0.020    0.005    0.070    0.017 MCMC.py:470(hpd)
>    15000    0.210    0.000    0.430    0.000 MCMC.py:557(normal_deviate)
>        2    0.000    0.000    0.000    0.000 MCMC.py:592(__init__)
>    10000    0.710    0.000    1.700    0.000 MCMC.py:628(sample_candidate)
>    10000    0.710    0.000   38.470    0.004 MCMC.py:645(propose)
>       18    0.000    0.000    0.000    0.000 MCMC.py:676(tune)
>        1    0.000    0.000    0.000    0.000 MCMC.py:737(__init__)
>     7238    0.110    0.000    0.210    0.000 MCMC.py:756(set_value)
>        1    0.000    0.000    0.000    0.000 MCMC.py:775(__init__)
>    15002    3.550    0.000   18.850    0.001 MCMC.py:820(uniform_like)
>        1    0.000    0.000    0.000    0.000 Matplot.py:36(__init__)
>       15    0.000    0.000    0.010    0.001 function_base.py:127(average)
>   280046   14.090    0.000   14.090    0.000 numeric.py:67(asarray)
>    65008    1.670    0.000    3.120    0.000 oldnumeric.py:131(reshape)
>        3    0.000    0.000    0.000    0.000 oldnumeric.py:179(transpose)
>        9    0.000    0.000    0.020    0.002 oldnumeric.py:187(sort)
>    60008   10.490    0.000   21.470    0.000 oldnumeric.py:220(resize)
>    65010    1.500    0.000    3.300    0.000 oldnumeric.py:258(ravel)
>    45008    0.600    0.000    1.270    0.000 oldnumeric.py:270(shape)
>    60015    1.380    0.000    3.510    0.000 oldnumeric.py:289(sum)
>        9    0.000    0.000    0.000    0.000 oldnumeric.py:348(rank)
>        0    0.000             0.000          profile:0(profiler)
>        1    0.000    0.000   58.010   58.010
>profile:0(sampler=DisasterSampler();
>sampler.sample(iterations=5000,burn=1000,verbose=False,plot=False))
>    45005    2.020    0.000   13.600    0.000 shape_base.py:115(atleast_1d)
>
>Numeric code:
>
>         1236421 function calls in 34.990 CPU seconds
>
>   Ordered by: standard name
>
>   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>    12000    0.060    0.000    0.060    0.000 :0(append)
>    22787    0.270    0.000    0.270    0.000 :0(apply)
>    45062    4.630    0.000    4.630    0.000 :0(array)
>    60009    1.050    0.000    1.050    0.000 :0(concatenate)
>     4000    0.010    0.000    0.010    0.000 :0(copy)
>    22802    0.260    0.000    0.260    0.000 :0(isinstance)
>   277933    2.460    0.000    2.460    0.000 :0(len)
>        3    0.000    0.000    0.000    0.000 :0(range)
>   105038    1.930    0.000    1.930    0.000 :0(reduce)
>   130020    3.320    0.000    3.320    0.000 :0(reshape)
>     7440    0.100    0.000    0.100    0.000 :0(round)
>        1    0.000    0.000    0.000    0.000 :0(setprofile)
>        9    0.010    0.001    0.010    0.001 :0(sort)
>       51    0.000    0.000    0.000    0.000 :0(time)
>        3    0.000    0.000    0.000    0.000 :0(transpose)
>    10011    0.180    0.000    0.180    0.000 :0(values)
>    15006    0.200    0.000    0.200    0.000 :0(zip)
>        1    0.000    0.000   34.990   34.990 <string>:1(?)
>    30004    2.850    0.000   15.410    0.001 MCMC.py:1179(poisson_like)
>        2    0.000    0.000    0.000    0.000 MCMC.py:1579(parameter)
>        1    0.000    0.000    0.000    0.000 MCMC.py:1600(node)
>        1    0.000    0.000    0.660    0.660 MCMC.py:1605(summary)
>        1    0.000    0.000    0.020    0.020 MCMC.py:2001(calculate_dic)
>        1    0.000    0.000    0.000    0.000 MCMC.py:2067(__init__)
>    10000    0.650    0.000   19.840    0.002 MCMC.py:2073(test)
>        1    0.590    0.590   34.990   34.990 MCMC.py:2098(sample)
>        1    0.000    0.000    0.000    0.000 MCMC.py:2210(__init__)
>    15002    1.380    0.000   27.670    0.002 MCMC.py:2230(calculate_likelihood)
>        1    0.000    0.000    0.000    0.000 MCMC.py:271(make_indices)
>        3    0.000    0.000    0.000    0.000 MCMC.py:327(__init__)
>    40000    0.360    0.000    0.360    0.000 MCMC.py:344(get_value)
>     9014    0.060    0.000    0.060    0.000 MCMC.py:349(set_value)
>    12000    0.640    0.000    0.870    0.000 MCMC.py:354(tally)
>       39    0.080    0.002    0.380    0.010 MCMC.py:376(get_trace)
>        4    0.000    0.000    0.050    0.012 MCMC.py:388(quantiles)
>       15    0.000    0.000    0.140    0.009 MCMC.py:436(mean)
>        8    0.200    0.025    0.430    0.054 MCMC.py:444(var)
>        4    0.010    0.003    0.280    0.070 MCMC.py:454(mcerror)
>        4    0.000    0.000    0.030    0.008 MCMC.py:462(hpd)
>    15000    0.250    0.000    1.710    0.000 MCMC.py:544(normal_deviate)
>        2    0.000    0.000    0.000    0.000 MCMC.py:579(__init__)
>    10000    0.560    0.000    2.790    0.000 MCMC.py:622(sample_candidate)
>    10000    0.560    0.000   23.590    0.002 MCMC.py:639(propose)
>       18    0.010    0.001    0.010    0.001 MCMC.py:670(tune)
>        1    0.000    0.000    0.000    0.000 MCMC.py:731(__init__)
>     7440    0.150    0.000    0.250    0.000 MCMC.py:757(set_value)
>        1    0.000    0.000    0.000    0.000 MCMC.py:776(__init__)
>    15002    2.090    0.000   10.770    0.001 MCMC.py:821(uniform_like)
>        1    0.000    0.000    0.000    0.000 Matplot.py:36(__init__)
>    60009    1.210    0.000    2.260    0.000 Numeric.py:231(concatenate)
>        3    0.000    0.000    0.000    0.000 Numeric.py:243(transpose)
>        9    0.000    0.000    0.010    0.001 Numeric.py:252(sort)
>    60009    4.260    0.000   13.350    0.000 Numeric.py:415(resize)
>    65011    0.830    0.000    2.840    0.000 Numeric.py:583(ravel)
>    60016    2.030    0.000    3.020    0.000 Numeric.py:634(sum)
>        9    0.000    0.000    0.000    0.000 Numeric.py:738(rank)
>    45018    0.400    0.000    0.400    0.000 Numeric.py:744(shape)
>       15    0.000    0.000    0.000    0.000 Numeric.py:762(average)
>    22787    0.760    0.000    1.490    0.000
>RandomArray.py:37(_build_random_array)
>     7787    0.120    0.000    0.610    0.000 RandomArray.py:54(random)
>    15000    0.270    0.000    1.270    0.000 RandomArray.py:87(standard_normal)
>    15000    0.190    0.000    1.460    0.000 RandomArray.py:93(normal)
>        0    0.000             0.000          profile:0(profiler)
>        1    0.000    0.000   34.990   34.990
>profile:0(sampler=DisasterSampler();
>sampler.sample(iterations=5000,burn=1000,verbose=False,plot=False))
>
>
>--
>Chris Fonnesbeck
>Atlanta, GA
>  
>




More information about the SciPy-Dev mailing list