[Tutor] Memory Management etc

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Thu May 4 19:31:41 CEST 2006


> I have a problem with a large programme that uses a lot of memory 
> (numerous large arrays which are dynamically extended).  I keep getting 
> unpredictable crashes (on a machine with 1/2 GB memory) whereas on my 
> laptop (1 GB memory) it seems OK.

Hi Philip,

Can you be more specific about what you mean by "crash"?  Does the whole 
operating system freeze, or do you get an error message from Python, or 
...?

The reason I ask is because what you see may not necessarily have to do 
with Python --- there may be lower-level issues such as physical hardware, 
for example.  So more information on symptoms will be very helpful.


> There are no debugger messages it just crashes (and reboots the machine 
> more often than not).

Ok, if I understand what you're saying: are you saying that the machine 
physically reboots without user prompting?  If so, that's almost certainly 
NOT Python then.  System reboot means that even your operating system is 
having difficulty keeping the machine usable.  The most likely explanation 
in this circumstance is that the physical hardware is defective.

I'd recommend running diagnostics like a RAM checker.  But try running 
your program on another machine as another data point to support the 
possibility that perhaps the hardware, and not the software is the issue.


> I have to say I have noticed (the programme is basically a 
> batch-factoring programme for integers) that no matter how I tune gc I 
> can't get it to reliably free memory in between factoring each integer.

How are you measuring this?

Note that Python does not necessarily give allocated memory back to the 
operating system: it keeps a memory pool that it reuses for performance 
reasons.

Is the amount of memory you're using at least bounded?


> Because this is a development programme (and for performance reasons) I 
> use global variables some of which refer to the large arrays.

Global variables don't necessarily make programs fast.  I would strongly 
discourage this kind of ad-hoc performance optimization.  Don't guess: let 
the machine tell you where the program is slow. If you really want to make 
your program fast, use a profiler.

Also note that parameter passing from one function to another is a 
constant-time operation: no object values are being copied.  So the cost 
assumptions you're making about passing large arrays around functions may 
not be correct.



> 1)  Does the mere fact that a function cites a variable as global create 
> a reference which prevents it being collected by gc?

This isn't a contributing factor.  But global variable values at the 
toplevel don't die: that's the point about global variables, because they 
always have at least one reference to them and they're always accessible 
to the outside.


> 3)  Is there any way to allocate a pool of available memory to my 
> programme at the outset from which I can allocate my large arrays?

Practially everything in Python is done at runtime, not compile time.  At 
program startup, I suppose you can preallocate some arrays and keep a pool 
of them for your usage.

But are you finding this to be a significant factor in your program, 
though?  Again, before you go ahead with optimization, I'd strongly 
recommend using a profiler to do a principled analysis of the hotspots of 
your program.  Have you looked at the Python Profiler yet?



> I'm keen to solve this because I would like to make my programme 
> generally available - in every other respect its near complete and 
> massively outperforms the only other comparable pure python module 
> (nzmath) which does the same job.

If you would like a code review, and if the program is short enough, I'm 
sure people here would be happy to give some pointers.

Good luck to you!


More information about the Tutor mailing list