Renting CPU time for a Python script

Sat Jul 20 16:16:16 EDT 2002

Hi Fernando,

Thanks for your very informative post!  Unfortunately the bottleneck
in my program lies with the PIL extension since I have to perform a
lot of putpixel and getpixel operations, which are quite expensive in
PIL.  In the future I would like to rewrite sections of the code in C,
but I have to admit that dealing with LibTiff greatly exceeds my C
programming abilities.  I really like your idea of building in
restarting functionality, and I will add it when I get some free time.

Thanks again!

Zack Bortolot

Fernando Pérez <fperez528 at yahoo.com> wrote in message news:<ah9k4o$259$1 at peabody.colorado.edu>...
> Zachary Bortolot wrote:
> 
> > Hello Everyone,
> > 
> > I am a graduate student conducting research on using computer vision
> > techniques to process digital air photos.  As part of my research I am
> > using a genetic optimization routine that I wrote in Python to find
> > values for several key parameters.  Unfortunately the genetic
> > algorithm is quite CPU intensive, and each run requires anywhere from
> > five to twelve days to complete on a 450MHz Pentium II.  This is
> > problematic since I have to run the program in a computing lab that is
> > frequently used for teaching, which means that I am often unable to
> > complete my runs.  I would like to know if anyone has any suggestions
> > on where I might go to rent CPU time that is Python and PIL friendly
> > (the university I am at does have an AIX-based mainframe, but Python
> > is not installed and users are only given 600k of disk space).  Speed
> > is not a major issue for me and the program does not use a lot of
> > memory or disk space.  However, stability is a definite must.  Thanks
> > in advance for any advice or suggestions!
> 
> 5 to 12 days is a lot. Have you profiled this thing to find the bottlenecks? 
> At that point a few weeks spent on writing the time-critical parts in C would 
> be well spent time, I think. On the other hand if you've already optimized 
> this to death and 5 to 12 days is the best you can get, good luck.
> 
> At any rate, any algorithm that takes that long to run should incorporate 
> checkpointing and restarting capabilities. That way if a machine crashes or 
> your lab stops your runs, you only lose a few hours of cpu time. If you can 
> save the state of the code in an object, you can even (using pickle) very 
> easily move a job from one machine to another in mid-run. The stability of 
> your environment should be a non-issue in terms of the survival of your runs.
> 
> So I'd recommend:
> 
> 1- code in automatic checkpointing and self-restarting abilities. It's fairly 
> easy to do, and saves a lot of headaches.
> 
> 2- profile your code. See if there are bottlenecks in python left. Without 
> seeing your code I can't say, but if you have numerical bottlenecks, Numeric 
> might help. If not, look at weave (scipy.org). It's often enough and faster 
> than writing a full blown extension by hand. Pyrex is another option. Finally 
> writing the extension yourself isn't really that difficult, SWIG is pretty 
> good (weave can also help there quite a bit).
> 
> 3- Once your code is optimized in C and self-restarting, you can distribute 
> runs over that lab without any problem. Just have your jobs get their state 
> from a server on the network so they can migrate automatically, and you 
> should be able to 'fire and forget'. With a central job manager keeping track 
> of what's been done, you can just set runs for a month and collect the 
> results at the end.
> 
> Cheers,
> 
> f.