Renting CPU time for a Python script
Zachary Bortolot
zbortolo at vt.edu
Sat Jul 20 16:16:16 EDT 2002
Hi Fernando,
Thanks for your very informative post! Unfortunately the bottleneck
in my program lies with the PIL extension since I have to perform a
lot of putpixel and getpixel operations, which are quite expensive in
PIL. In the future I would like to rewrite sections of the code in C,
but I have to admit that dealing with LibTiff greatly exceeds my C
programming abilities. I really like your idea of building in
restarting functionality, and I will add it when I get some free time.
Thanks again!
Zack Bortolot
Fernando Pérez <fperez528 at yahoo.com> wrote in message news:<ah9k4o$259$1 at peabody.colorado.edu>...
> Zachary Bortolot wrote:
>
> > Hello Everyone,
> >
> > I am a graduate student conducting research on using computer vision
> > techniques to process digital air photos. As part of my research I am
> > using a genetic optimization routine that I wrote in Python to find
> > values for several key parameters. Unfortunately the genetic
> > algorithm is quite CPU intensive, and each run requires anywhere from
> > five to twelve days to complete on a 450MHz Pentium II. This is
> > problematic since I have to run the program in a computing lab that is
> > frequently used for teaching, which means that I am often unable to
> > complete my runs. I would like to know if anyone has any suggestions
> > on where I might go to rent CPU time that is Python and PIL friendly
> > (the university I am at does have an AIX-based mainframe, but Python
> > is not installed and users are only given 600k of disk space). Speed
> > is not a major issue for me and the program does not use a lot of
> > memory or disk space. However, stability is a definite must. Thanks
> > in advance for any advice or suggestions!
>
> 5 to 12 days is a lot. Have you profiled this thing to find the bottlenecks?
> At that point a few weeks spent on writing the time-critical parts in C would
> be well spent time, I think. On the other hand if you've already optimized
> this to death and 5 to 12 days is the best you can get, good luck.
>
> At any rate, any algorithm that takes that long to run should incorporate
> checkpointing and restarting capabilities. That way if a machine crashes or
> your lab stops your runs, you only lose a few hours of cpu time. If you can
> save the state of the code in an object, you can even (using pickle) very
> easily move a job from one machine to another in mid-run. The stability of
> your environment should be a non-issue in terms of the survival of your runs.
>
> So I'd recommend:
>
> 1- code in automatic checkpointing and self-restarting abilities. It's fairly
> easy to do, and saves a lot of headaches.
>
> 2- profile your code. See if there are bottlenecks in python left. Without
> seeing your code I can't say, but if you have numerical bottlenecks, Numeric
> might help. If not, look at weave (scipy.org). It's often enough and faster
> than writing a full blown extension by hand. Pyrex is another option. Finally
> writing the extension yourself isn't really that difficult, SWIG is pretty
> good (weave can also help there quite a bit).
>
> 3- Once your code is optimized in C and self-restarting, you can distribute
> runs over that lab without any problem. Just have your jobs get their state
> from a server on the network so they can migrate automatically, and you
> should be able to 'fire and forget'. With a central job manager keeping track
> of what's been done, you can just set runs for a month and collect the
> results at the end.
>
> Cheers,
>
> f.
More information about the Python-list
mailing list