python in parallel for pattern discovery in genome data

Tim Churches tchur at optushome.com.au
Wed Jul 30 07:44:29 EDT 2003


BalyanM wrote:
> Sent: Wednesday, 30 July 2003 9:19 PM
> To: python-list at python.org
> Subject: python in parallel for pattern discovery in genome data
> 
> Hi,
> 
> I am new to python.I am using it on redhat linux 9.
> I am interested to run python on a sun 
> machine(SunE420R,os=solaris) with 4 cpu's for a pattern 
> discovery/search program on biological sequence(genomic 
> sequence).I want to write the python code so that it utilizes 
> all the 4 cpu's.Moreover do i need some other libraries. 
> Kindly advice.

Manoj,

Have a look at PyPar, which is Ole Neilsen's excellent Python wrapper
for the MPI parallel library. It is very easy to use, and Ole gives some
nice examples. See http://datamining.anu.edu.au/~ole/pypar/ That page
also lists some other Python wrappers for MPI. We have been using PyPar
with excellent results to parallelise probabilistic record linkage
problems, running on a 4 CPU SMP Sun server under Solaris, just like
yours. Some example code for doing this can be found in the lastest
Febrl release at http://datamining.anu.edu.au/projects/linkage.html
However, PyPAr and MPI (in particular the LAM/MPI distribution) also
work well on clusters of Linux machines. I think that LAM/MPI is
available as a pre-packages RPM for RedHat Linux, which makes
installation even easier.

Just out of interest, what types of pattern matching algorithms are you
using?

Hope this helps,

Tim C







More information about the Python-list mailing list