python in parallel for pattern discovery in genome data
Tim Churches
tchur at optushome.com.au
Wed Jul 30 07:44:29 EDT 2003
BalyanM wrote:
> Sent: Wednesday, 30 July 2003 9:19 PM
> To: python-list at python.org
> Subject: python in parallel for pattern discovery in genome data
>
> Hi,
>
> I am new to python.I am using it on redhat linux 9.
> I am interested to run python on a sun
> machine(SunE420R,os=solaris) with 4 cpu's for a pattern
> discovery/search program on biological sequence(genomic
> sequence).I want to write the python code so that it utilizes
> all the 4 cpu's.Moreover do i need some other libraries.
> Kindly advice.
Manoj,
Have a look at PyPar, which is Ole Neilsen's excellent Python wrapper
for the MPI parallel library. It is very easy to use, and Ole gives some
nice examples. See http://datamining.anu.edu.au/~ole/pypar/ That page
also lists some other Python wrappers for MPI. We have been using PyPar
with excellent results to parallelise probabilistic record linkage
problems, running on a 4 CPU SMP Sun server under Solaris, just like
yours. Some example code for doing this can be found in the lastest
Febrl release at http://datamining.anu.edu.au/projects/linkage.html
However, PyPAr and MPI (in particular the LAM/MPI distribution) also
work well on clusters of Linux machines. I think that LAM/MPI is
available as a pre-packages RPM for RedHat Linux, which makes
installation even easier.
Just out of interest, what types of pattern matching algorithms are you
using?
Hope this helps,
Tim C
More information about the Python-list
mailing list