[Tutor] python pipeline
jarod_v6 at libero.it
jarod_v6 at libero.it
Mon Sep 1 16:58:48 CEST 2014
Dear all,
I'll try to write a pipeline starting from a csv file where I write the name
and the path of my files.
example.csv
Name,FASTQ1,FASTQ2,DIRECTORY
sampleA,A_R1_.fastq.gz,A_R2_.fastq.gz,108,~/FASTQ/
sampleB,B_R1_.fastq.gz,B_R2_.fastq.gz,112,~/FASTQ/
On that list I need to send each time 3 different script whic are depend one
to the other. So I need to run1 and only whe it finisched start the second and
then the 3.
One of the problems teach script write the output only in the same directory
where I launch the program so I need to create. I set the output directory and
the I want to obtain this folder view
.
├── sampleA
│ ├── ref.txt
│ └── second
└── sampleB
├── ref.txt
└── second
I have problems on how to move in different folder and how can use subprocess
for execute all.
Any idea in how can I do this?
def Staralign(file,pos):
import subprocess
global Path
global Read1
global Read2
global Nome
global label
Read1 = []
Read2 = []
Nome = []
Path = []
label = []
with open(file) as p:
for i in p:
lines = i.rstrip("\n").split(",")
if lines[0] != "Name":
Path.append(lines[10])
Nome.append(lines[0])
Read1.append(lines[7])
Read2.append(lines[8])
out = open("toRun.sh","w")
out.write("#!/bin/bash\n")
global pipe
pipe =[]
dizionario = {}
for i in range(len(Nome)):
dx =str("".join(Path[i])+ "/"+ "".join(Read1[i]))
sn =str("".join(Path[i])+"/"+"".join(Read2[i]))
if not os.path.exists(pos+"/"+i):
os.makedirs(pos+"/"+i)
print >>out, "cd " + pos +"\n"
print >>out,"~/software/STAR_2.3.0e.Linux_x86_64_static/STAR --genomeDir
/home/sbsuser/databases/Starhg19/GenomeDir/ --runMode alignReads --readFilesIn
"+ dx +" "+ ""+ sn +" --runThreadN 12 --readFilesCommand zcat " +"\n"
step_1_out =["~/software/STAR_2.3.0e.Linux_x86_64_static/STAR --genomeDir
/home/sbsuser/databases/Starhg19/GenomeDir/ --runMode alignReads --readFilesIn %
s %s --runThreadN 12 --readFilesCommand zcat "%(dx,dn)]
print >>out,"cd " +" $PWD"+"/"+ "hg19_second/" +"\n"
print >>out,"~/software/STAR_2.3.0e.Linux_x86_64_static/STAR --runMode
genomeGenerate --genomeDir"+" $PWD"+"/"+ "hg19_second/ --genomeFastaFiles
~/databases/bowtie2Database/hg19.fa --sjdbFileChrStartEnd " +"$PWD"+"/"+ "SJ.
out.tab" +" --sjdbOverhang 49 --runThreadN 12" +"\n"
pipe.append("~/software/STAR_2.3.0e.Linux_x86_64_static/STAR --genomeDir
/home/sbsuser/databases/Starhg19/GenomeDir/ --runMode alignReads --readFilesIn
"+ dx +" "+ ""+ sn +" --runThreadN 12 --readFilesCommand zcat ")
print >>out,"cd .." + "\n"
print >>out,"~/software/STAR_2.3.0e.Linux_x86_64_static/STAR --genomeDir"+
" $PWD"+"/"+ "hg19_second/GenomeDir/ --runMode alignReads --readFilesIn "+
dx +" "+ ""+ sn +" --runThreadN 12 --readFilesCommand zcat " +"\n"
dizionario.setdefault()
# return Nome,Path,Read1,Read1
This isthe function I wrote but with this way I'm only able to write a bash
script..
More information about the Tutor
mailing list