I used defaultdic to store some variables but the output is blank

Peter Otten __peter__ at web.de
Sun Jun 9 07:56:41 EDT 2013


claire morandin wrote:

> I have the following script which does not return anything, no apparent
> mistake but my output file is empty.I am just trying to extract some
> decimal number from a file according to their names which are in another
> file. from collections import defaultdict import numpy as np
> 
> [code]ercc_contigs= {}
> for line in open ('Faq_ERCC_contigs_name.txt'):
>     gene = line.strip().split()

You probably planned to use the loop above to populate the ercc_contigs 
dict, but there's no code for that.

 
> ercc_rpkm = defaultdict(lambda: np.zeros(1, dtype=float))
> output_file = open('out.txt','w')
> 
> rpkm_file = open('RSEM_Faq_Q1.genes.results.txt')
> rpkm_file.readline()
> for line in rpkm_file:
>     line = line.strip()
>     columns =  line.strip().split()
>     gene = columns[0].strip()
>     rpkm_value = float(columns[6].strip())

Remember that ercc_contigs is empty; therefore the test 

>     if gene in ercc_contigs:

always fails and the following line is never executed.

>         ercc_rpkm[gene] += rpkm_value
> 
> ercc_fh = open ('out.txt','w')
> for gene, rpkm_value in ercc_rpkm.iteritems():
>     ercc = '{0}\t{1}\n'.format(gene, rpkm_value)
>     ercc_fh.write (ercc)[/code]
> 
> If someone could help me spot what's wrong it would be much appreciate
> cheers

By the way: it is unclear to my why you are using a numpy array here:

> ercc_rpkm = defaultdict(lambda: np.zeros(1, dtype=float))

I think

ercc_rpkm = defaultdict(float)

should suffice. Also:

>     line = line.strip()
>     columns =  line.strip().split()
>     gene = columns[0].strip()
>     rpkm_value = float(columns[6].strip())

You can remove all strip() method calls here as line.split() implicitly 
removes all whitespace.





More information about the Python-list mailing list