[Tutor] How to create a dictionary for ount elements

Wolfgang Maier wolfgang.maier at biologie.uni-freiburg.de
Tue Jun 3 21:56:00 CEST 2014


On 03.06.2014 18:24, jarod_v6 at libero.it wrote:
> HI there!!!
> I have  afile like this:
> file.txt
> programs 	sample	gene
> program1	sample1	TP53
> program1	sample1	TP53
> program1	sample2	PRNP
> program1	sample2	ATF3
> program2	sample1	TP53
> program2	sample1	PRNP
> program2	sample2	TRIM32
> program2	sample2	TLK1
> program2	sample2	KIT
>
>
> with open("prova.csv") as p:
>      for i in p:
>     ...:         lines = i.rstrip("\n").split("\t")
>     ...:         print lines
>     ...:
> ['programs ', 'sample', 'gene', 'values']
> ['program1', 'sample1', 'TP53', '2']
> ['program1', 'sample1', 'TP53', '3']
> ['program1', 'sample2', 'PRNP', '4']
> ['program1', 'sample2', 'ATF3', '3']
> ['program2', 'sample1', 'TP53', '2']
> ['program2', 'sample1', 'PRNP', '5']
> ['program2', 'sample2', 'TRIM32', '4']
> ['program2', 'sample2', 'TLK1', '4']
>

Be exact / do not provide approximate information if you are looking for 
adequate answers !!

Your file did not look like the one you showed, there was an additional 
'values' column in it.
What do you want to do with it ??

>
> I want to create a dictionary with set data with the names of the genes:
>
> example:
> dic = {}
>
>
> dic['program1-sample1] = set(TP53)
> dic['program1-sample2] = set(TP53,PRNP,ATF3)
>

Again, this is nothing you were ever really trying in a python shell 
since that would raise errors for several reasons, just try it yourself!

I would not build dictionary keys by concatenating the 'programs' and 
'sample' strings - rather use a tuple of the two (any immutable object 
works as a dict key), e.g.:

dic[('program1', 'sample1')] = {'TP53'}

Essentially, what you need to do is:

- instead of printing each individual list you've parsed from the input 
file, use the first two elements as a tuple for the dict key, then add 
the third element (the gene) to the set stored under that key (use 
set.add() for that purpose.

- the tricky part is what to do with keys that are encountered for the 
first time and, thus, don't have a set associated with them yet.
Here, dict.setdefault() will help you 
(https://docs.python.org/2.7/library/stdtypes.html?highlight=setdefault#dict.setdefault).
hint: your_dict(your_key, set()).add(the_gene) will work whether or not 
the key has been encountered before or not.

> So If I have a dictionary like that I can compare two set  I will compare the
> capacity of the programs in function of the gene show.

I have no idea what you are trying to do, so I can't tell you whether 
the data structure will be good for it.

Wolfgang


More information about the Tutor mailing list