[Tutor] How to create a dictionary for ount elements
Wolfgang Maier
wolfgang.maier at biologie.uni-freiburg.de
Tue Jun 3 21:56:00 CEST 2014
On 03.06.2014 18:24, jarod_v6 at libero.it wrote:
> HI there!!!
> I have afile like this:
> file.txt
> programs sample gene
> program1 sample1 TP53
> program1 sample1 TP53
> program1 sample2 PRNP
> program1 sample2 ATF3
> program2 sample1 TP53
> program2 sample1 PRNP
> program2 sample2 TRIM32
> program2 sample2 TLK1
> program2 sample2 KIT
>
>
> with open("prova.csv") as p:
> for i in p:
> ...: lines = i.rstrip("\n").split("\t")
> ...: print lines
> ...:
> ['programs ', 'sample', 'gene', 'values']
> ['program1', 'sample1', 'TP53', '2']
> ['program1', 'sample1', 'TP53', '3']
> ['program1', 'sample2', 'PRNP', '4']
> ['program1', 'sample2', 'ATF3', '3']
> ['program2', 'sample1', 'TP53', '2']
> ['program2', 'sample1', 'PRNP', '5']
> ['program2', 'sample2', 'TRIM32', '4']
> ['program2', 'sample2', 'TLK1', '4']
>
Be exact / do not provide approximate information if you are looking for
adequate answers !!
Your file did not look like the one you showed, there was an additional
'values' column in it.
What do you want to do with it ??
>
> I want to create a dictionary with set data with the names of the genes:
>
> example:
> dic = {}
>
>
> dic['program1-sample1] = set(TP53)
> dic['program1-sample2] = set(TP53,PRNP,ATF3)
>
Again, this is nothing you were ever really trying in a python shell
since that would raise errors for several reasons, just try it yourself!
I would not build dictionary keys by concatenating the 'programs' and
'sample' strings - rather use a tuple of the two (any immutable object
works as a dict key), e.g.:
dic[('program1', 'sample1')] = {'TP53'}
Essentially, what you need to do is:
- instead of printing each individual list you've parsed from the input
file, use the first two elements as a tuple for the dict key, then add
the third element (the gene) to the set stored under that key (use
set.add() for that purpose.
- the tricky part is what to do with keys that are encountered for the
first time and, thus, don't have a set associated with them yet.
Here, dict.setdefault() will help you
(https://docs.python.org/2.7/library/stdtypes.html?highlight=setdefault#dict.setdefault).
hint: your_dict(your_key, set()).add(the_gene) will work whether or not
the key has been encountered before or not.
> So If I have a dictionary like that I can compare two set I will compare the
> capacity of the programs in function of the gene show.
I have no idea what you are trying to do, so I can't tell you whether
the data structure will be good for it.
Wolfgang
More information about the Tutor
mailing list