[Tutor] How to show dictionary item non present on file

Peter Otten __peter__ at web.de
Tue Jul 22 13:48:09 CEST 2014


jarod_v6 at libero.it wrote:

> Hin there!!!
> 
> I have a niave question on dictionary analysis:
> If you have a dictionary like this:
> diz
> Out[8]: {'elenour': 1, 'frank': 1, 'jack': 1, 'ralph': 1}
> 
> and you have a list and you want to know which  keys are not present on my
> dictionary the code are simple.
> for i in diz.keys():
>    ...:     if i in mitico:
>    ...:         print "NO"
>    ...:     else:
>    ...:         print i
>    ...:
> NO
> 
> But I havethis problem I have a file and I want to know which elements are
> not present on my file from dictionary.
>  more data.tmp
> jack	1
> pippo	1
> luis	1
> frate	1
> livio	1
> frank	1
> 
> 
> with open("data.tmp") as p:
>     for i in p:
>         lines= i.strip("\n").split("\t")
>         if not diz.has_key(lines[0]):
>    ....:             print i
>    ....:
> pippo	1
> 
> luis	1
> 
> frate	1
> 
> livio	1
> 
> The output I want is to have :
> ralph and 'elenour.. how can I do this?
> thanks in advance!

You have the logic backwards. You have to iterate over the names in the dict 
and look them up in the file:

>>> diz = {'elenour': 1, 'frank': 1, 'jack': 1, 'ralph': 1}
>>> def find_name(name):
...     with open("data.tmp") as f:
...             for line in f:
...                     if name == line.split("\t")[0]:
...                             return True
...     return False
... 
>>> for name in diz:
...     if not find_name(name):
...             print name
... 
ralph
elenour

However, this is very inefficient as you have to read the file len(diz) 
times. It is better to store the names in a data structure suited for fast 
lookup first and then to use that instead of the file. Python's dict and set 
types are such data structures, so as we don't care about an associated 
value let's use a set:

>>> with open("data.tmp") as f:
...     names_in_file = {line.split("\t")[0] for line in f}
... 
>>> for name in diz:
...     if not name in names_in_file:
...             print name
... 
ralph
elenour

Digging a bit deeper you'll find that you can get these names with set 
arithmetic:

>>> set(diz) - names_in_file
set(['ralph', 'elenour'])

or even:

>>> diz.viewkeys() - names_in_file
set(['elenour', 'ralph'])





More information about the Tutor mailing list