[Tutor] a dictionary method good for this process

Kent Johnson kent37 at tds.net
Thu Aug 4 20:01:41 CEST 2005


Srinivas Iyyer wrote:
> Dear group:
> 
> I have two lists and I have to match the element in
> the first list to element in second list and print the
> line from second list:
> 
> Example:
> listA =['apple','baby','cat']
> listB =['fruit\tapple','tree\tapple','fruit\tmango',
>         'human\tbaby'
>         'infant\tbaby'
>         'human\tAlbert'
>         'animal\tcat'
>         'tiger\tcat'
>         'animan\tdog']
> 
> 
> I have to take apple, search in listB and print
> 
> fruit  apple
> tree apple
> infant baby
> human baby
> animal cat
> tiger cat

Presumably that is the results from searching for 'apple', 'baby' and 'cat'...

I would make a helper dictionary that maps from the second element in listB to a list of listB entries containing that element. This way you make just one pass over listB:

listA =['apple','baby','cat']
listB =['fruit\tapple','tree\tapple','fruit\tmango',
        'human\tbaby',
        'infant\tbaby',
        'human\tAlbert',
        'animal\tcat',
        'tiger\tcat',
        'animan\tdog']

# Make a helper dict that maps the second element of listB to a list of elements
d={}
for m in listB:
  cols = m.split('\t')
  term = cols[1]
  d.setdefault(term, []).append(m)
  
for i in listA:
    items = d.get(i)
    for item in items:
       print item


The only tricky part is the line
  d.setdefault(term, []).append(m)

This is just a shortcut for something like
  try:
    data = d[term]
  except KeyError:
    d[term] = data = []
  data.append(m)

Kent

> 
> 
> I am doing it this way:
> 
> for i in listA:
>    for m in listB:
>         cols = m.split('\t')
>         term = cols[2]
>         if i == term:
>            print m
> 
> this is very time consuming.
> 
> Because the two columns in listB are redundant I am
> unable to do a dict method.
> 
> The other way could be using a reg.ex method (which I
> did not try yet because the script written based on
> equality is still running.
> 
> for i in listA:
>        pat = re.compile(i)
>        for m in listB:
>              cols = m.split('\t')
>              terms = cols[1]
>              if pat.match(terms):
>                      print m
> 
> Experts, could you please suggest any other method
> which is fast and does the job correctly. 
> 
> Thank you.
> 
> Srini
> 
> 
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
> 



More information about the Tutor mailing list