Dictionary/Hash question
Gabriel Genellina
gagsl-py at yahoo.com.ar
Tue Feb 6 19:49:11 EST 2007
En Tue, 06 Feb 2007 20:31:17 -0300, Sick Monkey <sickcodemonkey at gmail.com>
escribió:
> Even though I am starting to get the hang of Python, I continue to find
> myself finding problems that I cannot solve.
> I have never used dictionaries before and I feel that they really help
> improve efficiency when trying to analyze huge amounts of data (rather
> than
> having nested loops).
You are right, a list is not the right data structure in your case.
But a dictionary is a mapping from keys to values, and you have no values
to store.
In this case one should use a set: like a list, but without ordering, and
no duplicated elements.
Also, it's not necesary to read all lines at once, you can process both
files line by line. And since reading both files appears to be the same
thing, you can make a function:
def mailsfromfile(fname):
result = set()
with open(fname,'r') as finput:
for line in finput:
mails = some_regular_expression.findall(line)
if mails:
result.update(mails)
return result
mails1 = mailsfromfile(f1name)
mails2 = mailsfromfile(f2name)
for mail in mails1 & mails2: # & = set intersection, mails present on both
files
# write mail to output file
--
Gabriel Genellina
More information about the Python-list
mailing list