[Tutor] Matching zipcode in address file

TGW galaxywatcher at gmail.com
Tue Apr 6 03:35:56 CEST 2010


> OK - you handled the problem regarding reading to end-of-file. Yes it
> takes a lot longer, because now you are actually iterating through
> match_zips for each line.
>
> How large are these files? Consider creating a set from match_zips. As
> lists get longer, set membership test become faster than list  
> membership
> test.
>
> If the outfile is empty that means that line[149:154] is never in
> match_zips.
>
> I suggest you take a look at match_zips. You will find a list of  
> strings
> of length 6, which cannot match line[149:154], a string of length 5.

I am still struggling with this....I have simplified the code, because  
I need to understand the principle.

#!/usr/bin/env python

import string

def main():
      infile = open("filex")
      outfile = open("results_testx", "w")
      zips = open("zippys", "r")
      match_zips = zips.readlines()
      lines = [line for line in infile if line[0:3] + '\n' in  
match_zips]
      outfile.write(''.join(lines))
      print line[0:3]
      zips.close()
      infile.close()
      outfile.close()
main()

filex:
112332424
23423423423
34523423423
456234234234
234234234234
5672342342
683824242

zippys:
123
123
234
345
456
567
678
555


I want to output records from filex whose first 3 characters match a  
record in zippys. Ouptut:
23423423423
34523423423
456234234234
234234234234
5672342342

I am not sure where I should put a '\n' or tweak something that I just  
cannot see.

Thanks


More information about the Tutor mailing list