[Tutor] Matching zipcode in address file
TGW
galaxywatcher at gmail.com
Tue Apr 6 03:35:56 CEST 2010
> OK - you handled the problem regarding reading to end-of-file. Yes it
> takes a lot longer, because now you are actually iterating through
> match_zips for each line.
>
> How large are these files? Consider creating a set from match_zips. As
> lists get longer, set membership test become faster than list
> membership
> test.
>
> If the outfile is empty that means that line[149:154] is never in
> match_zips.
>
> I suggest you take a look at match_zips. You will find a list of
> strings
> of length 6, which cannot match line[149:154], a string of length 5.
I am still struggling with this....I have simplified the code, because
I need to understand the principle.
#!/usr/bin/env python
import string
def main():
infile = open("filex")
outfile = open("results_testx", "w")
zips = open("zippys", "r")
match_zips = zips.readlines()
lines = [line for line in infile if line[0:3] + '\n' in
match_zips]
outfile.write(''.join(lines))
print line[0:3]
zips.close()
infile.close()
outfile.close()
main()
filex:
112332424
23423423423
34523423423
456234234234
234234234234
5672342342
683824242
zippys:
123
123
234
345
456
567
678
555
I want to output records from filex whose first 3 characters match a
record in zippys. Ouptut:
23423423423
34523423423
456234234234
234234234234
5672342342
I am not sure where I should put a '\n' or tweak something that I just
cannot see.
Thanks
More information about the Tutor
mailing list