[Tutor] Matching zipcode in address file
bob gailer
bgailer at gmail.com
Mon Apr 5 14:19:56 CEST 2010
On 4/5/2010 1:15 AM, TGW wrote:
>> Sorry - my mistake - try:
>>
>> infile = open("filex")
>> match_zips = open("zippys")
>> result = [line for line in infile if line in match_zips]
>> print result
> When I apply the readlines to the original file, It is taking a lot
> longer to process and the outfile still remains blank. Any suggestions?
OK - you handled the problem regarding reading to end-of-file. Yes it
takes a lot longer, because now you are actually iterating through
match_zips for each line.
How large are these files? Consider creating a set from match_zips. As
lists get longer, set membership test become faster than list membership
test.
If the outfile is empty that means that line[149:154] is never in
match_zips.
I suggest you take a look at match_zips. You will find a list of strings
of length 6, which cannot match line[149:154], a string of length 5.
>
> #!/usr/bin/env python
> # Find records that match zipcodes in zips.txt
>
> import os
> import sys
>
> def main():
> infile = open("/Users/tgw/NM_2010/NM_APR.txt", "r")
> outfile = open("zip_match_apr_2010.txt", "w")
> zips = open("zips.txt", "r")
> match_zips = zips.readlines()
> lines = [ line for line in infile if line[149:154] in match_zips ]
>
> outfile.write(''.join(lines))
> # print line[149:154]
> print lines
> infile.close()
> outfile.close()
> main()
>
>
>
--
Bob Gailer
919-636-4239
Chapel Hill NC
More information about the Tutor
mailing list