Stuck on a three word street name regex

Lie Ryan lie.1296 at gmail.com
Thu Jan 28 09:27:59 EST 2010


On 01/28/10 11:28, Brian D wrote:
> I've tackled this kind of problem before by looping through a patterns
> dictionary, but there must be a smarter approach.
> 
> Two addresses. Note that the first has incorrectly transposed the
> direction and street name. The second has an extra space in it before
> the street type. Clearly done by someone who didn't know how to
> concatenate properly -- or didn't care.
> 
> 1000 RAMPART S ST
> 
> 100 JOHN CHURCHILL CHASE  ST
> 
> I want to parse the elements into an array of values that can be
> inserted into new database fields.
> 
> Anyone who loves solving these kinds of puzzles care to relieve my
> frazzled brain?
> 
> The pattern I'm using doesn't keep the "CHASE" with the "JOHN
> CHURCHILL":


How does the following perform?

pat =
re.compile(r'(?P<streetnum>\d+)\s+(?P<streetname>[A-Z\s]+)\s+(?P<streetdir>N|S|W|E|)\s+(?P<streettype>ST|RD|AVE?|)$')

or more legibly:

pat = re.compile(
    r'''
      (?P<streetnum>  \d+              )  #M series of digits
      \s+
      (?P<streetname> [A-Z\s]+         )  #M one-or-more word
      \s+
      (?P<streetdir>  S?E|SW?|N?W|NE?| )  #O direction or nothing
      \s+
      (?P<streettype> ST|RD|AVE?       )  #M street type
      $                                   #M END
    ''', re.VERBOSE)




More information about the Python-list mailing list