Stuck on a three word street name regex

Brian D briandenzer at gmail.com
Wed Jan 27 22:57:25 EST 2010


On Jan 27, 7:27 pm, MRAB <pyt... at mrabarnett.plus.com> wrote:
> Brian D wrote:
> > I've tackled this kind of problem before by looping through a patterns
> > dictionary, but there must be a smarter approach.
>
> > Two addresses. Note that the first has incorrectly transposed the
> > direction and street name. The second has an extra space in it before
> > the street type. Clearly done by someone who didn't know how to
> > concatenate properly -- or didn't care.
>
> > 1000 RAMPART S ST
>
> > 100 JOHN CHURCHILL CHASE  ST
>
> > I want to parse the elements into an array of values that can be
> > inserted into new database fields.
>
> > Anyone who loves solving these kinds of puzzles care to relieve my
> > frazzled brain?
>
> > The pattern I'm using doesn't keep the "CHASE" with the "JOHN
> > CHURCHILL":
>
> [snip]
> Regex doesn't gain you much. I'd split the string and then fix the parts
> as necessary:
>
>  >>> def parse_address(address):
> ...     parts = address.split()
> ...     if parts[-2] == "S":
> ...         parts[1 : -1] = [parts[-2]] + parts[1 : -2]
> ...     parts[1 : -1] = [" ".join(parts[1 : -1])]
> ...     return parts
> ...
>  >>> print parse_address("1000 RAMPART S ST")
> ['1000', 'S RAMPART', 'ST']
>  >>> print parse_address("100 JOHN CHURCHILL CHASE  ST")
> ['100', 'JOHN CHURCHILL CHASE', 'ST']

This is a nice approach I wouldn't have thought to pursue. I've never
seen this referencing of list elements in reverse order with negative
values, so that certainly expands my knowledge of Python. Of course,
I'd want to check for other directionals -- probably with a list
check, e.g.,

if parts[-2] in ('E', 'W', 'N', 'S'):

Thanks for sharing your approach.



More information about the Python-list mailing list