matching a street address with regular expressions
Karthik Gurusamy
kar1107 at gmail.com
Wed Oct 10 15:21:16 EDT 2007
On Oct 10, 10:02 am, "Shawn Milochik" <Sh... at Milochik.com> wrote:
> On 10/4/07, Ricardo Aráoz <ricar... at gmail.com> wrote:
>
>
>
> > Christopher Spears wrote:
> > > One of the exercises in Core Python Programming is to
> > > create a regular expression that will match a street
> > > address. Here is one of my attempts.
>
> > >>>> street = "1180 Bordeaux Drive"
> > >>>> patt = "\d+ \w+"
> > >>>> import re
> > >>>> m = re.match(patt, street)
> > >>>> if m is not None: m.group()
> > > ...
> > > '1180 Bordeaux'
>
> > > Obviously, I can just create a pattern "\d+ \w+ \w+".
> > > However, the pattern would be useless if I had a
> > > street name like 3120 De la Cruz Boulevard. Any
> > > hints?
>
> Also, that pattern can be easily modified to have any number of words
> at the end:
> patt = "\d+ (\w+){1,}"
> This would take care of 3120 De la Cruz Boulevard.
\w doesn't take care of white-space. Following will work.
patt = r"\d+ (\w+\s*){1,}"
BTW {1,} is same as +. So
patt = r"\d+ (\w+\s*)+"
will work as well.
Note that using raw-string for re pattern is safer in most uses.
Karthik
More information about the Python-list
mailing list